User:ErAck/WorkFlow
Here I'm noting down my work flow and environment I use when working on code, hopefully giving some useful hints for others.
System
Preferably Debian. Etch did it. Lenny does it better. Which is what I use at home. At work I use Kubuntu, Solaris/SPARC, Solaris/x86, MacOSX/x86, and with a ten-foot pole only when necessary Windows XP.
Shell environment
With the migration to SVN the .svn/* subdirectories' content came in the way when using grep -r ..., so in ~/.bashrc I have
export GREP_OPTIONS='--exclude-dir=.svn'
respectively in ~/.cshrc
setenv GREP_OPTIONS '--exclude-dir=.svn'
Note: On some systems there's an old version of grep that doesn't know that option and bails out if set. This will break the build later, so check by setting the variable and invoke grep to see whether it complains, and if so do not set the variable.
With the migration to Mercurial this is obsolete unless you plan to work on old branches of the SVN repository, or have other sources managed by SVN.
Vim editor
I'm a Vim addict since the mid 90s, so I use the stuff lined out in Editor Vim.
A useful link for gvim
I usually use plain vim in a build environment shell, but to be able to compile sources also from within a detached gvim I setup a symbolic link in $SRC_ROOT after having configured and sourced the build environment:
ln -s LinuxX86Env.Set.sh ENV.$INPATH
for bash respectively
ln -s LinuxX86Env.Set ENV.$INPATH
for tcsh.
Building
Out of habits introduced at Sun Hamburg labs, I usually follow the same naming
scheme also at home when setting up CWSs, which is to checkout source code
into a subdirectory of $CWS_WORK_STAMP/$WORK_STAMP/ooo
,
for example cwsname/DEV300/ooo
configure
When I don't need a full-blown tree because I don't plan to work on globally used stuff that would also affect the much disregarded binfilter binary filter module, I of course exclude that from the build, and I exclude many others as well. This boils build time down to 3.5 hours for the entire tree [hey, you Windows guys are getting envious, aren't you? ;-)]. My configure call currently (2009-03-14) is
./configure --enable-dbgutil --disable-strip-solver --with-use-shell=bash --disable-binfilter --without-fonts --without-ppds --disable-build-mozilla --with-system-stdlibs --disable-systray --with-build-version="Built by erAck" --with-vendor="erAck" --disable-odk --disable-qadevooo --disable-pdfimport --disable-mediawiki --disable-reportdesign --disable-neon --with-system-zlib --with-system-openssl --with-system-jpeg
Especially note
- --enable-dbgutil
- This builds a non-product version with assertions and various checks during runtime enabled. The output directories are without the .pro extension, for example unxlngi6 instead of unxlngi6.pro
- --disable-strip-solver
- Symbols are not stripped from the libraries, so we'll have useful information in the debugger for backtraces.
- --with-use-shell=bash
- use bash shell instead of the default tcsh.
Note: if you do not specify this and source the resulting LinuxX86Env.Set.sh to work in bash, the SHELL environment variable will still contain /bin/tcsh, this may be needed for the build pocess [does it really? I'd consider it a bug], but will interfere with other tools attempting to invoke a shell that is thought to match the current shell, e.g. if yousource ENV.$INPATH
from within gvim as mentioned above in a useful link for gvim.
More shell variables
I add these to the end of LinuxX86Env.Set.sh, replace cwsname and m42 and the content of my_OOO_TREE as appropriate. For tcsh replace export var="..." with setenv var "..." and add to LinuxX86Env.Set instead.
export CWS_WORK_STAMP="cwsname"
export my_UPDMINOR="m42"
export WORKSPACE_STAMP="$CWS_WORK_STAMP"
export my_OOO_TREE="$HOME/ooo/src/$WORKSPACE_STAMP/$WORK_STAMP"
export TMP="/tmp"
export CCACHE_DIR="$my_OOO_TREE/.ccache_${INPATH}"
ccache -M 2G -F 100000
export LOCALINSTALLDIR="$my_OOO_TREE/inst.${my_UPDMINOR}"
export PKGFORMAT="installed"
export BUILD_COMMAND="perl $SRC_ROOT/solenv/bin/build.pl"
- TMP
- For some reason (is there any?) configure does not inherit that, so set again.
- CCACHE_DIR and ccache
- Speeds up things significantly when rebuilding source. Note that the cache directory is setup such that different milestones and product and non-product versions don't interfere.
- LOCALINSTALLDIR and PKGFORMAT
- Building the installation set in module instsetoo_native creates a directly usable installation instead of packages. The location is, for example, .../cwsname/inst.m42/...
See also GullFOSS entry. - BUILD_COMMAND
- build actually is an alias, setting up a variable enables use in inherited shells, from within the editor, or simply in a
time $BUILD_COMMAND --all
invocation.
Using LOCALINSTALLDIR with a build in instsetoo_native is not deterministic if not building with PKGFORMAT installed or building for several languages or building language packs. |
In these cases it is necessary to not set LOCALINSTALLDIR and let the build create deb or rpm packages, and after a complete build use
export LOCALINSTALLDIR=YourLocation
cd $SRC_ROOT/instsetoo_native/util
dmake openoffice_en-US PKGFORMAT=installed
If you also set FORCE2ARCHIVE=TRUE
you'd get a .tar.gz archive you could extract to any place.
YMMV.. See this mail for details about LOCALINSTALLDIR.
ccache
On my system I have setup a symbolic link /usr/local/bin/gcc -> /usr/bin/ccache so every source I build uses ccache, I never encountered problems with that. If you want to use ccache selectively for OOo, add the following variables:
export CC="ccache gcc"
export CXX="ccache g++"
build
In the OOo tree's root, effectively being .../cwsname/DEV300/ooo in this example, execute
source LinuxX86Env.Set.sh
./bootstrap
cd instsetoo_native
build --all -- -P2
Note that I don't invoke the dmake command in the tree's root. Reasons are:
- Fine grained call of the build command, specifying the number of processes to use. Here I create 2 dmake processes per source directory entered by the build script. As a rule of thumb, use 2 processes per CPU core, so if one is waiting for disk IO the other can do useful things. Yes, this is a much simplified view.. For a CPU having 2 cores, this would make 4 processes. However, instead of simply specifying -P4 for dmake I prefer 2 build processes and 2 dmake processes per build process, which would be
build -P2 --all -- -P2
- In case the build breaks, for example in module svx, after having fixed things the build can be easily continued in that module by retrieving and editing the command line, while still in module instsetoo_native:
build --all:svx -- -P2
- In case I'm short of disk space, which may happen if I build both, product and non-product, I add the --dlv_switch -link option, that creates hard links of the files delivered to the solver instead of copying them, which may save 500MB or so per build.
build --dlv_switch -link --all -- -P2
- Build can create a HTML status page to load in the browser and watch progress. Add the --html option and to still see the console output add --dontgraboutput as well. The HTML page is created as $SRC_ROOT/$INPATH.build.html
build --html --dontgraboutput --dlv_switch -link --all -- -P2
Ah, and of course I use
time $BUILD_COMMAND --html --dontgraboutput --dlv_switch -link --all -- -P2
instead and hope the build doesn't break so I can see 1:55:03 or some such ;-)
product and non-product
To build a product version additionally to the non-product version it would be enough to configure again but omit the --enable-dbgutil option. However, that would overwrite the already existing environment files, so
mv LinuxX86.Set LinuxX86.non-pro.Set
mv LinuxX86.Set.sh LinuxX86.non-pro.Set.sh
rm ENV.$INPATH
ln -s LinuxX86.non-pro.Set.sh ENV.$INPATH
./configure ... with all other options except --enable-dbgutil ...
# Again, edit the environment files as mentioned above to add variables, but
# this time set LOCALINSTALLDIR to "$my_OOO_TREE/inst_pro.${my_UPDMINOR}"
# to not overwrite the non-pro installation!
mv LinuxX86.Set LinuxX86.pro.Set
mv LinuxX86.Set.sh LinuxX86.pro.Set.sh
source LinuxX86.pro.Set.sh
ln -s LinuxX86.pro.Set.sh ENV.$INPATH
Note that it is not needed to execute ./bootstrap again, as the resulting dmake executable is copied to and executed from solenv/$OUTPATH/bin for both, pro and non-pro. Having renamed the LinuxX86.Set.sh it wouldn't even work because bootstrap sources that.
Note also that $INPATH contains either unxlngi6 for non-product or unxlngi6.pro for product, whereas $OUTPATH contains always unxlngi6 without any extension. This may be confusing, as one would assume $OUTPATH would be used for the output directories. Just remember that $INPATH is used in the solver path to pull in header files and libraries. In fact, $OUTPATH is a base name that gets extended in makefiles with .pro for product output directories, or may be extended with other extensions, for legacy reasons, see solenv/inc/settings.mk
Running
Run the office from the installation. Do not attempt to run it from the solver/bin directory, it won't work. Also do not run it from within the build environment's shell, as libraries from the wrong path would be pulled in and subsequent libraries not be found. Use a clean shell instead. In the example used here, the executable to run would be .../cwsname/DEV300/inst.m42/openoffice.org3/program/soffice, execute it once to verify successful installation and step through the initial wizard.
To load documents attached to issues it is a good idea to set macro security to the highest possible value, i.e. execute macros only if the document resides in a specific location. Go to Tools → Options → OpenOffice.org → Security → Macro Security and choose Security Level Very high. If you need to execute macros to reproduce a bug you may add a directory to Trusted Sources in which you put such documents after having inspected the macros coming with the document.
Other options I modify:
- OpenOffice.org → Paths
- Change My Documents to the desired location; I have a bugdocs folder.
- Load/Save → General
- Uncheck Save AutoRecovery information every ... minutes.
It usually gets in the way at the most inconvenient point when debugging. - Uncheck Size optimization for ODF format.
It's much easier to view the XML streams with each element on its own line when necessary, instead of everything on just one line.
Hint: use the XML pretty printer (xmlpp) to reformat streams of documents that were saved with this option enabled.
- Uncheck Save AutoRecovery information every ... minutes.
- Language Settings → Languages
- I check Enabled for Asian languages and Enabled for complex text layout (CTL) because I also work on i18n, not needed otherwise.
Little Helpers
To tame the source base I heavily use exuberant ctags and GNU id-utils and sometimes cscope. For scripts generating databases suitable for OOo see the Little Helpers.
Setting up the ctags and ID databases
In the build environment shell I create the ctags and GNU ID-utils databases using the scripts mentioned:
cd $SRC_ROOT
mkid-script '*'
ctags-script --global
cd sc
tagsID '{.,../formula}'
Note that '*' and '{.,../formula}' are enclosed in single quotes to prevent pathname and brace expansion on the command line. This creates
- ID
- Id-utils database for the entire OOo tree; generating this takes some time, 20 minutes or so.
- tags
- Tag file for $SRC_ROOT/solver/inc, containing symbols of all delivered header files.
- sc/tags
- Tag file for modules sc and formula.
- sc/ID
- Id-utils database for modules sc and formula.
- sc/cscope.*
- Cscope database for modules sc and formula.
Because I work within the sc module and the compiler and tokens are now derived from classes declared in module formula, I setup the combined databases, so Vim sees them as one entity. The global tags file is pulled in when an identifier is not found in the local tags file, and the global ID file comes in handy when working on changes that affect the entire office, for example to lookup where a certain method is used.
Debugging
Let assume we want to debug a call to an interpreter function. Let further assume we don't know the implementation name and we don't bother looking it up through the chain resource file, resource header file, opcode file, and guess the corresponding interpreter function's method (for details see implementation of spreadsheet functions).
Per debug session, I prefer having a terminal with 3 tabs open:
- Shell with build environment, lets call it Build.
- Shell from which I run the office, lets call it Run.
- Shell for debugger, lets call it Debug.
This way they don't interfere with each other, preserving all stdout/stderr from the office run and having a clean screen for the debugger, which should not be started from within the build environment.
Build selected files with debug
Of course we could build just the entire sc module with debug, but I suppose you're just too impatient to wait for it, as I am. Instead, lets build just some objects with debug in the Build shell:
cd $SRC_ROOT/sc/source/core/tool
dmake killobj
dmake debug=t
cd ../../../util
dmake debug=t
- dmake killobj
- Removes object files corresponding to all source files in the current directory.
- dmake debug=t
- Builds all object files with debug that need to be rebuild.
The final dmake debug=t in util links the shared libraries.
The sc/unxlngi6/lib/libscli.so shared library now has debug information for those objects.
mkd script
Now wait, instead of killing all objects in the tools directory,
selectively building only interpreter relevant files would be sufficient.
A script mkd called as
mkd interpr*.cxx
is a reusable solution. The second dmake
call in the script is to build the object archive or other targets for the
directory if all sources were successfully compiled. After that we still need
to cd into the util directory to link the library using dmake debug=t.
This can be accomplished by passing the --link or -l option to the
script, so mkd -l interpr*.cxx
does it all.
Formula compiler, tokens, interpreter and document access
To not only have the interpreter with debug but also the compiler and methods that access the document and retrieve values from cells or interpret recursively, this comes handy:
# To get down to the compiler roots add module formula core
cd $SRC_ROOT/formula/source/core/api
mkd -l *.cxx
# Calc related
cd $SRC_ROOT/sc/source/core/tool
mkd compiler*.cxx token*.cxx interpr*.cxx
cd ../data
mkd -l doc*.cxx tab*.cxx col*.cxx cell*.cxx
Run the office
In Run shell
# cd into the 3-layer office library directory
cd .../cwsname/DEV300/inst.m42/openoffice.org/basis3.0/program
# Create a backup of the original libraries and symbolic links to the debug
# libraries, only needed the first time of course:
mkdir bkp
cp -p libsc* bkp
cp -p libvba* bkp
cp -p libfor* bkp
ln -sf ../../../../ooo/sc/unxlngi6/lib/libsc* .
ln -sf ../../../../ooo/sc/unxlngi6/lib/libvba* .
ln -sf ../../../../ooo/formula/unxlngi6/lib/libfor* .
Note: This is explicitly for debugging Calc, other modules, other libraries, of course.. You do not need this to run the office. Creating symbolic links doesn't necessarily work with all libraries, as some of them then won't find the appropriate run path to dlopen other libraries. Some need to be copied instead. This may need experimenting. However, it works for the Calc libraries.
Then execute the office:
../../../openoffice.org3/program/scalc & sleep 8 ; ps
This starts Calc in the background, sleeps for 8 seconds during startup (you may have to adapt the duration on a slower machine), and then displays the process list, one of them being soffice.bin, for example
1234 pts/6 00:00:01 soffice.bin
Remember the PID or copy it to the clipboard or selection.
Debug the beast
In the Debug shell it is best to cd into some source code subdirectory, for example sc/source/core/tool, otherwise the debugger sometimes may not find included header files when stepping through inline methods. You may of course also use the --cd option instead, or put a cd command in a gdb command file and execute it on startup for frequent use. Consult the documentation for details. Then invoke the gdb debugger with the Text User Interface, gdbtui, and attach it to the office executable already running:
gdbtui --pid=1234
or, if that doesn't work
gdbtui .../cwsname/DEV300/inst.m42/openoffice.org3/program/soffice.bin 1234
where 1234 is of course the actual PID of the running process. If your
system doesn't have gdbtui, try gdb -tui
or
gdb --tui
or gdb --interpreter=tui
instead. If that
still doesn't come up with an extra text frame, you're out of luck or on MacOSX
or both ;-)
When it loads the libraries you may have to press the enter key a few times during the listing, and then it waits for command input, with the executable being interrupted. To debug some interpreter function as mentioned above, the central entry point would be to set a breakpoint at ScInterpreter::Interpret(), so enter
b 'ScInterpreter::Interpret()' c
where b sets a breakpoint, here at the entry of the desired function, and c continues running the program. Note that you don't have to type the full classname::method for the breakpoint, it is sufficient to type an unambiguous portion of the name an press the Tab key for completion. This might be
b 'ScInterpreter::Int<TAB>
Note that the leading single quote is needed for this functionality.
If you now enter a formula in a cell, the debugger breaks as soon as ScInterpreter::Interpret() is reached and you may start stepping through. In command line mode this would be pressing n Enter for next program line, stepping over function calls, or s Enter stepping into function calls. The previous command could be repeated by just pressing Enter.
However, we started with the TUI, so we will take advantage of it. Press Ctrl-x s to switch to SingleKey mode, from now on you can use just n or s. A single c continues execution. Other single keys do different things, experiment or read the fine manual.. You may leave SingleKey mode at any time by pressing q.
As detailed documentation for gdb and TUI sometimes is not installed on systems
(check info gdb
) and the info program is cumbersome to use if
not used to, here's the online documentation:
Debugging with GDB
and the
GDB Text User Interface
See also
The Debugging page gives useful hints and tips.
Profiling
I use the great Callgrind and KCachegrind tools to create call graphs and see where performance bottlenecks are.
Performance profiling of course needs to be done in a product version, compiled
with optimizations and without assertions and test code. So cd into
.../cwsname/DEV300/ooo and source ENV.unxlngi6.pro
. Everything
needed to be done hopefully is explained in
profiling OOo with callgrind.