| English |
This guide is becoming obsolete - see the wiki for better maintained information
Building and hacking on OpenOffice.org (OO.o) entails climbing a fairly lengthy incline. Hopefully this document will make the learning curve somewhat steeper and more abrupt, and will give you a walking stick to help you out. Older hackers guides targetting the 1.1 tree are available (also in German & Japanese).
This document assumes that you'll be using a reasonably current Linux system, as a time saving feature. Real hackers use Free software, and don't have time to read about non-Free stuff.
We aim to answer at least the following questions:
If you need help getting OO.o build, and you intend to hack on it, please join the mailing list here and ask questions there.
There are loads of versions of OO.o, and several choices of branch, with multiple outstanding patch sets. I recommend you build from up-stream CVS HEAD milestones (SRC680 milestones), with patch sets to make them easier to build from here.
The very latest ooo-build (a small ~1.5Mb build wrapper) can be got from CVS thus:
export CVSROOT=':pserver:anonymous@anoncvs.gnome.org:/cvs/gnome'
cvs login
cvs -z3 checkout -P ooo-build
Note: You are going to need to download an additional ~170Mb of compressed source, and have ~3Gb of space space to unpack and build it in.
The build process is pretty complicated; you have a choice of commands now; although running both won't actually hurt:
./autogen.sh # only for the CVS version
./configure # the packaged version
This will guess which branch snapshot you want to build; if
you have other ideas use the --with-tag option; eg.
--with-tag=src680-m65 for a legacy branch.
If for some reason you have a 31337 multi-threaded computer, with great slabs of RAM; you'll want to use --with-num-cpus=8 etc. NB. it's not clever to force the build to swap like a demented pawnbroker by using an artificially high number; C++ compilation is seriously memory hungry.
In particular, building SRC680 requires a recent jdk & a
version of apache-ant. If you use a Novell system, just do:
sudo rug in apache-ant, alternatively download
a package from
rpmfind.net, or failing that see
Ant download & set the ANT environment variable
appropriately before configuring.
By the time you've upgraded your system to the point that it
has all the packages you need to start building OO.o (mozilla,
recent libart etc. etc.) you're almost at the point that you
can download the bulk of the source. To do this, after a
successful configure simply type: ./download and wait.
If for whatever reason this fails, you can verify your download
by fetching the equivalent .md5 file & comparing it to
the result of md5sum <archive>. The source
archives are
here - put the source in ooo-build/src.
This is the taxing bit - type make and don't forget
to press enter. Quite possibly you want to log the output, so
why not make 2>&1 | tee /tmp/log.
Since ooo-build wraps the actual OO.o configuration & build process, there are a number of internal config checks that also need to pass. For a first time build it's well worth staying near the console while everything unpacks, and the internal configure runs; if that completes without incident - you're usually into the heavy-duty thumb twiddling.
When everything has finished building; you should get some happy
looking message. The easiest way to install is:
bin/ooinstall -l <path-to-install-to> I often use
/opt/OOInstall
If you are a packager, you'll want to run make install
which honours DESTDIR & does other packager-like things.
Note: The '-l' to ooinstall runs a linkoo on the installed result.
Now wander into /opt/OOInstall/program and do:
source ./ooenv this will setup your (bash) shell for
running OO.o directly. Then simply ./soffice.bin -writer.
This is better than running soffice, or a wrapper script since
it's very easy to use the debugger: gdb soffice.bin.
Note: ooenv was formerly known as
env. It was renamed not to conflict with /usr/bin/env.
So - we've built and run OO.o, and we want to prove to ourselves that it is in fact possible to hack on it. So in a new terminal do this:
cd build/src680-m66
. ./LinuxIntelEnv.Set.sh
cd vcl
Now have a hack at vcl/source/window/toolbox2.cxx; I suggest
adding (eg.) an nPos = 0 anywhere before the
m_aItems.insert in the 4rd InsertItem method:
void ToolBox::InsertItem( USHORT nItemId, const XubString& rText,
ToolBoxItemBits nBits, USHORT nPos ).
Then save.
You're still in vcl/ yes ? then type 'build debug=true'; wait
for the scrolling text to stop; (5 seconds?). Now re-run
soffice -writer. You should notice the effect.
If not, ensure the previous soffice.bin was dead with
killall -9 soffice.bin
You can find more things to hack in the tutorials.
Note: for day to day hacking you want to just run 'build' inside the source tree. It is also highly recommended to work inside a copy of the build tree, and generate / test patches in an un-hacked version. To copy just the build/src680-m66 directory elsewhere, you need to use the relocate tool.
With the power of C++ comes the ability to shoot yourself in the foot all the more easily; (and implicitly), cf. Holub, Rules for C and C++ programming, McGraw-Hill, 95.
The best way to prepare yourself for battle is to read the OpenOffice coding guidelines here, and for the easily confused c'tor / d'tor is short for constructor / destructor.
It is seldom clear which module a patch resides in in bugzilla.
A quick way to try and work this out is to do:
cvs status <somefile> | head
This should give a 'Repository Revision:' line, with a path, the
2nd fragment of this is the project name,
ooo-build/bin/owner automates that process for you.
In addition, since the mapping of module names to IssueZilla tickets is rather contorted & un-documented, if you know what module the bug is in, use this page to file it.
As you start soffice.bin, there are several useful parameters to
use to accelerate your debugging experience; particularly
-writer, -calc, -draw,
and (the wizardly painful) -impress arguments.
While the build system is in similar to may other systems, it is also perhaps slightly different. The overview is that each module is built, and then the results are delivered into the solver. Each module builds against the headers in the solver. Thus there are a few intricacies.
build then un-winds internal module dependencies, and builds each module with a chdir, dmake pair.
There are various standard directories and files in most of the modules that make up OO.o, here are some of the more useful:
Build's mode of operation is to invoke 'dmake' in each of the projects' directories with a given dependency order. dmake then executes the rules in makefile.mk.
On first view build.lst looks scary:
vc vcl : NAS:nas FREETYPE:freetype psprint rsc sot ucbhelper unotools sysui NULL
vc vcl usr1 - all vc_mkout NULL
vc vcl\source\unotypes nmake - all vc_unot NULL
vc vcl\source\glyphs nmake - all vc_glyphs vc_unot NULL
so we need to try and un-pack what's going on here, which is in fact not
as odd as it might seem at first glance. Firstly lists are terminated
by the 'NULL' string. Every line is prefixed by a shortcut which is
irrelevant.
[shortcut] [path to dir to build] nmake - [flags] [unique-name] [deps...] NULL vc vcl\source\glyphs nmake - all vc_glyphs vc_unot NULLshortcut is not used; flags determines which platforms this builds on; usually single char platform codes: 'dnpum' 'u' being Unix. The higher up the system, the more stuff is flagged 'all'.
unique-name this is a magic name, used by other lines to describe an internal dependency. deps... any number of names of other directories in this file, that must be built before this one.
There is also documentation here on it.
The syntax of d.lst is more comprehensible than build.lst, it omits some default actions, such as copying build.lst into inc/<module>/build.lst.
A line is of the form:
[action]: [arguments]
mkdir: %_DEST%\inc%_EXT%\external
where if '[action]:' is omitted, it defaults to the 'copy' action.
Typical actions are copy, mkdir, touch, hedabu,
dos and linklib.The 'hedabu' action is particularly interesting, inasmuch that it cosmetically re-formats the header to shrink it on install (otherwise it's much like the copy action).
During the action, various macro variables are expanded some of which are:
..\%__SRC%\inc\sal\*.h %_DEST%\inc%_EXT%\sal\*.h
NB. relative paths are relative to the 'prj/' directory.
char *, please?Just barely. OO.o has at least six string wrappers, although the C implementations are of little interest:
rtl_String — sal/inc/rtl/string.h
"Normal" string plus reference counting.
rtlstring->buffer is useful, as is
rtlstring->length. This object encapsulates
an generic 8bit string - of unknown encoding. Feel free to treat
rtlstring->buffer as your beloved char *.
If you really want to look at the implementation of some
rtl_String function and lxr nor grep can help you, have
a look at sal/rtl/source/strtmpl.c.
OString — sal/inc/rtl/string.hxx
Simply a rtl_String wrapped inside a class; you
can use ostring.pData to get at the rtl_String
(it's public). OString has reasonably useful
methods for if you need them.
rtl_uString — sal/inc/rtl/ustring.h
"Normal" Unicode string, similar to rtl_String, and
refcounted as well. However, this one always comes in UCS-2
encoding, presumably to be compatible with Java's
questionable choices.
See rtl_String above to find where the implementation
of some rtl_uStringfunctions is hidden.
OUString — sal/inc/rtl/ustring.hxx
An rtl_uString wrapped inside a class. This is
what most of the OO.o code uses to pass strings around.
To convert an OString to an OUString
it is necessary to specify the character set of the
OString see; sal/inc/rtl/textenc.h
— the only interesting case is RTL_TEXTENCODING_UTF8
String — tools/inc/string.hxx
This is an obsolete string class, aliased to 'UniString'.
It has a number of limitations such as a 64k length limit.
A couple of conversion functions are really useful here, Particularly:
rtl::OString aOString = ::rtl::OUStringToOString (aOUString, RTL_TEXTENCODING_UTF8);
And the reverse:
rtl::OUString aOString = ::rtl::OStringToOUString (aOString, RTL_TEXTENCODING_UTF8);
If you just want to programattically print out a string for debugging purposes you probably want to see this.
Linkoo is the tool that implements the -l
functionality of bin/ooinstall. It essentially
sym-links files of similar names into your local tree,
allowing a fast development iteration.
It is however slightly limited - some of the modules
cannot be linked for various reasons; these are:
cppuhelper and configmgr,
thus in the rare case that these are altered, they
must be copied manually into /opt/OOInstall/program.
In addition symlinks cannot be used for soffice.bin, and
this is more commonly altered - it has to be installed
from desktop/unxlngi4.pro/bin/soffice NB.
with an appended '.bin'
This section assumes use of gdb, from the console.
OO.o includes a way to add debugging code in per module, via
the build debug=true command in each module.
This also adds lots of runtime assertions,
churning warnings etc. in addition to debug symbols - which
can be useful. To do just a plain build with debug symbols
though use build debug=true dbg_build_only=true
You can also configure OO.o with --enable-symbols to build with symbolic generation.
We start in 'main' with a sal wrapper, that calls vcl/source/app/svmain.cxx (SVMain). It invokes Main on pSVData->mpApp; but pSVData is an in-line local. To debug this use the pImplSVData global variable. eg:
p pImplSVData->maAppData
This 'Main' method is typically:
desktop/source/app/app.cxx (Main).
We have already seen that OO.o has
it's own set of string classes, none of which gdb understands.
You need to use:
(gdb) print dbg_dump(sWhatEver) to print the contents
of a UniString/ByteString/rtl::OUString/rtl::OString regardless
of the type when debugging C++ code. See Caolan's write-up
here for details.
The build dependencies of the modules are clearly crucial to getting a clean build. When you type 'build' in a module, first build examines prj/build.list, eg. neon/prj/build.lst:
xh neon : soltools external expat NULL
this specifies that 'soltools', 'external' and 'expat' have to
be satisfactorily built and delivered before neon can be built.
Occasionally these rules get broken, and people don't notice for
a while.
What fun — you symlinked desktop/unxlngi4.pro/bin/soffice to soffice.bin in your install tree didn't you. That works fine if you just run it, but it seems gdb unpacks the symlink and passes a fully qualified path as argv[0], which defeats the hunting for the binary in the path, so it assigns the program base path as /opt/OpenOffice/OOO_STABLE_1/desktop/unxlngi4.pro/bin and starts looking for (eg. applicat.rdb) in there. Of course when it fails to find any setup information, it silently crashes somewhere else yards away from the original problem.
For various reasons signal handlers are trapped and life can get rather confusing; thus it's best for builders to apply something like this:
--- sal/osl/unx/signal.c
+++ sal/osl/unx/signal.c
@@ -188,6 +188,8 @@ static sal_Bool InitSignal()
bSetILLHandler = sal_True;
}
+ bSetSEGVHandler = bSetWINCHHandler = bSetILLHandler = bDoHardKill = sal_False;
+
SignalListMutex = osl_createMutex();
act.sa_handler = SignalHandlerFunction;
NB. trailing space.
Some methods, are described as having a special linkage, such that they can be used in callbacks; these typically have a prefix: 'LinkStub', so search for the latter part of the identifier in a freetext search. eg.
IMPL_LINK( Window, ImplHandlePaintHdl, void*, EMPTYARG )
builds the 'LinkStubImplHandlePaintHdl' method.
Often when you run gdb on a build without debugging symbols, you get an unhelpful gdb trace, but yet you can't afford the time/space to recompile all of OO.o with debugging symbols. Thus we have created a small perl helper, which will hunt for & touch files containing the symbols from your trace. This sub-set can then be re-built with debugging enabled for a better trace next time around:
gdb ./soffice.bin
...
bt
#0 0x40b4e0a1 in kill () from /lib/libc.so.6
#1 0x409acfe6 in raise () from /lib/libpthread.so.0
#2 0x447bcdbd in SfxMedium::DownLoad(Link const&) () from ./libsfx641li.so
#3 0x447be151 in SfxMedium::SfxMedium(String const&, unsigned short, unsigned char, SfxFilter const*, SfxItemSet*) ()
from ./libsfx641li.so
#4 0x448339d3 in getCppuType(com::sun::star::uno::Reference const*) () from ./libsfx641li.so
...
quit
cd base/OOO_STABLE_1/sfx2
ootouch SfxMedium
build debug=true
Thus, all files referencing / implementing anything with SfxMedium will be touched, and hence rebuilt with debugging symbols.
If you want to recompile the code in just your current directory, you can use the killobj dmake target to remove the object files:
dmake killobj
dmake
You are a victim of asynchronous X error reporting;
export SAL_SYNCHRONIZE=1 will make all the X traffic
synchronous, and report the error by the method that caused it,
it'll also make OO.o far slower, and the timing different.
Caolan suggests: put breakpoints in ww8par.cxx top and tail of SwWW8ImplReader::LoadDoc, and confirm that the document gets as far as the import filter.
A handy human place to put a breakpoint is in SwWW8ImplReader::ReadPlainChars, you can see chunks of text as they are read in. Alternatively SwWW8ImplReader::AppendTxtNode as each paragraph is inserted.
So OO.o contains some hefty debugging infrastructure; pictured here. Unfortunately enabling it is not altogether trivial. Firstly - none of it is built into a product build; so we need to go to re-build some core parts of OO.o as non-product builds; and then we need to re-run linkoo to link those new builds into our set.
First create a debug Environment file; I call it LinuxIntelEnv.Set.debug:
TMPFILE=~/.Env.Set.debug # Purge .pro bits sed 's/\.pro//g' LinuxIntelEnv.Set.sh > $TMPFILE . $TMPFILE rm $TMPFILE # Clobber product parts unset PRODUCT PROSWITCH PROFULLSWITCHNow do
source ./LinuxIntelEnv.Set.debug, this
sets up your environment for a non-product build.
cd vcl; build dbgutil=true --all
linkoo
Now - just run OO.o, and when it's in full-flow, press
<Alt>-<Shift>-<Control> 'D' in that
order; this should popup a debugging options window.
The debugging options
are subsequently saved to the .dbgsv.init file for the
next run; you can control the location of that with:
export DBGSV_INIT=$(HOME)/.dbgsv.init eg. it
is (unfortunately) a binary file.
This is fairly easy; edit sc/source/filter/inc/biffdump.hxx,
define EXC_INCL_DUMPER to 1, and re-build 'sc'. Also, copy
sc/source/filter/excel/biffrecdumper.ini to ~. Then run
soffice.bin foo.xls and you should get a
foo.txt with the debug data in it.
OO.o is a fairly threaded program, you're prolly just looking
at the wrong thread: there are not likely to be bugs in poll.
Use thread apply all backtrace to get a backtrace
of all threads - this will most likely fail. When it does do:
thread 1 then bt - most crashers
occur in the 'main' thread.
There are several typical stack-traces that come up again and again, one would be:
#15 0x4164a501 in raise () from /lib/tls/libc.so.6
#16 0x4164bcd9 in abort () from /lib/tls/libc.so.6
#17 0x415fb5a5 in std::set_unexpected ()
from /home/mnagashree/m72install/program/libstdc++.so.5
#18 0x415fb5e2 in std::terminate ()
from /home/mnagashree/m72install/program/libstdc++.so.5
#19 0x415fb69c in __cxa_rethrow ()
This section of trace means (essentially) that an exception was thrown - but there was no-one trying to catch it. Often this means there was a missing 'try {} catch()' clause in one of the calling frames.
A great way to debug exceptions is to add a breakpoint
in catch/throw, do this with catch throw or
catch catch in gdb.
Always use unified diffs 'cvs -z3 diff -u', since they are the most readable, (and sensible) types of diff to read and apply.
It tends to be a good idea to work out how best to implement your fix, and/or discuss it with a developer or two before hand. Some of the best ways to do this are to post to dev@openoffice.org or lurk on IRC at irc.freenode.net on the #OpenOffice.org channel. IRC is an awfully poor communication medium, but better than no communication. See here to unwind who is whom.
See here for more information on our patching infrastructure.
See here for a sane / hackers interface to OpenOffice's IssueZilla.
Since we can often extract the owner of a module by checking for the ADMIN_FILE_OWNER tag; there is a little tool in ooo-build: bin/owner <file-name> that helps you find out who to E-mail / interact with about a given module; it's worth assigning very specifically located bugs to that person.
This is the process for getting CVS accounts for the up-stream CVS server, ooo-build accounts are handled differently. To see how the issue raising process works see eg. issue #7270. Having got the account setup, you need to tunnel to the secure CVS server something like:
ssh -f -2 -P -L 2401:localhost:2401 tunnel@openoffice.org sleep 1400 < /dev/null > /dev/null
Then you need to change your CVSROOT to point at your local machine, since this is the endpoint of the tunnel:
:pserver:mmeeks@localhost:/cvs
Your account name and password - will be the same as you use for filing bugs etc. in the SourceCast system. Login, and ... you'll soon notice that you'll need to migrate your CVS settings to the new server, to do this without wasting B/W with duplicate checkouts do:
bin/re-root /path/to/checkout ":pserver:<account-name-here>@localhost:/cvs"
Of course, to commit anything, you'll need various project priviliges - and to battle the bureaucracy.
Patch/diff are a wonderful tools, however people often provide data that confuses them in a messy and difficult to un-tangle sort of a way. Here are some hints on untangling the mess:
Before committing a patch to ooo-build, test it with
make patch.apply in the top-level, NB. it really
pays to have 2 copies of the tree - 1 hacked, 1 pristine.
Just use dmake clean in the build/src680 directory.
Or for a more descructive version in ooo-build try rm -Rf build.
In order to make efficient use of bandwidth, generate sensible diffs by default, and follow the trend, you need this in your ~/.cvsrc.
cvs -z3 -q diff -upN update -dP checkout -P status -v
Adding header files to the OO.o build is notoriously clunky. To
add header files under external/, make sure you list them in
external/prj/d.lst so that they get copied under the
solver/680/unxlngi4.pro/inc/external directory when building.
Often there is some GUI element used near the thing you're trying to locate / fix. So, find some sufficiently unusual string and search for it in LXR's text search; this should reveal an identifier related to that string; eg. SID_AUTOFORMAT, or FN_NUM_BULLET_ON. Having obtained that, do a new text search for that string, and you'll find the usage [ or a chained define to something else ]. For eg. menus/toolbar buttons the functionality is usually in a case statement eg. case SID_AUTOFORMAT: ...
This is slightly more complex build wise than you might expect.
This should result in your type information being built into types.rdb & installed. This is however only part of the mix: the module 'offuh' builds & installs the .hdl/.hpp files we need (for C++), so if 'wherever' is a new path we need to update offuh/prj/d.lst to install those files too.
Finally, check that the types.rdb in the install set has your
types; a regview types.rdb / | grep 'whatever' -i
would work well for that. If not, copy it in from the solver.
While much of the initial openoffice.org structure seems not to be orientated towards hackers, there is much useful documentation if you dig for it.
For OO.o news, and a distinctive perspective on OO.o see ooodocs.org.
Other related pages are: OOExtras provides extra templates, macros, and clip art (curiously licensed under the LGPL). Quickstart applet for GNOME (and KDE). Dictionaries & Docs from Kevin Hendricks.
And an interesting portal.
While productising various releases of OpenOffice, different projects have come up with (quite huge) patch sets against OO.o. These have mostly been folded back into 2.0 but, there are still a few outstanding. The separate packaging efforts can be found here:
So no-one ever asked me these, I just made them up to astro-turf a bit (safer, wipe-clean, more durable questions).
By consulting various oracles, entrails etc. it transpires that
in theory this number once incremented weekly, there being weekly
freezes and hence solvers, development environments.
The 'mws' stands for 'Master Workspace'. The latest 2.0 development
is done with the SRC680 stem; with auto-incrementing milestones; hence
tags like SRC680_m66 would be common.
Essentially it seems there are a lot of XML files involved in component registration, and various other services. Also, the person who designed the XML files fell in love with trendy XML-things and used not-very-standard, very complicated bits gratuitously. It turns out that using Java is the best/only way to get this manipulation done. Also, Java can be used nicely at run-time if it's on the machine.
But from tag SRC680_m44 onwards there is an alternative python script included to address the issue of processing these XML files used for registration, so it should be possible to build versions after that date without java, though your milage may vary as the default build is with java.
This is rather inscrutable; some particularly curious brokenness
would be the way piping commands on stdin is crucially different
to inputting them from the tty thus:
echo 'echo #define DLL_NAME "libsch641li.so" >./foo.hxx' | /bin/tcsh -s
fails to do anything whereas typing the same thing into the shell
works just fine. Even more oddly:
tcsh -fc 'echo #define DLL_NAME "libsch641li.so" >./foo.hxx'
does do the right thing. See also csh.
The simple answer is: you need to run relocate /path/to/new/build;
another more complex answer is:
Well, assuming you have re-configured things (LinuxIntelEnv.Set will
need paths tweaking too — and re-importing to your shell) — then it's most
likely down to the ubiquitous non-relative paths, coded in lots
of generated / built files, particularly '.dpc*' (dependency)
files. Try:
find -name '*.dpc*' -exec rm {} \;
The stlport does some really broken things, so you will also need to edit the 'stl_gcc.h' inside the solver/, and replace the two path instances there (see inc/stl/config/stl_gcc.h).
While of course it's possible that your user name is not registered; often this just means your ~/.cvspass got lost and/or that you haven't logged in. cvs login, and repeat the command.
Product — isn't it obvious ?
Today I found a photograph of it on my system, so I stuck it in here:
OOo does some very odd things with X resources, thus some conventional
screenshot apps fail to take accurate shots. ImageMagick's 'import'
however does a good job; use:
import foo.png from the console, or
sleep 2; import -window root foo.png instead.
NB. unless you want your world to look tiny, you need to turn large
toolbar icons first.
The authors must be using a really strange editor. It thinks tab stops are on every fourth column. Of course, the files come out ugly in Unix editors which know that tabs are eight characters wide.
If you happen to use a Real Editor, we have some pink glasses to sell to you. Paste the contents of http://go-oo.org/emacs.el into your .emacs, or load it with a line like this: (load "/path/to/that/file.el"). Don't forget to adapt my-openoffice-path-regexp to your needs.
Henceforth emacs will use 4-column tabs for your OOo source files.
(And use C++-Mode for sdi-, hrc-,
and src-files.) Alternatively if you are sufficiently set in
your ways that you can't cope with investing these few seconds do:
M-x set-variable\ntab-width 4 & learn to love change.
Apparently if you use vi you can do: :set ts=4, and
good luck to you.
See the About ooo-build document.
If you have more hacking tips, corrections, a grip of correct spelling etc. please do mail me, at michael.meeks@novell.com.