The Allegro Hacker's Guide

This is a guide to some of the internal workings of Allegro, for people who are interested in hacking on it. This document is far from complete, and may not always be 100% accurate. Remember that when in doubt, the sources are always the definitive reference. Suggestions for what to include in this document will be very welcome: there is far too much code for me to go over it all in any kind of detail, so I want to concentrate on the things that people find most confusing...

Coding Style
Build Process
Header Files
Definitions
Unicode Support
Asm Routines
Other Stuff
How to contribute patches

Coding Style

I'm not going to be a fascist about this, but it does make life easier if all the code uses a consistent layout. If you are going to write and maintain more than one complete source file of your own, I think you are entitled to do that however you like, but for smaller contributions, I will probably reformat your code to fit in with my existing style. It will obviously save me time if you write it this way in the first place, hence this description:

Basic Allegro style: K&R, with 3 space indentation. On disk, though, tab stops are 8 spaces, so if for example a line was indented by 12 spaces, this would be saved out as either 12 space characters or 1 tab and 4 spaces, not as 4 tabs. If your editor can't handle the difference between 3 char internal and 8 char external tab stops, either get a better editor or use indent to clean up after yourself. The indent.pro file included with the Allegro distribution comes close to getting this layout right, but doesn't quite manage it, so some things still need to be cleaned up by hand.

Preprocessor defines and structure names are UPPER_CASE. Function and variable names are lower_case. MixedCaseNames are evil and should not be used. That silly m_pHungarian notation is _really_ evil and should not even be thought about.

All symbols should be declared as static unless that is absolutely not possible, in which case they should be prefixed with an underscore.

Functions look like this:

/* foobar:
 *  Description of what it does.
 */
void foobar(int foo, int bar)
{
   /* do some stuff */
}

Three blank lines between functions.

Conditionals look like:

   if (foo) {
      /* stuff */
   }
   else {
      /* stuff */
   }

The only time when something comes on the same line after a closing brace is at the end of a do/while loop, eg:

   do {
      /* stuff */
   } while (foo);

Case statements look like this:

   switch (foo) {

      case bar:
         /* stuff */
         break;

      default:
         /* stuff */
         break;
   }

Examples of where to put spaces:

   char *p;
   if (condition) { }
   for (x=0; x<10; x++) { }
   function(foo, bar);
   (BITMAP *)data[id].dat;

All sources should begin with the standard header:

/* ______ ___ ___ * /\ _ \ /\_ \ /\_ \ * \ \ \L\ \\//\ \ \//\ \ __ __ _ __ ___ * \ \ __ \ \ \ \ \ \ \ /'__`\ /'_ `\/\`'__\/ __`\ * \ \ \/\ \ \_\ \_ \_\ \_/\ __//\ \L\ \ \ \//\ \L\ \ * \ \_\ \_\/\____\/\____\ \____\ \____ \ \_\\ \____/ * \/_/\/_/\/____/\/____/\/____/\/___L\ \/_/ \/___/ * /\____/ * \_/__/ * * Brief description of what this file does. * * By Author. * * Cool stuff added by Someone Else. * * Stupid bug fixed by a Third Person. * * See readme.txt for copyright information. */

Author credits should be added in chronological order, and email addresses should not be included: those can be found in the main credits file, and if they only exist in one place, it is easier to update them when people change address.

People only need to be listed in the source file header if they've made a significant contribution to it (one-line fixes don't count), but no matter how small their addition, they must be added to the docs/thanks._tx file. This is sorted alphabetically by name. If they are already in it, update the text to describe the new addition, otherwise make a new entry for the new contributor. Also, anything more than very tiny modifications should be added to the docs/changes._tx file, which grows from the top in reverse chronological order. This file should briefly describe both the nature of the modification and who did it.

Build Process

This is very different depending on whether you are using autoconf or a fixed makefile. For most platforms, though, the fixup script (eg. fixdjgpp.bat), will create a small makefile, which defines MAKEFILE_INC to the make of another file (eg. makefile.dj), and then includes makefile.all. This contains a lot of generic rules, and includes the file named in MAKEFILE_INC to provide additional platform-specific information. The actual source files are listed in makefile.lst.

There are three library targets: alleg (release), alld (debugging), and allp (profiling). Objects go in obj/compiler/version/, where version is one of alleg, alld, or allp. Libraries go in lib/compiler/. A few generated things (asmdefs.inc, mmxtest.s, etc), go in the root of obj/compiler/. Dependencies are generated by "make depend", and go in obj/compiler/version/makefile.dep, which is included by makefile.all.

When you run "make clean", this only deletes harmless generated files like the objects. "make distclean" strips you right back to the original distribution, including getting rid of the test executables and the library itself. For the ultimate in personal hygene, run "make veryclean", which will wipe absolutely all generated files. After doing this, you will have to run "make depend" before you can build the library, and also "fixdll.bat" if you are working on a Windows platform.

To pass long commandlines to the MSVC and Watcom linkers, the program runner.exe is compiled using gcc, so make can pass it a decent number of arguments. This just saves the parameters into a temporary file, and then invokes the real command using that as an argument file.

All the makefiles currently use gcc for dependency generation, because this is easier than trying to get MSVC or Watcom to output the right info.

The symbol LIBRARY_VERSION, defined at the top of the makefile.ver, is used for including a version number in things like the DLL filename.

Header Files

allegro.h lives in the include/ directory. It is only a placeholder which includes other headers which live in the include/allegro/ tree. The reason for this slightly odd approach is that allegro.h can include things like "allegro/keyboard.h", which will work both in-situ within the build directory, and if we copy allegro.h to the system include directory and the other headers into system_include/allegro/. This avoids cluttering the system directories with lots of our headers, while still allowing programs to just #include <allegro.h>, and also makes it possible for people to access keyboard stuff with #include <allegro/keyboard.h>.

base.h includes alconfig.h, which checks the current platform and includes a helper header for this compiler (aldjgpp.h, almsvc.h, alwatcom.h, etc). That helper header defines a bunch of macros describing the system, emulates whatever things are needed to make the code compile properly, and optionally defines ALLEGRO_EXTRA_HEADER and ALLEGRO_INTERNAL_HEADER if it is going to need any other platform-specific includes.

After including the platform header, the rest of alconfig.h defines a lot of generic helper macros to their default values, but only if the platform header hasn't already overridden these to something specific.

Every module-specific header contains structure definitions and function prototypes. At the end of the file, it may include a header from the include/allegro/inline/ directory which defines related inline routines. If inline asm is supported, this can include in turn asm.inl which imports routines from one of the compiler-specific files al386gcc.h, al386vc.h and al386wat.h; otherwise C versions are used instead. The header alinline.h is a placeholder which includes all the headers defining inline functions.

If ALLEGRO_EXTRA_HEADER is defined, allegro.h includes this at the very end. This is used to include one of the files aldos.h, alwin.h, etc, which define platform-specific things like ID values for the hardware drivers. Unlike the platform files included from the top of allegro.h, these are specific per-OS rather than per-compiler, so the same alwin.h can be used by both MSVC and MinGW32. They describe library functions that relate to this platform, while the earlier header described the basic language syntax.

aintern.h is like the internal.h in earlier Allegro versions, defining routines that are shared between multiple sources, but that we don't generally want user programs to see.

On platforms which have specific, non-portable API routines of their own, these should go in a special header in the root of the include directory, eg. winalleg.h. This can be included by user programs that want to access these routines, while making it very clear to them that by including this header, they are writing non-portable code.

Definitions

All header function prototypes should use the macro AL_FUNC(). Inline routines use the macro AL_INLINE(). Global variables use AL_VAR() or AL_ARRAY(). Global pointers to functions use AL_FUNCPTR(). Pointers to functions which are passed as parameters to other routines or stored in a structure typedef use AL_METHOD(). This may seem like something of an overkill, but it gives us a lot of flexibility to add DLL import/export specifiers, calling convention markers like __cdecl, and even to mangle symbol names on some compilers. If you forget to use these macros, your code won't work on some platforms.

This only applies to header files, though: you can write normal code in the C sources.

The symbol ALLEGRO_SRC is defined while compiling library source files. If you want to inline a function in one of your sources, use the INLINE macro. To declare a zero-sized array in terminal position inside a structure, use the ZERO_SIZE_ARRAY(type, name) macro. To use 64 bit integers, declare a LONG_LONG variable (this won't be defined on all platforms). To do things with filenames, check the macros ALLEGRO_LFN, OTHER_PATH_SEPARATOR, and DEVICE_SEPARATOR. See the headers for details.

Unicode Support

Do not assume that strings are ASCII. They aren't. If you assume they are, your code might work for a while as long as people are only using it with UTF-8 data, but it will die horribly as soon as someone tries to run it with 16 bit Unicode strings, or Chinese GB-code, or some strange MIME format, etc. Whenever you see a char * being passed around, you must be aware that this will actually contain text in whatever format is currently selected, so you have to be damn careful when manipulating strings. Don't ever forget and call a regular libc routine on them!

Use the Unicode functions for all your text manipulation: see the docs for details. When allocating a scratch string on the stack, assume that each character will occupy at most four bytes: this will give you more than enough space for any of the current encoding schemes.

If you want to specify a constant string, use the function uconvert_ascii("my string", buf) to obtain a copy of "my string" in the current encoding format. If buf is NULL, this will use an internal static buffer, but the converted string will be overwritten by the next call to any format conversion routines, so you shouldn't pass it down into other library functions. Normally you should provide the conversion space yourself, allocating buf as a temporary object on the stack.

To convert the other way (eg. before passing an Allegro string to an OS routine that expects ASCII data), call uconvert_toascii(mystring, buf).

For any messages that may be seen by the user, you can call get_config_text("my ascii string") instead of uconvert_ascii(). This will return a pointer to persistent memory (so it is ok to keep the string around indefinitely), after converting into the current text encoding format. This function is cool because it saves you having to bother allocating space for the converted data, and because it allows the string to be replaced by the translations in language.dat. You should be sure to always pass a constant string to get_config_text(), rather than any generated text or data from other string variables: this is so that the findtext.sh script can easily locate all the strings that need to be translated.

Hardware drivers should initialise their name and desc fields to the global empty_string, and store an ASCII driver name in their ascii_name field. The framework code will automatically translate and convert this value, storing the result in both the name and desc fields. For most drivers this will be enough, but if you want to provide a more detailed description, it is up to your driver to set this up from their init routine, and take care of all the required conversions.

Asm Routines

Structure offsets are defined in asmdef.inc, which is generated by asmdef.c. This allows the asm code to use human readable names for the structure members, and to automatically adjust whenever new fields are added, so it will always exactly match the layout of the C structures.

Asm code should use the macro FUNC(name) to declare the start of a routine, and GLOBL(name) whenever it wants to refer to an external symbol (eg. a C variable or function). This is to handle name mangling in a portable way (COFF requires an underscore prefix, ELF does not).

You can modify %ds and %es from asm, as long as you put them back. If USE_FS and FSEG are defined, you can also change %fs, otherwise this is not required and you can safely use nearptr access for everything.

Don't assume that the MMX opcodes will be supported: not every assembler version knows about these. Check the ALLEGRO_MMX macro, and be sure to give up gracefully if these instructions are not available.

Other Stuff

Any portable routines that run inside a timer handler or input callback must be sure to lock all the code and data that they touch. This is done by placing an END_OF_FUNCTION(x) or END_OF_STATIC_FUNCTION(x) after each function definition (this is not required if you declare the function as INLINE, though), and then calling LOCK_FUNCTION() somewhere in your init code. Use LOCK_VARIABLE() to lock global variables, and LOCK_DATA() to lock allocated memory.

Any modules that have cleanup code should register their exit function by calling _add_exit_func(). This will ensure that they are closed down gracefully no matter whether the user calls allegro_exit(), falls off the bottom of main(), or the program dies suddenly due to a runtime error. You must call _remove_exit_func() from inside your shutdown routine, or you will find yourself stuck in an endless loop.

How to contribute patches

Once you are willing to contribute that beautiful hack which does what everybody has been waiting for, the fix for that hideous bug which has been driving you mad for several nights, the nice improved documentation you would have liked to read in the manual for the first time, etc, you have already done the hardest part. Now you only need a way to let the Allegro developers merge your changes in the main distribution.

You could probably send your patch to one of the people working on Allegro, but this is not very safe, it depends on the person you chose being available and willing to do the work for you. The best you can do is to send your patch to the Allegro Developers mailing list. Read the readme.txt file for information on how to subscribe to this list. Alternatively, updated subscription instructions should always be available at http://alleg.sourceforge.net/maillist.en.

Sending your patches to the mailing list instead of a single person is good, because all the subscribed developers can take a look at your modifications, suggest improvements, or find problems, which you can discuss on the same mailing list, letting other developers join the conversation when they consider appropriate. If the modifications are good, they will probably be accepted and merged in the WIP version for the next release. If you aren't lucky, or your patch still needs some work, you will be told why it's not accepted, or what you have to do to improve it. If you aren't subscribed to the list, remember to tell it in your email.

If you have obtained Allegro from an existent release, stable or unstable, you will have all the source code contained in some archive format. You will need it, because to create a patch you need two versions of each modified file, the original version, and your modified version. You will also need the diff tool, which is used to create the patches. This tool is usually packaged as a standalone package in most GNU/Linux distributions with the same name. For DOS, you can get a port from http://www.delorie.com/djgpp/. Just choose a mirror from http://www.delorie.com/djgpp/getting.html, enter the v2gnu directory and download the difxxb.zip package. While you are at it, you can also get a tool named patch (patxxb.zip), which is used to apply patches generated by diff, in case you have to apply the patches somebody else sends to you. Install the binaries in some directory of your path, so that you can use them from anywhere.

If you are planing to modify only one file, you will usually copy this file to the same name in the same directory with the appended extension '.old' before starting to work on it. After you have made your modifications to the file, and verified that they please you, go to the directory containing the modified and original files and type at the prompt:

   diff -u file.c.old file.c > patch

This command will generate a text file which contains the differences between both files in unified output format. Open it with your prefered editor and verify that it contains the modifications you wanted to do: lines you have added will be marked with a plus sign '+', lines you have removed will be marked with a minus sign '-'. If the file is bigger than a few kilobytes, compress it before sending to the developers mailing list, and of course remember to add an explanation of what the patch is meant to do, why it's needed, and any other information you consider relevant.

If the modifications you want to do are scattered through several files and/or directories, this form of patch generation is very tiresome for both ends (you, and the developers). So unpack a fresh copy of the Allegro source somewhere and move it to the parent directory where your current version is, after giving it another name of course, so as to obtain two complete sources trees side by side. Modify the files you wish in your working directory. Once you are finished, go back to the parent directory housing the two source trees and type:

diff -ur fresh_original_directory working_directory > patch

The '-r' switch makes diff compare directories recursively. Again, do the previous steps of verifying your patch, compressing and sending with correct instructions. If your patch adds or removes files, you will have to add the '-N' switch, because by default diff will ignore files which are only in one of the trees. Of course, you might want to run a 'make clean' in your working directory before running this command, or you will include lots of generated files which have nothing to do with your patch. Or you could edit the resulting patch, but that can be error prone.

If you are working with the cvs version of Allegro which you can get from Sourceforge (http://sourceforge.net/projects/alleg/), you won't need to copy any files at all. Just modify the files you want, go to the root directory of the cvs copy and type:

   cvs diff -u > patch

Unlike the standalone diff, the cvs diff command will work recursively through the Allegro source tree, comparing each file against the Sourceforge repository. The patch will have slightly different headers, but that's ok, once you have it follow the previous process to send it to the developers mailing list. Of course, check cvs' manual for more information and options.