A benefit of an integrated programmable build system

Macoy Madson

This article is mirrored on my blog.

I recently released File Helper, a file organization application I wrote using Cakelisp.

This application had only two external files that were necessary for it to fully function:

  • A font
  • An application icon

I packaged File Helper in a .zip or .tar.gz for Windows or Linux respectively. These archives contain the platform executable as well as a license file and the two necessary font and icon files.

However, wouldn't it be nice if instead I shipped a single executable, thereby eliminating the extract step?

It might sound trivial, but eliminating that extra step has many benefits:

  • Less technical users won't get confused. Double-clicking an archive usually opens it in a browser rather than extracting it, which might confuse them and cause them to not use my product.
  • The application has no risk of breaking if the executable is moved.
  • The user doesn't have to delete or move the archive after they extract it.

Bundling files into executables

An executable is just a file format which your operating system understands. It is essentially a header and a whole bunch of sections filled with binary data.

Typically, a linker converts a collection of object files into a single executable. Because executables are containers which can hold various kinds of data, we can package data only our application understands in the same container as the application code.

The operating system is fine with this because it only needs to map the executable into memory and start executing code at a designated entry point. It is then up to the program to decide how to interpret the various executable sections.

Platform differences

There are many different file formats for executables. Usually, an operating system only supports one executable file format. On Windows, it's the Win32 Portable Executable format, typically with extension .exe. On Linux, it's usually ELF.

I am only targeting those two platforms, so I can add code to specifically support those formats when building Cakelisp programs.

On Windows, data is added to executables via Resource Files. I wrote a tutorial on how to do this.

On Linux, data can be added via dumping the data to an object file which defines a couple symbols. This is a great tutorial on how to do that.

Good and bad ways

Like everything in programming, you'll hear different advice on how to bundle data.

The most common alternative method is to convert your data to a C-style array definition. This has many limitations, and in my opinion should be avoided:

  • Some compilers (MSVC included) limit the number of elements in an array, which therefore limits the size of the bundled data.
  • Your compiler has to do extra processing (tokenization, parsing, etc.) to that data which it should actually just treat as a giant binary blob. Extra unnecessary processing means longer build times.
  • An extra stage has to be created and compiled as part of your build system, which adds complexity.

We are going to proceed with the platform dependent but much more robust approach, which is to convert our data to object files without using a C/C++ compiler.

Integrated build system

Whether we are on Windows or Linux, we need to process our data file into some other form in order for the linker to properly understand the data package. This means adding a step to our build to process the data, because we want it to automatically stay up-to-date when linked in the executable.

Cakelisp includes a simple C/C++ build system as well as compile-time code execution. We need to create a new build step to process our binary data into object files. In order to do that, we use a compile-time build hook to execute a function which performs the conversion.

The full code is here.

The end-user interface is simply:

(import "DataBundle.cake")
(bundle-file data-start data-end (const char)
             "../data/MyFont.ttf")

We declare data-start and data-end to represent pointers to the symbols associated with our data.

That bundle-file invocation is a macro that adds the data file to a list. It also generates the variables we can use to refer to the data.

Finally, a compile-time function convert-all-bundle-files calls the necessary objcopy (or Resource Compiler on Windows[^1]) to generate the actual object file for each bundle-file. It only does this if the data files are changed or the object files don't already exist in the cache.

We can then link the generated objects into the executable alongside our code object files. It also adds that object file to the linker command line.

This function is integrated into the Cakelisp build sequence like so:

(add-compile-time-hook-module pre-build convert-all-bundle-files)

Conclusion

This is pretty great: we extended our build system to support bundling arbitrary data files, all without touching Cakelisp's internals itself.

Not only that, we extended the system in the same language we write our application code, and within the same invocation---we didn't need to create some other phase. We were also able to provide the user with an extremely simple interface to bundling files.

[^1]: On Windows, we need to generate a .rc file with a list of all the resources that should be compiled into a single object file. Because Cakelisp allows arbitrary compile-time code execution, we can easily do this by writing the filenames out to the generated rc, then invoking the Resource Compiler on that file. This platform-specific step can be completely automated!

I tried the trial version of file helper, here is some feedback.

I don't know if the trial is up to date, because it has data folder which is what you said you changed in the blog post.

  • My first impression while trying to get the application was a bit off, because I perceived the web page as pushing the paid version too much. It comes to basic things like putting the paid link before the trial link and labeling it as "get" instead of "buy". I also would prefer a "support" link instead has, as far as I know, there are no differences between the trial and the "paid" versions. I want to be clear that I'm not again you selling the application, just a bit pushed off by the way it's presented.

  • The order of the links is important to me because I don't want to buy something without knowing what it does, and the description is missing an important detail in my opinion. The detail is "The application generates a text files by category containing the file paths in that category". Without that it's not clear at all how to use the application or how it could interact with other software. By itself the application doesn't do anything useful.

  • Note on the website about free/freedom. I (and I suspect most people) consider "free" to mean "no money" in this context, so to say "freedom", I would use the word freedom.

  • I also think that the application (and web page) don't look good, which might not seem important, but having things look good gives me the idea that you know what you're doing and makes me in a better disposition to buy something. The reason I think the application looks bad is that the theme (which I now believe is auto generated based on the wallpaper) doesn't look good the choice of colors is very bad (my personal opinion) which obviously is partially because of my taste, but I also think (no data to support that) that most people wouldn't like the colors in those screenshots. The UI also seem a bit cluttered (I can't be more specific, it's a feeling).

  • The copying and license file should have a .txt extension (at least of Windows) to make then accessible to "less technical users".

  • The first thing I did after starting the application was maximizing it. It had the issue that the panels weren't docked and so I had to manually do the layout myself, which isn't great as I don't know what I want in a new application.

  • The first thing that happened after that was that the theme colors were bad. I understand now that it was generated based on my wallpaper (default Windows 10 wallpaper), but it didn't look good, contained enough contrast to distinguish panel when they are overlapping. It's in my opinion never a good idea to generate theme like that as you might loose important information conveyed by colors. One example here was that the panel border from imgui panels was the same (or close) as the background color of the panel, so when panel overlapped there was no distinction between the two panels.

  • After searching the menu there was a imgui style editor, where I was able to select the imgui dark theme. But when restarting the application it resets to what I imagine is the default theme that I don't think looks good.

  • Not being able to close the "Free trial" panel is annoying. You can put it in a tab and forget about it, so it's just user "harassment" for no reason.

  • The directory browser "path" part doesn't look good. It's not instantly clear what it express (the path) because it just looks like a list of buttons with no relation between them. This is also because by default it's using the application path which is a path less "familiar" to me than "C:". On next run of the program I would like it to either restart from the previous session directory or a user defined directory.

  • I would like "Parent directory", "Refresh", "Explorer here" buttons to be on the "path line".

  • I would like to be able to hide the filter textfield if I don't need it.

  • I would like to be able to hide the categories above the file list. Also it's not clear that clicking one of those button sets it as the last use category.

  • There are unnecessary borders around the file list (particularly at the top, but also left right and bottom).

  • I would like a way to have a textbox where I can just type the path I want to go.

  • Clicking the "drive letter" dropdown doesn't work unless I hide the tab bar. If the tab bar is visible clicking the dropdown act like I clicked on the tab bar.

  • When the filter textfield has the focus, pressing "Enter/Return" still enters directories in the list (I'm expecting a textfield to "capture" keyboard inputs).

  • Clicking an item in the list leaves the blinking cursor in the filter textfield.

  • If I have an optical drive selected (DVD, blueray...) with no disk in it and press "Enter/Return" it happens "/??<?" in the path.

  • Non ascii characters aren't displayed properly.

  • I would like to be able to resize and reorder column in the file list view. Also I think the category an item has isn't visible enough.

  • The mouse "line hover" rectangle is smaller than the selected line rectangle which doesn't look good.

  • In the dropdown to assign a category to the selected item, I can't click on the colored hexagon to select the category.

  • I'm not sure but I would suspect that assigning multiple categories to a single item would be useful.

  • The file names text is lined up to the left of the line rectangles without spacing which doesn't look good.

  • The "Use*" button could be rename to "Set xxx" where xxx is the name of the category. The tooltip when hovering "Use*" wasn't directly clear to me. A the beginning I thought that pressing * on the keyboard would set the last use category on an item.

  • There doesn't seem to be a way to select multiple files to assign the category at the same time.

  • There is a gap between lines when trying to select them using the mouse which is really annoying.

  • I don't like the wrapping when at the bottom or top of the list and pressing up or down.

  • I had instances of trying to use shortcuts in the list (ctrl + home, expecting that to select the first line) that were selecting a button in the UI (the back button). I can't reproduce it now, so I assume I had another gui element selected somehow.

  • It seem that the application try to keep the 6th line in view, which is weird. When I use down to move in the list, when I arrive on the 6th line and tries to move down, the 7th line is selected, than on the next frame the 7th line moves on the spot where the 6th line was. First, I don't want that behavior. I like to keep some lines in the view but the 6th line when there are 22 lines doesn't seem the correct value. Second, the fact that the thing happens on two frame doesn't look or feel good.

  • In the category panel, hovering the name of the category doesn't highlight the line and clicking it doesn't expand it. I need to click on the arrow or hexagon which is harder for no reason.

  • When adding a new category I would like the ui to expand the line and give the focus on the name textfield to make it easier to edit.

  • I would like to be able to delete the categories without expanding the line (a X at the end of the line for example).

  • Sometimes the dropdown to set the hotkey for a category display the names next to the hotkeys, but not always.

  • The default colors for the tree map aren't good. Especially black for the default color, makes it hard to distinguish small files.

  • The "export all to text" button isn't placed at a good location. I would put it as a direct button in the menu bar for quick access (not a menu item, a button next to the menu bar). But it should also get a menu entry. It being in the category panel make sens in term of concept, but not for usability (it's the thing that allows you to get some output from the program).

  • I would like the category panel to allow me to set the export path and file name per category.

  • Non ascii character produce a file with a wrong file name (not the correct character). e.g. à produce Ã.

  • A way to execute a command directly with the generated file(s) or content of the files would probably be useful.

Hope it helps.