Cakelisp»Blog
Macoy Madson

This post is mirrored on my blog.

I have been relatively busy lately due to unpacking all my things from my cross-country move. Some of my hobbies have been receiving more time. However, don't worry, Cakelisp and GameLib development still continues.

For this month, I'm going to write a short article about an approach I take to difficult programming tasks. I call it "surgical programming".

The key difference between junior and mid-level software engineers

In my career, I think one of the most important skills I had to develop was learning how to read code.

My less-experienced self would frequently reach out for documentation, or only read function signatures. I would call functions other people had written without reading them myself to confirm they did what they claimed.

Now, I rarely ever read online documentation. The code is the ground truth, and additional context can be gleaned from reading the version control system logs for the file. I try to read much more functions in their entirety before using them. When I have a question about some functionality, I try to read the code before asking the original developers for help.

I think reading code is a skill that you can practice, but it definitely takes discipline. If you respect the programmer who wrote the code, it gets a bit easier.

John Carmack recommends[^1] stepping through code from main() to understand what's going on:

An exercise that I try to do every once in a while is to "step a frame" in the game, starting at some major point like common->Frame(), game->Frame(), or renderer->EndFrame(), and step into every function to try and walk the complete code coverage. This usually gets rather depressing long before you get to the end of the frame. Awareness of all the code that is actually executing is important, and it is too easy to have very large blocks of code that you just always skip over while debugging, even though they have performance and stability implications.

This is also a good way to force yourself to read the code---the instruction pointer acts as a virtual bookmark, and you can go statement-by-statement rather than having to find the best place to start in the myriad of files in a codebase.

Surgical programming

Some tasks require a large amount of code or a complex system to be comprehended before the correct modification can be discovered and implemented. Surgical programming is a way to systematically approach these tasks.

It relates to reading code because it essentially divides hard problems into two phases: a pre-op (reading) phase, and a operation (writing) phase.

Pre-op

The pre-operation is the first phase in approaching a difficult task. The goal of pre-op is to understand the system and the task, answer any questions you have, and define a clear implementation sequence for the operation.

Importantly, the pre-op puts you on the hook: you don't get to write code until you've read enough to complete the operation plan.

When I'm embarking on a difficult task or hairy investigation, I make notes[^2] under a "Pre-op" heading where I list everything I encounter while reading that is relevant to the operation.

I also think of things I don't know and add them as to-dos on the pre-op. I can't start writing code until I've read enough to have good answers to the to-dos. They can be questions like "how did they handle X?" or "what do I need to modify to get Y?". It also includes things like "what do designers mean by Z?" where I have to talk to concerned parties to gain more context and requirements.

It feels good to call it a pre-op because it's more cool sounding, and feels like you're still making progress and spending time wisely. It could also be called the "research phase", but in my opinion that sounds much more boring.

When trying to understand complex systems, you may need to insert logging, visualizations, or other instrumentation to help illustrate the system's behavior. This is appropriate to do in pre-op, and will likely help with future investigation, so it should be kept in the code (perhaps behind boolean toggles or #if clauses, if necessary).

The key with this technique is not starting "work" on the task until you are sure you know what to do. In my career I remember making false starts where I would write a bunch of code only to find the approach wouldn't work half-way through implementing it. In almost every case the problem was a lack of understanding of the existing code. The pre-op helps to reduce chances of false-starts, because you deliberately seek out your blind-spots in understanding a system and illuminate them.

Once all your questions are answered and you feel you have a good understanding of the situation, you can write out an operation plan. This is a step-by-step outline of the things you need to do in order to make the modification correctly. It is useful as you are reading code to take note of functions and whatnot that are going to be relevant to the operation. If you do this, you will be able to jump straight to the definitions, signatures, etc. that need to be modified.

Operation

Once you have answered all the questions in the pre-op and have an operation plan, you can proceed with the operation.

It feels good to write the code now, because you can just blaze through it. You're no longer "feeling around" while at the same time fighting compiler errors. This happens when your code is written with only a half-baked understanding of the system you're changing.

The key during the operation is to notice when you still trip up. Could that have been handled in the pre-op instead? Did you start the operation before you were ready? You can write these instances down and form a pre-op checklist for the next time, if you find yourself consistently forgetting them.

The final part of the operation is validation. You should step through code in a debugger the first time you run it, checking all of your assumptions and confirming the data is modified as you intend. This is a great way to cut down on iteration time, because you don't waste time getting your hopes up and skipping straight to testing. Off-by-one, inverted conditionals, and error-handling are usually very obvious when stepping through code, but difficult to spot when only testing.

Exploratory programming

There are some problems where it is necessary to make a few different attempts at implementations. This is sometimes called exploratory programming. The surgical approach would consider this style of development part of the pre-op, because it's about gaining more understanding before writing the eventually committed operation code.

These types of problems don't fit as well into the surgical method, which is okay. It's mainly important to recognize when you are flailing due to lack of understanding versus exploring in order to gain more insight. The goal of flailing is to complete the task[^3], whereas the goal of exploring is to learn new things about the system.

Debrief

Once you have completed a task, it can be beneficial to analyze at the meta-level various things:

  • How long it took to complete the task
  • What things during pre-op made understanding the system difficult (software architechture, etc.)
  • Why the task was required. If it was due to a bug, why did the bug occur? Could that type of bug be automatically or systematically prevented?
  • How you can improve future operations

Conclusion

Surgical programming provides a deliberate structure for approaching complex problems. It becomes automatic after you've done it for a while, but I hope especially for more junior programmers (or for anyone on very complex problems) having the explicit pre-op and operation phases will make solving hard problems easier.

[^1]: John Carmack on Inlined Code (archive.org)

[^2]: I make all my notes in Org-mode, which is simply unmatched in its suitability for complex note-taking. In this case, the nesting and folding of headings helps manage the complexity. You can also insert direct links to files and lines of code which are relevant. I also copy-paste code snippets into my notes for easier reference.

[^3]: In The Pragmatic Programmer they call this Programming by Coincidence---if your flailing ends up working, it's only by chance, not by a deliberate, systematic approach.

Macoy Madson

This article is mirrored on my blog.

I started working on Cakelisp a little more than a year ago. I started the project after searching for a programming language that met all my requirements, and found nothing which matched.

I figure this is a good time to take stock and talk about what is in the future for Cakelisp.

Popularity

My goal was never to make a language that everyone wanted to use. I always approached Cakelisp with the goal of making the perfect language for me, without considering too much how other people would like it.

This gives me a whole lot of flexibility when making decisions. I can play more fast-and-loose and make big changes without worrying.

Besides, even if I did try to make a language for "everyone", I believe I would have failed to get much more attention.

As it stands, Cakelisp has over two hundred "stars" on GitHub. My original Hacker News thread got a few hundred upvotes, which is relatively good for Hacker News. I get emails maybe once every three months or so with questions about Cakelisp.

I think I'm still the only person in the world using the language. I think this is fine for several reasons:

  • I can make changes without worrying about breaking other peoples' projects.
  • I don't feel a lack of existing libraries. I have easy access to essentially any C or C++ project, which is a massive ecosystem. GameLib shows how easy it is to expose C/C++ libraries, and still with no bindings necessary!
  • Supporting a language used by many people quickly becomes a full-time job. By staying small, I can focus on my goals rather than spend too much time helping other people with their goals.

If you are interested in using Cakelisp, don't let this deter you. Because the language is so small, it means you can have a big impact on the direction of the language, and can make big changes yourself without much trouble. I think even though general applicability wasn't my goal, the language is still useful in a variety of cases, and is appealing not only to me.

Purpose

It is important for me to clarify that I never made Cakelisp for its own sake. I was running into real pain points with C++ on my personal projects, and made Cakelisp to alleviate that.

From that perspective, I consider Cakelisp a great success. There are several angles I see where the language succeeded.

I am more motivated to write in a language I made, because I can tailor it to my tastes. This is very eye-opening. At my job, I now realize just how much pain is caused by using C++, and I know how "easy" it would be to stop experiencing that pain. When you like your environment and have control over it, you are much more productive. Things don't feel arbitrary, and you are empowered to make changes at every level.

I no longer need to futz with 3rd-party build systems, which never seemed to work exactly how I wanted. Admittedly, I now futz with Cakelisp's build system[^1], but we all know we're much more forgiving of our own projects than other peoples'.

Projects

The language was never the end goal, making projects with the language was the goal. I have successfully made several projects, including an Android game for my girlfriend:

...a file organization application for sale:

...an automatic color scheme generator:

...and various other prototypes.

GameLib provides me with easy access to a huge amount of awesome C and C++ libraries. Each time I add a new module to GameLib, all my future projects benefit from how easy it is to import. GameLib handles acquiring the 3rd party source and versioning, which was a noticeable point of friction for me that prevented me from making quick prototypes.

Plan

I have no intention of stopping using Cakelisp. It continues to provide value every time I start a new project.

In terms of major updates to the language, I don't have anything explicitly planned. I let my projects dictate what needs to get done on the language. I have about 2,600 lines of TODOs and various notes on Cakelisp in my personal notes, so there is no doubt work to be done, but it doesn't really need to get done if I can still successfully ship projects without them.

I have been itching to move towards pure C output support. This is valuable because it should speed up compilation times, make embedded systems programming in Cakelisp more reasonable, and I like the elegance of C over C++. The barrier here is mostly compile-time code execution, because a lot of Cakelisp's public APIs use C++-only features like generics (std::vector and std::unordered_map) and std::string[^2].

This change will require a relatively large refactor of both the compile-time APIs and the build system interface. I'm not sure when exactly I will get to it, but it will be nice once it's done.

Conclusion

I'm very happy with Cakelisp's progress and what I've been able to get done with it. I am not concerned by its lack of popularity, because that wasn't a goal of mine. I plan to continue refining the language and making cool projects with it. Here's to another year of Cakelisp!

[^1]: Which, to be fair, is much simpler and more limited than something like CMake, which is more complex and riddled with features. (To put it out there: I hate CMake so much, but that's beside the point. Don't use something just because everyone else is using it, etc. etc.)

[^2]: One of my biggest regrets now on Cakelisp is using std::string. It no doubt saved me time getting started, but I'm going to pay a much bigger price now trying to get rid of all of it. If you find yourself making a language, do not use it! Write a string interning system or at least your own linear allocator instead, and keep your APIs C-compatible!

Macoy Madson

This article is mirrored on my blog.

I recently released File Helper, a file organization application I wrote using Cakelisp.

This application had only two external files that were necessary for it to fully function:

  • A font
  • An application icon

I packaged File Helper in a .zip or .tar.gz for Windows or Linux respectively. These archives contain the platform executable as well as a license file and the two necessary font and icon files.

However, wouldn't it be nice if instead I shipped a single executable, thereby eliminating the extract step?

It might sound trivial, but eliminating that extra step has many benefits:

  • Less technical users won't get confused. Double-clicking an archive usually opens it in a browser rather than extracting it, which might confuse them and cause them to not use my product.
  • The application has no risk of breaking if the executable is moved.
  • The user doesn't have to delete or move the archive after they extract it.

Bundling files into executables

An executable is just a file format which your operating system understands. It is essentially a header and a whole bunch of sections filled with binary data.

Typically, a linker converts a collection of object files into a single executable. Because executables are containers which can hold various kinds of data, we can package data only our application understands in the same container as the application code.

The operating system is fine with this because it only needs to map the executable into memory and start executing code at a designated entry point. It is then up to the program to decide how to interpret the various executable sections.

Platform differences

There are many different file formats for executables. Usually, an operating system only supports one executable file format. On Windows, it's the Win32 Portable Executable format, typically with extension .exe. On Linux, it's usually ELF.

I am only targeting those two platforms, so I can add code to specifically support those formats when building Cakelisp programs.

On Windows, data is added to executables via Resource Files. I wrote a tutorial on how to do this.

On Linux, data can be added via dumping the data to an object file which defines a couple symbols. This is a great tutorial on how to do that.

Good and bad ways

Like everything in programming, you'll hear different advice on how to bundle data.

The most common alternative method is to convert your data to a C-style array definition. This has many limitations, and in my opinion should be avoided:

  • Some compilers (MSVC included) limit the number of elements in an array, which therefore limits the size of the bundled data.
  • Your compiler has to do extra processing (tokenization, parsing, etc.) to that data which it should actually just treat as a giant binary blob. Extra unnecessary processing means longer build times.
  • An extra stage has to be created and compiled as part of your build system, which adds complexity.

We are going to proceed with the platform dependent but much more robust approach, which is to convert our data to object files without using a C/C++ compiler.

Integrated build system

Whether we are on Windows or Linux, we need to process our data file into some other form in order for the linker to properly understand the data package. This means adding a step to our build to process the data, because we want it to automatically stay up-to-date when linked in the executable.

Cakelisp includes a simple C/C++ build system as well as compile-time code execution. We need to create a new build step to process our binary data into object files. In order to do that, we use a compile-time build hook to execute a function which performs the conversion.

The full code is here.

The end-user interface is simply:

(import "DataBundle.cake")
(bundle-file data-start data-end (const char)
             "../data/MyFont.ttf")

We declare data-start and data-end to represent pointers to the symbols associated with our data.

That bundle-file invocation is a macro that adds the data file to a list. It also generates the variables we can use to refer to the data.

Finally, a compile-time function convert-all-bundle-files calls the necessary objcopy (or Resource Compiler on Windows[^1]) to generate the actual object file for each bundle-file. It only does this if the data files are changed or the object files don't already exist in the cache.

We can then link the generated objects into the executable alongside our code object files. It also adds that object file to the linker command line.

This function is integrated into the Cakelisp build sequence like so:

(add-compile-time-hook-module pre-build convert-all-bundle-files)

Conclusion

This is pretty great: we extended our build system to support bundling arbitrary data files, all without touching Cakelisp's internals itself.

Not only that, we extended the system in the same language we write our application code, and within the same invocation---we didn't need to create some other phase. We were also able to provide the user with an extremely simple interface to bundling files.

[^1]: On Windows, we need to generate a .rc file with a list of all the resources that should be compiled into a single object file. Because Cakelisp allows arbitrary compile-time code execution, we can easily do this by writing the filenames out to the generated rc, then invoking the Resource Compiler on that file. This platform-specific step can be completely automated!

Macoy Madson

For my post this month, I wanted to present the entirety of a tutorial I just created for learning to use Cakelisp. The most up-to-date version will be kept here.

Enjoy!


This tutorial will introduce you to Cakelisp's most unique feature, compile-time code generation. I'm not going to introduce fundamental programming constructs like variables or conditional logic---I'm going to focus on what makes Cakelisp special. This is the most important thing to cover because it is the least familiar to new users from other languages.

Prerequisites

  • Experience writing C or C++ programs. If you're just learning how to program, you should learn a different language rather than Cakelisp for now.

Setup

First, download Cakelisp. You can also clone it through git. The URL is https://github.com/makuto/cakelisp.

Unzip the master.zip file, if you downloaded it manually.

Cakelisp's repository

The following section may be skipped. It serves as a quick introduction to the collection of files you downloaded.

Cakelisp consists of the following major components:

A collection of C++ files

The core of Cakelisp itself is in the src/ directory.

These files define the functionality of Cakelisp:

  • Tokenizer: Turns .cake text files into arrays of tokens, which is easier to work with
  • Evaluator: Uses the arrays of tokens as instructions on how to manipulate the "Environment"
  • Generators: Invoked by the evaluator, generators create C/C++ text output
  • Writer: Writes generated outputs to C/C++-language text files
  • Module manager: Handles the separation of files into modules and performs the high-level procedure
  • Build system: Invokes the compiler, linker, and dynamic loader as necessary to build your program

You don't need to know exactly what these do for now.

Runtime

The runtime/ directory stores .cake files which provide various features:

  • CHelpers.cake provide various helper macros and generators for writing C/C++ code
  • CppHelpers.cake provide C++-only features
  • Cakelisp.cake makes it possible to run cakelisp while within another cakelisp compile-time phase
  • ComptimeHelpers.cake gives powerful tools for writing macros, generators, and other compile-time-only code

...and more. With the C/CPP helpers files, they have any language feature that wasn't essential to include in Generators.cpp as "built-ins".

Nothing in runtime/ will actually affect your program unless you explicitly import them.

Supplementary things

  • doc/ folder contains Cakelisp documentation
  • tools/ holds 3rd-party configuration files for making other tools work with Cakelisp
  • test/ consists of several .cake files used to test the language while it is developed

Preparing your environment

Cakelisp relies on a C++ compiler and linker to perform various things. Your system needs to have a C++ toolchain set up.

  • On Windows, download and install Visual Studio for best results
  • On Linux, your system should already have g++ or clang++ installed
  • On Mac, you need clang++

Once these prerequisites are satisfied, do the following:

  • Windows: Run Build.bat
  • Linux: Run Build.sh
  • Mac: (TODO) Run Build_Mac.sh

If the script fails, please email [email protected] so I can help you and make this build step more robust.

If they succeed, you now have a working cakelisp binary in the bin/ directory!

A note on installs

The language is changing fast enough that I recommend against doing a system-wide installation of cakelisp. If you are using version control, you should check in the entirety of Cakelisp as a submodule so that you always have the compatible version for that project.

First program

Let's make sure everything is working. Create a new file Hello.cake and edit it to have the following:

(c-import "<stdio.h>")

(defun main (&return int)
  (fprintf stderr "Hello, Cakelisp!\n")
  (return 0))

If you're familiar with C (which you probably should be; I will basically assume you are in this tutorial), this should be pretty simple.

We're just getting started though; this language is much more than C with more parentheses.

Build the file with the following command (adjust to make it cakelisp.exe on Windows, if necessary):

./bin/cakelisp --execute Hello.cake

If everything is set up properly, you should see:

Successfully built and linked a.out
Hello, Cakelisp!

You can see that it not only built, but ran the output executable for us, thanks to that --execute option.

If you run that same command again, you'll see slightly different output:

No changes needed for a.out
Hello, Cakelisp!

Cakelisp's build system automatically caches build artifacts and only rebuilds things when you make changes.

Special sauce

"Hello World" is pretty boring. Let's write a program that would be difficult to write in a language without Cakelisp's features.

Let's write a program which takes the name of a command and executes it, much like git does (e.g. git add or git commit, where add and commit are commands).

However, to show off Cakelisp, we're going to have the following rule:

Adding a command should be as easy as writing a function.

This means no boilerplate is allowed.

Taking user input

Modify our main function to take command-line arguments:

(defun main (num-arguments int
             arguments ([] (* char))
             &return int)
  (unless (= 2 num-arguments)
    (fprintf stderr "Expected command argument\n")
    (return 1))
  (fprintf stderr "Hello, Cakelisp!\n")
  (return 0))

By convention, names are written in Kebab style, e.g. num-arguments rather than numArguments or num_arguments. This is purely up to you to follow or ignore, however.

Now, if we build, we should see the following:

Successfully built and linked a.out
Expected command argument
/home/macoy/Repositories/cakelisp/a.out
error: execution of a.out returned non-zero exit code 256

You can see that Cakelisp --execute output additional info because we returned a non-zero exit code. This is useful if you are using --execute in a process chain to run Cakelisp code just like a script.

TODO: Currently, Cakelisp --execute has no way to forward arguments to your output executable. From now on, remove the --execute and run it like so, adjusting accordingly for your platform (e.g. output.exe instead of a.out):

./bin/cakelisp Hello.cake && ./a.out MyArgument

Doing the build on the same command as your execution will make sure that you don't forget to build after making changes.

You should now see:

Hello, Cakelisp!

Getting our macro feet wet

In order to associate a function with a string input by the user, we need a lookup table. The table will have a string as a key and a function pointer as a value.

However, we need to follow our rule that no human should have to write boilerplate like this, because that would make it more difficult than writing a function.

We will accomplish this by creating a macro. Macros in Cakelisp let you execute arbitrary code at compile time and generate new tokens for the evaluator to evaluate.

These are unlike C macros, which only do string pasting.

Let's write our first macro:

(defmacro hello-from-macro ()
  (tokenize-push output
    (fprintf stderr "Hello from macro land!\n"))
  (return true))

tokenize-push is a generator where the first argument is a token array to output to, and the rest are tokens to output.

We will learn more about it as we go through this tutorial.

Every macro can decide whether it succeeded or failed, which is why we (return true) to finish the macro. This gives you the chance to perform input validation, which isn't possible in C macros.

Invoke the macro in main:

(defun main (num-arguments int
             arguments ([] (* char))
             &return int)
  (unless (= 2 num-arguments)
    (fprintf stderr "Expected command argument\n")
    (return 1))
  (fprintf stderr "Hello, Cakelisp!\n")
  (hello-from-macro)
  (return 0))

And observe that "Hello from macro land!" is now output.

Why use a macro?

In this simple example, our macro should just be a function. It would look exactly the same, though wouldn't need a return or tokenize-push:

(defun hello-from-function ()
  (fprintf stderr "Hello from function land!\n"))

We're going to use the macro to generate additional boilerplate, which is what a function cannot do.

Making our macro do more

Let's make a new macro for defining commands:

(defmacro defcommand (command-name symbol arguments array &rest body any)
  (tokenize-push output
    (defun (token-splice command-name) (token-splice arguments)
      (token-splice-rest body tokens)))
  (return true))

This macro now defines a function (defun) with name command-name spliced in for the name token, as well as function arguments and a body.

We now take arguments to the macro, which are defined similarly to function arguments, but do not use C types.

The arguments say defcommand must take at least three arguments, where the last argument may mark the start of more than three arguments (it will take the rest, hence &rest).

There are only a few types which can be used to validate macro arguments:

  • symbol, e.g. my-thing, 4.f, 'my-flag, or even 'a'
  • array, always an open parenthesis
  • string, e.g. "This is a string"
  • any, which will take any of the above types. This is useful in cases where the macro can accept a variety of types

The first argument is going to be the name of the command. We chose type symbol because we want the command definition to look just like a function:

(defun hello-from-function () ;; hello-from-function is a symbol
  (fprintf stderr "Hello from function land!\n"))

(defcommand hello-from-command () ;; hello-from-command is also a symbol
  (fprintf stderr "Hello from command land!\n"))

;;(defcommand "hello-from-bad-command" () ;; "hello-from-bad-command" is a string
;;  (fprintf stderr "Hello from command land!\n"))
;; This would cause our macro to error:
;; error: command-name expected Symbol, but got String

In this example, defcommand will output the following in its place:

(defun hello-from-command ()
  (fprintf stderr "Hello from command land!\n"))

Compile-time variables

Okay, but a C macro could slap some strings around like that! Let's do something a C macro could not: create the lookup table automatically.

We need to add the command to a compile-time list so that code can be generated for runtime to look up the function by name.

For this, we need some external help, because we don't know how to save data for later during compile-time. Add this to the top of your Hello.cake:

(import &comptime-only "ComptimeHelpers.cake")

This ComptimeHelpers.cake file provides a handy macro, get-or-create-comptime-var. We import it to tell Cakelisp that we need that file to be loaded into the environment. We include &comptime-only because we know we won't use any code in it at runtime.

However, if we try to build now, we get an error:

Hello.cake:1:24: error: file not found! Checked the following paths:
Checked if relative to Hello.cake
Checked search paths:
    .
error: failed to evaluate Hello.cake

Cakelisp doesn't know where ComptimeHelpers.cake is. We need to add its directory to our search paths before the import:

(add-cakelisp-search-directory "runtime")
(import &comptime-only "ComptimeHelpers.cake")

This allows you to move things around as you like without having to update all the imports. You would otherwise need relative or absolute paths to find files. You only need to add the directory once. The entire Environment and any additional imports will use the same search paths.

Next, let's invoke the variable creation macro. You can look at its signature to see what you need to provide:

(defmacro get-or-create-comptime-var (bound-var-name (ref symbol) var-type (ref any)
                                    &optional initializer-index (index any))

It looks just like a regular variable declaration, only this one will share the variable's value during the entire compile-time phase.

Let's create our lookup list. We'll use a C++ std::vector, as it is common in Cakelisp internally and accessible from any macro or generator (TODO: This will change once the interface becomes C-compatible):

(defmacro defcommand (command-name symbol arguments array &rest body any)

  (get-or-create-comptime-var command-table (<> (in std vector) (* (const Token))))
  (call-on-ptr push_back command-table command-name)

  (tokenize-push output
    (defun (token-splice command-name) (token-splice arguments)
      (token-splice-rest body tokens)))
  (return true))

We take a pointer to const Token to contain our command function name.

Finally, let's invoke our defcommand macro to test it:

(defcommand say-your-name ()
  (fprintf stderr "your name.\n"))

If we build and run this, nothing visibly changes! We are storing the command-table, but not outputting it anywhere useful.

Compile-time hooks

defcommand is collating a list of command names in command-table. We want to take that table and convert it to a static array for use at runtime.

The problem is we don't know when defcommand commands are going to finish being defined. We don't know the right time to output the table, because more commands might be discovered during compile-time evaluation.

The solution to this is to use a compile-time hook. These hooks are special points in Cakelisp's build procedure where you can insert arbitrary compile-time code.

In this case, we want to use the post-references-resolved hook. This hook is invoked when Cakelisp runs out of missing references, which are things like an invocation of a macro which hasn't yet been defined.

This hook is the perfect time to add more code for Cakelisp to evaluate.

It can be executed more than once. This is because we might add more references that need to be resolved from our hook. Cakelisp will continue to run this phase until the dust settles and no more new code is added.

Creating our compile-time code generator

We use a special generator, defun-comptime, to tell Cakelisp to compile and load the function for compile-time execution.

We attach the compile-time function to compile-time hooks, or call from macros or generators.

It's time to create a compile-time function which will create our runtime command look-up table.

(defun-comptime create-command-lookup-table (environment (& EvaluatorEnvironment)
                                             was-code-modified (& bool) &return bool)
  (return true))

(add-compile-time-hook post-references-resolved
                       create-command-lookup-table)

Each hook has a pre-defined signature, which is what the environment and other arguments are. If you use the wrong signature, you will get a helpful error saying what the expected signature was.

From our previous note on post-references-resolved we learned that our hook can be invoked multiple times. Let's store a comptime var to prevent it from being called more than once:

(defun-comptime create-command-lookup-table (environment (& EvaluatorEnvironment)
                                             was-code-modified (& bool) &return bool)
  (get-or-create-comptime-var command-table-already-created bool false)
  (when (deref command-table-already-created)
    (return true))
  (set (deref command-table-already-created) true)
  (return true))

We have to make the decision to do this ourselves because we might actually want a hook to respond to many iterations of post-references-resolved. In this case however, we want it to run only once.

Our compile-time function is now hooked up and running when all references are resolved, but it's doing nothing.

Let's get our command table and make a loop to iterate over it, printing each command:

(defun-comptime create-command-lookup-table (environment (& EvaluatorEnvironment)
                                           was-code-modified (& bool) &return bool)
  (get-or-create-comptime-var command-table-already-created bool false)
  (when (deref command-table-already-created)
    (return true))
  (set (deref command-table-already-created) true)

  (get-or-create-comptime-var command-table (<> (in std vector) (* (const Token))))
  (for-in command-name (* (const Token)) (deref command-table)
    (printFormattedToken stderr (deref command-name))
    (fprintf stderr "\n"))
  (return true))

You can see we called printFormattedToken, which is a function available to any compile-time code. It uses a camelCase style to tell you it is defined in C/C++, not Cakelisp.

If all goes well, we should see this output:

say-your-name
No changes needed for a.out
Hello, Cakelisp!
Hello from macro land!

You can see it lists the name before the "No changes needed for a.out" line. This is a sign it is running during compile-time, because the "No changes" line doesn't output until the build system stage.

It's Tokens all the way down

At this point, we know it's printing successfully, so we have our list. We now need to get this list from compile-time to generated code for runtime.

To do this, we will generate a new array of Tokens and tell Cakelisp to evaluate them, which results in generating the code to define the lookup table.

We need to create the Token array such that it can always be referred back to in case there are errors. We do this by making sure to allocate it on the heap so that it does not go away on function return or scope exit:

(var command-data (* (<> std::vector Token)) (new (<> std::vector Token)))
(call-on push_back (field environment comptimeTokens) command-data)

We add to the Environment's comptimeTokens list so that the Environment will helpfully clean up the tokens for us at the end of the process.

We know we need two things for each command:

  • Name of the command, as a string
  • Function pointer to the command, so it can be called at runtime

We're going to use the name provided to defcommand for the name, but we need to turn it into a string so that it is properly written:

(var command-name-string Token (deref command-name))
(set (field command-name-string type) TokenType_String)

We copy command-name into command-name-string, which copies the contents of command-name and various other data. We then change the type of command-name-string to TokenType_String so that it is parsed and written to have double quotation marks.

The function pointer will actually just be command-name spliced in, because the name of the command is the same as the function that defines it.

We can use tokenize-push to create the data needed for each command:

(tokenize-push (deref command-data)
  (array (token-splice-addr command-name-string)
         (token-splice command-name)))

We use token-splice-addr because command-name-string is a Token, not a pointer to a Token like command-name.

Let's output the generated command data to the console to make sure it's good. Here's the full create-command-lookup-table so far:

(defun-comptime create-command-lookup-table (environment (& EvaluatorEnvironment)
                                             was-code-modified (& bool) &return bool)
  (get-or-create-comptime-var command-table-already-created bool false)
  (when (deref command-table-already-created)
    (return true))
  (set (deref command-table-already-created) true)

  (get-or-create-comptime-var command-table (<> (in std vector) (* (const Token))))

  (var command-data (* (<> std::vector Token)) (new (<> std::vector Token)))
  (call-on push_back (field environment comptimeTokens) command-data)

  (for-in command-name (* (const Token)) (deref command-table)
    (printFormattedToken stderr (deref command-name))
    (fprintf stderr "\n")

    (var command-name-string Token (deref command-name))
    (set (field command-name-string type) TokenType_String)

    (tokenize-push (deref command-data)
      (array (token-splice-addr command-name-string)
             (token-splice command-name))))

  (prettyPrintTokens (deref command-data))
  (return true))

And our full output:

say-your-name
(array "say-your-name" say-your-name)
No changes needed for a.out
Hello, Cakelisp!
Hello from macro land!

Creating the lookup table

We need to define the runtime structure to store the lookup table's data for each command. We also need to define a fixed signature for the commands so that C/C++ knows how to call them.

Add this before main:

;; Our command functions take no arguments and return nothing
(def-function-signature command-function ())

(defstruct-local command-metadata
  name (* (const char))
  command command-function)

Now the runtime knows what the layout of the data is. In create-command-lookup-table, let's generate another array of tokens to hold the runtime lookup table variable:

(var command-table-tokens (* (<> std::vector Token)) (new (<> std::vector Token)))
(call-on push_back (field environment comptimeTokens) command-table-tokens)

(tokenize-push (deref command-table-tokens)
  (var command-table ([] command-metadata)
    (array (token-splice-array (deref command-data)))))

(prettyPrintTokens (deref command-table-tokens))

We declare command-table to be an array of command-metadata, which we just defined.

We then splice in the whole command-data array, which should now contain all the commands.

We now get:

say-your-name
(array "say-your-name" say-your-name)
(var command-table ([] command-metadata)
  (array (array "say-your-name" say-your-name)))
Successfully built and linked a.out
Hello, Cakelisp!
Hello from macro land!

Putting it somewhere

We have created our code, but we need to find a place to put it relative to the other code in our Hello.cake module.

This matters because Cakelisp is constrained by declaration/definition order, a constraint imposed by using C/C++ as output languages.

We know we want to use command-table in main to run the command indicated by the user-provided argument. That means we need to declare command-table before main is defined.

We use a splice point to save a spot to insert code later. Define a splice point right above the (defun main definition:

(splice-point command-lookup-table)

Finally, let's evaluate our generated code, outputting it to the splice point. We'll change create-command-lookup-table to return the result of the evaluation. We set was-code-modified to tell Cakelisp that we actually made changes that may need more processing.

(set was-code-modified true)
(return (ClearAndEvaluateAtSplicePoint environment "command-lookup-table" command-table-tokens))

And to make sure it works, we will reference command-table in main. We will list all the available commands, but this time, at runtime.

Update our import to include CHelpers.cake, which has a handy macro for iterating over static arrays:

(import &comptime-only "ComptimeHelpers.cake" "CHelpers.cake")

In main, add the code to list commands. Put it at the very start of the function so it always occurs:

(fprintf stderr "Available commands:\n")
(each-in-array command-table i
  (fprintf stderr "  %s\n"
           (field (at i command-table) name)))

And check the output:

say-your-name
(array "say-your-name" say-your-name)
(var command-table ([] command-metadata)
  (array (array "say-your-name" say-your-name)))
Successfully built and linked a.out
Available commands:
  say-your-name
Hello, Cakelisp!
Hello from macro land!

Try adding another defcommand to make sure it is added to the list.

Running commands

Let's finish up by actually taking the user input and calling the appropriate command.

We need strcmp, so we'll update our c-import to include it straight from the C standard library:

(c-import "<stdio.h>" "<string.h>")

And, in main, after we've confirmed we have enough arguments, we check the command table and run the command!

(var found bool false)
(each-in-array command-table i
  (when (= 0 (strcmp (field (at i command-table) name) (at 1 arguments)))
    (call (field (at i command-table) command))
    (set found true)
    (break)))
(unless found
  (fprintf stderr "error: could not find command '%s'\n" (at 1 arguments))
  (return 1))

Now, we can see our output in different scenarios.

Building only:

> ./bin/cakelisp test/Tutorial_Basics.cake
  say-your-name
  (array "say-your-name" say-your-name)
  (var command-table ([] command-metadata)
    (array (array "say-your-name" say-your-name)))
  Successfully built and linked a.out

Running with no arguments:

> ./a.out
  Available commands:
    say-your-name
  Expected command argument

Running with an invalid command:

> ./a.out foo
  Available commands:
    say-your-name
  Hello, Cakelisp!
  Hello from macro land!
  error: could not find command 'foo'

And finally, running a valid command:

> ./a.out say-your-name
  Available commands:
    say-your-name
  Hello, Cakelisp!
  Hello from macro land!
  your name.

Conclusion

The complete tutorial code can be found in test/Tutoral_Basics.cake.

You can see it's now as easy to define a command as defining a new function, so we achieved our goal.

We had to do work up-front to generate the code, but that work is amortized over all the time saved each time we add a new command. It also changes how willing we are to make commands.

Going further

There are a number of different things you could do with this:

  • Commands could optionally provide a help string
  • Code modification could be used to read all functions rather than requiring the use of defcommand
  • Support for arguments could be added

You made it!

If you are feeling overwhelmed, it's okay. Most languages do not expose you to these types of features.

This tutorial threw you into the deep end of the most advanced Cakelisp feature. This is to showcase the language and to reassure you---If you can understand compile-time code generation, you can understand Cakelisp!

It can take some time to appreciate the power that compile-time code generation and code modification give you. It really is a different way of thinking. Here are some examples where it really was a killer feature:

You can see that this one feature makes possible many things which would be very cumbersome to do without it.

Learning more

Reading documentation

The doc/ folder contains many files of interest, especially Cakelisp.org. There you will find much more detailed documentation than this tutorial provides.

Cakelisp self-documentation

Cakelisp provides some features to inspect its built-in generators. From the command line:

./bin/cakelisp --list-built-ins

...lists all the possible generators built in to Cakelisp. This is especially useful when you forget the exact name of a built-in.

./bin/cakelisp --list-built-ins-details

This version will list all built-ins as well as provide details for them.

Reading code

The best way to learn Cakelisp is to read existing code.

There are examples in test/ and runtime/. You can find extensive real-world usage of Cakelisp on macoy.me.

GameLib is the closest thing to a package manager you will find in Cakelisp land. It provides powerful features as well as easy importing for a number of 3rd-party C and C++ libraries.

Macoy Madson

This post is mirrored on my website.

It is common advice that it isn't worth automating something unless the time saved doing the task is greater than the time required to automate it.

Randall Munroe has two relevant xkcd comics (license):

This comic is the most relevant to this article:

When I talk about automation, I include things like code generation and macros, which automate code writing.

I am going to argue that doing work to automate things may still be worth your time, even if it takes longer than it would have without the automation.

Ethics of unergonomic interfaces

Recently, I have been thinking about user interface ergonomics, or how well an interface, be it textual code, physical hardware, or application UI, is designed around humans.

The goal of automation is typically to increase the ergonomic quality of some interface. Rather than doing a mindless and repetitive task (unergonomic), it is done automatically so one can spend their time doing more valuable work (ergonomic).

Jonathan Blow originally made me see interfaces in this way in an On Doubt interview:

If by spending 10 hours of your time, you can save a typical person using software 1 minute of bad experience, then at some level of usage that number just dwarfs your hours of time, and it becomes an ethical failing to not do that work.

Level editing

I heard an especially good example of this. Some game designers I know spent entire weeks hand-writing 2,000-line XML files. These files determined the basic setup of a level in the game.

They would open the game in a specific world, then hand-copy hundreds of floating point coordinates provided by their mouse pointer. These coordinates would be input (by hand!) to this file to precisely place items.

The file would look something like this, only there would be thousands of unique entries:

<level>
  <building x=1234.23 y=928.2233 z=977.22>Hut</building>
  <building x=928.23 y=110.2233 z=5638.22>Shack</building>
</level>

If an artist wanted a building rotated, they would send the designer a message to have them tweak the value. These are creative people with Master's degrees, typing numbers into a file for days on end. These files could have been generated by the computer---there was no creativity involved in their copying to these files.

This is an example of an unethical interface. One week of programmer time would have easily saved multiple weeks of designer time. Not only that, but the programmer would likely get some amount of enjoyment and satisfaction from creating a good tool.

When developing things which have a human interacting with it, you should consider how much you value your life time, and have empathy for your users' life time.

Effects of automation on creations

Automation not only changes how you do something, but what you will be willing to do.

In the level editing example, designers would be much more willing to create levels if they didn't have to suffer every time they did it. They would spend less time copying numbers to index cards and more time being creative and iterating.

I have spent a large amount of time on a new programming language, Cakelisp. I could have implemented all the software I have written with Cakelisp with existing languages. Thanks to Turing completeness, it follows that for any program I can write in Cakelisp, I could have saved all that up-front time making Cakelisp by writing the program in C instead.

This is too naive. I work on both C and C++ programs professionally. I am very familiar with how to get real things done with those languages. I am also intimately familiar with their limitations. When I worked on projects at home, I became more frustrated with those limitations.

Cakelisp was a revelation because it was my chance to escape from these limitations. I had to pay my time in order to make something I consider better, but it was absolutely worth it. I was now in charge of the interface. I shaped the language, the language didn't shape me.

This resulted in much higher motivation to do my own projects. If my goal is to eventually become self-sufficient by selling my software, doing more projects in a more enjoyable way is valuable.

Macro magic

Modern computer processors are very parallel. A good processor nowadays has sixteen or more cores. However, writing multi-threaded software to exploit these processors remains difficult as ever.

In my File Helper project, I needed to keep the user interface responsive while performing a scan of the entire filesystem. My first implementation used SDL's thread API to create a thread dedicated to file system scanning. I used a heavy-handed mutex to ensure the main thread and the scanning thread never stomped on each other's shared memory.

This system was complicated, fragile, and difficult to change.

Task systems assist the programmer by abstracting work that needs to be done into tasks or jobs. In this example, scanning the filesystem would be considered a task.

The problem is, C doesn't have a great interface to these systems.

Here's an example of the C I would have to write to create and run a task in EnkiTS:

taskScheduler= enkiNewTaskScheduler();
enkiInitTaskScheduler(taskScheduler);

// myLongTask is a function which must match a special signature
enkiTaskSet* myTask = enkiCreateTaskSet(taskScheduler, myLongTask);

enkiParamsTaskSet taskParams = enkiGetParamsTaskSet(myTask);
// How many things need processing
taskParams.setSize= 100;
// How many things should be processed in each task bucket
taskParams.minRange= 10;
enkiSetParamsTaskSet(myTask, taskParams);

// Set task on-complete
enkiCompletionAction* completionAction =
    enkiCreateCompletionAction(taskScheduler,
                               onMyLongTaskComplete);
enkiParamsCompletionAction completionArgs =
    enkiGetParamsCompletionAction(completionAction);
completionArgs.pDependency= enkiGetCompletableFromTaskSet(myTask);
enkiSetParamsCompletionAction(completionAction, completionArgs);

// Start the tasks
enkiAddTaskSet(taskScheduler, myTask);
enkiWaitforAllAndShutdown(taskScheduler);

enkiDeleteTaskSet(taskScheduler, myTask);
enkiDeleteCompletionAction(taskScheduler, completionAction);
enkiDeleteTaskScheduler(taskScheduler);

As you can see, this is a lot worse than a simple function call:

myLongTask();

This friction in the interface changes how often I am willing to use the task system. Additionally, it imposes a maintenance burden simply by virtue of being much more lines of code, which encourages bad habits like copy-pasting.

Cakelisp provides me the opportunity to eliminate this friction via full-power macros, which allow me to do arbitrary compile-time code generation.

This means I get to define the interface. The task system becomes a domain-specific language where I only type what I must:

;; I define a task just like a function, with regular arguments
;; These arguments are automatically structured and de-structured
(def-task treemap-update-state-task (state (* treemap-state))
 (do-the-long-scanning state))

;; Start the task, passing the arguments from the current context (state)
(task-system-execute
 (treemap-update-state-task state)
 ;; This will run after the previous task, on any thread
 (classify-treemap-paths state)
 (treemap-update-state-done :pin-to-main-thread state))

This task-system-execute macro took about one week to implement (it's here). That is a large time investment, but it will continue to pay dividends the more I use it. In fact, I have already used it three times in File Helper, with great success.

This macro not only eliminated the fragile mess that was the hand-rolled file system thread code, it made me move even more things off the main thread.

Another example, this time with tasks that run in parallel:

(task-system-execute
 (export-category-paths copied-entries copied-categories)
 (parallel ;; These tasks can run at the same time
  (open-userdata-system-file-explorer
   open-file-explorer-on-complete)
  (on-export-complete :pin-to-main-thread
                      copied-entries copied-categories)))

There are some drawbacks, of course. This macro adds a small amount of time to the compile phase, and produces code that is likely slower than hand-rolled threading code. However, the amount of time recovered by being more willing to make things threaded should easily make up for its drawbacks.

This tool changes how willing I am to write things which require multi-threading. It changes what kinds of things I even consider making.

Conclusion

Automation not only saves you time, it changes what you do. The ergonomics and interface friction influence how you approach tasks. More ergonomic interfaces give you more time to do more interesting work.