Writing a compiler in Ruby, bottom up
(To follow my compiler related posts, either subscribe to my main RSS feed, or the compiler specific one )
Back in March 2008 I started publishing a series on how to write a compiler in Ruby, bottom up, that is, starting with the code generator and working my way up instead of the more traditional approach of writing the parser first. Here are the parts I've published on my blog so far.
(The bits labeled "interlude" are sort-of related articles in that the cover issues with writing/developing compilers, or issues related to compilation, but are not directly tied into the actual article series)
- Step 1 - Creating a simple prolog / epilog for the main function
- Step 2 - Function calls / Hello World
- Step 3 - Chaining expressions ; Adding sub-expressions
- Step 4 - Defining functions ; Adding support for a runtime
- Step 5 - Number literals ; if .. then .. else
- Step 6 - Anonymous functions: lamba / call
- Step 7 - Making use of lambda / call ; Function arguments
- Step 8 - Assignment, arithmetic and comparisons
- Step 9 - A builtin "while"
- Step 10 - Testing the compiler: A primitive "parser"
- Step 11 - A separate emitter class to output assembler
- Step 12 - Adding support for arrays
- Interlude - A simple operator precedence parser
- Step 13 - Local variables
- Step 14 - Variable length arguments
- Step 15 - Starting on the real parser
- Step 16 - Extending the parser
- Step 17 - Inferring "let"'s
- Step 18 - Plugging in an operator precedence parser
- Interlude - The Ruby Object Model - useful overview as the next steps is to start fleshing out an object model
- Step 19 - The Object Model
- Interlude - The problem with compiling Ruby - Overview of some of the problems with compiling Ruby
- Step 20 - Taking stock of where we are; detail walkthrough of the shunting yard parse component
- Step 21 - Basic support for the Symbol class; starting to work
- Step 22 - A diversion into method_missing
- Step 23 - Real Strings: Turning strings into objects
- Interlude - How to implement closures
- Step 24 - The Outer Scope: Cleaning up and adding the "main" object
- Step 25 - Towards define_method via closures
- Step 26 - Taking a stab at STABS support (debugging/gdb)
- Step 27 - The Operators - Part 1: Parsing; The start of the removal of 'runtime.c'
- Step 28 - The Operators - Part 2: Turning numbers into Numbers (or small enough integers into Fixnum's at least)
- Step 29 - The Operators - Part 3: Operating on our newly minted Fixnums
- Step 30 - The Operators - Part 4: Comparisons
- Step 31 - Stringing things together; basic string output and string interpolation
- Step 32 - Testing code generation
- Step 33 - Register Allocation
- Step 34 - Putting self in a register
- Step 35 - Towards Self Hosting
- Step 36 - Scanning for problems
- Step 37 - Default Arguments
- Step 38 - Super
- Step 39 - To be or not to be nil
- Step 40 - Untangling "expect"
- Step 41 - Messy further steps towards self-hosting
- Step 42 - Re-opening class definitions (to be published November '14)
- Step 43 - Eigenclasses (to be published November or early December '14)
- Step 44 - Wrapping up self-hosting of the tokenization code (planned for December '14)
- Step 45 - '#send' and method lookup (planned for December '14 or early Jan '14)
All the code can now be found in this GitHub repository
A recent (as of 2014) article series on "reconstructing" Ruby (in C)
Someone commenting on this series on Hackernews pointed to Nicklaus Wirth's excellent book on compiler construction (PDF available on his homepage) as a good starting point from a traditional, top down point of view. It's a good and very approachable book, and well worth a read.
Another great series with a more traditional approach is a famous series by Jack Crenshaw: Let's build a compiler
This page at Stackoverflow has a great list of other compiler writing resources.