Tag: compiler

2010-02-22 17:35 UTC Writing a (Ruby) compiler in Ruby bottom up - step 24

This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

Apologies for the delay... This part was ready before christmas, but for various reasons I never got around to posting it. And to make matters worse I managed to post the wrong part yesterday. Sigh.

The Outer Scope

Stepping back from the attr_* debacle for a bit... I wanted to look at something else, mostly because it affects all my ad-hoc testing.

One of the downsides of starting this without a specific design in mind is that after I decided to go down the Ruby path we've been dragging along quite a bit of cruft.

But that is the cost of experimentation.

In particular, the compiler has a weird distinction between functions and methods: Define something with def outside of a class and it becomes a function, inside it becomes a method.

We need support for functions to make implementation reasonably easy: It makes bootstrapping the object model far easier, as well as integration with the outside C-based world.

But Ruby doesn't have functions.

So what to do?

Recently I wrote about hiding the "low level plumbing" and the change I'm about to show you gets us a bit closer to that while at the same time starting to clean up the function vs. method mess.

In Ruby, self outside of a class is the main object - an instance of Object. Logically then, for a def outside of a class, the method will be defined on `Object. So, we'll create that object.

What about function calls?

Defining and calling functions will still be possible, but only using the s-expression syntax. That reasonably cleanly "hides" the plumbing we use functions for (in fact, it means we could stop the annoying convention I started of prefixing the names with "__" since they effectively now live in their own namespace, though I haven't changed that yet).

Some preparations

First we must rewrite a few functions fully as s-expressions. This is straightforward. For example:

    -def __get_string(str)
    -  s = String.new
    -  %s(callm s __set_raw (str))
    +%s(defun __get_string (str) (let (s)
    +  (assign s (callm String new))
    +  (callm s __set_raw (str))
       s
    -end
    +))

Otherwise the upcoming changes will turn them into methods, which is not what we want.

I'm not going to present all of them. The biggest one is probably __new_class_object, but all of these are straight forward translations.

These changes were done on the master branch prior to starting this feature proper.

Splitting up "defun"

You can follow these remaining changes on the main-object branch

So far defun has been made up by two mostly different branches: If occurring inside a class, it's compiled as a method, if not it'd compile a function.

This is both messy and not very logical.

Going forward we'll separate it into defm that defines a Ruby style method, and defun, which, as before, defines a function. defun will be completely hidden from normal Ruby code - you'll have to dip down into the s-expression syntax to access it.

First our list of compile_* methods in compiler.rb:

    -                   :do, :class, :defun, :if, :lambda,
    +                   :do, :class, :defun, :defm, :if, :lambda,

The new, simplified compile_defun:

    # Compiles a function definition.
    # Takes the current scope, in which the function is defined,
    # the name of the function, its arguments as well as the body-expression
    # that holds the actual code for the function's body.
    #
    # Note that compile_defun is now only accessed via s-expressions
    def compile_defun(scope, name, args, body)
      f = Function.new(args, body,scope)
      name = clean_method_name(name)

# add function to the global list of functions defined so far @global_functions[name] = f

# a function is referenced by its name (in assembly this is a label). # wherever we encounter that name, we really need the adress of the label. # so we mark the function with an adress type. return [:addr, clean_method_name(name)] end

This is really more or less just tearing out the method part, and brings it roughly back to how it was before we added the method calling bit.

But compile_defm is also somewhat simpler than the old method part:

    # Compiles a method definition and updates the
    # class vtable.
    def compile_defm(scope, name, args, body)
      scope = scope.class_scope

f = Function.new([:self]+args, body, scope) # "self" is "faked" as an argument to class methods.

@e.comment("method #{name}")

body.depth_first do |exp| exp.each do |n| scope.add_ivar(n) if n.is_a?(Symbol) and n.to_s[0] == ?@ && n.to_s[1] != ?@ end end

cleaned = clean_method_name(name) fname = "__method_#{scope.name}_#{cleaned}" scope.set_vtable_entry(name, fname, f)

# Save to the vtable. v = scope.vtable[name] compile_eval_arg(scope,[:sexp, [:call, :__set_vtable, [:self,v.offset, fname.to_sym]]])

# add the method to the global list of functions defined so far # with its "munged" name. @global_functions[fname] = f

# This is taken from compile_defun - it does not necessarily make sense for defm return [:addr, clean_method_name(fname)] end

The most important changes?

  • The call to scope.class_scope at the top.
  • The change to call __set_vtable near the bottom instead of hardcoding the asm to set the vtable entry.

So, why those changes?

If you look at GlobalScope, we've added a @class_scope instance variable that is initialized to this:

    @class_scope = ClassScope.new(self,"Object",@vtableoffsets)

This means that if you have a :defm occurring in the global scope (we'll get to that) it will be compiled in that class scope, as a method of class Object.

Poof and our functions are (almost) gone.

(In ClassScope we just return self from the new class_scope method.)

For the second change, adding __set_vtable is just the purist in me wanting to expel as much assembler as possible. Here it is (from lib/core/class.rb):

    # FIXME: This only works correctly for the initial
    # class definition. On subsequent re-opens of the class
    # it will fail to correctly propagate vtable changes 
    # downwards in the class hierarchy if the class has
    # since been overloaded.
    %s(defun __set_vtable (vtable off ptr)
      (assign (index vtable off) ptr)
    )

Ok, so not just the purist in me. In fact, this makes re-opening classes "cleaner" in that this function will actually get reasonably complex as we fix this issue, and also later handle define_method and otherwise deal with cases where no vtable slot is actually available.

Anything else? A few pieces of cleanups to deal with the scope changes, and this little gem / turd depending on your point of view, in lib/core/core.rb:

    +
    +# OK, so perhaps this is a bit ugly...
    +self = Object.new
    +

Ehm. We probably should put that in the compiler, or at least prevent arbitrary assigning to self in user code. But it works for now, and we've done worse in the past. To reiterate: Get things working first. Make them fast/pretty later.

The last vital change is this, in Parser#parse_def:

    -    return E[pos, :defun, name, args, exps]
    +    return E[pos, :defm, name, args, exps]
       end

It does what it looks like - make the parser for the Ruby code spit out only :defm nodes instead of :defun nodes. Our code is now free of functions anywhere outside of s-expressions.

What's the effect?

These changes make this work as expectd:

    class Foo
      def hello
        puts "hello"
      end
    end

def hello_world puts "hello world!" end

%s(puts "foo") # Calls the C function Foo.new.hello # Calls Foo#hello which calls Object#puts puts "world" # Calls Object#puts (via the "main" instance)

hello_world # Calls Object#hello_world (via the "main" instance)

Adding this on the end on the other hand still fails (but it works in MRI) because __set_vtable doesn't propagate the vtable update downwards to subclasses of Object:

    Foo.new.hello_world

When re-opening a class, new methods needs to be added "properly" by processing all child classes and updating their vtables.

The alternative (which moves the cost from the point of defining a method to the point of calling the method), is to implement a proper default method missing that ascends the class hierarchy to look for an implementation before giving an error.

We really should do both, since we eventually need to handle methods without vtable entries.

But that's for a future installment



2009-12-21 16:45 UTC How to implement closures

This is a sort-of interlude to my regular compiler series.

The goal is to give a brief overview of some techniques for implementing closures in a programming language. I will use C for my examples, mostly because it's low level enough that a further translation to assembler etc. is straight forward, and many compilers target C directly anyway.

Before we start, most of the code examples in this post are available in this Gist so you don't need to cut and paste bits and pieces (which won't work anyway, as the text below will omit details such as header includes)

A closure) is in it's simplest form a block of code that can be passed around as a value, and that can reference variables in the scope it was created in even after exiting from that scope.

(For a more formal description look at the Wikipedia page linked above)

A Ruby block forms a closure, for example:

    def foo x
      printf "x is %d\n",x

lambda do x += 1 printf "block: x is %d\n",x

end end

c = foo(5) c.call

c.call

Every time foo is called it will return a closure that has access to the x that was passed in to foo. In the example, x will start out at 5 and get incremented to 7.

The most important implication of this is that any variables used in the closure must be guaranteed to live as long as the closure does.

We'll get back to this example throughout this article.

There are a number of ways of handling this, for example:

  • Instead of a traditional stack, put activation frames (arguments and local variables) for function/method calls on the heap, as a linked list. When creating a closure, you just keep a reference to it. When returning from a function you unlink the frame from the ones below. Presto, the gc handles the remaining dirty details. Some variations of this is called a spaghetti stack
  • Rewriting. You can rewrite any function that may create a closure to create a separate heap allocated environment, and to copy/locate all arguments and variables that may be reused into it. The environment can safely be returned with the closure.
  • Copy. You can copy the current activation frame into a separate closure environment when returning the closure, ensure the closure refers to the variables via a reference, that is replaced with a references to the environment copy.
  • More? Probably...

Each of these have advantages and disadvantages.

We're going to look at the rewriting method mainly, though most of what you find below also apply to the copying method.

These methods are pretty similar - they both involve returning a pointer to the function implementing the code in the closure combined with a pointer to the data in the closure. The difference lies in how that data is accessed.

The rewriting approach creates the closure environment as early as possible, and changes the surrounding function to refer to that environment. It incures the closure creation cost whenever the closure generating function is created, but since local variables are initialized straight into the environment it avoids later copying.

The copy approach delays the creation of the closure environment as much as possible, and then copies the data. It can avoid unnecessary creation cost if there are paths that don't lead to creation of a closure, but when the creation happens, it needs to handle full copying of any objects or object references involved, which may be more expensive.

For both of these methods, care must be taken to create a single environment for any set of closures created during the same execution of the function that creates the closures.

"Fat pointers", objects and thunks

When returning a closure we also need to be able to pass along the environment.

As it turns out, there are a number of ways of handling this:

  • We can create a "fat pointer". I.e. instead of passing around only the address to the code, we also pass around a pointer to the environment, and it is the callers responsibility to load that address onto the stack or into a register so the code can get at it.
  • We can turn the closure into an object, like Ruby's Proc, and simply treat the variables used as instance variables of that object.
  • We can create our own half-assed almost object by storing the function pointer in the environment.
  • We can create a "thunk" on the fly - a small piece of code that will load the address of the environment and jump straight into the real code - and return that instead of a pointer to the real code of the closure. The major downside here is that it requires turning off protection against executing code from the heap.

The "fat pointer" approach is simple but involves more work at every call site, and doubles the size of the data that is passed around.

Turning it into an object is simple, and for my Ruby compiler it's even necessary a lot of the time since most of the time when you handle blocks in Ruby, you'll actually get a Proc object. But it has the full overhead of method dispatch.

The "half-assed object" approach still requires each call site to do a little bit more work, but less than the fat pointer approach. It also doesn't require additional data to be passed around (the function pointer is stored in the environment instead of copied around)

Creating a "thunk" also has overhead, but isn't as scary as it may sound - the code to create is very simple, and really it consists of copying a few bytes around.

We'll start with a "sort of" object, and then take a look at the thunk approach.

Simulating the rewriting method in C

closures-basic.c

C lacks pretty much everything that could make this convenient and easy, so it really lays the implementation bare, for better or worse.

First, let's create a structure to hold the function pointer and environment:


struct closure { void (* call)(struct closure *); int x; };

If we had more local variables in foo, we'd add them to this structure.

Then we need to create a function with the code for the lambda block:


    void block(struct closure * env) {
      env->x += 1;
      printf ("block: x is %d\n", env->x);
    }

Finally we can implement foo:


    struct closure * foo(int x)
    {
      struct closure * closure = (struct closure *)malloc(sizeof(struct closure *));
      closure->x = x;
      printf ("x is %d\n",closure->x);
      closure->call = █
      return closure;
    }

Ewww..

A couple of observations: If we want to be able to return multiple closures (say, an array of them), the variables needs to be acessed via one more indirection, which makes this even more disgustingly convoluted.

It's also annoying, because it means lots of extra overhead in order to handle a situation that might very well never arise.

If the language requires supporting multiple closures (like Ruby), a compiler could support both approaches to optimize - adding the cost of the extra redirection only:

  • when more than one closure can potentially be returned at the same time.
  • when those closures need access to the same variables.

For an example of these restrictions, consider:

     def foo x
       a = 1

b = 2 if x return lambda { a += 1 }

else return lambda { b+= 1 }

end end

First of all, this function is guaranteed to only return one lambda at the time, so applying that rule, we can just assing the appropriate function pointer to the call member variable, and do away with the extra indirection.

Secondly, even if we decide to return both of them at the same time, they are guaranteed to never access the same variables, so we could instead create two distinct closure environments, and still avoid the extra indirection.

But let's take a look at the complete example with the extra indirection anyway, as a worst case scenario:

Indirection hell

closures-indirection.c

First we create an environment for the free variables we wish to "capture":


struct env { int x; };

Our modified closure structure holds a pointer to it, instead of holding the variables:


    struct closure {
      void (* call)(struct env *);
      struct env * env;
    };

The block takes a pointer to the environment:


    void block(struct env * env) {
      env->x += 1;
      printf ("block: x is %d\n", env->x);
    }

And the closure function itself needs to first allocate the environment, and then the closure:


    struct closure * foo(int x) {
      struct env * env = (struct env *)malloc(sizeof(struct env));
      env->x = x;

printf ("x is %d\n",env->x);

struct closure * closure = (struct closure *)malloc(sizeof(struct closure *)); closure->env = env; closure->call = block;

return closure; }

Finally we call it with the env:


    int main() {
      struct closure * c = foo(5);

c->call(c->env); c->call(c->env); }

Fat pointers

The example above is actually really simple to use to illustrate the fat pointer approach. Instead of returning a pointer to the closure object, we simply return the object itself:

closures-fatptr.c


    struct closure foo(int x)
    {
      struct env * env = (struct env *)malloc(sizeof(struct env));
      env->x = x;

printf ("x is %d\n",env->x);

struct closure closure; closure.env = env; closure.call = block;

return closure; }

int main() { struct closure c = foo(5);

c.call(c.env); c.call(c.env); }

As you can see it makes some things simpler (no need for the second memory allocation step).

Variable substitution

An optimization worth keeping in mind is variable substitution. In cases where it can be guaranteed that the variables in question keeps the same value, the need for a separate environment may go away if variables are substituted for their values in the closures body.

Furthermore, if the variables can be guaranteed not to change in any closure returned from the function, then whether or not the variable changes in the function outside the closures, the closures may keep separate environments (and hence do the optimization to avoid indirection) with copies of the variables in question.

Of course this would require the function to carry out any updates once for each generated closure.

(you might have spotted here one of the advantages for functional languages with little or no mutation of variables - they have a lot fewer issues to worry about with respect to sharing of viarable state)

Moving on to thunks

closures-thunks.c

Now that we have the basics down, how do we return just a "plain" function pointer, so we can simplify the call sites?

What we want to create is something like the code below.

Note that we take a shortcut and don't create a proper stack frame - this kind of thing can easily confuse gdb and other debuggers, which is not necessarily very nice.

The thunk below also does not directly allow passing any arguments to the block - if we wanted to do that it gets hairier, since we'd need to manipulate the parameters passed, while the caller doesn't know we've altered the size of the parameter space set aside. In a compiler this would likely be solved by making either the caller or callee aware that it's a closure call, and adjusting the stack accordingly separate from the thunk.

Of course, the downside of using asm here is that it's architecture specific, and the code to generate the thunk will need to be modified accordingly to port the code, but then if you do this in a compiler that is outputting native assembly, you have to do that anyway.

    my_closure_instance:
        pushl $my_environment ; Push the environment onto the stack as the first arg
        call  $the_block      ; Go to the real code
        addl  $4,%esp         ; Throw the environment pointer away
        ret                   ; Return to the caller

Putting some arbitrary values in there, and assembling it with gcc/gas, and then doing objdump -D to the resulting binary gives this (and other bits and pieces I've cut - use the -nostdlib option to avoid dealing with a bunch of initialization code):

    08048055 :
    8048055:       68 00 08 af 2f          push   $0x2faf0800
    804805a:       e8 00 00 00 00          call   804805f 
    804805f:       83 c4 04                add    $0x4,%esp
    8048062:       c3                      ret  

This shows us the values to put in our thunk. Couple of observations: The push is just followed by the address itself, but the call uses an offset from the first byte of the following instruction.

This approach of using structs to generate the thunk, I've blatantly stolen from Joe Damato

because I like how it makes the code that manipulates the thunk more readable:


    struct __attribute__((packed)) thunk {
      unsigned char push_op;
      void * env_addr;
      unsigned char call_op;
      signed long call_offset;
      unsigned char add_esp_ops[3];
      unsigned char ret_op;
    };

struct thunk default_thunk = {0x68, 0, 0xe8, 0, {0x83, 0xc4, 0x04}, 0xc3};

(the __attribute__ stuff is the gcc specific way of avoiding padding for alignement)

We change the rest like this:


    typedef void (* cfunc)();

cfunc foo (int x) { struct env * env = (struct env *)malloc(sizeof(struct env)); env->x = x;

printf ("x is %d\n",env->x);

struct thunk * thunk = (struct thunk *)mmap(0,sizeof(struct thunk), PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); *thunk = default_thunk; thunk->env_addr = env; thunk->call_offset = (void *)&block - (void *)&thunk->add_esp[0]; // Pretty! mprotect(thunk,sizeof(struct thunk), PROT_EXEC); return (cfunc)thunk; }

The typedef is a workaround for C's absolutely atrocious ptr-to-function declarations...

The interesting bit is at the end where we allocate the thunk, and fill in the addresses

Then we cast the thunk data structure to a function pointer. That ought to make you feel dirty, and a bit queasy. It's ok, though.

We use mmap to avoid problems on systems with executable heaps turned off (execshield etc. that wil cause a segmentation fault if you try to execute code in malloc()'d memory or on the stack), and the mprotect() turns off write access to the page after we're done. For a production approach you may want a dedicated allocation function to properly manage this and perhaps avoid doing a separate mmap for every thunk created.

All of this could really be wrapped up into a nice generic function. Something like this:


    struct thunk * make_thunk(struct env * env, void * code) {
      struct thunk * thunk = (struct thunk *)mmap(0,sizeof(struct thunk), PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      *thunk = default_thunk;
      thunk->env_addr = env;
      thunk->call_offset = code - (void *)&thunk->add_esp[0]; // Pretty!
      mprotect(thunk,sizeof(struct thunk), PROT_EXEC);
      return thunk;
    }

Finally, this makes our main function look like this:


    int main() {
      cfunc c = foo(5);

c(); c(); }

Now that's nicer...

Since this is already gcc/Linux/x86-32 specific, it can be made even nicer. gcc supports inner functions, and with a tiny bit of restructuring and a couple of macros I did this, mostly for fun to see how close to a "natural" syntax for closures I could get in C without changing the compiler:


    #define initenv(__vars__) struct env { __vars__ ; } * env = (struct env *)malloc(sizeof(struct env));
    #define new_closure(__block__) (closure)make_thunk(env,&__block__)

closure foo (int x) { initenv(int x) env->x = x;

printf ("x is %d\n",env->x);

void block (struct env * env) { env->x += 1; printf ("block: x is %d\n", env->x); }

return new_closure(block); }

I'll still stick to Ruby...

Some parting comments

I wrote this while exploring various approaches to add closure support to my Ruby compiler. I haven't quite made up my mind yet. The object approach is tempting because Ruby already has the Proc class, but I'll probably go for an environment + fat pointer approach that will be converted into a Proc object if assigned to anything (as opposed to just used for yield)

The thunk approach is somewhat appealing too, but if turned into a Proc object it may be as much or more overhead than the fat pointer approach (and there will only be one call site: In Proc#call) and this approach would mean the fat pointer wouldn't be passed around much - typical usage would be passing a block to a method, and then yield'ing to it, where the yield would be the only call site.



2009-12-16 18:26 UTC Writing a (Ruby) compiler in Ruby bottom up - step 23

This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

Continuing down the rabbit hole: String

A couple of parts ago we established some of the problems with supporting even the seemingly simple attr_reader, attr_writer and attr_accessor. To reiterate, a naive implementation looks like this:

def attr_accessor sym
  attr_reader sym
  attr_writer sym
end

def attr_reader sym define_method sym do %s(ivar self sym) end end

def attr_writer sym define_method "#{sym.to_s}=".to_sym do |val| %s(assign (ivar self sym) val) end end

We resolved the issue of having a working sym.to_s, but that still leaves a number of things to address. For example, we need at least the start of a "proper" String class that we can add #to_sym to. That's what we'll look at this time.

The first bit is very similar to what we did for Symbol: We won't type tag, and we'll create String objects for literal strings by calling a low level function.

Quick sidebar: Hiding the low level implementation

MRI "hides" the gory details of String and other basic classes by not making them real Ruby objects per-se. There's data there that is simply not accessible without dipping into the C API. We may eventually want to take a similar tack - add a mechanism for allocating space for, setting and retrieving low level variables via the s-expression syntax that are completely invisible for the higher level API.

Currently we don't. The "raw" string used for Symbol and for String objects is a normal instance variable, and it's perfectly possible to expose it, and accessing it thinking it's a real object will trivially easily cause nasty crashes.

In general, we would eventually benefit from creating a mechanism to "hide" the low level plumbing from accidental access by normal Ruby. For now we'll retain the ability to shoot ourselves in the foot easily.

The basics

The String class will for now contain just a pointer to a C-style string. It will be immutable in it's first incarnation. This gets us a decent step further very easily - we can just keep the address to a C-style string constant.

As it turns out, this will be an easy foundation to build on:

MRI uses a number of flags for strings, for example to do "copy on write". Once we need to make strings mutable, we'll do the same. Strings created from constants will start out marked as "copy on write", and when doing an update, the string buffer will be copied into freshly allocated space first. The hope is that many string constants will never be updated.

We also need to implement String#to_sym. This is thankfully also reasonably easy: We need to access the low level function we added that will retrieve a symbol based on a "raw" c-style string.

The code

As the last couple of times, this was unfortunately committed a bit piecemeal, so the commits aren't easy to follow. But lets go through this step by step.

First of all, we're going to rewrite the tree, to alter a literal string into

%s(call __get_string ConstantReferringToTheRawString)

So, we alter compile to add a call to rewrite_strconst:

#
# Re-write string constants outside %s() t
# %s(call __get_string [original string constant])
def rewrite_strconst(exp)
  exp.depth_first do |e|
    next :skip if e[0] == :sexp
    is_call = e[0] == :call
    e.each_with_index do |s,i|
      if s.is_a?(String)
        lab = @string_constants[s]
        if !lab
          lab = @e.get_local
          @string_constants[s] = lab
        end
        e[i] = [:sexp, [:call, :__get_string, lab.to_sym]]
        # FIXME: This is a horrible workaround to deal
        # with a parser inconsistency that leaves calls
        # with a single argument with the argument "bare"
        # if it's not an array, which breaks with this rewrite.
        e[i] = [e[i]] if is_call && i > 1 
      end
    end
  end
end

Effectively this visits all nodes that are not s-expressions, and if it finds a string, it will add a string constant. It will then use the label that is allocated to rewrite the expression like this:

[:sexp, [:call, :__get_string, lab.to_sym]]

As the following line says, it's a nasty workaround, and it will be torn out as soon as I get a parser, so just ignore that... pretend it's not even there ;)

Now, the reason we wrap the call to __get_string is that otherwise it will be rewritten to a method call because there really is no such thing as function calls in Ruby, but for some of our low-level plumbing we need actual C-style function calls (such as for calling out to C code).

We're also changing strconst, the method used to get the "value" of a string constant as follows:

-  def strconst str
-    lab = @string_constants[str]
-    return lab if lab
-    lab = @e.get_local
-    @string_constants[str] = lab
-    return lab
+  def strconst(a)
+    lab = @string_constants[a]
+    if !lab # For any constants in s-expressions
+      lab = @e.get_local
+      @string_constants[a] = lab
+    end
+    return [:addr,lab]
   end

What's left? Not much, really. You can see the basi c String class

Here's the important bit:

class String
   def __set_raw(str)
     @buffer = str
   end
end

def __get_string(str) s = String.new %s(callm s __set_raw (str)) s end

It's worthwhile reading the comments in at the link above if you want to see more about where this is heading.

We'll flesh out the String class in coming parts, but this groundwork now means we're working on a "real" object instead of on a raw pointer to a c-style string, which means it's mostly a runtime library issue instead of requiring much in terms of compiler changes.

One of the things worth mentioning that is covered in one of the comments in lib/core/string.rb is that the whole __get_string and __get_symbol thing is a bit ugly. As I mentioned earlier in this part, hiding the low level implementation would be nice, and that includes preventing direct calls to __get_string and _get_symbol, and that can be done trivially by simply ensuring the Ruby code can't directly do function calls (as opposed to method calls). But that doesn't solve String#__set_raw. To fix that, I'll need to either remove the need for the method, or by making it possible to "hide" certain methods entirely from the Ruby code...

It's not a priority now, though - first we get things working, then we make it pretty.

What about String#to_sym? Well, this ought to do the trick:

class String
  def to_sym
    __get_symbol(@buffer)
  end
end

So, that leaves us with a working define_method and %s(lambda) with support for variables to get attr_reader,attr_writer and attr_accessor working... We're heading down that road soon.



2009-12-10 18:53 UTC Ruby gets a spec

A week or so ago, the Ruby Draft specification made the rounds

Yes... Ruby is finally getting a standard. While RubySpec has been around for a while, and is a great, it is an executable specification that tells you what, not why, and it aims to be "complete" while the new project aims to define a shared core suitable for implementation by all the different Ruby implementations that are springing up.

I think they're complementary enough that the Ruby Draft Specification could be very beneficial.

For my compiler project this means there's now a compelling reason to revisit some things (such as the parser) to look at conformance, and a clear roadmap for some others (classes) to ensure it at least meets the spec.

Of course, in many ways MRI will still be the benchmark, but the standard will provide a minimal level of conformance that will likely be easier to achieve.

In particular, the spec includes a formal grammar for Ruby that doesn't involve walking through thousands of lines of code written for another implementation to verify that you're doing things right. I'm particularly looking forward to going through my parser and aligning it more closely with the draft spec... (I'm sure I'll be swearing a lot while doing it, though)

It also includes descriptions of expected semantics for things like the object model that have previously been something a lot of people (including me) have spent time revers engineering to figure out.

Combined with all the Ruby implementations in progress, this is a clear indication that Ruby is growing up and becoming a contender in areas where it would previously not be acceptable.

Keep an eye on the Ruby standard project...



2009-11-10 14:20 UTC Writing a (Ruby) compiler in Ruby bottom up - step 22

This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

A diversion into Method missing

So far the method_missing implementation has just printed a notice and quit.

During the trip down the rabbit hole that is attr_accessor and friends that became a major annoyance.

The problem is that this notice has not included a stack backtrace or any way to figure out where it occurred. It's also been impossible to override it and actually figure out what method was being called because we currently get to method_missing by jumping straight into the vtable.

We get method_missing when we hit the pointer that's installed there as default when nothing has overridden it.

So how do we get better debug output? And how do we support users overriding method_missing and actually getting a symbol to use?

Thunks

A "thunk" in terms of a object oriented languages is generally a small piece of compiler generated code that gets inserted to "adjust" a function or method call.

In this specific case, we will generate a separate thunk for each vtable entry.

Instead of inserting a pointer to method_missing directly, we will insert the address of a small thunk. The thunk will not even create a full stack frame, but simply add the address of the Symbol corresponding to the vtable slot as the first argument on the stack, and then jump straight into method_missing, thereby simulating a "direct" call to method_missing with the symbol as the first argument.

It's actually very simple - we just need to pop the real return address off the stack, push the symbol onto the stack, and then push the real return address back on.

Ok, so we still cheat a bit. Eventually we need to make our method_missing into a real method, but for now it's a function. Here's the code I've added to create a "base" vtable that is used to initialize the vtable slots of Object:

    def output_vtable_thunks
      @vtableoffsets.vtable.each do |name,_|
        @e.label("__vtable_missing_thunk_#{clean_method_name(name)}")
        # FIXME: Call get_symbol for these during initalization 
        # and then load them from a table instead.  
        compile_eval_arg(GlobalScope.new, ":#{name.to_s}".to_sym)
        @e.popl(:edx) # The return address 
        @e.pushl(:eax)
        @e.pushl(:edx)
        @e.jmp("__method_missing")
      end
      @e.label("__base_vtable")
      # For ease of implementation of __new_class_object we
      # pad this with the number of class ivar slots so that the
      # vtable layout is identical as for a normal class 
      ClassScope::CLASS_IVAR_NUM.times { @e.long(0) }
      @vtableoffsets.vtable.to_a.sort_by {|e| e[1] }.each do |e|
        @e.long("__vtable_missing_thunk_#{clean_method_name(e[0])}")
      end
    end

I hope it's reasonably easy to follow. First it generates a number of functions that will look like this:

    __vtable_missing_thunk_to_yaml:
        subl    $4, %esp
        movl    $1, %ebx
        movl    $.L110, (%esp)
        movl    $__get_symbol, %eax
        call    *%eax
        addl    $4, %esp
        popl    %edx
        pushl   %eax
        pushl   %edx
        jmp __method_missing

This can be optimized a lot, but if you've followed this series, you know getting something working is higher priority. In this case we generate a call to __get_symbol, which was introduced in the last part, and we pass the string corresponding to the name:

    .L110:
        .string "to_yaml"

Then we adjust the stack as mentioned above.

The next step is to create the __base_vtable. Here's an excerpt:

    __base_vtable:
        .long 0
        .long 0
        .long __vtable_missing_thunk_new
        .long __vtable_missing_thunk___send__
        .long __vtable_missing_thunk___get_symbol
        .long __vtable_missing_thunk___method_missing
        .long __vtable_missing_thunk_array
        .long __vtable_missing_thunk___new_class_object
        .long __vtable_missing_thunk_define_method
        ...

Then we need to modify __new_class_object to assign entries from __base_vtable instead of just blindly assigning a pointer to __method_missing:

    # size <= ssize *always* or something is severely wrong.                                                                                                                
    def __new_class_object(size,superclass,ssize)
      ob = 0
      %s(assign ob (malloc (mul size 4))) # Assumes 32 bit                                                                                                                  
      i = 1
      %s(while (lt i ssize) (do
          (assign (index ob i) (index superclass i))
          (assign i (add i 1))
      ))
      %s(while (lt i size) (do
           # Installing a pointer to a thunk to method_missing                                                                                                              
           # that adds a symbol matching the vtable entry as the                                                                                                            
           # first argument and then jumps straight into __method_missing                                                                                                   
           (assign (index ob i) (index __base_vtable i))
           (assign i (add i 1))
      ))
      %s(assign (index ob 0) Class)
      ob
    end

Finally we make __method_missing output the symbol, instead of just spitting out "Method missing":

    def __method_missing sym
      %s(printf "Method missing: %s\n" (callm sym to_s))
      %s(exit 1)
      0
    end



Older Entries

  • 2009-11-01 Writing a (Ruby) compiler in Ruby bottom up - step 21
  • 2009-05-05 Writing a (Ruby) compiler in Ruby bottom up - step 20
  • 2009-04-19 The problem with compiling Ruby
  • 2009-04-16 Writing a compiler in Ruby bottom up - Milestone: It can parse itself...
  • 2009-03-10 Writing a compiler in Ruby bottom up - step 19
  • 2009-02-23 Writing a compiler in Ruby bottom up - step 18
  • 2009-02-21 Writing a compiler in Ruby bottom up - step 17
  • 2009-02-17 Writing a compiler in Ruby bottom up - step 16
  • 2009-02-14 Writing a compiler in Ruby bottom up - step 15
  • 2009-02-12 Writing a compiler in Ruby bottom up - step 14
  • 2009-02-01 Just added a github repository for my compiler series
  • 2009-01-26 Writing a compiler in Ruby bottom up - step 13
  • 2008-12-08 A simple Operator Precedence parser
  • 2008-10-27 Writing a compiler in Ruby bottom up - step 12
  • 2008-09-29 Writing a compiler in Ruby bottom up - step 11
  • 2008-07-10 Writing a compiler in Ruby bottom up - step 10
  • 2008-06-19 Writing a compiler in Ruby bottom up - step 9
  • 2008-06-01 Writing a compiler in Ruby bottom up - step 8
  • 2008-05-16 Writing a compiler in Ruby bottom up - step 7
  • 2008-05-06 Unholy: Converting Ruby 1.9 bytecode to Python bytecode
  • 2008-05-03 Writing a compiler in Ruby bottom up - step 6
  • 2008-04-27 Writing a compiler in Ruby bottom up - step 5
  • 2008-04-17 Writing a compiler in Ruby bottom up - step 4
  • 2008-04-08 Strawman for a new parser generator
  • 2008-04-06 Ur-Scheme: A tiny self-hosting Scheme to x86 asm compiler
  • 2008-04-05 - Why am I forced to optimize when choosing my language?
  • 2008-04-05 Writing a compiler in Ruby bottom up - step 3
  • 2008-03-27 Writing a compiler in Ruby bottom up - step 2/??
  • 2008-03-26 Writing a compiler in Ruby bottom up - step 1/??
  • 2008-03-21 Shotgun: The Rubinius virtual machine and some musings on compiling Ruby
  • About me

    E-mail: vidar@hokstad.com Skype: vhokstad
    Twitter: vhokstad
    View my LinkedIn profile.

    I was born April 21st, 1975, in Oslo, Norway. Since 2000 I've been living in London, UK. I'm married and we just had our first child, Tristan Ikemefuna Hokstad.

    I'm working for Aardvark Media as Director of Technology. I'm also currently on the board of SpatialQ, a startup in the GIS space, and an advisor to Skoach, a startup doing a time management app for people with ADD.

    Twitter Updates

      follow me on Twitter