Writing a (Ruby) compiler in Ruby bottom up - step 38 2014-09-28


This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

I've been lazy again. Just today I realized that not only has it been two months since last post, but I still have one more post queued than I thought. I hope to clean up the next post in 2-3 weeks time, and another one in the following 2-3 weeks. I've actually just started preparing part 42..

Super

This time, I'm going to cover a small, simple change again for a change. Make this compile:


      class ScannerString < String
        def initialize str
          super(str)
          @position = 0
        end
      end

Which means implementing super. Conceptually this is easy. Consider that "under the hood", a method call effectively compiles to roughly the following pseudo code (imagine we're compiling to C instead of assembler). "self.bar(baz)" becomes:


        self->__class_pointer[offset_of_bar](baz)

(this assumes the method has a vtable slot, but we don't handle ones that don't yet, anyway)

We want to translate this into something like this:


        self->__class_pointer->__super_class_pointer[offset_of_bar](baz)

In other words: We find the pointer to the Class object, and then we need to look up the super-class of that class.

A quick aside about eigenclasses

If you're experienced with Ruby, you've probably at least heard of eigenclasses or metaclasses.

An eigenclass in Ruby is a class that is private to the instance. This may look like a complicating factor, especially since if you call #class on an object after defining a method on the eigenclass, you still get the same class as before. But what actually happens is that the eigenclass is added into the inheritance chain, but "hidden" from certain types of lookup.

For method calls, everything is as before, so we continue to ignore eigenclasses (at this point it looks like I'll cover that in part 42 or 43)

Getting the superclass

First things first: We currently don't actually store a pointer to the super-class anywhere.

In lib/core/class.rb, we handle the storage (in b05160c):


      (assign (index ob 3) superclass)

We then increase ClassScope::CLASS_IVAR_NUM in scope.rb. At the same time, we add a #method method that will be needed to get the name of the method we're currently compiling, in order to compile a call to the super-class version of the method correctly.

compile_super and friends

Next, we add the function/method name Function, which includes a few changes to actually pass it when it is known. But the real work gets done by Compiler#compile_super which we'll look at shortly, and this new line in #compile_call (in cd29057):


     +    return compile_super(scope, args,block) if func == :super

This is because super-calls will look like receiver-less method-calls, which are handled by compile_call.

And here's #compile_super:


      # Compiles a super method call
      #
      def compile_super(scope, args, block = nil)
        method = scope.method.name
        @e.comment("super #{method.inspect}")
        trace(nil,"=> super #{method.inspect}\n")
        ret = compile_callm(scope, :self, method, args, block, true)
        trace(nil,"<= super #{method.inspect}\n")
        ret
      end

It pulls the name of the surrounding method, and then calls #compile_callm, with an added new flag. Other than that it's debugging/tracing only. The new flag to #compile_callm is do_load_super. #compile_callm changes like this:


    -  def compile_callm(scope, ob, method, args, block = nil)
    +  def compile_callm(scope, ob, method, args, block = nil, do_load_super = false)

and this:


      load_class(scope) # Load self.class into %eax
    + load_super(scope) if do_load_super

and finally, #load_super looks like this:


    +  # Load the super-class pointer
    +  def load_super(scope)
    +    @e.load_instance_var(:eax, 3)
    +  end

Basically #load_super depends on the hardcoded slot for the super class pointer as an instance variable. Note: We could alternatively expose this pointer as a instance variable of some sort, but this keeps it anonymous and inaccessible (keep in mind that asking for the super-class is different to following this field, as following the inheritance chain from Ruby excludes eigenclasses, and also modules.

Are we there yet?

Not quite. We're going to make string take an optional argument to #initialize too, in order to be able to handle the example I started with.

Here is the change we make (in 55d076e). See the comments inline.


     class String
    -  def initialize
    +  # NOTE
    +  # Changing this to '= ""' is likely to fail, as it
    +  # gets translated into '__get_string("")', which
    +  # will do (callm String new), leading to a loop that
    +  # will only terminate when the stack blows up.
    +  #
    +  # Setting it "= <some other default>" may work, but
    +  # will be prone to order of bootstrapping the core
    +  # classes.
    +  #
    +  def initialize *str
         # @buffer contains the pointer to raw memory
         # used to contain the string.
         # 
    @@ -13,7 +23,13 @@ class String
         # 0 outside of the s-expression eventually will
         # be an FixNum instance instead of the actual
         # value 0.
    -    %s(assign @buffer 0)
    +
    +    %s(if (lt numargs 3)
    +         (assign @buffer 0)
    +         (do 
    +            (assign len (callm (index str 0) length))
    +            (callm self __copy_raw ((callm (index str 0) __get_raw) len)))
    +        )
       end

Next time

Next time it will be time to fix "nil". And since we're going there, maybe "true" and "false" too.


blog comments powered by Disqus