2010-03-03 00:31 UTC Minimig
That is all.
That is all.
Stepping back from the attr_* debacle for a bit... I wanted to look
at something else, mostly because it affects all my ad-hoc testing.
One of the downsides of starting this without a specific design in mind is that after I decided to go down the Ruby path we've been dragging along quite a bit of cruft.
But that is the cost of experimentation.
In particular, the compiler has a weird distinction between functions
and methods: Define something with def outside of a class and it
becomes a function, inside it becomes a method.
We need support for functions to make implementation reasonably easy: It makes bootstrapping the object model far easier, as well as integration with the outside C-based world.
But Ruby doesn't have functions.
So what to do?
Recently I wrote about hiding the "low level plumbing" and the change I'm about to show you gets us a bit closer to that while at the same time starting to clean up the function vs. method mess.
In Ruby, self outside of a class is the main object - an instance
of Object. Logically then, for a def outside of a class, the method
will be defined on `Object. So, we'll create that object.
Defining and calling functions will still be possible, but only using the s-expression syntax. That reasonably cleanly "hides" the plumbing we use functions for (in fact, it means we could stop the annoying convention I started of prefixing the names with "__" since they effectively now live in their own namespace, though I haven't changed that yet).
First we must rewrite a few functions fully as s-expressions. This is straightforward. For example:
-def __get_string(str)
- s = String.new
- %s(callm s __set_raw (str))
+%s(defun __get_string (str) (let (s)
+ (assign s (callm String new))
+ (callm s __set_raw (str))
s
-end
+))
Otherwise the upcoming changes will turn them into methods, which is not what we want.
I'm not going to present all of them. The biggest one is probably
__new_class_object, but all of these are straight forward translations.
These changes were done on the master branch prior to starting this feature proper.
You can follow these remaining changes on the main-object branch
So far defun has been made up by two mostly different branches: If
occurring inside a class, it's compiled as a method, if not it'd compile
a function.
This is both messy and not very logical.
Going forward we'll separate it into defm that defines a Ruby style
method, and defun, which, as before, defines a function. defun
will be completely hidden from normal Ruby code - you'll have to dip
down into the s-expression syntax to access it.
First our list of compile_* methods in compiler.rb:
- :do, :class, :defun, :if, :lambda,
+ :do, :class, :defun, :defm, :if, :lambda,
The new, simplified compile_defun:
# Compiles a function definition.
# Takes the current scope, in which the function is defined,
# the name of the function, its arguments as well as the body-expression
# that holds the actual code for the function's body.
#
# Note that compile_defun is now only accessed via s-expressions
def compile_defun(scope, name, args, body)
f = Function.new(args, body,scope)
name = clean_method_name(name)
# add function to the global list of functions defined so far
@global_functions[name] = f
# a function is referenced by its name (in assembly this is a label).
# wherever we encounter that name, we really need the adress of the label.
# so we mark the function with an adress type.
return [:addr, clean_method_name(name)]
end
This is really more or less just tearing out the method part, and brings it roughly back to how it was before we added the method calling bit.
But compile_defm is also somewhat simpler than the old method part:
# Compiles a method definition and updates the
# class vtable.
def compile_defm(scope, name, args, body)
scope = scope.class_scope
f = Function.new([:self]+args, body, scope) # "self" is "faked" as an argument to class methods.
@e.comment("method #{name}")
body.depth_first do |exp|
exp.each do |n|
scope.add_ivar(n) if n.is_a?(Symbol) and n.to_s[0] == ?@ && n.to_s[1] != ?@
end
end
cleaned = clean_method_name(name)
fname = "__method_#{scope.name}_#{cleaned}"
scope.set_vtable_entry(name, fname, f)
# Save to the vtable.
v = scope.vtable[name]
compile_eval_arg(scope,[:sexp, [:call, :__set_vtable, [:self,v.offset, fname.to_sym]]])
# add the method to the global list of functions defined so far
# with its "munged" name.
@global_functions[fname] = f
# This is taken from compile_defun - it does not necessarily make sense for defm
return [:addr, clean_method_name(fname)]
end
The most important changes?
scope.class_scope at the top.__set_vtable near the bottom instead of hardcoding the asm to set the vtable entry.So, why those changes?
If you look at GlobalScope, we've added a @class_scope instance variable
that is initialized to this:
@class_scope = ClassScope.new(self,"Object",@vtableoffsets)
This means that if you have a :defm occurring in the global scope (we'll get to that)
it will be compiled in that class scope, as a method of class Object.
Poof and our functions are (almost) gone.
(In ClassScope we just return self from the new class_scope method.)
For the second change, adding __set_vtable is just the purist in me
wanting to expel as much assembler as possible. Here it is (from lib/core/class.rb):
# FIXME: This only works correctly for the initial
# class definition. On subsequent re-opens of the class
# it will fail to correctly propagate vtable changes
# downwards in the class hierarchy if the class has
# since been overloaded.
%s(defun __set_vtable (vtable off ptr)
(assign (index vtable off) ptr)
)
Ok, so not just the purist in me. In fact, this makes re-opening classes
"cleaner" in that this function will actually get reasonably complex as we
fix this issue, and also later handle define_method and otherwise deal
with cases where no vtable slot is actually available.
Anything else? A few pieces of cleanups to deal with the scope changes, and this little gem / turd depending on your point of view, in lib/core/core.rb:
+
+# OK, so perhaps this is a bit ugly...
+self = Object.new
+
Ehm. We probably should put that in the compiler, or at least prevent arbitrary assigning to self in user code. But it works for now, and we've done worse in the past. To reiterate: Get things working first. Make them fast/pretty later.
The last vital change is this, in Parser#parse_def:
- return E[pos, :defun, name, args, exps]
+ return E[pos, :defm, name, args, exps]
end
It does what it looks like - make the parser for the Ruby code spit out
only :defm nodes instead of :defun nodes. Our code is now free of
functions anywhere outside of s-expressions.
These changes make this work as expectd:
class Foo
def hello
puts "hello"
end
end
def hello_world
puts "hello world!"
end
%s(puts "foo") # Calls the C function
Foo.new.hello # Calls Foo#hello which calls Object#puts
puts "world" # Calls Object#puts (via the "main" instance)
hello_world # Calls Object#hello_world (via the "main" instance)
Adding this on the end on the other hand still fails (but it works in
MRI) because __set_vtable doesn't propagate the vtable update
downwards to subclasses of Object:
Foo.new.hello_world
When re-opening a class, new methods needs to be added "properly" by processing all child classes and updating their vtables.
The alternative (which moves the cost from the point of defining a method to the point of calling the method), is to implement a proper default method missing that ascends the class hierarchy to look for an implementation before giving an error.
We really should do both, since we eventually need to handle methods without vtable entries.
But that's for a future installment
This is a sort-of interlude to my regular compiler series.
The goal is to give a brief overview of some techniques for implementing closures in a programming language. I will use C for my examples, mostly because it's low level enough that a further translation to assembler etc. is straight forward, and many compilers target C directly anyway.
Before we start, most of the code examples in this post are available in this Gist so you don't need to cut and paste bits and pieces (which won't work anyway, as the text below will omit details such as header includes)
A closure) is in it's simplest form a block of code that can be passed around as a value, and that can reference variables in the scope it was created in even after exiting from that scope.
(For a more formal description look at the Wikipedia page linked above)
A Ruby block forms a closure, for example:
def foo x
printf "x is %d\n",x
lambda do
x += 1
printf "block: x is %d\n",x
end
end
c = foo(5)
c.call
c.call
Every time foo is called it will return a closure that has access to the x that was
passed in to foo. In the example, x will start out at 5 and get incremented to 7.
The most important implication of this is that any variables used in the closure must be guaranteed to live as long as the closure does.
We'll get back to this example throughout this article.
There are a number of ways of handling this, for example:
Each of these have advantages and disadvantages.
We're going to look at the rewriting method mainly, though most of what you find below also apply to the copying method.
These methods are pretty similar - they both involve returning a pointer to the function implementing the code in the closure combined with a pointer to the data in the closure. The difference lies in how that data is accessed.
The rewriting approach creates the closure environment as early as possible, and changes the surrounding function to refer to that environment. It incures the closure creation cost whenever the closure generating function is created, but since local variables are initialized straight into the environment it avoids later copying.
The copy approach delays the creation of the closure environment as much as possible, and then copies the data. It can avoid unnecessary creation cost if there are paths that don't lead to creation of a closure, but when the creation happens, it needs to handle full copying of any objects or object references involved, which may be more expensive.
For both of these methods, care must be taken to create a single environment for any set of closures created during the same execution of the function that creates the closures.
When returning a closure we also need to be able to pass along the environment.
As it turns out, there are a number of ways of handling this:
Proc, and simply treat the
variables used as instance variables of that object.
The "fat pointer" approach is simple but involves more work at every call site, and doubles the size of the data that is passed around.
Turning it into an object is simple, and for my Ruby compiler it's even necessary a lot
of the time since most of the time when you handle blocks in Ruby, you'll actually get a
Proc object. But it has the full overhead of method dispatch.
The "half-assed object" approach still requires each call site to do a little bit more work, but less than the fat pointer approach. It also doesn't require additional data to be passed around (the function pointer is stored in the environment instead of copied around)
Creating a "thunk" also has overhead, but isn't as scary as it may sound - the code to create is very simple, and really it consists of copying a few bytes around.
We'll start with a "sort of" object, and then take a look at the thunk approach.
C lacks pretty much everything that could make this convenient and easy, so it really lays the implementation bare, for better or worse.
First, let's create a structure to hold the function pointer and environment:
struct closure {
void (* call)(struct closure *);
int x;
};
If we had more local variables in foo, we'd add them to this structure.
Then we need to create a function with the code for the lambda block:
void block(struct closure * env) {
env->x += 1;
printf ("block: x is %d\n", env->x);
}
Finally we can implement foo:
struct closure * foo(int x)
{
struct closure * closure = (struct closure *)malloc(sizeof(struct closure *));
closure->x = x;
printf ("x is %d\n",closure->x);
closure->call = █
return closure;
}
Ewww..
A couple of observations: If we want to be able to return multiple closures (say, an array of them), the variables needs to be acessed via one more indirection, which makes this even more disgustingly convoluted.
It's also annoying, because it means lots of extra overhead in order to handle a situation that might very well never arise.
If the language requires supporting multiple closures (like Ruby), a compiler could support both approaches to optimize - adding the cost of the extra redirection only:
For an example of these restrictions, consider:
def foo x
a = 1
b = 2
if x
return lambda { a += 1 }
else
return lambda { b+= 1 }
end
end
First of all, this function is guaranteed to only return one lambda at the
time, so applying that rule, we can just assing the appropriate function
pointer to the call member variable, and do away with the extra indirection.
Secondly, even if we decide to return both of them at the same time, they are guaranteed to never access the same variables, so we could instead create two distinct closure environments, and still avoid the extra indirection.
But let's take a look at the complete example with the extra indirection anyway, as a worst case scenario:
First we create an environment for the free variables we wish to "capture":
struct env {
int x;
};
Our modified closure structure holds a pointer to it, instead of holding the variables:
struct closure {
void (* call)(struct env *);
struct env * env;
};
The block takes a pointer to the environment:
void block(struct env * env) {
env->x += 1;
printf ("block: x is %d\n", env->x);
}
And the closure function itself needs to first allocate the environment, and then the closure:
struct closure * foo(int x) {
struct env * env = (struct env *)malloc(sizeof(struct env));
env->x = x;
printf ("x is %d\n",env->x);
struct closure * closure = (struct closure *)malloc(sizeof(struct closure *));
closure->env = env;
closure->call = block;
return closure;
}
Finally we call it with the env:
int main() {
struct closure * c = foo(5);
c->call(c->env);
c->call(c->env);
}
The example above is actually really simple to use to illustrate the fat pointer approach. Instead of returning a pointer to the closure object, we simply return the object itself:
struct closure foo(int x)
{
struct env * env = (struct env *)malloc(sizeof(struct env));
env->x = x;
printf ("x is %d\n",env->x);
struct closure closure;
closure.env = env;
closure.call = block;
return closure;
}
int main() {
struct closure c = foo(5);
c.call(c.env);
c.call(c.env);
}
As you can see it makes some things simpler (no need for the second memory allocation step).
An optimization worth keeping in mind is variable substitution. In cases where it can be guaranteed that the variables in question keeps the same value, the need for a separate environment may go away if variables are substituted for their values in the closures body.
Furthermore, if the variables can be guaranteed not to change in any closure returned from the function, then whether or not the variable changes in the function outside the closures, the closures may keep separate environments (and hence do the optimization to avoid indirection) with copies of the variables in question.
Of course this would require the function to carry out any updates once for each generated closure.
(you might have spotted here one of the advantages for functional languages with little or no mutation of variables - they have a lot fewer issues to worry about with respect to sharing of viarable state)
Now that we have the basics down, how do we return just a "plain" function pointer, so we can simplify the call sites?
What we want to create is something like the code below.
Note that we take a shortcut and don't create a proper stack frame - this kind of thing can easily confuse gdb and other debuggers, which is not necessarily very nice.
The thunk below also does not directly allow passing any arguments to the block - if we wanted to do that it gets hairier, since we'd need to manipulate the parameters passed, while the caller doesn't know we've altered the size of the parameter space set aside. In a compiler this would likely be solved by making either the caller or callee aware that it's a closure call, and adjusting the stack accordingly separate from the thunk.
Of course, the downside of using asm here is that it's architecture specific, and the code to generate the thunk will need to be modified accordingly to port the code, but then if you do this in a compiler that is outputting native assembly, you have to do that anyway.
my_closure_instance:
pushl $my_environment ; Push the environment onto the stack as the first arg
call $the_block ; Go to the real code
addl $4,%esp ; Throw the environment pointer away
ret ; Return to the caller
Putting some arbitrary values in there, and assembling it with gcc/gas, and then doing
objdump -D to the resulting binary gives this (and other bits and pieces
I've cut - use the -nostdlib option to avoid dealing with a bunch of initialization
code):
08048055 :
8048055: 68 00 08 af 2f push $0x2faf0800
804805a: e8 00 00 00 00 call 804805f
804805f: 83 c4 04 add $0x4,%esp
8048062: c3 ret
This shows us the values to put in our thunk. Couple of observations: The push is just followed by the address itself, but the call uses an offset from the first byte of the following instruction.
This approach of using structs to generate the thunk, I've blatantly stolen from Joe Damato
because I like how it makes the code that manipulates the thunk more readable:
struct __attribute__((packed)) thunk {
unsigned char push_op;
void * env_addr;
unsigned char call_op;
signed long call_offset;
unsigned char add_esp_ops[3];
unsigned char ret_op;
};
struct thunk default_thunk = {0x68, 0, 0xe8, 0, {0x83, 0xc4, 0x04}, 0xc3};
(the __attribute__ stuff is the gcc specific way of avoiding padding for alignement)
We change the rest like this:
typedef void (* cfunc)();
cfunc foo (int x) {
struct env * env = (struct env *)malloc(sizeof(struct env));
env->x = x;
printf ("x is %d\n",env->x);
struct thunk * thunk = (struct thunk *)mmap(0,sizeof(struct thunk), PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
*thunk = default_thunk;
thunk->env_addr = env;
thunk->call_offset = (void *)&block - (void *)&thunk->add_esp[0]; // Pretty!
mprotect(thunk,sizeof(struct thunk), PROT_EXEC);
return (cfunc)thunk;
}
The typedef is a workaround for C's absolutely atrocious ptr-to-function declarations...
The interesting bit is at the end where we allocate the thunk, and fill in the addresses
Then we cast the thunk data structure to a function pointer. That ought to make you feel dirty, and a bit queasy. It's ok, though.
We use mmap to avoid problems on systems with executable heaps turned off (execshield etc. that wil cause a segmentation fault if you try to execute code in malloc()'d memory or on the stack), and the mprotect() turns off write access to the page after we're done. For a production approach you may want a dedicated allocation function to properly manage this and perhaps avoid doing a separate mmap for every thunk created.
All of this could really be wrapped up into a nice generic function. Something like this:
struct thunk * make_thunk(struct env * env, void * code) {
struct thunk * thunk = (struct thunk *)mmap(0,sizeof(struct thunk), PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
*thunk = default_thunk;
thunk->env_addr = env;
thunk->call_offset = code - (void *)&thunk->add_esp[0]; // Pretty!
mprotect(thunk,sizeof(struct thunk), PROT_EXEC);
return thunk;
}
Finally, this makes our main function look like this:
int main() {
cfunc c = foo(5);
c();
c();
}
Now that's nicer...
Since this is already gcc/Linux/x86-32 specific, it can be made even nicer. gcc supports inner functions, and with a tiny bit of restructuring and a couple of macros I did this, mostly for fun to see how close to a "natural" syntax for closures I could get in C without changing the compiler:
#define initenv(__vars__) struct env { __vars__ ; } * env = (struct env *)malloc(sizeof(struct env));
#define new_closure(__block__) (closure)make_thunk(env,&__block__)
closure foo (int x)
{
initenv(int x)
env->x = x;
printf ("x is %d\n",env->x);
void block (struct env * env) {
env->x += 1;
printf ("block: x is %d\n", env->x);
}
return new_closure(block);
}
I'll still stick to Ruby...
I wrote this while exploring various approaches to add closure support to my
Ruby compiler. I haven't quite made up my mind yet. The object approach is
tempting because Ruby already has the Proc class, but I'll probably go for
an environment + fat pointer approach that will be converted into a Proc
object if assigned to anything (as opposed to just used for yield)
The thunk approach is somewhat appealing too, but if turned into a Proc object it may be as much or
more overhead than the fat pointer approach (and there will only be one
call site: In Proc#call) and this approach would mean the fat pointer
wouldn't be passed around much - typical usage would be passing a block to
a method, and then yield'ing to it, where the yield would be the only
call site.
A couple of parts ago we established some of the problems with supporting even the
seemingly simple attr_reader, attr_writer and attr_accessor. To reiterate,
a naive implementation looks like this:
def attr_accessor sym attr_reader sym attr_writer sym end def attr_reader sym define_method sym do %s(ivar self sym) end end def attr_writer sym define_method "#{sym.to_s}=".to_sym do |val| %s(assign (ivar self sym) val) end end
We resolved the issue of having a working sym.to_s, but that still leaves a number
of things to address. For example, we need at least the start of a "proper" String
class that we can add #to_sym to. That's what we'll look at this time.
The first bit is very similar to what we did for Symbol: We won't type tag, and we'll
create String objects for literal strings by calling a low level function.
MRI "hides" the gory details of String and other basic classes by not making them real Ruby objects per-se. There's data there that is simply not accessible without dipping into the C API. We may eventually want to take a similar tack - add a mechanism for allocating space for, setting and retrieving low level variables via the s-expression syntax that are completely invisible for the higher level API.
Currently we don't. The "raw" string used for Symbol and for String objects is a normal instance variable, and it's perfectly possible to expose it, and accessing it thinking it's a real object will trivially easily cause nasty crashes.
In general, we would eventually benefit from creating a mechanism to "hide" the low level plumbing from accidental access by normal Ruby. For now we'll retain the ability to shoot ourselves in the foot easily.
The String class will for now contain just a pointer to a C-style string. It will be immutable in it's first incarnation. This gets us a decent step further very easily - we can just keep the address to a C-style string constant.
As it turns out, this will be an easy foundation to build on:
MRI uses a number of flags for strings, for example to do "copy on write". Once we need to make strings mutable, we'll do the same. Strings created from constants will start out marked as "copy on write", and when doing an update, the string buffer will be copied into freshly allocated space first. The hope is that many string constants will never be updated.
We also need to implement String#to_sym. This is thankfully also reasonably easy:
We need to access the low level function we added that will retrieve a symbol based
on a "raw" c-style string.
As the last couple of times, this was unfortunately committed a bit piecemeal, so the commits aren't easy to follow. But lets go through this step by step.
First of all, we're going to rewrite the tree, to alter a literal string into
%s(call __get_string ConstantReferringToTheRawString)
So, we alter compile to add a call to rewrite_strconst:
# # Re-write string constants outside %s() t # %s(call __get_string [original string constant]) def rewrite_strconst(exp) exp.depth_first do |e| next :skip if e[0] == :sexp is_call = e[0] == :call e.each_with_index do |s,i| if s.is_a?(String) lab = @string_constants[s] if !lab lab = @e.get_local @string_constants[s] = lab end e[i] = [:sexp, [:call, :__get_string, lab.to_sym]] # FIXME: This is a horrible workaround to deal # with a parser inconsistency that leaves calls # with a single argument with the argument "bare" # if it's not an array, which breaks with this rewrite. e[i] = [e[i]] if is_call && i > 1 end end end end
Effectively this visits all nodes that are not s-expressions, and if it finds a string, it will add a string constant. It will then use the label that is allocated to rewrite the expression like this:
[:sexp, [:call, :__get_string, lab.to_sym]]
As the following line says, it's a nasty workaround, and it will be torn out as soon as I get a parser, so just ignore that... pretend it's not even there ;)
Now, the reason we wrap the call to __get_string is that otherwise it will be
rewritten to a method call because there really is no such thing as function calls
in Ruby, but for some of our low-level plumbing we need actual C-style function calls
(such as for calling out to C code).
We're also changing strconst, the method used to get the "value" of a string constant
as follows:
- def strconst str - lab = @string_constants[str] - return lab if lab - lab = @e.get_local - @string_constants[str] = lab - return lab + def strconst(a) + lab = @string_constants[a] + if !lab # For any constants in s-expressions + lab = @e.get_local + @string_constants[a] = lab + end + return [:addr,lab] end
What's left? Not much, really. You can see the basi c String class
Here's the important bit:
class String def __set_raw(str) @buffer = str end end def __get_string(str) s = String.new %s(callm s __set_raw (str)) s end
It's worthwhile reading the comments in at the link above if you want to see more about where this is heading.
We'll flesh out the String class in coming parts, but this groundwork now means we're working on a "real" object instead of on a raw pointer to a c-style string, which means it's mostly a runtime library issue instead of requiring much in terms of compiler changes.
One of the things worth mentioning that is covered in one of the comments in lib/core/string.rb
is that the whole __get_string and __get_symbol thing is a bit ugly. As I mentioned
earlier in this part, hiding the low level implementation would be nice, and that includes
preventing direct calls to __get_string and _get_symbol, and that can be done trivially
by simply ensuring the Ruby code can't directly do function calls (as opposed to method
calls). But that doesn't solve String#__set_raw. To fix that, I'll need to either remove
the need for the method, or by making it possible to "hide" certain methods entirely from
the Ruby code...
It's not a priority now, though - first we get things working, then we make it pretty.
What about String#to_sym? Well, this ought to do the trick:
class String def to_sym __get_symbol(@buffer) end end
So, that leaves us with a working define_method and %s(lambda) with support for variables
to get attr_reader,attr_writer and attr_accessor working... We're heading down that
road soon.
A week or so ago, the Ruby Draft specification made the rounds
Yes... Ruby is finally getting a standard. While RubySpec has been around for a while, and is a great, it is an executable specification that tells you what, not why, and it aims to be "complete" while the new project aims to define a shared core suitable for implementation by all the different Ruby implementations that are springing up.
I think they're complementary enough that the Ruby Draft Specification could be very beneficial.
For my compiler project this means there's now a compelling reason to revisit some things (such as the parser) to look at conformance, and a clear roadmap for some others (classes) to ensure it at least meets the spec.
Of course, in many ways MRI will still be the benchmark, but the standard will provide a minimal level of conformance that will likely be easier to achieve.
In particular, the spec includes a formal grammar for Ruby that doesn't involve walking through thousands of lines of code written for another implementation to verify that you're doing things right. I'm particularly looking forward to going through my parser and aligning it more closely with the draft spec... (I'm sure I'll be swearing a lot while doing it, though)
It also includes descriptions of expected semantics for things like the object model that have previously been something a lot of people (including me) have spent time revers engineering to figure out.
Combined with all the Ruby implementations in progress, this is a clear indication that Ruby is growing up and becoming a contender in areas where it would previously not be acceptable.
Keep an eye on the Ruby standard project...
E-mail: vidar@hokstad.com
Skype: vhokstad
Twitter: vhokstad
View my LinkedIn profile.
I was born April 21st, 1975, in Oslo, Norway. Since 2000 I've been living in London, UK. I'm married and we just had our first child, Tristan Ikemefuna Hokstad.
I'm working for Aardvark Media as Director of Technology. I'm also currently on the board of SpatialQ, a startup in the GIS space, and an advisor to Skoach, a startup doing a time management app for people with ADD.