This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.
To be or not to be nil
One of the big elephants sauntering around the room for a long time has been
the issue of how to handle the specifics of how Ruby handles
false. To a lesser extent this issue also affects numbers, but it is those
three values that are most critical right now.
The reason is control flow. So far, we've treated these values the way C does:
nil is simply the null pointer;
true is any non-zero value, and false is zero
(and thus for most practical intents the same as
The problem, of course, is that this is not the way it is in Ruby.
nil are values distinct from the numbers, and they compare with each others
and with other values in different ways than in C.
They are also objects. Which means we lose out on some of the simplest ways of
doing comparisons and turning the comparison results into a value. We may find
people doing things like
if <some expression>.nil?.
false both evaluate to false in a conditional, but
nil != false.
So far "faking it" have worked, because with a few exceptions like the ones above, the C and Ruby variations are relatively compatible. But it's not a lasting solution.
There is another problem: If we change basic contructs like the s-expression
work on Ruby objects, we'll find it hard to implement the "plumbing" under Ruby.
Introducing some static typing
Don't bring out the pickaxe just yet (groan). As it happens, our compiler compiles two very different languages: The s-expression inspired low level language used both as the compilation target for Ruby and implementation language for low level features, and Ruby itself.
The former language is de facto typeless, like BCPL: We pass values around with wild abandon, and we even clobber Ruby local variables and instance variables with it, but what meaning these values have depends entirely on usage rather than the type of the variable (as in C) or a type attached to the value itself, like in Ruby.
And as it happens, here lies both the problem and solution to our conundrum from above:
If only the compiler can know when it is dealing with real Ruby values, and when
it is dealing with something else, then, e.g.
compile_if can generate different
code in these situations.
Not only that: We will need this information when we eventually get tired of leaking memory and start adding a garbage collector - otherwise we're stuck with a conservative collector, so we get twice the benefit.
It will also help us contain the "leakage" of untyped values into Ruby, by letting us define and narrow the rules for when and where and how we're allowed to work with them.
As it happens, we don't need a very complicated type-system: For now we can get away with knowing if a reasonable subset of constructs returns either an object or may contain anything.
That's it. That's the grand total of the static typing we'll introduce this time.
However the changes start laying the groundwork for more static typing that we can use for optimizations and sanity checks. Ultimately I wish to relegate the "s-expression plumbing" to a very restricted space.
Apart from just categorizing stuff into two types, there's another limitation too
for now: Where we act on type information, we will treat all variables as typed to
objects, and all return variables from method calls to be typed as objects. We
will implicitly assume that the s-expression syntax will be contained, though we
are not yet verifying that. In some cases this will be outright wrong. E.g. this
if foo; bar; end from working correctly if
foo is not an
object, and happens to contain 0, and in any number of similar instances, so it
is likely introducing some regressions (I caught one while writing this - there are
First, let's put some basic test cases in place. You'll find them in d22b95f
Then lets start putting our new typing into place. Let's start with a
to hold a possibly typed value (in 3ec81cb):
require 'delegate' # Used to hold a possiby-typed value # Currently, valid values for "type" # are :object or nil. class Value < SimpleDelegator attr_reader :type def initialize ob, type = nil super(ob) @type = type end # Evil. Since we explicitly check for Symbol some places def is_a?(ob) __getobj__.is_a?(ob) end end
To simplify refactoring, we have it be a delegator, so we only selectively add/change
behaviour as needed. For now, the only new thing is that
#type will return the associated
type tag, or nil. We only support
:object for now, to indicate we know the value to be a
pointer to a Ruby object.
I'm not going to go through ever detail of the changes in
compiler.rb. You can find the full set
Apart from a number of changes to return objects of the new
Value class, the main things
to notice are as follows:
@global_constants to prevent them from being treated
as method calls:
+ @global_constants << :false + @global_constants << :true + @global_constants << :nil
Next up is this change in
- @e.jmp_on_false(l_else_arm, res) + + if res && res.type == :object + @e.save_result(res) + @e.cmpl(@e.result_value, "nil") + @e.je(l_else_arm) + @e.cmpl(@e.result_value, "false") + @e.je(l_else_arm) + else + @e.jmp_on_false(l_else_arm, res) + end +
What's happening here is that instead of assuming an untyped value, we check to see if we know we have an object. If we do, and we come across "if result; ...; else ...; end", we change the code to effectively do the equivalent of:
if result != nil && result != false # if block else # else block end
There's an equivalent change for
Furthermore there's a few minor additional changes to
scope.rb to prevent true/false/nil from
being treated as method calls in 52f31ad3
We also need to check in
transform.rb that we're not trying to treat true, false and nil as
local variables. See f2af5fc
Changes to the runtime
In order to make these changes work, we also need to modify the runtime in various ways.
Most obviously, we need to actually make
nil real objects. We do that in
+require 'core/true' +true = TrueClass.new # FIXME: MRI does not allow creating an object of TrueClass +require 'core/false' +false = FalseClass.new # FIXME: MRI does not allow creating an object of FalseClass +require 'core/nil' +nil = NilClass.new # FIXME: MRI does not allow creating an object of NilClass. + # OK, so perhaps this is a bit ugly... self = Object.new @@ -59,9 +66,6 @@ STDERR = 1 STDOUT = IO.new ARGV=7 Enumerable=8 #Here because modules doesn't work yet -nil = 0 # FIXME: Should be an object of NilClass -true = 1 # FIXME: Should be an object of TrueClass -false = 0 # FIXME: Should be an object of FalseClass
These depends on very basic initial implementations of
NilClass - see c356591
Another change is in
lib/core/fixnum.rb, where all the comparison operators needs to change:
def == other - %s(eq @value (callm other __get_raw)) + %s(if (eq @value (callm other __get_raw)) true false) end
This is because
%s(eq ..) etc. does not handle typing yet (and they may not necessarily
ever need it), so we use our newly
typed %s(if ..) coupled with explicitly returning the
right objects instead of the numeric values we'd previously get.
It is important to do this in particular as one of the changes I snuck past in
assumes that method calls returns Ruby objects.
Almost done now, but there's also a minor change to
lib/core/object.rb to remove the horribly
false methods we used previously.
To be or not to be and more?
As it happens, we have a few more things to do:
%s(and ..) and
%s(or ...) needs to take
type information into account to be able to generate proper code for e.g.:
if a and b ... elsif a or c ... end
In our new world,
a and b (or
a && b) will always be true, because both
have integer values that are non-null. Similarly
a or c /
a || c will always be true as well,
since both values will be seen to evaluate to true.
First of all, I've added a test case to catch this, in 1dfe043. But one of our other test
cases shows a regression as well.
features/inputs/strcmp.rb now gives wrong results, because
we previously relied on being able to use "plain Ruby"
if to check the result of a call to
strcmp that we stored in a local variable. But for now at least, we're assuming variables
contain objects. We'll likely want to refine that, but for now we'll apply a workaround that
will work (in 32ddcde):
%s(assign res (if (strcmp @buffer (callm other __get_raw)) false true))
By explicitly assigning with the values
true, from the result of a value that
will get an indeterminate type, it will work again.
But lets fix "&&"/"and". Firstly we need to actually store the return value from
else_arm (in 2ae727d):
- compile_eval_arg(scope, if_arm) + ifret = compile_eval_arg(scope, if_arm) @e.jmp(l_end_if_arm) if else_arm @e.local(l_else_arm) - compile_eval_arg(scope, else_arm) if else_arm + elseret = compile_eval_arg(scope, else_arm) if else_arm
Secondly, we need to determine type based on them. Most importantly, we can only
safely return a type that is shared by both of them if both
present (also in 2ae727d):
- return Value.new([:subexpr]) + # We only return a specific type if there's either only an "if" + # expression, or both the "if" and "else" expressions have the + # same type. + # + type = nil + if ifret && (!elseret || ifret.type == elseret.type) + type = ifret.type + end + + return Value.new([:subexpr], type) end
Other than that, we're simply just adding our missing
(EDIT: This implementation is broken; a correct version will be in part 41)
+ def compile_or scope, left, right + compile_if(scope, left, false, right) + end
And that's it for this time.
<<< Back to top
- 2014-09-28 Writing a (Ruby) compiler in Ruby bottom up - step 38
- 2014-07-19 Writing a (Ruby) compiler in Ruby bottom up - step 37
- 2014-06-27 Writing a (Ruby) compiler in Ruby bottom up - step 36
- 2014-05-16 Writing a (Ruby) compiler in Ruby bottom up - step 35
- 2014-04-16 Writing a (Ruby) compiler in Ruby bottom up - step 34
- 2014-04-13 The Oberon-07 language report is 17 pages
- 2014-04-06 Writing a (Ruby) compiler in Ruby bottom up - step 33
- 2014-02-08 Writing a (Ruby) compiler in Ruby bottom up - step 32
- 2013-12-31 Writing a (Ruby) compiler in Ruby bottom up - step 31
- 2013-12-04 Writing a (Ruby) compiler in Ruby bottom up - step 30
- 2013-11-04 Writing a (Ruby) compiler in Ruby bottom up - step 29
- 2013-10-31 The Last New (Paper) Book I Will Ever Buy?
- 2013-10-08 Writing a (Ruby) compiler in Ruby bottom up - step 28
- 2013-09-24 Inline Graphviz
- 2013-09-02 Writing a (Ruby) compiler in Ruby bottom up - step 27
- 2013-08-06 Our Hidden Digital Libraries
- 2013-07-30 Writing a (Ruby) compiler in Ruby bottom up - step 26
- 2010-06-12 Writing a (Ruby) compiler in Ruby bottom up - step 25
- 2010-03-02 Minimig
- 2010-02-22 Writing a (Ruby) compiler in Ruby bottom up - step 24
- 2009-12-21 How to implement closures
- 2009-12-16 Writing a (Ruby) compiler in Ruby bottom up - step 23
- 2009-12-10 Ruby gets a spec
- 2009-12-05 Virgin Media, or how to make your customers hate you
- 2009-11-10 Writing a (Ruby) compiler in Ruby bottom up - step 22
- 2009-11-10 Writing a (Ruby) compiler in Ruby bottom up - step 21
- 2009-11-05 A pitfall of the Ruby Range class
- 2009-08-19 Vacation over
- 2009-06-05 Proof of concept SVG editor gadget for Google Wave
- 2009-06-01 Google Wave Gadget Emulator
- 2009-05-30 Google Wave as infrastructure
- 2009-05-28 All my known ancestors
- 2009-05-23 Family tree using Graphviz and Ruby
- 2009-05-21 Writing a compiler in Ruby bottom up - step 15
- 2009-05-21 Writing a compiler in Ruby bottom up - step 19
- 2009-05-18 Making Graphviz output pretty with XSL - Updated
- 2009-05-18 Making Graphviz output pretty with XSL
- 2009-05-14 I love throwing out code
- 2009-05-14 Is it wrong to try to make the Imperial March your babys first memory?
- 2009-05-05 Writing a (Ruby) compiler in Ruby bottom up - step 20
- 2009-05-03 Tristan Ikemefuna Hokstad
- 2009-04-20 Updated Graphviz tools on Github
- 2009-04-19 The problem with compiling Ruby
- 2009-04-16 Writing a compiler in Ruby bottom up - Milestone: It can parse itself...
- 2009-03-25 The Home Cloud
- 2009-03-03 The Ruby Object Model - Structure and Semantics
- 2009-02-22 Writing a compiler in Ruby bottom up - step 18
- 2009-02-20 Writing a compiler in Ruby bottom up - step 17
- 2009-02-19 Sliding Stats: Rack Middleware to keep an eye on your traffic
- 2009-02-17 Writing a compiler in Ruby bottom up - step 16
- 2009-02-12 Writing a compiler in Ruby bottom up - step 14
- 2009-02-03 Simple charts in Ruby using SVG::Graph
- 2009-02-01 Just added a github repository for my compiler series
- 2009-01-31 Creating Graphviz graphs from Ruby arrays
- 2009-01-25 Writing a compiler in Ruby bottom up - step 13
- 2009-01-19 Operations is a development concern
- 2008-12-08 A simple Operator Precedence parser
- 2008-10-26 Writing a compiler in Ruby bottom up - step 12
- 2008-09-28 Writing a compiler in Ruby bottom up - step 11
- 2008-09-21 Still alive
- 2008-07-10 Writing a compiler in Ruby bottom up - step 10
- 2008-07-10 Writing a compiler in Ruby bottom up - step 9
- 2008-06-10 5 simple ways to troubleshoot using Strace
- 2008-06-01 Writing a compiler in Ruby bottom up - step 8
- 2008-05-29 TraceViz: Visualizing traceroute output with graphivz
- 2008-05-28 Inversions (Iain M. Banks)
- 2008-05-24 OpenVZ and Apache troubleshooting: PRNG still contains insufficient entropy!
- 2008-05-22 Reducing coupling through unit tests
- 2008-05-21 Confessions of a Commodore 64 remix addict
- 2008-05-21 Would you like to live in the Culture?
- 2008-05-16 Writing a compiler in Ruby bottom up - step 7
- 2008-05-15 How to beat comment spam (for now, anyway)
- 2008-05-13 Giant balls of typeless source files
- 2008-05-06 A brief introduction to Semantic Dictionary Encoding
- 2008-05-06 Unholy: Converting Ruby 1.9 bytecode to Python bytecode
- 2008-05-03 Writing a compiler in Ruby bottom up - step 6
- 2008-05-01 Software ICs: Reuse should not always mean inheritance or configuration
- 2008-05-01 How to sell and not sell a new programming language
- 2008-04-30 Customizing the Ruby syntax highlighter for x86 assembler
- 2008-04-29 When your Linux / iptables firewall randomly drops connections...
- 2008-04-28 Writing a compiler in Ruby bottom up - step 5
- 2008-04-24 Where tagging falls apart
- 2008-04-17 Writing a compiler in Ruby bottom up - step 4
- 2008-04-16 OpenVz, /proc/user_beancounters and tcpsndbuf
- 2008-04-13 Rebuilding the build server on every build
- 2008-04-12 Mini reviews of 19 Ruby template engines
- 2008-04-11 Ken Livingstone and how to handle the press
- 2008-04-11 - Why am I forced to optimize when choosing my language?
- 2008-04-11 Dealing with information overload
- 2008-04-09 LDAP braindamage
- 2008-04-08 Strawman for a new parser generator
- 2008-04-07 So much for giving up the OOXML fight - Demonstration in Oslo
- 2008-04-07 Joys of virtualization
- 2008-04-06 Ur-Scheme: A tiny self-hosting Scheme to x86 asm compiler
- 2008-04-06 Writing a compiler in Ruby bottom up - step 3
- 2008-04-05 Writing a compiler in Ruby bottom up - step 2/??
- 2008-04-02 More OOXML folly
- 2008-03-31 OOXML: Ashamed of Standard Norge and insulting comments from Alex Brown
- 2008-03-30 Cisco and Patent Troll Tracker
- 2008-03-29 Latest referrers using Rack and Ruby
- 2008-03-29 Why coupling is always bad / Cohesion vs. coupling
- 2008-03-28 The OOXML circus is making ISO increasingly irrelevant
- 2008-03-27 Writing a compiler in Ruby bottom up - step 1/??
- 2008-03-26 Model-View-Controller - the beginning
- 2008-03-26 Trackback / comment spammers still at it...
- 2008-03-25 URLs do not belong in the Views
- 2008-03-24 Enforcing Strict Model-View Separation in Template Engines
- 2008-03-23 Why Rails is total overkill and why I love Rack
- 2008-03-23 Waking up to snow in late March....
- 2008-03-22 Rack middleware: Adding cache headers
- 2008-03-22 Rewriting content types with Rack
- 2008-03-22 Draw a logo with gradients with Ruby and Cairo
- 2008-03-22 Web2.0 style logo reflection with Ruby and Cairo
- 2008-03-22 Simple drawing in Ruby with Cairo
- 2008-03-21 Being a recovering startup employee tired of gambling
- 2008-03-21 Using Sequel and Ruby to import the Geonames database
- 2008-03-21 Shotgun: The Rubinius virtual machine and some musings on compiling Ruby
- 2008-03-20 Pet peeve: Exposing file extensions on the web
- 2008-03-20 Sequel with Sqlite caveat: Sorting on dates
- 2008-03-20 Sequel ORM: Right level of abstraction
- 2008-03-20 Sundown With Arthur: Remembering Arthur C. Clarke
- 2008-03-20 Syntax highlighting in Ruby
- 2008-03-20 The perils of shared testing and live sites...
- 2008-03-20 Sequel praise and Sqlite type translation problems
- 2008-03-20 My blog, version 2