Vidar Hokstad V2.0

Home Blog

Tag: typing

2008-05-13 13:22 UTC Giant balls of typeless source files

Posted in: , , ,
Cedric posted this entry about an article by Steve Yegge called Dynamic languages strike back (it's slightly interesting, but nothing really new if you already like dynamic languages, though the section on trace trees is wel worth a read). Most of Cedric's comments were unremarkable, but this made me cringe:

What will keep preventing dynamically typed languages from displacing statically typed ones in large scale software is not performance, it's the simple fact that it's impossible to make sense of a giant ball of typeless source files, which causes automatic refactorings to be unreliable, hence hardly applicable, which in turn makes developers scared of refactoring. And it's all downhill from there. Hello bit rot.

First of all, let me be generous and assume he meant type annotated source files, because most prominent dynamic languages are strongly typed. In Ruby, for example, everything is an object, and every object has a type and a class (note the distinction: A Ruby object has a type that may or may not coincide with just the class, because of the concept of eigenclasses, which allows you to extend a single object without changing the class).

This makes the argument slightly more tenable, but not much so. Type annotations, or lack of them, is not a distinguishing feature of static languages vs. dynamic. Many static languages, such as Haskell, can forego most (in some cases all) type annotation because of good type inference engines.

But one of the things I love about Ruby is the lack of unnatural type restrictions. In fact, the few times I've come across code with explicit type annotations (using #is_a? etc.) it's usually been a hindrance rather than a help, because the implementor of the class in question made assumptions about what classes could be provided as input that just plain were more restrictive than they needed to be.

The "giant balls of typeless source files" just have yet to be a problem for me. First of all, they're not giant balls - I find myself routinely writing far less code to achieve the same goals than what I'd do in most other languages that I know.

But more than that, I end up writing code where type largely doesn't matter, and where the small bits of type that do matter is clearly documented. The small bits that do matter are NOT type or class names, but things like this:

  • Argument "foo" must implement #each to iterate over a collection.
  • Argument "bar" must support Comparable.

Coupling tends to be far reduced, allowing me to reason about, and test, the code in much smaller units. Which again means that I rarely - if ever - need to even consider "giant balls" of source files at all. Most of my Ruby applications are on the order of a few hundred to a few thousand lines of code, but they build on a shitload of libraries. And while I like having the source to them in case I need a new feature, I can honestly say that I've never looked at the source of most of them for other than curiosity about how they do something.

Most of the time, I won't look at source or even documentation. I'll be doing something like this:

$ irb -rmodel
irb(main):001:0> (Item.methods - Object.methods).sort
=> ["[]", "add_hook", "after_create", "after_destroy", "after_initialize", "after_save", "after_update", "all_association_reflections", "associate", "association_reflection", "associations", "before_create", "before_destroy", "before_save", "before_update", "belongs_to", "cache_key_from_values", "cache_store", "cache_ttl", "columns", "create", "create_table", "create_table!", "create_with", "create_with_params", "database_opened", "dataset", "db", "db=", "def_hook_method", "delete_all", "destroy_all", "drop_table", "fetch", "find", "find_or_create", "has_and_belongs_to_many", "has_hooks?", "has_many", "has_validations?", "hooks", "implicit_table_name", "is", "is_a", "is_dataset_magic_method?", "join", "load", "many_to_many", "many_to_one", "method_missing", "no_primary_key", "one_to_many", "one_to_one", "plugin_gem", "plugin_module", "primary_key", "primary_key_hash", "schema", "serialize", "set_cache", "set_cache_ttl", "set_dataset", "set_primary_key", "set_schema", "skip_superclass_validations", "subset", "super_dataset", "table_exists?", "table_name", "validate", "validates", "validates_acceptance_of", "validates_confirmation_of", "validates_each", "validates_format_of", "validates_length_of", "validates_numericality_of", "validates_presence_of", "validations"]
irb(main):002:0> 

... and I tend to find what I'm looking for.

I have so far not once felt even the slightest need to use a refactoring tool with Ruby. I've never once done a mass renaming of methods with Ruby that spanned more than a single file, and where emacs search replace wasn't more than sufficient.

This is an impedance mismatch between the idea what is necessary that just doesn't match reality for a lot of people working with dynamic languages.

I don't want a re-factoring tool. It doesn't even place in my top 10 of features or functionality I'd like to have for Ruby, to the point where I've never looked to see if one already exists.

Does that mean I don't refactor? No it doesn't. But refactoring when the module you're working on is measured in hundreds of lines and a handful of files, and is meaningfully separate and not coupled to the rest of your code, is pretty trivial, and not something that needs tool support.

What it boils down to is that the very need for advanced refactoring tools is a big red flashing warning sign. It means the language has failed in making life easy for you and/or you have failed as a designer.

It is one of the reasons I truly loathe highly coupled systems - it's a design smell that I've had to endure enough in the past that I'll go to great lengths to avoid it. The result is better software, with far more reusability and maintainability.

And systems without giant balls of source files - typeless or not.


2008-04-05 19:19 UTC - Why am I forced to optimize when choosing my language?

Martin C. Martin wrote a great post on performance and dynamic languages. I particularly loved this quote:

Premature optimization is the root of all evil. Why am I forced to optimize when I am choosing my language, before writing a single line of code?

The post and comments are both well worth a read. I'm frustrated myself about the performance of dynamic languages, and while I don't particularly like the idea of annotating the programs to speed them up, I am a strong believer in designing dynamic languages to make the job easier for a compiler when it can be done without making things harder for the users. Optional type annotation, though can definitively be in that category if done right.

That said, I also believe that the compilers can be made smart enough. I do have some ideas on making Ruby run at near native speeds, for example, and I plan on writing more about that later. They are still ideas, though, and given the amazing efforts of the Rubinius team and others, hopefully I won't have to actually try to implement them...

Overall, the best approaches to improving the speed of dynamic languages seem to be to use stats and analysis to guess a lot of static properties or semi-static properties of the program, and then cop out and include a full JIT code generator in the runtime system in order to be able to reverse various optimizations or fall back on a "slow path" when your assumptions were wrong. In Ruby at least, which is the dynamic language I know best, there is plenty of room to do that, and it's also pretty close to the approach the Rubinius team is taking with their "send sites", and also to what Smalltalk VM's do (which is where Rubinius got it frm).

I believe a lot of those techniques can be carried over in a "mostly static" native compiler (I'm sure it already has, but I haven't had nearly as much time as I'd like to read up on the work that's been done in that area), and hope to eventually get some chance to explore that (some of you may be aware that I'm posting a series on writing a compiler in Ruby - what I've posted so far is trivial stuff, but towards the end of the parts I've written but not yet posted it's getting close to being what I want in a toy platform for experimentation with stuff like this)


Older Entries

About me

E-mail: vidar@hokstad.com
Skype: vhokstad
View my LinkedIn profile

I was born April 21st, 1975, in Oslo, Norway. Since 2000 I've been living in London, UK. I'm married.

I'm working for Aardvark Media as Director of Technology. I'm also currently on the board of SpatialQ, a startup in the GIS space, and an advisor to Skoach, a startup doing a time management app for people with ADD.

Tags

(1) (2) (1) (3) (2) (3) (2) (15) (10) (3) (2) (2) (2) (2) (2) (3) (5) (2) (4) (2) (2) (2) (2) (2) (3) (4) (4) (4) (3) (30) (5) (2) (1) (33) (1) (2) (2) (4) (2) (3) (3) (2) (2) (1) (3) (2) (4) (2) (3) (2)

StumbleUpon My link page

(Links I have stumbled and like)