Toxic Elephant

Don't bury it in your back yard!

How many s-expression formats are there for Ruby?

Posted by matijs 04/11/2012 at 13h34

Once upon a time, there was only UnifiedRuby, a cleaned up representation of the Ruby AST.

Now, what do we have?

  • RubyParser before version 3; this is the UnifiedRuby format: "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
  • RubyParser version 3: "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3)) "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3))
  • Rubinius; this is basically the UnifiedRuby format, but using Arrays.

      # => [:call, nil, :foobar, [:arglist, [:lit, 1], [:lit, 2], [:lit, 3]]]
  • RipperRubyParser; a wrapper around Ripper producing UnifiedRuby: "foobar(1,2,3)"
      # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))

How do these fare with new Ruby 1.9 syntax? Let’s try hashes. RubyParser before version 3 and Rubinius (even in 1.9 mode) can’t handle this.

  • RubyParser 3: "{a: 1}"
      # => s(:hash, s(:lit, :a), s(:lit, 1))
  • RipperRubyParser: "{a: 1}"
      # => s(:hash, s(:lit, :a), s(:lit, 1))

And what about stabby lambda’s?

  • RubyParser 3: "->{}"
      # => s(:iter, s(:call, nil, :lambda), 0, nil)
  • RipperRubyParser: "->{}"
      # => s(:iter, s(:call, nil, :lambda, s(:arglist)),
      #      s(:masgn, s(:array)), s(:void_stmt))

That looks like a big difference, but this is just the degenerate case. When the lambda has some arguments and a body, the difference is minor:

  • RubyParser 3: "->(a){foo}"
      # => s(:iter, s(:call, nil, :lambda),
      #      s(:lasgn, :a), s(:call, nil, :foo))
  • RipperRubyParser: "->(a){foo}"
      # => s(:iter, s(:call, nil, :lambda, s(:arglist)),
      #      s(:lasgn, :a), s(:call, nil, :foo, s(:arglist)))

So, what’s the conclusion? For parsing Ruby 1.9 syntax, there are really only two options: RubyParser and RipperRubyParser. The latter stays closer to the UnifiedRuby format, but the difference is small.

RubyParser’s results are a little neater, so RipperRubyParser should probably conform to the same format. Reek can then be updated to use the cleaner format, and use either library for parsing.

Tags , , , no comments no trackbacks

Building a Simple Markdown Viewer with GirFFI

Posted by matijs 17/04/2012 at 07h41

This morning, I found myself looking for a simple markdown previewer that would run on the desktop. Using GirFFI, it was ridiculously easy to create it myself.

The simple version, based on the Webkit example in the GirFFI repository, goes something like this:

require 'ffi-gtk3'
require 'github/markup'

GirFFI.setup :WebKit, '3.0'
win = :toplevel
scr = nil, nil
wv =
win.add scr
scr.add wv
win.set_default_geometry 700, 500

file = ARGV[0]
fullpath = File.expand_path(file, Dir.pwd)
html = GitHub::Markup.render fullpath
wv.load_string html, nil, nil, "file://#{fullpath}"

GObject.signal_connect(win, "destroy") { Gtk.main_quit }

I got the basic version working in about 10 minutes. The more complex version adds a keyboard handler to allow reloading the viewed file.

Tags , 2 comments no trackbacks

Books for Programmers

Posted by matijs 19/02/2012 at 12h46

My list of all-time-favorite books for programmers. I’m not saying everyone should read these, but each of these had an important impact on my growth as a programmer. These are not necessarily in chronological order, by the way.

First, books that are mostly independent of your choice of programming language:

Design Patterns and Refactoring are not books to be read cover to cover, since they they devote quite a large part of their volume to catalogueing. The other two definitely are.

The following books are each really about a particular language. They’re well written, but it’s hard to separate the impact of the books from the impact of the languages.

  • Programming Perl (a.k.a. The Camel Book). This book made me grasp object-oriented programming for the first time by breaking it down to a very basic level. I did most of my learning Perl from this book.
  • Programming Ruby (a.k.a. The Pickaxe Book). I learned Ruby from the free online edition. It got me hooked.

Tags , , no comments no trackbacks

You Need Some Isolation

Posted by matijs 11/12/2011 at 18h08

Something weird just happened. While refactoring GirFFI, I had managed to remove all use of a particular module. So, I removed the corresponding file, ran the tests using

rake test

And the tests passed. Committed, done.

Then, I took a walk down to the library. By the time I got back, as soon as I looked at my code again, there it was: A giant require statement requiring the file I had just removed. Huh, why do my tests pass?

Well, duh, I have GirFFI installed as a gem, and my code is just picking up the missing file from there. So, I run

bundle exec rake test

The tests fail, showing me exactly the line I need to remove. Commit amended, done.

So, the moral of the story: If you’re developing a gem, use your isolation tool of choice, be it Bundler, Isolate, or something else, to shield your gem development environment from older installed versions.

Tags no comments no trackbacks

A tiny replacement for RVM

Posted by matijs 31/07/2011 at 17h31

Recently, there was a change in where Debian’s rubygems packages store each gem’s files. Instead of having a separate bin directory for each version of ruby, now both the 1.8 and the 1.9 version store scripts in /usr/local/bin. In fact, they will happily overwrite each other’s scripts. This can be very confusing when you think you’re running a script with Ruby 1.8, but in fact it’s running with 1.9, and hence, 1.9’s set of installed gems.

All this made me seriously consider using RVM. Which was quite shocking, as I consider it to be an ugly hack, both in concept and in execution. So, rather than admitting defeat, I decided to create my own hack.

Tags no comments no trackbacks

Choosing a Distributed File System

Posted by matijs 30/06/2011 at 18h15

It’s happening, like it happens to all of us: My hard disk is getting full, and although the free space would have seemed like an ocean just a decade ago, now it’s a worryingly small pool of tiny little gigabytes. I could try freeing up some by tediously going through all the photos I never bothered to cull before, but with Gb-sized videos being added on a regular basis, that isn’t a long term solution. Where long term is anything that will tide me over to my next laptop.

But, what if I could offload some of those files to some other storage medium? I’m not really that fond of external hard disks, but perhaps a file server? Great! You mount some remote directory, and it’s like it’s right there on your machine.

There’s only one problem with that: My computer is a laptop, and as such, it gets carried around. Not a lot, but still. I won’t be able to choose beforehand which files I want to access (again, too tedious). So, what I really want is good offline behaviour.

So, what are the options? After some poking around on Wikipedia, it seems I apparently want a Distributed Fault-Tolerant File System. Look, it says right there:

for […] offline (disconnected) operation.

Yay! What follows is a full evening of reading about Lustre, MooseFS, Tahoe-LAFS, 9P, Ceph, AFS and Coda. The situation is not uplifting:

The only system that actually promises offline operation is Coda. It is derived from AFS, for which this feature was concieved as early as 1997. Unfortunately, not much has happened for Coda since more than a year. This wouldn’t be a problem if it were rock-solid, but there’s a bug open since three years describing how loss of network connectivity during a write can cause both the original file and its replacement to be lost.

The next best thing (feature-wise, at least) is OpenAFS. Another descendant from AFS, it seems more solid. In 2008, offline operation was a Google Summer of Code project. This has been integrated in the main code base, but is disabled by default. It also seems to need explicit commands to go offline and online, which is not ideal.

All the other options really don’t seem to provide offline operation at all. Everyone seems busy developing different flavors of massive petabyte-size storage systems for clusters of machines linked through rock-solid gigabit-per-second or faster networks. Offline operation is clearly not a use case there.

One honorary mention goes to InterMezzo, a descendant of Coda. It seems to have supported offline operation, but managed to become obsolete before its parent, because its developers are now working on Lustre, yet another multi-petabyte high performance cluster file system.

After all that, where do we stand? There is basically no production-ready solution for my needs, so I guess for now I’ll have to resort to getting rid of crappy files, removing installed debug packages or shrinking my hardly-used MacOSX partition.

2 comments no trackbacks

GirFFI - An Introduction

Posted by matijs 10/05/2011 at 07h09

Over two years ago, I had the idea, that it should be possible to combine two great technologies, ruby-ffi, and GObject Introspection, to dynamically create bindings for GLib-based libraries.

This idea, like many, was born from frustration: The development of Ruby-GNOME2 is labour-intensive, and therefore, it lags behind the development of Gnome libraries. In particular, I wanted to use the Gio library, which had no bindings at the time, to fetch generated icons for images.

Serious development started in october 2009, with a basic proof-of-concept to show that it was at least possible to use FFI to bind Gtk+.

About a year later, GirFFI 0.0.1 was finally released as a gem. Several more releases followed, and now, with release 0.0.9, it seems be nearly feature-complete.

So, what can GirFFI do for you?

Given any GLib-based library with GObject Introspection data, it generates bindings for that library. These bindings are generated dynamically at runtime, meaning that methods and classes are not generated until first use.

Because GirFFI uses FFI to call into the C libraries, it is not tied to a particular Ruby, but is known to work with MRI 1.8.7 and 1.9.2, and with JRuby 1.6.1. Support for Rubinius is planned.

GirFFI supports both Gtk+ 2, and the new Gtk+ 3. You can choose between them by specifying the version when you set up the bindings.

The bindings GirFFI generates are less Ruby-like than those of Ruby-GNOME2. For example, it uses the standard method names provided by the library, and doesn’t try to change, e.g., get_name and set_name to name and name=. This also means it is not a drop-in replacement for Ruby-GNOME2.

Of course, some work remains to be done.

GirFFI is currently very conservative when it comes to freeing allocated memory. As a result, it leaks memory. There are tools to at least test this, but I have no experience using these. Help would definitely be appreciated.

Also, documentation is rather lacking. In particular, there’s not much information on how to translate from the C function calls described in the Gtk+ documentation to the Ruby method calls needed for GirFFI. Ideally, GirFFI could generate Ruby-oriented documentation straight from the GObject-introspection data.

Finally, there are certain functions that are deemed ‘unintrospectable’ by GObject Introspection. These include functions taking varargs or generic pointers. These will need to be hand-bound. GirFFI includes several such hand-bound methods, but the set is far from complete.

Do give GirFFI a try. The gem’s name is gir_ffi, and you can fork the code on GitHub. Comments, bug reports, and pull requests are all welcome.

Tags no comments no trackbacks

Benchmarking Dynamic Method Creation in Ruby

Posted by matijs 22/04/2011 at 09h30

Let’s look at dynamic method generation. I need it for GirFFI, and if you do any kind of metaprogramming, you probably need it too. It was already shown a long time ago that using string evaluation is preferable to using define_method with a block.

That is, if you care at all about speed.

Tags no comments no trackbacks

Materialized Path to Nested Set

Posted by matijs 13/12/2010 at 23h14

On twitter, @clemensk asks:

Hey SQL experts, is it somehow possible in pure (My)SQL to extract a nested set from a table full of paths (think: Category 1 > Category 2)?

To do this, you need to do two things: Extract the names of the nodes, and calculate values for lft and rgt. Here’s my take on the latter part:

Tags no comments no trackbacks

Redefined Accessors

Posted by matijs 10/12/2010 at 09h29

If you’re going to do this:

<typo:code lang=”ruby”> def foo= f

@foo = f + " bar"

end </typo:code>

Then don’t first do this:

<typo:code lang=”ruby”> attr_accessor :foo </typo:code>

But instead do this:

<typo:code lang=”ruby”> attr_reader :foo </typo:code>

That way, there won’t be “method redefined” warnings all over the place.

Let’s make this more general: Before you release your gem, make sure it runs without warnings. They should stick out like a sore thumb when you run your tests, anyway.


Tags no comments no trackbacks