Toxic Elephant

Don't bury it in your back yard!

In Ruby, negation is a method

Posted by matijs 30/01/2014 at 06h16

These past few days, I’ve been busy updating RipperRubyParser to make it compatible with RubyParser 3. This morning, I discovered that one thing that was changed from RubyParser 2 is the parsing of negations.

Before, !foo was parsed like this:

s(:not, s(:call, nil, :foo))

Now, !foo is parsed like this:

s(:call, s(:call, nil, :foo), :!)

That looks a lot like a method call. Could it be that in fact, it is a method call? Let’s see.

Tags , no comments no trackbacks

Some thoughts on Ruby's speed

Posted by matijs 02/03/2013 at 16h42

Yesterday, I read Alex Gaynor’s slides on dynamic language speed. It’s an interesting argument, but I’m not totally convinced.

At a high level, the argument is as follows, it seems:

  • For a comparable algorithm, Ruby et al. do much more work behind the scenes than ‘fast’ languages such as C.
  • In particular, they do a lot of memory allocation.
  • Therefore, we should add tools to those languages that allow us to do memory allocation more efficiently.

Allocation

As an example, the creation of an array of squared integers is presented, where the runtime needs to dynamically allocate ever larger arrays to hold its result.

I can see the argument there, but I’m not convinced the solution is correct, for the example given. In particular, how much time is spent allocating and re-allocating the array of object references, and how much is spent elsewehere?

<typo:code lang=”ruby”> require ‘benchmark’

def doubles_unallocated n buf = [] i = 1; j = 0 while j < n do

buf[i - 1] = i * i
i += 1; j += 1

end buf end

def doubles_preallocated n buf = Array.new n i = 1; j = 0 while j < n do

buf[j] = i * i
i += 1; j += 1

end buf end

n = 10000

Benchmark.bmbm do |x| x.report(“unallocated”) { n.times { doublesunallocated 1000 } } x.report(“preallocated”) { n.times { doublespreallocated 1000 } } end </typo:code>

A typical result on Ruby 1.9.3 would be (I’m not showing the rehearsal section here):

                   user     system      total        real
unallocated    1.480000   0.020000   1.500000 (  1.493959)
preallocated   1.370000   0.030000   1.400000 (  1.404668)

Taking several runs into account, I conclude that pre-allocating the array shaves off 7 to 13 percent of the runtime.

Algorithms

The algorithms used above are obviously not idiomatic Ruby, but they have the advantage of allowing a comparison of allocated and unallocated arrays. Normally, I would probably use map with a range. Similarly, the Python code presented in the slides looks odd to me. I would have expected something using list comprehensions.

Here is the idiomatic Ruby implementation:

<typo:code lang=”ruby”> def doubles_map n (1..n).map {|i| i * i } end </typo:code>

And here is the Ruby equivalent of the Python example given (using each versus for gives no difference in speed):

<typo:code lang=”ruby”> def doubles_each n buf = [] (1..n).each do |i|

buf << i * i

end buf end </typo:code>

For Ruby, doubles_map is the idiomatic implementation. It also has the advantage that a dedicated implementation of Range#map would be able to allocate the result array in one go, transparantly avoiding the array growing overhead. Using list comprehensions in Python would probably have the same advantage.

Unfortunately, doubles_map is also a lot slower than doubles_each, except on MRI 1.8. Here is the benchmark I ran with a fixed-array version thrown in for good measure:

<typo:code lang=”ruby”> ARR = (1..1000).toa def doublesfixed_array ARR.map {|i| i * i } end

n = 10000

Benchmark.bmbm do |x| x.report(“using map”) { n.times { doublesmap 1000 } } x.report(“using each”) { n.times { doubleseach 1000 } } x.report(“using a fixed array”) { n.times { doublesfixedarray } } end </typo:code>

Here is the result for MRI 1.9.3:

                          user     system      total        real
using map             1.220000   0.000000   1.220000 (  1.217607)
using each            1.070000   0.020000   1.090000 (  1.091302)
using a fixed array   0.940000   0.020000   0.960000 (  0.969685)

And for MRI 1.8.7:

                          user     system      total        real
using map             2.790000   0.000000   2.790000 (  2.795956)
using each            3.630000   0.010000   3.640000 (  3.644272)
using a fixed array   2.120000   0.010000   2.130000 (  2.133636)

A better Range#map

These benchmark results made me wonder how Range#map is implemented. Since it is written in Ruby, I checked the Rubinius implementation, and there at least the problem seems due to the lack of a dedicated Range#map implementation: The generic Enumerable#map is used, effectively turning it into doubles_each. I’m not entirely sure why it is actually slower, perhaps that’s due to the double yield involved and/or argument splatting.

Here are the results of the previous benchmark, run on Rubinius 2.0.0-rc1:

                          user     system      total        real
using map             1.312082   0.004000   1.316082 (  1.316717)
using each            0.996062   0.000000   0.996062 (  0.997418)
using a fixed array   0.720045   0.000000   0.720045 (  0.724615)

I made the following simple dedicated implementation for Range#map. It falls back to the default Enumerable#map for most cases, except for Fixnums:

<typo:code lang=”ruby”> class Range def map

if block_given? && Fixnum === @begin
  first, last = @begin, @end
  last -= 1 if @excl
  size = last - first + 1

  out = Array.new size
  out_tuple = out.tuple

  i = first
  j = 0
  while i <= last
    out_tuple[j] = yield i
    i += 1
    j += 1
  end

  out
else
  super
end

end end </typo:code>

This makes doubles_map run slightly faster than the fixed array version:

                          user     system      total        real
using map             0.648040   0.000000   0.648040 (  0.650103)
using each            0.928058   0.000000   0.928058 (  0.930109)
using a fixed array   0.680042   0.004001   0.684043 (  0.686719)

Benchmarking Still Needed

While examining these options, I found that there are many ways to write this code, each subtly different, each with a different performance on different Rubies (the examples using while are the slowest on MRI 1.9.3, but the fastest on Rubinus). So, if you really want your code to be fast, you still have to benchmark, you still have to know which Ruby implementation your code will run on, and you still have to try all the options.

Tags , , , no comments no trackbacks

How I found a bug in GirFFI using Travis and Git

Posted by matijs 17/02/2013 at 20h15

I love Travis CI. I love git bisect. I used both recently to track down a bug in GirFFI.

Suddenly, builds were failing on JRuby. The problem did not occur on my own, 64 bit, machine, so it seemed hard to debug. I tried making Travis use different JVMs, but that didn’t help, apart from crashing in a different way (faster, too, which was nice).

Building a Travis box

Using the travis-boxes repository, I created a VM as used by Travis. This is currently not documented well in the READMEs, so I’m writing it down here, slightly out of order of actual events.

I cloned the following three repositories:

travis-cookbooks travis-boxes veewee-definitions

First, I created a base box in veewee-definitions, according to its README. In this case, I created a precise32 box, since that’s the box Travis uses for the builds. The final, export, stage creates a precise32.box file.

Then, I moved the precise32.box file to travis-boxes/boxes, making a base box available there. There is a Thor task to create just such a base box right there, but it doesn’t work, and seems to be deprecated anyway, since veewee is no longer supposed to be used in that repository.

So, a base box being available in travis-boxes, I used the following to create a fully functional box for testing Rubies:

bundle exec thor travis:box:build -b precise32 ruby

Oddly, this didn’t produce a box travis-ruby, but it did produce travis-development, which I could then manipulate using vagrant.

Hunting down the bug

I ssh’d into my fresh travis box using vagrant ssh. After a couple of minutes getting to know rvm (I use rbenv myself), I was able to confirm the crash on JRuby. After some initial poking around trying to pin down the problem to one particular test case and failing, I decided to use git bisect. As my check I used the test:introspection task, which reliably crashed when the problem was present.

While it’s possible to automate git bisect, I like to use it manually, since a particular test used may fail for unrelated reasons. Also, since git bisect is a really fast process, there is a pleasent lack of tedium.

Anyway, after a couple of iterations, I was able to locate the problematic commit. By checking the different bits of the commit I then found the culprit: I accidentally broke the code that creates layout definitions, in particular the one used by GValue. Going back to master, I added a simple test and fix. I will have to revisit the code later to clean it up and make it more robust.

Tags , , , , no comments no trackbacks

How many s-expression formats are there for Ruby?

Posted by matijs 04/11/2012 at 13h34

Once upon a time, there was only UnifiedRuby, a cleaned up representation of the Ruby AST.

Now, what do we have?

  • RubyParser before version 3; this is the UnifiedRuby format:

    RubyParser.new.parse "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
    
  • RubyParser version 3:

    Ruby18Parser.new.parse "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3))
    
    Ruby19Parser.new.parse "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3))
    
  • Rubinius; this is basically the UnifiedRuby format, but using Arrays.

      "foobar(1,2,3)".to_sexp
      # => [:call, nil, :foobar, [:arglist, [:lit, 1], [:lit, 2], [:lit, 3]]]
    
  • RipperRubyParser; a wrapper around Ripper producing UnifiedRuby:

      RipperRubyParser::Parser.new.parse "foobar(1,2,3)"
      # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
    

How do these fare with new Ruby 1.9 syntax? Let’s try hashes. RubyParser before version 3 and Rubinius (even in 1.9 mode) can’t handle this.

  • RubyParser 3:

      Ruby19Parser.new.parse "{a: 1}"
      # => s(:hash, s(:lit, :a), s(:lit, 1))
    
  • RipperRubyParser:

      RipperRubyParser::Parser.new.parse "{a: 1}"
      # => s(:hash, s(:lit, :a), s(:lit, 1))
    

And what about stabby lambda’s?

  • RubyParser 3:

      Ruby19Parser.new.parse "->{}"
      # => s(:iter, s(:call, nil, :lambda), 0, nil)
    
  • RipperRubyParser:

      RipperRubyParser::Parser.new.parse "->{}"
      # => s(:iter, s(:call, nil, :lambda, s(:arglist)),
      #      s(:masgn, s(:array)), s(:void_stmt))
    

That looks like a big difference, but this is just the degenerate case. When the lambda has some arguments and a body, the difference is minor:

  • RubyParser 3:

      Ruby19Parser.new.parse "->(a){foo}"
      # => s(:iter, s(:call, nil, :lambda),
      #      s(:lasgn, :a), s(:call, nil, :foo))
    
  • RipperRubyParser:

      RipperRubyParser::Parser.new.parse "->(a){foo}"
      # => s(:iter, s(:call, nil, :lambda, s(:arglist)),
      #      s(:lasgn, :a), s(:call, nil, :foo, s(:arglist)))
    

So, what’s the conclusion? For parsing Ruby 1.9 syntax, there are really only two options: RubyParser and RipperRubyParser. The latter stays closer to the UnifiedRuby format, but the difference is small.

RubyParser’s results are a little neater, so RipperRubyParser should probably conform to the same format. Reek can then be updated to use the cleaner format, and use either library for parsing.

Tags , , , no comments no trackbacks

Building a Simple Markdown Viewer with GirFFI

Posted by matijs 17/04/2012 at 07h41

This morning, I found myself looking for a simple markdown previewer that would run on the desktop. Using GirFFI, it was ridiculously easy to create it myself.

The simple version, based on the Webkit example in the GirFFI repository, goes something like this:

require 'ffi-gtk3'
require 'github/markup'

GirFFI.setup :WebKit, '3.0'
Gtk.init
win = Gtk::Window.new :toplevel
scr = Gtk::ScrolledWindow.new nil, nil
wv = WebKit::WebView.new
win.add scr
scr.add wv
win.set_default_geometry 700, 500
win.show_all

file = ARGV[0]
fullpath = File.expand_path(file, Dir.pwd)
html = GitHub::Markup.render fullpath
wv.load_string html, nil, nil, "file://#{fullpath}"

GObject.signal_connect(win, "destroy") { Gtk.main_quit }
Gtk.main

I got the basic version working in about 10 minutes. The more complex version adds a keyboard handler to allow reloading the viewed file.

Tags , 3 comments no trackbacks

You Need Some Isolation

Posted by matijs 11/12/2011 at 18h08

Something weird just happened. While refactoring GirFFI, I had managed to remove all use of a particular module. So, I removed the corresponding file, ran the tests using

rake test

And the tests passed. Committed, done.

Then, I took a walk down to the library. By the time I got back, as soon as I looked at my code again, there it was: A giant require statement requiring the file I had just removed. Huh, why do my tests pass?

Well, duh, I have GirFFI installed as a gem, and my code is just picking up the missing file from there. So, I run

bundle exec rake test

The tests fail, showing me exactly the line I need to remove. Commit amended, done.

So, the moral of the story: If you’re developing a gem, use your isolation tool of choice, be it Bundler, Isolate, or something else, to shield your gem development environment from older installed versions.

Tags no comments no trackbacks

A tiny replacement for RVM

Posted by matijs 31/07/2011 at 17h31

Recently, there was a change in where Debian’s rubygems packages store each gem’s files. Instead of having a separate bin directory for each version of ruby, now both the 1.8 and the 1.9 version store scripts in /usr/local/bin. In fact, they will happily overwrite each other’s scripts. This can be very confusing when you think you’re running a script with Ruby 1.8, but in fact it’s running with 1.9, and hence, 1.9’s set of installed gems.

All this made me seriously consider using RVM. Which was quite shocking, as I consider it to be an ugly hack, both in concept and in execution. So, rather than admitting defeat, I decided to create my own hack.

Tags no comments no trackbacks

GirFFI - An Introduction

Posted by matijs 10/05/2011 at 07h09

Over two years ago, I had the idea, that it should be possible to combine two great technologies, ruby-ffi, and GObject Introspection, to dynamically create bindings for GLib-based libraries.

This idea, like many, was born from frustration: The development of Ruby-GNOME2 is labour-intensive, and therefore, it lags behind the development of Gnome libraries. In particular, I wanted to use the Gio library, which had no bindings at the time, to fetch generated icons for images.

Tags no comments no trackbacks

Benchmarking Dynamic Method Creation in Ruby

Posted by matijs 22/04/2011 at 09h30

Let’s look at dynamic method generation. I need it for GirFFI, and if you do any kind of metaprogramming, you probably need it too. It was already shown a long time ago that using string evaluation is preferable to using define_method with a block.

That is, if you care at all about speed.

How preferable? About a factor of 1.8 on Ruby 1.8.7:

                    user     system      total        real
regular:        0.270000   0.000000   0.270000 (  0.267164)
string eval:    0.270000   0.000000   0.270000 (  0.273170)
define_method:  0.490000   0.000000   0.490000 (  0.493258)

These numbers are relatively easy to explain. The string evaluation basically has the exact same result as regular method definition, because you actually do the same thing: You define a method using def. With define_method, you have a lot of overhead because a block is more than a bunch of code. It’s actually a closure, and Ruby has to set up the closure’s binding every time you call the method.

(Aside: There are of course other factors to consider. Using string evaluation gives you much more power to build exactly the method you want, while the fact that the blocks passed to define_method are closures allows you to do things now ordinary method can do.)

On Ruby 1.9.2, the results are quite similar, although the difference is now only a factor of 1.5:

                    user     system      total        real
regular:        0.140000   0.010000   0.150000 (  0.142998)
string eval:    0.140000   0.000000   0.140000 (  0.141439)
define_method:  0.210000   0.000000   0.210000 (  0.211936)

Too bad. It seems we’re stuck with string eval. Let’s look at JRuby:

                    user     system      total        real
regular:        0.324000   0.000000   0.324000 (  0.324000)
string eval:    0.208000   0.000000   0.208000 (  0.208000)
define_method:  0.690000   0.000000   0.690000 (  0.690000)

Wait, that can’t be right. Let’s run that again:

                    user     system      total        real
regular:        0.424000   0.000000   0.424000 (  0.424000)
string eval:    0.241000   0.000000   0.241000 (  0.241000)
define_method:  0.756000   0.000000   0.756000 (  0.756000)

Hm, it got a little slower, but the pattern is the same: The method defined using string eval is the fastest of the lot. What is going on here?

Quick, let’s try rubinius.

                    user     system      total        real
regular:        0.348022   0.000000   0.348022 (  0.183686)
string eval:    0.380024   0.004000   0.384024 (  0.196908)
define_method:  0.412026   0.008001   0.420027 (  0.215781)

Uh-huh. Again please.

                    user     system      total        real
regular:        0.376023   0.004000   0.380023 (  0.192683)
string eval:    0.356022   0.000000   0.356022 (  0.189258)
define_method:  0.164010   0.000000   0.164010 (  0.138058)

Huh? Like, maybe the benchmark is wrong?

Okay, let’s try bmbm instead of bm. For rubinius:

Rehearsal --------------------------------------------------
regular:         0.332020   0.004001   0.336021 (  0.172166)
string eval:     0.352022   0.004000   0.356022 (  0.185557)
define_method:   0.192012   0.000000   0.192012 (  0.152656)
----------------------------------------- total: 0.884055sec

                     user     system      total        real
regular:         0.052003   0.000000   0.052003 (  0.050464)
string eval:     0.052003   0.000000   0.052003 (  0.050471)
define_method:   0.076005   0.000000   0.076005 (  0.076912)

That looks more sane. Let’s try JRuby:

Rehearsal --------------------------------------------------
regular:         0.408000   0.000000   0.408000 (  0.408000)
string eval:     0.196000   0.000000   0.196000 (  0.196000)
define_method:   0.657000   0.000000   0.657000 (  0.657000)
----------------------------------------- total: 1.261000sec

                     user     system      total        real
regular:         0.095000   0.000000   0.095000 (  0.096000)
string eval:     0.109000   0.000000   0.109000 (  0.109000)
define_method:   0.416000   0.000000   0.416000 (  0.416000)

Much better. Notice that the difference between string eval and define_method is a stunning factor of four!

Now go back to rubinius. Did you notice how fast it was? That’s stunning. So what’s the difference there between define_method and string eval? Not so big. But the numbers are small so there may be some influence from the environment. Let’s look at how things scale: 10 million iterations:

Rehearsal --------------------------------------------------
regular:         1.240078   0.004000   1.244078 (  1.034290)
string eval:     1.076067   0.000000   1.076067 (  1.058813)
define_method:   0.848053   0.000000   0.848053 (  0.838424)
----------------------------------------- total: 3.168198sec

                     user     system      total        real
regular:         0.496031   0.000000   0.496031 (  0.496565)
string eval:     0.528033   0.000000   0.528033 (  0.500421)
define_method:   0.864054   0.000000   0.864054 (  0.863875)

That’s a factor of about 1.6, somewhere between MRI 1.8.7 and 1.9.2.

Conclusions

The basic conclusion holds: String evaluation leads to faster methods. How much of a difference it makes depends on which Ruby you’re using, and will probably depend on the particular method you’re creating.

Benchmarking is tricky: If you don’t watch out, you could draw the wrong conclusions.

Finally, Rubinius is fast. Really fast. Wow.

The Code

Benchmarks were generated with the following program:

<typo:code lang=”ruby”> require ‘benchmark’

class Foo def regular; end

eval “def stringeval; end”

define_method(:block) {} end

foo = Foo.new

n = 10000000

Benchmark.bmbm do |x| x.report(“regular: “) { n.times { foo.regular } } x.report(“string eval: “) { n.times { foo.stringeval } } x.report(“define_method:”) { n.times { foo.block } } end </typo:code>

Tags no comments no trackbacks

Materialized Path to Nested Set

Posted by matijs 13/12/2010 at 23h14

On twitter, @clemensk asks:

Hey SQL experts, is it somehow possible in pure (My)SQL to extract a nested set from a table full of paths (think: Category 1 > Category 2)?

To do this, you need to do two things: Extract the names of the nodes, and calculate values for lft and rgt. Here’s my take on the latter part:

Finding the left and right values basically corresponds to traversing a number of boundaries. For each enclosing set, we traverse its boundary once, and for each neighboring set, we traverse its boundary twice. Therefore:

  • We must count the enclosing sets: SUBSTRING(this.path, 1, LENGTH(other.path)) = other.path. Each counts as one, including the current node itself.
  • We must count the neighboring sets: other.path < this.path AND SUBSTRING(this.path, 1, LENGTH(other.path)) <> other.path

This will give us the value for lft.

For rgt, it’s almost the same:

  • We must count the enclosing sets: SUBSTRING(this.path, 1, LENGTH(other.path)) = other.path. Each counts as one, including the current node itself.
  • We must count the neighboring sets, but this is more complicated, as the ordering is not entirely alphabetical: SUBSTRING(other.path, 1, LENGTH(this.path) > this.path
  • Finally, we have to subtract from the maximum value of rgt (which is twice the number of nodes) and add 1 to take care of off-by-one errors.

And here is the resulting query:

<typo:code lang=”sql”> SELECT ( SELECT COUNT(*)

FROM paths b
WHERE SUBSTRING(a.path, 1, LENGTH(b.path)) = b.path ) +

( SELECT COUNT(*)

FROM paths b
WHERE SUBSTRING(a.path, 1, LENGTH(b.path)) <> b.path
  AND b.path < a.path ) * 2 AS lft,

( SELECT COUNT() FROM paths ) * 2 - ( SELECT COUNT()

FROM paths b
WHERE SUBSTRING(a.path, 1, LENGTH(b.path)) = b.path ) -

( SELECT COUNT(*)

FROM paths b
WHERE SUBSTRING(b.path, 1, LENGTH(a.path)) > a.path) * 2 + 1 AS rgt,

a.path FROM paths a </typo:code>

Tags 1 comment no trackbacks