Toxic Elephant

Don't bury it in your back yard!

Repo size

Posted by matijs 25/09/2015 at 08h11

I just realized one important factor for attracting casual open source contributions is code/repo size. A huge repo is a barrier. So, it’s hugely important to either use off-the-shelf libraries, or split off parts of your code into their own components. These components need to live in their own repository, so no monorepo’s.

Of course, a high-status, high-visibility project can get away with more. Rails, for example, has all its components in one repository and does not seem to be lacking in contributions. On the other hand, for a long time Gnome required the full source for everything to be checked out and built together. This requires a serious commitment for even the most trivial bug fixes.

Why the sudden insight? A project I’m involved in has problems with wkhtmltopdf: The version that used to work crashes after a server upgrade, and the version that works has problems with fonts and images. A simple solution could be to just recompile the old version on the new server. However, because it essentially forks all of Qt, checking out the source will require 1GB of disk space, while building will require another 2.5GB (and a commensurate amount of time). This is not undertaken lightly.

no comments no trackbacks

Try to avoid try

Posted by matijs 28/07/2015 at 10h52

Because of a pull request I was working on, I had cause to benchmark activesupport’s #try. Here’s the code:

require 'benchmark'
require 'active_support/core_ext/object/try'

class Bar
  def foo

  end
end

class Foo

end

bar = Bar.new
foo = Foo.new

n = 1000000
Benchmark.bmbm(15) do |x|
  x.report('straight') { n.times { bar.foo } }
  x.report('try - success') { n.times { bar.try(:foo) } }
  x.report('try - failure') { n.times { foo.try(:foo) } }
  x.report('try on nil') { n.times { nil.try(:foo) } }
end

Here is a sample run:

Rehearsal ---------------------------------------------------
straight          0.150000   0.000000   0.150000 (  0.147271)
try - success     0.760000   0.000000   0.760000 (  0.762529)
try - failure     0.410000   0.000000   0.410000 (  0.413914)
try on nil        0.210000   0.000000   0.210000 (  0.207706)
------------------------------------------ total: 1.530000sec

                      user     system      total        real
straight          0.140000   0.000000   0.140000 (  0.143235)
try - success     0.740000   0.000000   0.740000 (  0.742058)
try - failure     0.380000   0.000000   0.380000 (  0.379819)
try on nil        0.210000   0.000000   0.210000 (  0.207489)

Obviously, calling the method directly is much faster. I often see #try used defensively, without any reason warrented by the logic of the application. This makes the code harder to follow, and now this benchmark shows that this kind of cargo-culting can actually harm performance of the application in the long run.

Some more odd things stand out:

  • Succesful #try is slower than failed try plus a straight call. This is because #try actually does some checks and then calls #try! which does one of the checks all over again.
  • Calling #try on nil is slower than calling a nearly identical empty method on foo. I don’t really have an explanation for this, but it may have something to do with the fact that nil is a special built-in class that may have different logic for method lookup.

Bottom line: #try is pretty slow because it needs to do a lot of checking before actually calling the tried method. Try to avoid it if possible.

Tags , , no comments no trackbacks

In Ruby, negation is a method

Posted by matijs 30/01/2014 at 06h16

These past few days, I’ve been busy updating RipperRubyParser to make it compatible with RubyParser 3. This morning, I discovered that one thing that was changed from RubyParser 2 is the parsing of negations.

Before, !foo was parsed like this:

s(:not, s(:call, nil, :foo))

Now, !foo is parsed like this:

s(:call, s(:call, nil, :foo), :!)

That looks a lot like a method call. Could it be that in fact, it is a method call? Let’s see.

Tags , no comments no trackbacks

Things: A classification

Posted by matijs 19/01/2014 at 11h33

  • Things needed every day
  • Things needed every week
  • Things needed only during a certain season
  • Things needed for administrative purposes
  • Things kept for sentimental reasons
  • Thinks kept for beauty

Tags no comments no trackbacks

Some thoughts on Ruby's speed

Posted by matijs 02/03/2013 at 16h42

Yesterday, I read Alex Gaynor’s slides on dynamic language speed. It’s an interesting argument, but I’m not totally convinced.

At a high level, the argument is as follows, it seems:

  • For a comparable algorithm, Ruby et al. do much more work behind the scenes than ‘fast’ languages such as C.
  • In particular, they do a lot of memory allocation.
  • Therefore, we should add tools to those languages that allow us to do memory allocation more efficiently.

Tags , , , no comments no trackbacks

How I found a bug in GirFFI using Travis and Git

Posted by matijs 17/02/2013 at 20h15

I love Travis CI. I love git bisect. I used both recently to track down a bug in GirFFI.

Suddenly, builds were failing on JRuby. The problem did not occur on my own, 64 bit, machine, so it seemed hard to debug. I tried making Travis use different JVMs, but that didn’t help, apart from crashing in a different way (faster, too, which was nice).

Building a Travis box

Using the travis-boxes repository, I created a VM as used by Travis. This is currently not documented well in the READMEs, so I’m writing it down here, slightly out of order of actual events.

I cloned the following three repositories:

travis-cookbooks travis-boxes veewee-definitions

First, I created a base box in veewee-definitions, according to its README. In this case, I created a precise32 box, since that’s the box Travis uses for the builds. The final, export, stage creates a precise32.box file.

Then, I moved the precise32.box file to travis-boxes/boxes, making a base box available there. There is a Thor task to create just such a base box right there, but it doesn’t work, and seems to be deprecated anyway, since veewee is no longer supposed to be used in that repository.

So, a base box being available in travis-boxes, I used the following to create a fully functional box for testing Rubies:

bundle exec thor travis:box:build -b precise32 ruby

Oddly, this didn’t produce a box travis-ruby, but it did produce travis-development, which I could then manipulate using vagrant.

Hunting down the bug

I ssh’d into my fresh travis box using vagrant ssh. After a couple of minutes getting to know rvm (I use rbenv myself), I was able to confirm the crash on JRuby. After some initial poking around trying to pin down the problem to one particular test case and failing, I decided to use git bisect. As my check I used the test:introspection task, which reliably crashed when the problem was present.

While it’s possible to automate git bisect, I like to use it manually, since a particular test used may fail for unrelated reasons. Also, since git bisect is a really fast process, there is a pleasent lack of tedium.

Anyway, after a couple of iterations, I was able to locate the problematic commit. By checking the different bits of the commit I then found the culprit: I accidentally broke the code that creates layout definitions, in particular the one used by GValue. Going back to master, I added a simple test and fix. I will have to revisit the code later to clean it up and make it more robust.

Tags , , , , no comments no trackbacks

How many s-expression formats are there for Ruby?

Posted by matijs 04/11/2012 at 13h34

Once upon a time, there was only UnifiedRuby, a cleaned up representation of the Ruby AST.

Now, what do we have?

  • RubyParser before version 3; this is the UnifiedRuby format:

    RubyParser.new.parse "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
    
  • RubyParser version 3:

    Ruby18Parser.new.parse "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3))
    
    Ruby19Parser.new.parse "foobar(1, 2, 3)"
    # => s(:call, nil, :foobar, s(:lit, 1), s(:lit, 2), s(:lit, 3))
    
  • Rubinius; this is basically the UnifiedRuby format, but using Arrays.

      "foobar(1,2,3)".to_sexp
      # => [:call, nil, :foobar, [:arglist, [:lit, 1], [:lit, 2], [:lit, 3]]]
    
  • RipperRubyParser; a wrapper around Ripper producing UnifiedRuby:

      RipperRubyParser::Parser.new.parse "foobar(1,2,3)"
      # => s(:call, nil, :foobar, s(:arglist, s(:lit, 1), s(:lit, 2), s(:lit, 3)))
    

How do these fare with new Ruby 1.9 syntax? Let’s try hashes. RubyParser before version 3 and Rubinius (even in 1.9 mode) can’t handle this.

  • RubyParser 3:

      Ruby19Parser.new.parse "{a: 1}"
      # => s(:hash, s(:lit, :a), s(:lit, 1))
    
  • RipperRubyParser:

      RipperRubyParser::Parser.new.parse "{a: 1}"
      # => s(:hash, s(:lit, :a), s(:lit, 1))
    

And what about stabby lambda’s?

  • RubyParser 3:

      Ruby19Parser.new.parse "->{}"
      # => s(:iter, s(:call, nil, :lambda), 0, nil)
    
  • RipperRubyParser:

      RipperRubyParser::Parser.new.parse "->{}"
      # => s(:iter, s(:call, nil, :lambda, s(:arglist)),
      #      s(:masgn, s(:array)), s(:void_stmt))
    

That looks like a big difference, but this is just the degenerate case. When the lambda has some arguments and a body, the difference is minor:

  • RubyParser 3:

      Ruby19Parser.new.parse "->(a){foo}"
      # => s(:iter, s(:call, nil, :lambda),
      #      s(:lasgn, :a), s(:call, nil, :foo))
    
  • RipperRubyParser:

      RipperRubyParser::Parser.new.parse "->(a){foo}"
      # => s(:iter, s(:call, nil, :lambda, s(:arglist)),
      #      s(:lasgn, :a), s(:call, nil, :foo, s(:arglist)))
    

So, what’s the conclusion? For parsing Ruby 1.9 syntax, there are really only two options: RubyParser and RipperRubyParser. The latter stays closer to the UnifiedRuby format, but the difference is small.

RubyParser’s results are a little neater, so RipperRubyParser should probably conform to the same format. Reek can then be updated to use the cleaner format, and use either library for parsing.

Tags , , , no comments no trackbacks

Building a Simple Markdown Viewer with GirFFI

Posted by matijs 17/04/2012 at 07h41

This morning, I found myself looking for a simple markdown previewer that would run on the desktop. Using GirFFI, it was ridiculously easy to create it myself.

The simple version, based on the Webkit example in the GirFFI repository, goes something like this:

require 'ffi-gtk3'
require 'github/markup'

GirFFI.setup :WebKit, '3.0'
Gtk.init
win = Gtk::Window.new :toplevel
scr = Gtk::ScrolledWindow.new nil, nil
wv = WebKit::WebView.new
win.add scr
scr.add wv
win.set_default_geometry 700, 500
win.show_all

file = ARGV[0]
fullpath = File.expand_path(file, Dir.pwd)
html = GitHub::Markup.render fullpath
wv.load_string html, nil, nil, "file://#{fullpath}"

GObject.signal_connect(win, "destroy") { Gtk.main_quit }
Gtk.main

I got the basic version working in about 10 minutes. The more complex version adds a keyboard handler to allow reloading the viewed file.

Tags , 2 comments no trackbacks

Books for Programmers

Posted by matijs 19/02/2012 at 12h46

My list of all-time-favorite books for programmers. I’m not saying everyone should read these, but each of these had an important impact on my growth as a programmer. These are not necessarily in chronological order, by the way.

First, books that are mostly independent of your choice of programming language:

Design Patterns and Refactoring are not books to be read cover to cover, since they they devote quite a large part of their volume to catalogueing. The other two definitely are.

The following books are each really about a particular language. They’re well written, but it’s hard to separate the impact of the books from the impact of the languages.

  • Programming Perl (a.k.a. The Camel Book). This book made me grasp object-oriented programming for the first time by breaking it down to a very basic level. I did most of my learning Perl from this book.
  • Programming Ruby (a.k.a. The Pickaxe Book). I learned Ruby from the free online edition. It got me hooked.

Tags , , no comments no trackbacks

You Need Some Isolation

Posted by matijs 11/12/2011 at 18h08

Something weird just happened. While refactoring GirFFI, I had managed to remove all use of a particular module. So, I removed the corresponding file, ran the tests using

rake test

And the tests passed. Committed, done.

Then, I took a walk down to the library. By the time I got back, as soon as I looked at my code again, there it was: A giant require statement requiring the file I had just removed. Huh, why do my tests pass?

Well, duh, I have GirFFI installed as a gem, and my code is just picking up the missing file from there. So, I run

bundle exec rake test

The tests fail, showing me exactly the line I need to remove. Commit amended, done.

So, the moral of the story: If you’re developing a gem, use your isolation tool of choice, be it Bundler, Isolate, or something else, to shield your gem development environment from older installed versions.

Tags no comments no trackbacks