Warning: Includes Known Bugs

February 25th, 2009 | 10 comments

UPDATE: In further discussions with Jim Deville of IronRuby it appears that there may be a legal issue preventing IronRuby devs from patching Ruby code themselves. However it may be possible for IronRuby to use a commonly maintained and patched version of the standard library.

Reviewing the logs and considering this was Shri’s first major discussion in the IRC channel, I unfortunately grouped him in with Charles’ intolerable behavior and personal attacks which have occurred on a number of occasions in #rubyspec and #rubinius. My apologies to Shri. The struck out text below remains merely for historical accuracy.

UPDATE: Charles response to this post wasn’t exactly positive, but I think it’s fair to have this discussion in public: http://pastie.org/400493 Also, please note that I’ve struck out Shri’s name below as I may have misunderstood him in the earlier discussion.


You, the trusting consumer, would probably like to receive such cautionary advertisement were you to use a product that did, in fact, ship to you code that includes known bugs. And not just known bugs, but known bugs that have fixes for them.

You would like to know this, right? I mean, I’m not just some hard-headed asshole that thinks there’s something a bit whack here, am I? Please, do tell me.

Well, as luck would have it, you can also tell this to Charles Oliver Nutter of JRuby and Shri Borde of IronRuby.

Here’s the drama: There’s this project RubySpec. You may have heard of it. It attempts to describe the behavior of the Ruby programming language. All the alternative Ruby implementations use the RubySpec project to attempt to show that they are “Ruby”.

All the alternative implementations also choose to ship some version or other of the Ruby standard library. At least the parts written in Ruby. Makes sense, since they all implement the Ruby programming language.

As is the case with all software, from time to time bugs are discovered in Ruby. Usually, these are fixed soon after they are discovered and the fix is committed to the trunk version of MRI (Matz’s Ruby Implementation). Eventually, trunk becomes another stable release with a particular patchlevel.

The RubySpecs deal with this situation with a ruby_bug guard. You can read the details of RubySpec guards. This particular guard has two functions:

  1. It prevents the guarded spec from executing on any version of MRI less than or equal to the version specified in the guard. This is because MRI cannot re-release a particular patchlevel after it has been released. And the bugs are discovered after a release.
  2. It documents the spec, which shows what the correct behavior should be.

A key feature of the ruby_bug guard is that it does not prevent the spec from running on any alternative implementation. That is because every alternative implementation is expected to have the correct behavior. Additionally, these guards are only added after Matz or ruby-core has stated that the behavior at issue is a bug and the behavior of the spec is the correct behavior.

Now here is the rub, Charles does not want to manage patching the Ruby standard library that he ships with JRuby with the patches that already exist for known bugs. He wants to ship whatever version MRI has most recently released. Further, when you run the RubySpecs with JRuby, he wants to MASK those bugs because he doesn’t think it’s fair that JRuby fails a spec which shows a known bug in the Ruby standard library for which patches are available.

That’s Charles choice of strategies for managing JRuby packaging. I’m strongly of the opinion that you, the user, would like to know that. Charles apparently disagrees.

In fact, he disagrees so vehemently that he takes to calling me names in the #rubyspec IRC channel because I refuse to change the fact that the ruby_bug guard will not silently mask spec failures on JRuby or any other alternative implementation. Aside from being immature, I think there is a real problem with this. Don’t you?

Charlie will argue that it is simply impossible to ship the trunk version of Ruby standard library because it is an unknown quantity? However, the best defense against bugs in the Ruby standard library is better specs. And we’re talking about specs here that show the bugs and for which patches exist. Furthermore, there are actually relatively few bugs noted in the specs and most of those are in older versions of Ruby, not the current stable release.

So, here’s my question to you: Would you like to know that JRuby and possibly IronRuby ship you code that contains know bugs for which patches exist? Would you also like to know that Charles wants you to run RubySpec on JRuby and not know there is a bug?

Caveat Lector

February 20th, 2009 | 6 comments

I really did double-check this time and I won’t be making any wild claims here. Sorry to disappoint.

We’re going to be running Antonio’s Ruby Benchmark Suite daily to track our progress on performance in Rubinius. The current RBS is a bit of a beast so I imported the files into the Rubinius repository and did some refactoring. You can read the details and up-vote that if you’d like to see this merged back.

Now, for some baseline RBS results. If you want to follow along at home, here’s what I did. I generated these by running the rake bench task using the VM option (see the benchmark/utils/README in the Rubinius repository) for Rubinius on the stackfull and master branch and for MRI using the version installed on Debian lenny, 1.8.7p22. The system is a dual Intel® Xeon™ CPU 2.40GHz. Then I ran the rake bench:to_csv task, imported the CSV file into Google Docs, added the comparison columns and colors, and exported to PDF.

Here’s what I got. The green is faster, the red is slower. The reported time is the minimum time recorded in five “iterations” of each benchmark per input. The maximum time allowed to run five iterations is 300 seconds, or an average of 60 seconds per iteration.

A few notes about these numbers:

  • We’re still fixing the breakage on the stackfull branch, so it is not surprising, for instance, that all the thread benchmarks errored out. The new native thread support is not 100% done.
  • There are a couple speed regressions on the stackfull branch, most minor. We’ll fix those.
  • Most of the benches do run on the stackfull branch.
  • On most of the benches that run slower in stackfull than MRI, we’re 2x or less slower than MRI.
  • We are a lot faster than MRI on quite a few benchmarks.
  • Rubinius on either branch does quite well relative to MRI on benches that MRI times-out on for certain inputs.

Perhaps the biggest point about the stackfull branch is that we haven’t done much optimization at all. Evan’s been coding in the basic new interpreter architecture, fixing the GC interaction, adding the native threading. We’re fixing breakage now so we can get this merged into the master branch. The JIT is not hooked up. The new GC work is not done. There is no inlining. In other words, there is lots of head room. And that is the key point. You can’t just “make it faster”. Architecture is crucial. Since RailsConf 2008, we’ve been working hard to lay the architectural foundations. With those (and the switch away from stackless), we can start focusing on the real dynamic language optimizations.

While the benchmarks tell part of the story, there’s another part that is even more interesting IMO. And this is the part that got me so excited I, um, well I just got excited...

The two biggest pieces of Ruby software that we most often run are the Rubinius compiler and the RubySpecs. The RubySpecs are much more “real-world” than these benchmarks. Here are the results of two complete CI runs on master and stackfull. Note that we are not quite running all the basic CI specs on stackfull, but we’ll figure in that difference in our calculations below.

First, on master:

  $ bin/mspec ci --gc-stats
  rubinius 0.10.0 (ruby 1.8.6) (f4c5576c4 12/31/2009) [i686-apple-darwin9.6.0]

  Finished in 131.248169 seconds

  1430 files, 6927 examples, 23006 expectations, 0 failures, 0 errors

  Time spent in GC: 51.6s (39.3%)

And then on stackfull:

  $ bin/mspec ci --gc-stats
  rubinius 0.11.0-dev (ruby 1.8.6) (e7b6a2d56 12/31/2009) [i686-apple-darwin9.6.0]

  Finished in 66.357996 seconds

  1349 files, 6298 examples, 21344 expectations, 0 failures, 0 errors

  Time spent in GC: 12.7s (19.1%)

Let’s calculate how we do in expectations per second:

  $ irb
  >> master = 23006 / 131.248169
  => 175.286254850534
  >> stackfull = 21344 / 66.357996
  => 321.649255351232
  >> stackfull / master
  => 1.83499416782851

So, compiling and running the specs is about 1.8 times faster on stackfull. This is upside down from the normal results. Normally, we do better on the micro benchmarks and see that invert on “macro” benchmarks. On the RBS benches, stackfull is not 1.8 times faster than master. If I average the “x Master” column, I get 1.39.

There was something else in those spec run numbers I wanted to talk about… oh yeah, GC stats. We have a very simple GC timer stat right now. I’m going to be adding a few more stats. But what we see here is that the overall percentage of time spent in GC drops by half in stackfull. Even so, 19% is too much time to spend in GC. We expect to drop that by half again. Basically, leaning more on structures alloca'd on the C stack reduces a lot of pressure on the GC.

Some would toss out that it’s not hard to be faster than MRI. Perhaps. But it is an accomplishment to write a reasonably good VM, garbage collector, compiler, and Ruby standard library without importing anyone else’s code. And, lest we forget, that is two VM’s in about 27 months of a public project.

Some would also question the sanity of writing a VM and garbage collector when crazy smart people do things like that. Well, crazy smart people write papers that reasonably smart people can read and understand. From the benchmark result above, that is working pretty well.

Here’s the point: Don’t ever let anyone tell you that something is a bad idea. Make your own decisions. We probably wouldn’t have Ruby itself if Matz fretted over whether Larry Wall or Adele Goldberg were smarter than he. My most recent favorites in this space: Factor, Clojure, and yes, tinyrb.

We’re working frantically to get the stackfull branch breakages fixed and the branch merged back into master. Feel free to poke around and ask questions.

This is NOT cold fusion

February 12th, 2009 | 4 comments

Um, whoops. It was really late last night. Have I mentioned you’re wearing a great outfit today. Ok, already.

There’s this slight matter of DEV=1 rake build in Rubinius. Yes, I was debugging something. Started running some stuff under the stackfull branch, was intrigued by what I was seeing, decided to make some comparisons, could swear I ran rake clean; rake in the master branch, had a lot of green tea yesterday…

All right already. It’s not 4x faster. Here’s some new numbers:

Master branch:

$ bin/mspec ci core/string
rubinius 0.10.0 (ruby 1.8.6) (781eb14d3 12/31/2009) [i686-apple-darwin9.6.0]
.....................................................................

Finished in 10.576468 seconds

69 files, 763 examples, 5632 expectations, 0 failures, 0 errors

Stackfull:

rubinius 0.10.0 (ruby 1.8.6) (325174a8e 12/31/2009) [i686-apple-darwin9.6.0]
..................E.....E...E...F............E..E.EE..E........F..F..

Finished in 6.124444 seconds

69 files, 763 examples, 5545 expectations, 6 failures, 19 errors

That’s about 58% as long, or 42% faster, or creeping up on 2x.

If you rushed out and bought Evan a Valentine’s bear, I do apologize. But send it anyway. All the rest in my previous post about this being the beginning of a very good thing still holds. We’ll be getting more results soon and fixing the spec breakage on the stackfull branch. Stay tuned!

All shiny and new

February 12th, 2009 | 8 comments

UPDATE 2.0: You really did see the update below, right? You’re getting Charlie all worried with your enthusiasm for Rubinius.

UPDATE: Ahem, you should probably also read: This is NOT cold fusion. No, it’s not April 1st. Sorry about that. Are you still excited? Read on!

It’s a pattern I’m fairly familiar with now. Evan will be pondering an issue with Rubinius. I’ll catch wind of it when he starts asking some questions of smart people, reading academic CS papers, other implementation’s code, and tossing out some “what if…” questions. Next thing you know, he’s frenetically churning out code. Suddenly, Rubinius is much better, and in this case, faster.

Well, it’s happened again and the preliminary results are outstanding. A couple weeks ago, Evan began coding some changes to the way the Rubinius bytecode interperter works. He changed the stackless execution architecture that implemented an optimized kind of spaghetti stack to use the C stack more directly and naturally. This better enables the CPU optimizations of the past dozen years to work. It also significantly simplifies the code for our FFI, C-API for C extensions, JIT, and for potentially leveraging LLVM much more effectively. This change also brings native threads, and a much better GC for the mature generation is also in the works.

Now, for some details. Again, these results are preliminary. There is still a lot of breakage on the stackfull branch but MSpec is already running and many of the CI specs run. I’ll be getting a new CI set in place today and we’ll get the remaining breakage fixed quickly (don’t ya just love those specs).

Here’s some numbers for compiling and running the String specs.

First, on the Rubinius master branch:

    Finished in 25.829773 seconds

    69 files, 763 examples, 5632 expectations, 0 failures, 0 errors

Now, on the Rubinius stackfull branch:

    Finished in 5.834874 seconds

    69 files, 754 examples, 5563 expectations, 6 failures, 19 errors

Here’s the numbers for running after the specs have been compiled.

Again, on the master branch:

    Finished in 5.101799 seconds

    69 files, 763 examples, 5632 expectations, 0 failures, 0 errors

And now the stackfull branch:

    Finished in 1.564942 seconds

    69 files, 754 examples, 5563 expectations, 6 failures, 19 errors

I’ll let that sink in a bit…

The numbers for Hash with compilation are similar.

Master:

    Finished in 5.379050 seconds

    48 files, 195 examples, 425 expectations, 0 failures, 0 errors

Stackfull:

    Finished in 1.295544 seconds

    48 files, 193 examples, 421 expectations, 0 failures, 0 errors

That’s right, between 4.1 and 4.4 approaching 2 times faster (see the UPDATE above). And we are just getting started. The significant GC changes are not in yet. We are not yet doing any significant optimizations in the compiler, no profile-directed optimizations at runtime, and our nascent JIT is not hooked up by default. As I said at the outset, these optimizations are made easier by this architecture change.

While I’m breaking the news, Evan deserves the credit for the architecture decisions and generally being courageous enough to try and learn (some would say fail) and try again. Some have doubted that the lofty goals Rubinius has set are realistic. Doubters have a seat.

If you want to try this at home, clone the Rubinius Github repository and do the following:

    $ git branch --track stackfull origin/stackfull
    $ rake build
    $ bin/mspec ci core/string

Thanks to Engine Yard for trusting in Evan’s excellent judgment and system architecture talents and in all our hard work even if it doesn’t look immediately relevant. The path is clear. The goods are in the truck and they will be delivered.