I just finished the first stop on my seven language tour. Ruby was in some ways an easy start, although not as easy as I would have thought. While Ruby is enough like Python to seem familiar, I was still surprised by some of the differences I found.
The author of the book, Bruce Tate, is a Rubyist as I understand it, and he was definitely pushing metaprogramming as one of Ruby’s cool features. And in that department (and in others) Ruby definitely has some cool tricks.
After some easy build-ups, including arrays and hashes (pretty much equivalent to lists and dictionaries in Python), a sort of slick implementation of a Roman numeral class, and using inject to cumulatively operate on all of the elements of an array, the final problem to be solved in Ruby is creating a module which creates a class to read a CSV file. The file is opened automatically when an object of the class is instantiated, assuming a filename derived from the class name. That is, if the class is ‘ActAsCsv’, it will open a file ‘actascsv.txt’.
The first line of the CSV file is assumed to be a header containing the names of the fields, and one is supposed to create a CsvRow class which supplies accessor methods corresponding to each field. So if the first line contains “one, two, three”, you should be able to access the third field of any row with ‘row.three’.
A few things in this exercise left me feeling uncomfortable. First of all, the automatic opening of a file based on the class name, struck me more a parlor trick to illustrate Ruby’s capabilities for introspection, than a practical strategy. While I can do the same thing in Python, I can’t think of a single instance, even in simple scripts, where I would be happy linking all instances of a class to a single file, or even constraining what file is opened. I try to write my classes to handle arbitrary file names instead. I may be making too much of this, but as a teacher I’ve come to really dislike teaching examples which make you say “huh?” right at the beginning.
I’ve always advocated to my programming students that they write programs in such a way that runtime errors are obvious. If an error matters, it should be as obvious as possible, so that you can find and fix it. so I was surprised and a little uneasy to discover that Ruby seems to be taking the opposite approach – if a field or method or array slot is missing, just return nil and carry on. That’s right, no index out of bounds exceptions… ever.
In fact, Ruby takes things one step further. Objects have a overridable method ‘method_missing’ which gets called whenever you try to access a method that the class doesn’t actually have. The default version raises a NoMethodError exception, but once you override it, the behavior is up to you. This is powerful – if you construct your method correctly you can do some interesting things, including dynamically supplying field accessors for rows in CSV file. OTOH, if you don’t construct it right, if you don’t remember to raise a NoMethodError for the truly unexpected values, you run the risk of masking important information.
In my solution, the contents of the row are a hash in the CsvRow, and the method_missing implementation of the accessors is really just syntactic sugar hiding a key, value hash lookup. Add to that the fact that Ruby returns a nil rather than raising an exception if a key isn’t found, and my code can’t tell the difference between an empty field and a missing one. I freely admit my solution may not be the “Ruby way”, but I would argue it’s at least plausible.
So while I’m glad I had the chance to play with a bit of Ruby, and I’m still liking the books approach, I end up with two issues, one regarding Ruby and the other with the example problem. The first is what I’ve just mentioned – I was guided into using Ruby in a way that I think conceals potential bugs, and Ruby was complicit in that.
The second issue is with the exercise. While I’ll certainly concede that Ruby has some syntactic flexibility that Python doesn’t, this problem doesn’t really present a convincing use case. I don’t intend to make my reading of this book into just a ‘so what, I can do that in Python’ response, but in this case it does seem to me that a Pythonic version takes care of the problem more efficiently. If I’m OK with using dictionary access rather than accessor methods – saying ‘row["one"]‘ rather than ‘row.one’ – this example is a non-problem in Python, thanks to the csv module in the standard library.
[Edit: Yes, I know that you can override __getattr__ in Python to provide the same functionality. See the comments for suggested versions. My point was that Ruby's method_missing approach was what made me uneasy. And it's more inviting - in Ruby you would override the method_missing method of a class derived from hash, and nothing else. In Python, I would argue, the surgery is a bit more serious.]