Wednesday, February 20, 2008

Mork is evil, but...

I decided today to finally start poking around the Mork reader code that was introduced to import history data, since the plan is to use it to migrate mork to SQLite code. I had read earlier enough to know that I would have to look into having morkreader handle multiple tables, but what I saw was just astounding. You see, morkreader is essentially a 580-line hack.

Don't get me wrong, I have nothing against hacks. My patch for fixing searching in base64-encoded messages was quite hacky as well. But a few things justify the hack. First, the legacy code needed some pretty severe refactoring to handle the recursive nature of MIME. Second, the improperly-handled cases should be rather rare in nature: the simplest way to generate a case is to forward-as-attached a message with a base64-encoded attachment. (I concede: a third reason is that I wrote it, but that pales in importance to the other two, I swear...) Morkreader, however, did not need to work around crufty legacy APIs, nor are its improperly-handled components uncommon.

The first assumption that morkreader makes is that there is only one table (mailnews loves having several tables). Adding in support for multiple tables would not be too difficult if the parser wasn't already broken in other ways. The number two assumption it makes is that no line is longer than 80-characters (which should be safe) and that no line is continued more than once (i.e., no more than 160-characters)... which is complete BS, as anyone who has ever subscribed to a mailing list can recognize (think of the References: header). Finally, the code will not handle aborted changesets properly, the banality of which I cannot determine. (Does mailnews code ever use the ! change type?)

So, to assess the accuracy of morkreader, I read the mork specification. Cross-referencing with some mork files of mine (a FF 2 history.dat, an abook.mab, and an inbox msf file), I discovered that the specification itself is inaccurate. Once again, two failings here: (atomScope=c) should be (a=c), and the spec implies that -[...] is the proper way to remove a row, whereas [-...] is the actual method. A subtle statement in the spec says that a + can be omitted from changesets.

I had to fix mork to get my first patch for bug 413260 working, looks like I'll have to fix morkreader as well. Oh well, magic will happen if Friday is as predicted...

No comments: