I could have written it myself faster

(home) 2020-12-18

When starting a new project we are faced with the choice whether to use an existing tool, package or library or whether just to write our own.

The modern, fashionable, way seems to be to use an existing package. I tend strongly in the other direction, and usually write my own. However, recently, since I was writing for an audience, I decided to try using a package instead. This is what happened.

The project, in this case, was reading and parsing a .wasm file. I wanted to understand the .wasm format for another project I had planned.

I might have done this in C, but I decided on common lisp since I wanted a formal description of the format which I could use for other tools later. CL is good at that.

I chose a lisp binary parsing package pretty much at random from quicklisp, lisp-binary.

wasm reader - using lisp-binary

(Disclaimer: I am neither disparaging lisp-binary, nor its author. Writing a generic system to parse binaries is non-trivial and the author gives example parsers for a few real-world binary formats. His package is good. This article is discussing the higher level choice of re-using or writing from scratch.)

In the beginning I was struggling to understand the .wasm spec and the package api simultaneously. There were many times where I knew exactly what code I wanted to write but struggled instead with finding out how lisp-binary wanted me to describe that. This often led me to thinking I could have written my own parser in less time.

I wanted a reader and then wanted to produce arbitrary tools which took the resulting output and actually produced a completely different format (i.e. not .wasm).

I was curious as to why the author creates structs rather than simple tagged lists. Using structs means an external tool has to include the definitions of those structs too and implies it will also depend on lisp-binary where there was no real need.

The main lisp-binary tutorial stops just before getting to the part in which I was most interested, namely how the structures produced by the reader can be pattern-matched by another tool.

I later realized that lisp-binary seems targeted at creating a reader/filter/writer style of program which probably explains the mismatch with what I was trying to do (just a reader) and explains the design decisions that seemed strange from my viewpoint. No package author can guess all the things a user might want to do with the package. That cannot be expected of him.

Again, I'm sure that lisp-binary's author could quickly achieve what I wanted using his own system, but it wasn't clear to me.

I did, finally, create a .wasm parser using lisp-binary /src/wasm-read.lisp .

Afterwards I decided that it was worth rewriting the parser just as an exercise to see if it were really true that I could have written my own parser faster than learning and using an existing one, or whether I was mistaken.

Points of note:

wasm reader - scratch-written using standard CL

I realize that the second time writing the same program should go faster. After all I should now know the .wasm spec (or at least be better able to read the spec this time). I did also re-use some code, for example the LEB128 reader, but very little in fact and this is code I needed to write regardless of the approach I took. I attempted to discount these advantages.

Writing this parser was a much more pleasurable experience than using the package. I'm fairly familiar with common lisp's reading and byte manipulation functions, so there was no cognitive struggle between what I wanted to do and actually writing the code.

Is it possible that if I didn't know how to write this kind of parser that using a package would have been easier? I cannot really comment. I feel like knowing how to write the parser already meant I could understand the lisp-binary package easier, since I could understand it's architecture. I think I would have struggled a lot more with lisp-binary as a beginner.

My final reader code /src/wasmrd.lisp ended up being a little longer although mine includes a reader for signed LEB128, missing in the original. Considering that mine includes all the reader infrastructure required, which was contained in the lisp-binary package for the original, my scratch-written reader could be considered substantially shorter, in fact.

$ wc -l wasm-read.lisp wasmrd.lisp
301 wasm-read.lisp (lisp-binary)
340 wasmrd.lisp (scratch-written)

Points of note:

the code

references

Tags: software-reuse bloat not-invented-here-syndrome (home)