Despite what I may have said in an earlier post, it's not all that nice to code directly in wat.
So I've started work on a compiler. A very thin compiler. The language should feel very similar to our, beloved,
wat, but save us from
typing all those
Some points up front:
WASM). This means using
sbcldebug trace if you make the slightest mistake).
watson, watsup, watsamatta, watulookin. Also cos it's just syntactical sugar, get it?
sugarlangor sugarland if you prefer.
wasmfile. The best standalone wasm runtime I've found so far is
wasm-interp. Please don't say node.js. You can say
That feeling when ...
People say Minimalist System
Then say "just use npm install ..."
Here is the
malloc.wat example from the previous blog post on webasm re-written in
It's quite a bit more concise and convenient than the original raw
wat (see below).
(previous post on
; malloc.sugar - sugar example code (memory (import "js" "mem") 10) (import "console" "log" (func js_log (i i32))) (global memend i32 #x1000) (global sbrk (mut i32) 0) ; ----- allocate some memory in wasm-space (defun (export "malloc") (len i32) :result i32 (locals (i32 (newbrk (+ sbrk len)) (mem sbrk))) (if (>= newbrk memend) (return 0)) (set sbrk newbrk) mem) ; ----- write bytes in wasm-space memory using js callback (defun (export dump_range) (start i32 len i32) (locals (i32 i (end (+ start len)))) (for (i start end) (js_log (get.u8* i))))
And this is what you would have written if you were coding directly in
(module (memory (import "js" "mem") 10) (import "console" "log" (func $js_log (param $i i32))) (global $memend i32 (i32.const 4096)) (global $sbrk (mut i32) (i32.const 0)) ;; ----- allocate some memory in wasm-space (func (export "malloc") (param $len i32) (result i32) (local $newbrk i32) (local $mem i32) (local.set $newbrk (i32.add (global.get $sbrk) (local.get $len))) (if (i32.ge_u (local.get $newbrk) (global.get $memend)) (return (i32.const 0))) (local.set $mem (global.get $sbrk)) (global.set $sbrk (local.get $newbrk)) (local.get $mem)) ;; ----- write bytes in wasm-space memory using js callback (func (export "dump_range") (param $start i32) (param $len i32) (local $i i32) (local $end i32) (local.set $end (i32.add (local.get $start) (local.get $len))) (local.set $i (local.get $start)) (block $break2 (loop $head1 (br_if $break2 (i32.eq (local.get $i) (local.get $end))) (call $js_log (i32.load8_u (local.get $i))) (local.set $i (i32.add (i32.const 1) (local.get $i))) (br $head1)))))
your-source.sugar source file in your favorite text editor. Then run
sugar on it to produce
This is how I use sugar (with a makefile for calling sugar, then wat2wasm): /src/sugar.mk
$ mk malloc.wasm ./sugar malloc.sugar >malloc.wat summary: globals:2 functions:3 macros:2 wat2wasm malloc.wat
(see below for installing common lisp)
The compiler is very small and may be interesting to read, for that reason. /src/sugar
It includes a macro system, which I have used to implement
for but it's not yet available for use from sugar
itself. Also the full CL defmacro/destructuring-bind lambda list syntax is non trivial to replicate; not to mention backquote.
Here is the overall compiler structure
loop: read a form [from source file] (compile form) => stdout compile form: integer => (i32.const int) symbol => (local/global.get varname) [check in environment/scope] list/s-expr/cons => if (is-macro? form) (compile (expand-macro form)) else: is-special? form => (compile-special form) [defun/import/export/global/set...] is-function? => for all args (compile arg) call func with args otherwise => error undefined function
Too many brackets?
Rule 1: There are no brackets, only indentation.
Rule 0: We suffer the brackets cos it gives us defmacro and that's worth a lot of suffering.
Did you ever generate code? Did you think that was cool? You'll love common lisp. Try it for 2 weeks. Money back guarantee if not satisfied.
Lisp has value-IF i.e. you can use IF anywhere you use a value. That's like C's ternary but more powerful.
At first I thought WAT doesn't have it, but (if) has an optional
(result type) clause, like (func) so that it's branches
can return (type matched) values. So you can do clever things like this:
note the (result i32) clause between if and condition.
(set x (if (result i32) (>= z limit) (comp-true-value y) (comp-false-value s 14 q)))
wat (select) is like the C ternary operator although it can have blocks/statement sequences inside.
select v if: select has all 3 clauses always evaluated while (if) has only the conditional and the chosen branch evaluated. The other (if) branch is not evaluated.
Implement (defmacro ...) for sugar.
At the moment I only have the bitwise and/or operators '&' and 'bor' and next on the list is having shortcutting logical
Ideally I'd like to have lisp-style
and/or where the expression value is
returned rather than just 0 or 1. Also standalone
and/or so we can have things like
(and (file-existsp fname) (process-file fname)) (or (setup-completep) (error setup-failed))Shortcut
and/oractually compile to nested (if) expressions.
A whole slew of function counterparts for other types (i64,f32,f64) are missing. The compiler cannot type-infer which one (see below)
so these would have explicit names like
+f64, set.f64 etc.
There are many other relops missing etc. They are fairly trivial to add.
Currently I'm using pointer
get*/set* but I'd like to have
aref and implicit index to pointer arithmetic.
Some more loops.
while repeat etc. The underlying
block, loop, br_if, br are exposed so writing loops is possible.
It would only be the case of adding some more macros to enable:
(while condition body...) (forever body...)
continue, break would be nice.
Some kind of struct might be nice to have. Even if all the fields had to be i32 initially i.e. just a vector with named, instead of numbered, fields.
Even the current error messages drop you into the SBCL debugger, which can be a scary place - [
ctrl-d] to exit.
Knowing the types of values would allow the compiler to check signatures of calls of user functions and builtins.
It would also allow me to infer the correct type for
set rather than requiring
Locals must be defined at the start of a function. This means implementing LET (scoped blocks with local variables) is not trivial. The current, single-pass, compiler would need to become far more complex to first decide how many locals are required in a function, then allocate them up-front and possibly rename them if there are clashes.
The holy grail of compiler writers is that they rewrite their compiler in their new language. Common Lisp is so far above
the level of
wat that this would be a large undertaking. First I'd need to write malloc...
Setting up sugar/common lisp may be trivial or hard depending on how familiar with common lisp you are :)
Installing common lisp (I use SBCL) is outside the scope of this post. (Tell me it's harder than npm ...)
I refer you to [Zach Beane's] https://www.quicklisp.org/beta/
SBCL might even be in your package manager:
$ pacman -Ss sbcl
sugar is run as a unix script. I've made a custom sbcl core since I prefer
Here is a very brief explanation of how to make such a core, starting with plain, vanilla sbcl.
$ sbcl * (ql:quickload '#:iterate) * (sb-ext:save-lisp-and-die "sbcl-iterate") $ sudo mv sbcl-iterate /usr/share/sbcl/sbcl-iterate
[2020-12-07] My original compiler structure had a macro expansion bug. I was misled since I was in a block where I had just checked that (form) was a consp/list and I assumed this remained true inside that block. However, I missed the fact that I am modifying the form inside the block, with (expand-macro). It's possible (expand-macro) returns a non-list e.g. a symbol. Then my list assumptions break.
compile form: ... list/s-expr/cons => while: (is-macro? form) form := (expand-macro form) <---- bug: this can change form into a non-list is-special? form => ... is-function? => ...
First I fixed this with an extra list check in is-macro and also in the is-special/is-function parts, and a recursive call to compile for non-list forms.
I later thought if I had used a functional approach and avoided the mutation, I would not have made this mistake. My visual hint that I was dealing with a list would have remained true. It turns out the functional solution is even more elegant than the original and doesn't have the bug (+1 to functional purists).
The functional solution, doesn't loop, it recurses on compile. Then compile gets the chance to decide again what type of form expand-macro returned (nested macros will be recursively expanded).
compile form: ... list/s-expr/cons => if: (is-macro? form) (compile (expand-macro form)) <--- recurse, don't loop, no mutation else: is-special?/is-function? ...
I have backported the fix into the attached compiler source code: /src/sugarI discovered the bug while implementing the base cases for shortcut and/or i.e. (and x) and (or x).