WebAsm: Forth with Lisp Syntax

(home) 2020-12-03

The combination of forth semantics and s-expr syntax makes the WebAsm text format (WAT/.wat) probably the most powerful programming language extant today. Soon I expect it to obviate all other languages. joke

There is a common myth floating around that webasm is just a binary format for a javascript AST (abstract syntax tree). That is really selling webasm short.

forth

webasm is like forth in so far as it is a stack machine. There are no registers in the webasm virtual machine.

Unlike forth, webasm functions have explicit parameters, local variables and only return a single value (currently). webasm also lacks forth's rich set of words for stack manipulation [dup swap roll ...]

lisp

webasm's text format uses lisp/s-expr syntax. The language syntax also has many lisp style influences. Impressive! Finally, someone in a position of power has realized that we should be using s-expr's for everything and discarding xml and json.

static typing

All webasm globals, locals, parameters and functions are strongly statically typed. Only a few simple types (integer and float) and a reference type are allowed. [types::=i32,i64,f32,f64].

Note the absence of strings or even arrays. No unsigned types either, although there are some instructions which can be marked signed/unsigned [right-shift, greater-equal ...].

memory

wasm memory layout is like what you find on simple 80's computer or a not-too-modern microcontroller. The memory starts at zero and proceeds to a specified max value (the max value is dynamically growable, cf. C's brk()).

Not even malloc() is provided.

Memory pointers are just i32/i64 integer offsets in the memory array (exactly the same as real pointers ...).

assembler

wasm instructions are very much on the assembler end of the spectrum, with i32.add, f32.neg and the like. There are some slightly higher-level (more controlled) control forms like (block) and (loop). Arbitrary jumps are not possible [security feature]. Indeed code and data are not in the same address space, and code space is completely opaque to the wasm programmer.

overall structure

The top-level form is a (module ...) which can contain globals and functions. There are also import, export, memory size specs and even type definitions.

(module
  (func $malloc (param $len i32) (result i32)
    ;; ... body returning an i32
    ))

Canonically instructions are laid out flat but the text format also allows for a much nicer nested s-expr syntax.

flat syntax (making the stack machine more obvious):

global.get $dot
local.get $len
i32.add
global.set $dot

s-expression alternative (same code)

(global.set $dot (i32.add (global.get $dot) (local.get $len)))

Parameters, locals and even functions can all be accessed by index but they can be given optional labels too. This and the s-expr option makes it a breeze to code directly in the language itself. joke

runtime environment

The entire wasm code is called from javascript, which has an api designed to load and import functions from a wasm file (see below for example)

import/export

webasm functions can be exported to javascript like this: note how the wasm return value is implicitly whatever is on the stack at exit; wat compilers check stack arity and type.

[full code example at the end]

(func (export "addNumbers") (param $a i32) (param $b i32) (result i32)
  (i32.add (local.get $a) (local.get $b)))
// called from javascript like this (after some incantations)
var someNumber = 17;
const sum = wasm.instance.exports.addNumbers(12, someNumber);

Javscript can also export functions into wasm-space, using a 2-level object passed into the WebAssembly incantation.

// export a console.log(i32) function
var importObject = {
  console: { log: function(num) { console.log(num + " 0x" + (num>>>0).toString(16)); } },
  };
WebAssembly.instantiateStreaming(fetch("malloc.wasm"), importObject)
  .then(function(wasm){
      .... access wasm functions in here
      });

The javascript exports must be explicitly imported by the wasm module and given a local function signature (and name). unlike the js counterpart, the wasm function is strongly, statically typed, taking a single i32 parameter, only

(import "console" "log" (func $js_log (param $i i32)))
...
(call $js_log (i32.const 15)) ;; example of calling out to javascript

externref - the part of the spec we wish we had

It would be nice to have another wasm type, an opaque, read-only, javascript reference object which we could receive from javascript, store and later pass back to javascript in our callouts/callbacks.

You will miss this feature almost as soon as you write your first substantial wat program.

Alas the externref spec is not part of the original webasm spec and is a newer addition. I am not sure how widely adopted it is.

a more substantial example

(the full html/wat code is linked below, also as a live example)

The WAT source code of for the (cheap) malloc() example function and another (even cheaper) dump_range() function which calls back to javascript to write bytes into the debug console.

(module
  (memory (import "js" "mem") 10)
  (import "console" "log" (func $js_log (param $i i32)))
  (global $memend i32  (i32.const 4096))
  (global $sbrk (mut i32)  (i32.const 0))
  ;; ----- allocate some memory in wasm-space
  (func (export "malloc") (param $len i32) (result i32)
    (local $newbrk i32)
    (local $mem i32)
    (local.set $newbrk (i32.add (global.get $sbrk) (local.get $len)))
    (if (i32.ge_u (local.get $newbrk) (global.get $memend))
      (return (i32.const 0)))
    (local.set $mem (global.get $sbrk))
    (global.set $sbrk (local.get $newbrk))
    (local.get $mem))
  ;; ----- write bytes in wasm-space memory using js callback
  (func (export "dump_range") (param $start i32) (param $len i32)
    (local $i i32)
    (local $end i32)
    (local.set $end (i32.add (local.get $start) (local.get $len)))
    (local.set $i (local.get $start))
    (block $break2 (loop $head1
      (br_if $break2 (i32.eq (local.get $i)  (local.get $end)))
      (call $js_log (i32.load8_u (local.get $i)))
      (local.set $i (i32.add  (i32.const 1) (local.get $i)))
      (br $head1))))
  )

Compile it to malloc.wasm like so:

$ wat2wasm malloc.wat
https://webassembly.github.io/wabt/doc/wat2wasm.1.html

The HTML (full file linked below) which sets up the webasm instance and calls the functions we wrote:

Live example: (open your javascript console to see things happen, e.g. press [F12] on firefox) /src/malloc.html

<script>
"use strict";
(function(){
  var wasm;
  var memory = new WebAssembly.Memory({initial:10}); // 640K, more than anyone could ever use
  var importObject = {
    console: { log: function(arg) { console.log(arg); } },
    js: { mem: memory }
    };
  WebAssembly.instantiateStreaming(fetch("malloc.wasm"), importObject)
    .then(function(_wasm){
      wasm=_wasm;
    });
  function write_seq(ptr, len){
    var bytes = new Uint8Array(memory.buffer, ptr, len);
    for(var i=0; i<len; ++i){
      bytes[i]=i;
    }
  }
  function malloc_test(e){
    const nbytes = 16;
    const ptr = wasm.instance.exports.malloc(nbytes);
    console.log("before write:");
    wasm.instance.exports.dump_range(ptr, nbytes);
    write_seq(ptr, nbytes);
    console.log("after write:");
    wasm.instance.exports.dump_range(ptr, nbytes);
  }
  runbtn.addEventListener("click", malloc_test);
})();
</script>

references

Tags: lisp forth webasm javascript low-level (home)