Posts

ccgo/v4 experiment: Trying the new runtime.Pinner

tl;dr: Looking forward future Pinner.Pin  performance improvements. The upcoming Go version 1.21, scheduled for release next month, is currently available for download as Go 1.21rc2 in the "Unstable version" section here . Go 1.21 introduces a new runtime type, Pinner . ccgo/v4, the next, also not yet released version of the C to Go transpiler, uses pinning to "freeze" addresses of local  Go variables, addresses of which are passed around in the original C code. ccgo produces Go code where any C pointer points to memory not managed by the Go runtime. So ccgo simply puts such "escaping" variables in the memory not visible to the garbage collector, with stable, immovable addresses. Those are provided by the modernc.org/memory package. Otherwise a goroutine stack resizing can change the address of a local variable. Another problem ccgo has to solve are the runtime pointer validity checks. I'm not aware of the details being documented somewhere outside of

Producing a Go scanner in 1,219 bytes of code

modernc.org/rec is a regexp to Go code compiler tool. It is still a bit rough around the edges. For example, it converts the regexps to a DFA, but it does not yet support intersecting character classes ending up in the same DFA state. Anyway, rec can already handle some nontrivial tasks, like generating a usable, working Go scanner. Here are those 1,219 bytes - in scanner.sh . The shell script is used in the generate target of the Makefile . Note the Perl Unicode character classes in the regexp for Go indentifiers. The respective EBNF  lexical grammar   part of Go specification is identifier = letter { letter | unicode_digit } . The above production expands and translates to the regexp (\pL|_)(\pL|_|\p{Nd})* .  As mentioned above, constructing DFAs from regexps using character classes is a bit challenging per se when considering Unicode. A similar program, lx(1) , part of libfsm , seems to not support Unicode so far, possibly because facing similar difficulties. The resulting sca

Producing a Go parser from the language specification mechanically, mostly

There's yet another, possibly never-to-be-completed Go front end, modernc.org/gc/v3 . This time I'm trying something new with respect to the Go parser. It takes three main steps, initially. 1. Extract the EBNF specification from the language specs . The unmodified EBNF grammar is not a well-formed PEG grammar : $ go test -v -run Spec === RUN   TestSpecEBNF     all_test.go:68: left recursive: Expression     all_test.go:68: left recursive: PrimaryExpr --- PASS: TestSpecEBNF (0.01s) PASS ok  modernc.org/gc/v3/internal/ebnf 0.011s $  2. Manually rewrite spec.ebnf to peg.ebnf The goals are to Remove the left recursion Reorder the terms to obtain the correct parse tree. Rewrite selected parts of the grammar to get the backtracing on a large corpus of Go code to something like, say acceptable* 10% on average. (* - acceptable as a starting point) For the last part a PEG, actually in this case an  EBNF interpreter  is needed . To clarify, a particular PEG grammar can be used to g