Modern C

Posts

Taming the Flat AST: Ergonomics in the Age of Zero Allocations

- February 03, 2026

Last week, I introduced egg , a modern LL(1) parser generator for Go. The response from the community was insightful, particularly regarding egg 's most distinct feature: the flat AST. To achieve high performance and low garbage collection pressure, egg encodes the entire Abstract Syntax Tree (AST) into a single []int32 slice. While this is excellent for the CPU cache, it can be intimidating for the developer. One commenter on the Gophers Slack put it perfectly: "The 'Traversing the AST' section really dissuades me from using this tool... I think coming up with a clean, well-typed API might be harder than I think." They were right. In my previous post, I showed a raw recursive walker that involved manual index arithmetic. It was fast, but it wasn't pretty. However, shortly after that post, I implemented ImportEBNF —a feature allowing egg to bootstrap itself by reading EBNF from HTML files. In doing so, I had to "eat my own dog food"...

Building a High-Performance JSON Parser in Go with Egg

- January 30, 2026

Egg: A Modern LL(1) Parser Generator for Go Parsing is a staple of computer science, yet finding the right tool for the job in Go often feels like a choice between heavy, complex frameworks or writing a brittle parser by hand. Enter egg (Expression Grammar Generator). egg is a new parser generator that prioritizes simplicity, performance, and Go-native conventions. It takes an EBNF grammar—very similar to the one used in the Go language specification—and generates a fast, dependency-light recursive descent parser. In this post, we’ll build a fully functional JSON parser from scratch to demonstrate how egg works, and we'll dive deep into its most controversial and powerful feature: its flat AST encoding. The Grammar: Familiar and Concise One of egg 's strengths is that it doesn't invent a complex new syntax. If you've read the Go language spec, you already know how to write an egg grammar. It combines lexical definitions (regex and literals) and syntax de...

ccgo/v4 experiment: Trying the new runtime.Pinner

- July 13, 2023

tl;dr: Looking forward future Pinner.Pin performance improvements. The upcoming Go version 1.21, scheduled for release next month, is currently available for download as Go 1.21rc2 in the "Unstable version" section here . Go 1.21 introduces a new runtime type, Pinner . ccgo/v4, the next, also not yet released version of the C to Go transpiler, uses pinning to "freeze" addresses of local Go variables, addresses of which are passed around in the original C code. ccgo produces Go code where any C pointer points to memory not managed by the Go runtime. So ccgo simply puts such "escaping" variables in the memory not visible to the garbage collector, with stable, immovable addresses. Those are provided by the modernc.org/memory package. Otherwise a goroutine stack resizing can change the address of a local variable. Another problem ccgo has to solve are the runtime pointer validity checks. I'm not aware of the details being documented somewhere outside of...

Producing a Go scanner in 1,219 bytes of code

- May 14, 2023

modernc.org/rec is a regexp to Go code compiler tool. It is still a bit rough around the edges. For example, it converts the regexps to a DFA, but it does not yet support intersecting character classes ending up in the same DFA state. Anyway, rec can already handle some nontrivial tasks, like generating a usable, working Go scanner. Here are those 1,219 bytes - in scanner.sh . The shell script is used in the generate target of the Makefile . Note the Perl Unicode character classes in the regexp for Go indentifiers. The respective EBNF lexical grammar part of Go specification is identifier = letter { letter | unicode_digit } . The above production expands and translates to the regexp (\pL|_)(\pL|_|\p{Nd})* . As mentioned above, constructing DFAs from regexps using character classes is a bit challenging per se when considering Unicode. A similar program, lx(1) , part of libfsm , seems to not support Unicode so far, possibly because facing similar difficulties. The...

Producing a Go parser from the language specification mechanically, mostly

- November 25, 2022

There's yet another, possibly never-to-be-completed Go front end, modernc.org/gc/v3 . This time I'm trying something new with respect to the Go parser. It takes three main steps, initially. 1. Extract the EBNF specification from the language specs . The unmodified EBNF grammar is not a well-formed PEG grammar : $ go test -v -run Spec === RUN TestSpecEBNF all_test.go:68: left recursive: Expression all_test.go:68: left recursive: PrimaryExpr --- PASS: TestSpecEBNF (0.01s) PASS ok modernc.org/gc/v3/internal/ebnf 0.011s $ 2. Manually rewrite spec.ebnf to peg.ebnf The goals are to Remove the left recursion Reorder the terms to obtain the correct parse tree. Rewrite selected parts of the grammar to get the backtracing on a large corpus of Go code to something like, say acceptable* 10% on average. (* - acceptable as a starting point) For the last part a PEG, actually in this case an EBNF interpreter is needed . To clarify, ...

Search This Blog