Rodin Aarssen: Concrete Syntax with Black Box Parsers


Event Details


Meta programming is the art of writing software that takes source code as input for manipulation, analysis or code generation. Many meta programming systems reason about abstract syntax trees representing this source code, which requires intimate knowledge of the data type that describes the abstract syntax. Concrete syntax patterns allow meta programmers to create and perform matching on syntax trees using the actual concrete syntax of the object language. However, meta programming systems that support these concrete syntax patterns generally require a concrete grammar of the object language, written in their own formalism. Writing such a grammar is a daunting, error-prone task, especially for non-trivial programming languages, such as C++ and Java.

We present Concretely, a technique to augment meta programming systems with pluggable concrete syntax, reusing external, black box parsers. Concretely allows the meta programmer to use concrete syntax patterns in absence of a concrete grammar. Additionally, we present Tympanic, a DSL to declaratively map external parsers’ AST structures to the internal data structures of the Rascal meta programming language. Algebraic data types (ADTs) for the abstract grammar and marshalling code, mapping the external parser’s AST to the generated Rascal ADT, are automatically generated from a Tympanic specification. Tympanic allows implementors of Concretely to solve the impedance mismatch problem between Rascal’s algebraic data types and object-oriented class hierarchies in Java representing a grammar.

We show that for realistic programming languages, such as C++ and Java, the effort of adding support for concrete syntax patterns with Concretely is in the order of dozens of source lines of code (SLOC). Similarly, we show that using Tympanic for grammar mapping yields a significant reduction in terms of SLOC, compared to manually implementing the AST data types and marshalling code.