Parsing
Gnome-class obtains a TokenStream
from the Rust compiler in the
entry point for the procedural macro, and parses that stream of tokens
into an Abstract Syntax Tree (AST). We use the syn
crate for the
parsing machinery: it is able to parse arbitrary Rust code, and
allows creating new parsers for our extensions to the language.
Overview of the Abstract Syntax Tree (AST)
The AST is defined in src/ast.rs
. The AST is intended to match the
user's code pretty much verbatim. For example, consider a call like this:
# #![allow(unused_variables)] #fn main() { gobject_gen! { class Counter { f: Cell<u32> } impl Counter { pub fn add(&self, x: u32) -> u32 { self.get_priv().f.set(self.get() + x); self.get() } pub fn get(&self) -> u32 { self.get_priv().f.get() } } } #}
First, there is the actual invocation of the gobject_gen!
macro. It
has two items, a class
and an impl
. Even though Rust does not
have a class
item by itself, we use the same terminology to indicate
that this is a toplevel thing in the user's code. (FIXME: replace
"thing" with something more meaningful?)
The contents of the gobject_gen!
invocation will be parsed into the
following; see src/ast.rs
for the actual definitions of these
structs/enums:
Program {
items: [
Item::Class(
Class {
name: Ident("Counter"),
extends: None,
fields: FieldsNamed {
brace_token: Brace,
named: Punctuated { /* the "f: Cell<u32>" field goes here */ }
}
}
),
Item::Impl(
Impl {
trait_: None,
self_path: Ident("Counter"),
items: [
ImplItem {
attrs: [empty vector],
node: ImplItemKind::Method(
ImplItemMethod {
public: true,
virtual_: false,
signal: false,
name: Ident("add"),
inputs: Punctuated {...},
output: ReturnType, // u32
body: Some(Block {...}),
}
),
node: ImplItemKind::Method(
ImplItemMethod {
public: true,
virtual_: false,
signal: false,
name: Ident("get"),
inputs: Punctuated {...},
output: ReturnType, // u32
body: Some(Block {...}),
}
),
}
],
}
),
],
}
Whew! Fortunately, within the parsing functions we only need to deal with one thing at a time, and not the entire tree of code.
In summary: the macro call that looks like
gobject_gen! {
class Counter {
... field definitions for the per-instance private struct ...
}
impl Counter {
... two method definitions ...
}
}
gets parsed into
Program {
items: [
Item::Class(
Class {
name: Ident("Counter"),
items: a syn::Punctuated that contains
an f member of type Cell<u32> ...
]
}
),
Item::Impl(
Impl {
self_path: Ident("Counter"),
items: [
... two ImplItemKind::Method ...
],
}
),
],
}
That is, we parse the invocation above into. a Program
with two
items, an Item::Class
and an Item::Impl
. In turn, each of these
items has a detailed description of the corresponding constructs.
The parsing process
Gnome-class uses the syn
crate to parse a TokenStream
into our AST
structures. To define a parser for SomeStruct
, one creates an impl Synom for SomeStruct
. The Synom
trait has a parse
method; Syn
provides a set of parser combinators that let one "fill out" the
resulting structs by recursively parsing their fields.
Parser combinators are recursive-descent parsers that let one compose
big parsers from small parsers. Syn implements parser combinators
with macros similar to the nom
crate. We won't go into a full
description of how syn works here, and just focus on the peculiarities
of gnome-class. (FIXME: link to syn/nom docs)
The parsing code — the bunch of impl Synom
and parser combinators
that gnome-class uses — is in parser/mod.rs
.
We define parsers for the constructs in the gobject_gen!
macro that
are not normally part of Rust, like the class
item and the signal
keyword. In the deep part of these structures, we use plain Syn
structs like syn::FnArg
to represent function arguments, or
syn::Ident
for identifiers.