I've been reading the recent posts on CodeTalker with interest. I've written a handful of parsers using two different parser generators for Python: PLY and SimpleParse. My most recent work with parsing has had me gravitating toward SimpleParse so I thought I'd see how it stacks up against CodeTalker.
First I checked the web to see if someone had written a JSON parser using SimpleParse. I found Rob Lanphier's JsonOrder. It at least had a grammar that I could yank as a jumping off point.
The result after about an hour of coding and benchmarking is spjson.py. At first I tried to adapt Rob's version but I switched back to SimpleParse's dispatch processor model. Pretty much the only thing that remains from JsonOrder is the tweaked grammar.
How does it measure up to CodeTalker's JSON parser?
It's a bit slower. I added a simple timeit benchmark to the spjson.py file. I used the same JSON file that Jared (the CodeTalker author) used in his benchmarks. Here are the results of running it against the latest version of CodeTalker at the time:
CodeTalker 0.0484498786926 SimpleParse 0.0623928356171
In terms of lines of code they are nearly identical. I didn't do anything fancy to omit docstrings or comments (neither module has many of either).
$ # use 'head' to strip the 'if __name__ ...' section $ head spjson.py -n 71 | grep -v '^\s*$' | wc -l 55 $ cat src/codetalker/codetalker/contrib/json.py | grep -v '^\s*$' | wc -l 55
The style is the biggest difference. SimpleParse uses an EBNF defined in a string to create the parser. CodeTalker uses an EBNF defined using Python code.
Both libraries let you write specialized processors for the grammar. This is one thing that I really value in SimpleParse over PLY. Your grammar and the code to act on the parse tree (or token stream) are neatly separated.
I'm still quite happy with SimpleParse. I like having the grammar in a nice contained EBNF rather than defined with Python + syntactic sugar. I'll probably give CodeTalker a look next time a new parsing task comes up though. It's great to see how the options for parsing in Python are expanding.