The Occasional Occurence

More On Organic Code

November 29, 2010 at 02:07 PM | categories: Software, General

I blogged a couple years ago about Code Farming (tl;dr - the web is a more organic programming environment than packaged software). That post was mostly a commentary on another post I had read.

This past weekend I read "The Biology of Sloppy Code". It touches on some of those same ideas that inspired me to post about Code Farming.

Despite the title, it's less about sloppy code than it is about attempting to categorize different types of software development according to more traditional sciences.

I don't have any specific comments on the article, other than suggesting you read it. I will say that I particularly enjoyed the references to abstraction, Play-Doh and a few zingers the author threw in (real programmers vs. sissy Nancy-boy bedwetters).

A JSON Parser Using SimpleParse

July 21, 2010 at 02:31 PM | categories: Python, Software, computing, General

I've been reading the recent posts on CodeTalker with interest. I've written a handful of parsers using two different parser generators for Python: PLY and SimpleParse. My most recent work with parsing has had me gravitating toward SimpleParse so I thought I'd see how it stacks up against CodeTalker.

First I checked the web to see if someone had written a JSON parser using SimpleParse. I found Rob Lanphier's JsonOrder. It at least had a grammar that I could yank as a jumping off point.

The result after about an hour of coding and benchmarking is spjson.py. At first I tried to adapt Rob's version but I switched back to SimpleParse's dispatch processor model. Pretty much the only thing that remains from JsonOrder is the tweaked grammar.

How does it measure up to CodeTalker's JSON parser?

It's a bit slower. I added a simple timeit benchmark to the spjson.py file. I used the same JSON file that Jared (the CodeTalker author) used in his benchmarks. Here are the results of running it against the latest version of CodeTalker at the time:

CodeTalker 0.0484498786926
SimpleParse 0.0623928356171

In terms of lines of code they are nearly identical. I didn't do anything fancy to omit docstrings or comments (neither module has many of either).

$ # use 'head' to strip the 'if __name__ ...' section
$ head spjson.py -n 71 | grep -v '^\s*$' | wc -l
55
$ cat src/codetalker/codetalker/contrib/json.py | grep -v '^\s*$' | wc -l
55

The style is the biggest difference. SimpleParse uses an EBNF defined in a string to create the parser. CodeTalker uses an EBNF defined using Python code.

Both libraries let you write specialized processors for the grammar. This is one thing that I really value in SimpleParse over PLY. Your grammar and the code to act on the parse tree (or token stream) are neatly separated.

I'm still quite happy with SimpleParse. I like having the grammar in a nice contained EBNF rather than defined with Python + syntactic sugar. I'll probably give CodeTalker a look next time a new parsing task comes up though. It's great to see how the options for parsing in Python are expanding.

Rebasing to a New Branch with Mercurial

July 09, 2010 at 08:53 AM | categories: Python, Software, work, computing, General

I had a situation at work the other day where I had made a number of local commits to the default branch of my repo. I wanted to push them upstream to our central server but the feature was incomplete and I didn't want to break anything in case someone needed to make a tweak to the current code in the default branch.

I had the idea to use the `hg rebase <http://mercurial.selenic.com/wiki/RebaseProject>`_ command to move all my local commits to a new branch before pushing. It worked, and here's how I did it.

hg clone localrepo temp-localrepo. I always try crazy ideas in another local clone in case I trash the repo.
hg up to the revision before my local commits that I want to put in a branch.
hg branch newbranchname && hg ci -m "Branching for reason foo." This creates a new branch and head that can be used as a rebase destination.
Now for the rebase. I needed to rebase the first changeset in my series of local changes onto the newbranchname changeset. Something like hg rebase --source 94 --dest 105.

That's about it. After verifying that it did what I wanted all that was left was to repeat the steps in my main local repo and push to the central server.*

* Of course what I actually did was push these changes to my main local repo and then to the central repo. Woops. That left me with the changes both in ``default`` and in ``newbranchname``.

Don't do that. Repeat the steps after you've tried them in another local clone. I leave recovering from such a situation with ```hg strip <http://mercurial.selenic.com/wiki/Strip>`_`` as an exercise for the reader.

Getting Version Information from Mercurial

May 13, 2010 at 04:52 PM | categories: Python, work, computing, Software

In an application I'm working on at work I want to be able to display various bits of version information in the UI. This goes for both production deployments from Python EGG files and in development running straight out of the repository.

We use Mercurial for revision control so it is a logical choice for a version information source. The result of some hacking is my first Mercurial extension and commit hook.

Mercurial Version Info Plugin.

See the README file rendered at that link for the details. Click here to download version 1.0.

My Take on Multiple Constructors

March 17, 2010 at 05:38 PM | categories: Python, Software, computing, General

I noticed the same question on c.l.p that Steve Ferg responded to on his blog. I was feeling too lazy to respond to the thread earlier but I thought I'd throw my idea up on the ol' blog before wrapping up for the day.

I think this is a classic use-case for class methods. Here is my implementation.

class Vector(object):
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

    @classmethod
    def from_sequence(cls, sequence):
        return cls(*sequence)

    @classmethod
    def from_vector(cls, vec):
        return cls(vec.x, vec.y, vec.z)

    def __repr__(self):
        return "Vector(%s, %s, %s)" % (self.x, self.y, self.z)

if __name__ == '__main__':
    print Vector(1,2,3)
    print Vector.from_sequence([4,5,6])
    print Vector.from_sequence((7,8,9))
    v = Vector(10, 11, 12)
    print Vector.from_vector(v);

Here is the output from running the script:

Vector(1, 2, 3)
Vector(4, 5, 6)
Vector(7, 8, 9)
Vector(10, 11, 12)

I like the classmethod route because it is obvious what the code is doing, it makes it easy to add new from_\* methods and keeps the general __init__ method clean.