The Occasional Occurence

Customizing the Python Import System

July 31, 2008 at 10:39 PM | categories: Python, Software, work, computing, General

So I've been programming with Python since 2001 and I've never had the need to do anything that the standard import system didn't provide - until this week. We are planning on a little code reorganization for a project at work in preparation for collaboration from more developers. I wrote a simple custom importer/loader that let's a developer write

from application.widgets import foobar

instead of the longer

from application.widgets.foobar.widget import foobar

and the class foobar winds up in globals().

It's not groundbreaking functionality but it actually does add a little clarity in our situation. The whole task it was made quite simple by the features introduced in PEP 302 (that document is a great reference). Now, before anyone suggests that we could have just pulled the classes in via a __init__.py in the application/components directory, note that some components might depend on others which have not been imported and thus their imports would fail.

Anyhow, like I said, it isn't groundbreaking, but the very fact that you can customize Python's import system is neat. I got to thinking about what other ways I could hack the import system, and came up with a little web importer. I'll post the code below, only because I think it is a clever trick, not that it is something to use in development of a Real Application.

"""
Stupid Python Trick - import modules over the web.
Author: Christian Wyglendowski
License: MIT (http://dowski.com/mit.txt)
"""

import httplib
import imp
import sys

def register_domain(name):
    WebImporter.registered_domains.add(name)
    parts = reversed(name.split('.'))
    whole = []
    for part in parts:
        whole.append(part)
        WebImporter.domain_modules.add(".".join(whole))

class WebImporter(object):
    domain_modules = set()
    registered_domains = set()

    def find_module(self, fullname, path=None):
        if fullname in self.domain_modules:
            return self
        if fullname.rsplit('.')[0] not in self.domain_modules:
            return None
        try:
            r = self._do_request(fullname, method="HEAD")
        except ValueError:
            return None
        else:
            r.close()
            if r.status == 200:
                return self
        return None

    def load_module(self, fullname):
        if fullname in sys.modules:
            return sys.modules[fullname]
        mod = imp.new_module(fullname)
        mod.__loader__ = self
        sys.modules[fullname] = mod
        if fullname not in self.domain_modules:
            url = "http://%s%s" % self._get_host_and_path(fullname)
            mod.__file__ = url
            r = self._do_request(fullname)
            code = r.read()
            assert r.status == 200
            exec code in mod.__dict__
        else:
            mod.__file__ = "[fake module %r]" % fullname
            mod.__path__ = []
        return mod

    def _do_request(self, fullname, method="GET"):
        host, path = self._get_host_and_path(fullname)
        c = httplib.HTTPConnection(host)
        c.request(method, path)
        return c.getresponse()

    def _get_host_and_path(self, fullname):
        tld, domain, rest = fullname.split('.', 2)
        path = "/%s.py" % rest.replace('.', '/')
        return ".".join([domain, tld]), path

sys.meta_path = [WebImporter()]

You can use it like so:

import webimport
webimport.register_domain('dowski.com')
from com.dowski import test

That would fetch and import http://dowski.com/test.py.

There may be other Python libraries out there that do this better - I couldn't find any with a quick Google search. I can think of a number of features would be needed for a serious implementation of something like this (caching, HTTP-AUTH, signatures, remote package support, etc). For now though I'm just throwing this out there because I think it is neat.

Anyone else doing neat tricks with the import hooks that Python exposes?

cw