The Occasional Occurence
HTTP Utilities with CherryPy
January 08, 2009 at 12:56 PM | categories: Python, Software, cherrypy, GeneralEric Florenzano posted a detailed blog entry on creating fast web utilities with bare WSGI. In the blog he shared that the larger Python web frameworks are overkill for small utility-like applications. He then proceeded to build a small utility app that conforms to the WSGI spec.
I like to use CherryPy to write HTTP utility-style applications. It lets you write RESTful WSGI-compliant HTTP applications without even knowing that you are doing so. Here is Eric's song counter application rewritten using CherryPy and using the builtin MethodDispatcher.
from collections import defaultdict import cherrypy counts = defaultdict(int) class SongCounts(object): exposed = True def GET(self, id): return str(counts[id]) def POST(self, id): counts[id] += 1 return str(counts[id]) class CounterApp(object): exposed = True song = SongCounts() def GET(self): return ','.join(['%s=%s' % (k, v) for k, v in counts.iteritems()]) def DELETE(self): counts.clear() return 'OK' sc_conf = { '/': { 'request.dispatch':cherrypy.dispatch.MethodDispatcher(), }, } application = cherrypy.tree.mount(CounterApp(), config=sc_conf)
It might just be personal preference and experience with CherryPy, but that code is much more expressive and readable to me than a raw WSGI callable. Another thing, related to scalability, is skill scalability. While you can use CherryPy to build small utilities like this, it is also useful for combining with a template engine and a database for writing full-scale web applications.
Of course, the main thrust of Eric's argument is that raw WSGI is faster, not more readable. Here are some benchmarks from my machine running Eric's raw WSGI app and my CherryPy WSGI app.
The specs on my machine are: IBM Thinkpad T61, Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz, 4GB 667 MHz DDR2 SDRAM
First Eric's app:
$ spawn -t 0 -p 8080 counter.application $ curl -X POST -H "Content-Length:0" http://127.0.0.1:8080/song/1 $ ab -n 10000 http://127.0.0.1:8080/song/1 ... Concurrency Level: 1 Time taken for tests: 7.38927 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 1020000 bytes HTML transferred: 10000 bytes Requests per second: 1420.67 [#/sec] (mean) Time per request: 0.704 [ms] (mean) Time per request: 0.704 [ms] (mean, across all concurrent requests) Transfer rate: 141.50 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 7 Processing: 0 0 0.5 0 11 Waiting: 0 0 0.3 0 11 Total: 0 0 0.5 0 11
Now my CherryPy app:
$ spawn -t 0 -p 8080 songcounter.application $ curl -X POST -H "Content-Length:0" http://127.0.0.1:8080/song/1 $ ab -n 10000 http://127.0.0.1:8080/song/1 ... Concurrency Level: 1 Time taken for tests: 14.529259 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 1680000 bytes HTML transferred: 10000 bytes Requests per second: 688.27 [#/sec] (mean) Time per request: 1.453 [ms] (mean) Time per request: 1.453 [ms] (mean, across all concurrent requests) Transfer rate: 112.88 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 8 Processing: 0 1 0.8 1 31 Waiting: 0 0 1.1 1 14 Total: 0 1 0.8 1 31
Wow! Eric's raw WSGI app is over 2x faster in req/s. Of course the measly ~688 req/s from my CherryPy application translates to over 59 million req/24hrs*. Not too shabby either. ;-)
Looking at the benchmarks side by side, raw WSGI is going to get you the most bang for your buck out of your hardware. But like I'd rather write things in Python before dropping down to a lower-level language for speed, I'd rather write my HTTP utilities in CherryPy and only fall back to raw WSGI when the need arises.
cw
* Eric's app would service over 123 million requests in the same time period. :P
UPDATE:
Ok, after taking Robert's suggestion to profile, I realized that my CherryPy application was logging to stdout in addition to spawning doing its own stdout logging. Adding the following line to my application bumped up the speed to 842.34 req/s.
cherrypy.config.update({'log.screen':False})
That should bring the speed closer to what Tim Parkin measured using his restish framework.