The Occasional Occurence

Caching HTTP Responses with CherryPy

February 25, 2009 at 10:53 AM | categories: Python, Software, computing, cherrypy, General

The most basic case is very simple.

import time
import cherrypy

class WebSvc(object):
    @cherrypy.tools.caching(delay=300)
    @cherrypy.expose
    def quadruple(self, number):
        time.sleep(1) # make the real call somewhat costly
        return str(int(number) * 4)

cherrypy.quickstart(WebSvc())

That uses an in-memory cache and defaults to items expiring from the cache in 300 seconds (5 minutes). If you want to tweak that setting or others you can configure the caching tool to your liking.

This is in response to `a post that asks if setting up caching in other web frameworks is as easy as in <http://slightlynew.blogspot.com/2009/02/full-web-service-with-http-caching-in-7.html>`_Rails` Ruby with Sinatra <http://slightlynew.blogspot.com/2009/02/full-web-service-with-http-caching-in-7.html>`_.

HTTP Utilities with CherryPy

January 08, 2009 at 12:56 PM | categories: Python, Software, cherrypy, General

Eric Florenzano posted a detailed blog entry on creating fast web utilities with bare WSGI. In the blog he shared that the larger Python web frameworks are overkill for small utility-like applications. He then proceeded to build a small utility app that conforms to the WSGI spec.

I like to use CherryPy to write HTTP utility-style applications. It lets you write RESTful WSGI-compliant HTTP applications without even knowing that you are doing so. Here is Eric's song counter application rewritten using CherryPy and using the builtin MethodDispatcher.

from collections import defaultdict

import cherrypy

counts = defaultdict(int)

class SongCounts(object):
    exposed = True

    def GET(self, id):
        return str(counts[id])

    def POST(self, id):
        counts[id] += 1
        return str(counts[id])


class CounterApp(object):
    exposed = True
    song = SongCounts()

    def GET(self):
        return ','.join(['%s=%s' % (k, v) for k, v in counts.iteritems()])

    def DELETE(self):
        counts.clear()
        return 'OK'

sc_conf = {
    '/': {
        'request.dispatch':cherrypy.dispatch.MethodDispatcher(),
    },
}

application = cherrypy.tree.mount(CounterApp(), config=sc_conf)

It might just be personal preference and experience with CherryPy, but that code is much more expressive and readable to me than a raw WSGI callable. Another thing, related to scalability, is skill scalability. While you can use CherryPy to build small utilities like this, it is also useful for combining with a template engine and a database for writing full-scale web applications.

Of course, the main thrust of Eric's argument is that raw WSGI is faster, not more readable. Here are some benchmarks from my machine running Eric's raw WSGI app and my CherryPy WSGI app.

The specs on my machine are: IBM Thinkpad T61, Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz, 4GB 667 MHz DDR2 SDRAM

First Eric's app:

$ spawn -t 0 -p 8080 counter.application
$ curl -X POST -H "Content-Length:0" http://127.0.0.1:8080/song/1
$ ab -n 10000 http://127.0.0.1:8080/song/1
...
Concurrency Level:      1
Time taken for tests:   7.38927 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      1020000 bytes
HTML transferred:       10000 bytes
Requests per second:    1420.67 [#/sec] (mean)
Time per request:       0.704 [ms] (mean)
Time per request:       0.704 [ms] (mean, across all concurrent requests)
Transfer rate:          141.50 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       7
Processing:     0    0   0.5      0      11
Waiting:        0    0   0.3      0      11
Total:          0    0   0.5      0      11

Now my CherryPy app:

$ spawn -t 0 -p 8080 songcounter.application
$ curl -X POST -H "Content-Length:0" http://127.0.0.1:8080/song/1
$ ab -n 10000 http://127.0.0.1:8080/song/1
...
Concurrency Level:      1
Time taken for tests:   14.529259 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      1680000 bytes
HTML transferred:       10000 bytes
Requests per second:    688.27 [#/sec] (mean)
Time per request:       1.453 [ms] (mean)
Time per request:       1.453 [ms] (mean, across all concurrent requests)
Transfer rate:          112.88 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       8
Processing:     0    1   0.8      1      31
Waiting:        0    0   1.1      1      14
Total:          0    1   0.8      1      31

Wow! Eric's raw WSGI app is over 2x faster in req/s. Of course the measly ~688 req/s from my CherryPy application translates to over 59 million req/24hrs*. Not too shabby either. ;-)

Looking at the benchmarks side by side, raw WSGI is going to get you the most bang for your buck out of your hardware. But like I'd rather write things in Python before dropping down to a lower-level language for speed, I'd rather write my HTTP utilities in CherryPy and only fall back to raw WSGI when the need arises.

* Eric's app would service over 123 million requests in the same time period. :P

UPDATE:

Ok, after taking Robert's suggestion to profile, I realized that my CherryPy application was logging to stdout in addition to spawning doing its own stdout logging. Adding the following line to my application bumped up the speed to 842.34 req/s.

cherrypy.config.update({'log.screen':False})

That should bring the speed closer to what Tim Parkin measured using his restish framework.

Reading Chunked HTTP/1.1 Responses

April 02, 2008 at 12:35 AM | categories: Python, work, computing, cherrypy, General

For work today I wanted a way to iterate over an HTTP response with chunked transfer-coding on a chunk-for-chunk basis. I didn't see a builtin way to do that with httplib. It supports chunked reads but you have to specify the amount that you want to read if you don't want it to buffer. I just wanted it to read and yield each chunk that it received from the server.

For my first crack at it I really just tried to use the httplib basics:

import httplib

conn = httplib.HTTPConnection('localhost:8080')
conn.request('GET', '/')
r = conn.getresponse()
data = r.read(10)
while data:
    print data
    data = r.read(10)

That worked but since I won't know the chunk size in real-life, I would probably get output similar to this:

Chunk 0
Ch
unk 1
Chun
k 2
Chunk
3
Chunk 4
...

I really wanted that chunk-for-chunk iteration. After taking a look at the very readable httplib source this evening, it wasn't very hard to accomplish. I basically just took the httplib.HTTPResponse._read_chunked method and modified it to be a generator. I subclassed HTTPResponse and stuck my generator in an __iter__ method. Behold; now you can do this sort of thing:

if __name__ == "__main__":
    import httplib
    import iresponse
    conn = httplib.HTTPConnection('localhost:8080')
    conn.response_class = iresponse.IterableResponse
    conn.request('GET', '/')
    r = conn.getresponse()
    for chunk in r:
        print chunk

With nice results like this:

Chunk 0
Chunk 1
Chunk 2
Chunk 3
Chunk 4
...

You can download the iresponse module from my projects site. There is also a small CherryPy application that serves some data with chunked transfer-coding in case any of you want to fiddle with it.

Book Review: CherryPy Essentials

May 11, 2007 at 06:03 AM | categories: Python, cherrypy, General

`CherryPy Essentials <http://www.cherrypyessentials.com/>`_ Author: Sylvain Hellegouarch Publisher: Packt Publishing Pages: 257

Introduction The title CherryPy Essentials is a bit of a misnomer, as this book covers far more than the bare essentials of CherryPy 3. Admittedly, a book on simply the essentials would be little more than a leaflet, as CherryPy is very easy to understand and be productive with. In this book, which has two main sections, the author covers CherryPy specifics in the first four chapters and then delves into a host of other web-related Python libraries in the remaining chapters. The use of those libraries is tied together with CherryPy through the construction of an example "photo blog" application. Here follows a more in depth look at these two main sections and their strengths and weaknesses.

Chapters 1-4: Pure CherryPy From how to acquire and install CherryPy to creating custom "Tools" to extend it, the first four chapters cover various topics. Some of the content is geared at beginners and some is geared at the seasoned CherryPy developer. There's something for everyone.

The first chapter covers the history and rationale behind the project. While it is interesting, it does not delve into anything that you need to know to get up and running with CherryPy. Newcomers to CherryPy might want to give it a glance, but people who have been using it for a while don't really need to bother.

The second chapter covers installation. Those of you who have installed Python packages won't find any surprises here. For the beginners, the content is excellent. Three methods of installation (source tarball, easy_install, Subversion) are covered in nice detail. Definitely go over this chapter if you are new to Python in general or if you would like to learn more about any of those installation methods.

Things really get going once you hit chapter three. After a brief intro, it starts off with source code for a demo "note taking" application. The code is well documented and should be fairly easy to digest. It makes use of a number of basic features of CherryPy and the author refers back to the code throughout the chapter to highlight specific features. Highlights include configuring your application and the builtin HTTP server, how the object publishing system works, handling exceptions and customizing errors, and overviews of the various utility modules that CherryPy is built upon (but also exposes to the developer!).

One thing that could be slightly confusing in this and subsequent chapters is the use of the term "CherryPy engine". The author uses it in reference to CherryPy itself, but CherryPy has a distinct "engine" (also introduced in the chapter) with specific functionality (process and signal control, generating Request objects, etc). The use of the term in the context of CherryPy-at-large could lead to some confusion - but now you know, and knowing is half the battle.

Chapter four goes even deeper into the inner workings of CherryPy and the power that is available to the developer. The various URL dispatchers (HTTP method, Routes and virtual hosting) included with the framework are introduced with usage examples. Although it is mentioned that there is a "simple way" to create your own dispatchers, there is no content related to actually creating one.

Hooking into the core request/response processing system is covered in this chapter. The various hook points and how to attach callbacks to them is described. The "Tool" interface which makes working with the various hook points much friendlier is also introduced. All of the builtin Tools are covered, including expected parameters, example code and a description of what the Tool accomplishes. This section is reminiscent of the Python Cookbook in its approach and is one of the really useful references in the book. The coverage of Tools winds up with a section on creating your own custom Tool.

One slight quirk about the section on Tools is that the Tools for serving static content are discussed separately from the other tools. They are introduced after the section on creating a custom Tool. Not a huge deal, but an interesting decision. Perhaps it was done that way because serving static content is covered in greater detail than the other Tools.

Chapter four closes with a section on the WSGI support that CherryPy provides. Examples of hosting WSGI applications using CherryPy as well as hosting a CherryPy application within a third-party WSGI server are presented. The diagram of WSGI leaves a little to be desired, but the content of this section is very good otherwise.

I think that the first four chapters are what most people have been looking for in a CherryPy book. These are the Essentials as referenced in the title, in my opinion. But the book is just getting started...

Chapters 5-10: Developing a Web Application with CherryPy Chapter five kicks off the rest of the book, which is a sort of case-study covering the creation of a "photoblog application". The "model" of the photoblog is introduced in this chapter, as well as three different Python object-relational mappers (ORMs): SQLObject, SQLAlchemy and Dejavu. Excellent usage examples are presented for each ORM, but oddly enough the examples are not geared to the photoblog but to an imaginary music cataloging system or something.

Though the three ORMs are introduced, only one is chosen to be used with the photoblog. The lucky ORM is Dejavu. Some might question this choice, as it is not one of the more popular ORMs (at least compared to SQLObject and SQLAlchemy), but I think it was a good decision. Dejavu has a solid design and compares well with other ORMs. Whichever ORM you prefer, the Dejavu code should be easy enough to follow and you could follow along with your favorite ORM if you like.

After the ORM is settled on, the topic of the application as a web service is covered in chapter six. Two methods of exposing the application's API are described: REpresentational State Transfer (REST) and the Atom Publishing Protocol (APP). The description of the principles of REST is good and the photoblog API implementation using REST and the MethodDispatcher helps to solidify the concepts. APP is also covered with good introductory detail.

The user interface for the photoblog is covered in chapter seven. Like the database/ORM layer, CherryPy can work with any templating language, but the decision was made to use Kid in the book. A custom Tool for rendering Kid templates is introduced in the chapter and it is a nice example of a simple but worthwhile Tool. The MochiKit Javascript library is also introduced in this chapter. While I am personally a fan of MochiKit, the only problem that I have with its use in the book is that the latest stable release is quite dated. Developers wanting to follow along with the book will have to fetch the latest revision from the MochiKit subversion repository.

Ajax, the darling of the web, is presented in chapter eight. The chapter includes some history and a description of Ajax in general. MochiKit handles the client-side while CherryPy (of course) and the simplejson library are used to generate JSON output on the server-side. Some knowledge of Javascript is expected in this chapter, but the decisions that are made in the photoblog application are covered in great detail. One particularly interesting topic that is covered in the chapter is Basic/Digest authentication with XMLHttpRequest objects. A form-based login page that does Basic/Digest auth "under the covers" is presented complete with code.

Chapter 9 offers excellent coverage of testing your CherryPy applications. Unit, functional and load testing are all covered in this chapter, with copious examples and descriptions. The unittest and doctest standard library modules are introduced, as well as the webtest extension to unittest provided by CherryPy specifically for testing web applications. For functional testing, Selenium is demonstrated. It is a top-notch testing tool that is invaluable for testing Ajax applications. I usually do crude load testing with ab, but the FunkLoad tool introduced in this chapter is very interesting.

The final chapter in the book delves into deploying your CherryPy applications. Configuration, some of the various deployment methods (mod_rewrite, mod_python, etc) and targets (Apache, lighttpd) are covered in good detail. Surprisingly (to me at least) deployment using mod_proxy was not covered. I find this to be the best deployment option for a small site. Thankfully, there are good online docs for that deployment method.

Conclusions As I mentioned in the introduction, the title CherryPy Essentials is misleading - this book is much more than that. It gives a good picture of the power and flexibility of CherryPy and introduces the reader to a number of good libraries and tools that are useful for web development.

A well done index follows-up the meat of the book, which is always important in this type of book which you are likely to refer back to for various topics.

The only fairly large disappointment to me was the poor quality of the copy editing. The editors let too many awkward phrases slip through. Thankfully, the content of the book more than makes up for that shortcoming. Kudos to the author.

If you are looking to expand your knowledge of CherryPy in particular or web development using Python in general, I think that you would do well reading CherryPy Essentials.

Disclaimer: Packt sent me a copy of the book in exchange for doing a review. I already had a copy, so I gave it away. I am also involved with the CherryPy project, so I am somewhat biased.

My Projects Site

December 06, 2006 at 09:34 PM | categories: Python, cherrypy, General

I had to disable comments on my projects site because of comment SPAM. They won't be renabled until I get some decent anti-spam measures in place. Really, the software that runs the projects site is in need of a serious rewrite. As I mentioned to someone who inquired about it being upgraded to CherryPy 3 the other day, I have a long list of things to fix/enhance with the project. Perhaps I just need to make the list shorter...