Wednesday, March 31, 2010

Easy Performance Profiling with Appstats

Since App Engine debuted 2 years ago, we’ve written extensively about best practices for writing scalable apps on App Engine. We make writing scalable apps as easy as possible, but even so, getting everything right can be a challenge - and when you don’t, it can be tough to figure out what’s going wrong, or what’s slowing your app down.

Thanks to a new library, however, you no longer need to guess. Appstats is a library for App Engine that enables you to profile your App Engine app’s performance, and see exactly what API calls your app is making for a given request, and how long they take. With the ability to get a quick overview of the performance of an entire page request, and to drill down to the details of an individual RPC call, it’s now easy to detect and eliminate redundancies and inefficiencies in the way your App Engine app works. Appstats is available for both the Python and Java runtimes.

Enabling Appstats is remarkably straightforward. In Python, it can be made to work with any Python webapp framework. For the full lowdown, see the docs, but here’s the quickstart if you’re using a supported framework. First, create or open a file called in your app’s root directory, and add the following to it:

def webapp_add_wsgi_middleware(app):
   from google.appengine.ext.appstats import recording
   app = recording.appstats_wsgi_middleware(app)
   return app

In Java, appstats works by installing a filter. Again, the lowdown is here, but for a quickstart, add this to your web.xml:

           <param-value>Appstats available: /appstats/details?time={ID}</param-value>

This installs the Appstats event recorder, which is responsible for recording API calls made by your app so you can review them later. The recorder is pretty low overhead, so you can even use it on a live site, if you wish - though you may want to disable it once you no longer require it.

The other component of Appstats is the administrative interface. To install this on the Python runtime, you need to make a change to your app.yaml file. Add the following block inside the ‘handlers’ section of app.yaml:

- url: /stats.*
 script: $PYTHON_LIB/google/appengine/ext/appstats/

And for Java, add this to your web.xml:



The url here - ‘/stats’ - can be anything you like, as long as it ends with ‘/stats’, and is the URL that you can access the Appstats admin console over.

For additional ease-of-use, we can add the admin interface as a custom admin console page. In Python, add the following block to the end of app.yaml:

 - name: Appstats
   url: /stats

Similarly, you can do this in Java by adding this to appengine-web.xml:

 <page name="Appstats" url="/stats" />

After redeploying your app, you should now see a new option in your app’s admin console labelled ‘Appstats’. Clicking on this will show you the appstats admin console.

The main page of the admin console provides some high level information on RPC calls made and URL paths requested, and, down the bottom, a history of recent requests. Clicking the plus button on any of these will expand the entry to show more details about it, but for the really juicy details, click on an individual request in the Requests History section.

This page goes into detail on what happened in an individual request to our app. The timeline is of particular interest: Each row represents an individual RPC call that was made, with the start time, end time, and CPU time consumed all noted by means of a chart on the right. As you can see, it’s quite possible for the CPU time consumed by an RPC call to exceed the wall clock time - this typically occurs when multiple machines are involved in assembling a reply to your RPC request.

Let’s take a look at the code in question. Clicking on the datastore_v3.RunQuery RPC call to expand, we can get a complete stacktrace for the RPC call, and by clicking on a stack frame, Appstats shows us the source code in question! We see that the culprit looks something like this:

def get_questions(self):
 return models.Question.all()

This is a really frequent query - it’s executed every time anyone loads the front page - so it’s a prime candidate for the memcache API. Modifying the function to take advantage of it is straightforward:

def get_questions(self):
 questions = memcache.get("latest_questions")
 if not questions:
   questions = models.Question.all().fetch(50)
   memcache.set("latest_questions", questions, time=60)
 return questions

All we’re doing here is first, checking if the results are already available in memcache. If they’re not, we fetch them the regular way, by doing a datastore query, and store them in memcache for future reference, telling it to keep them around for 60 seconds.

60 seconds is relatively high for data that may change often, but we figure users won’t be bothered by this on our site. Much shorter timeouts - as low as just a few seconds - can still save huge amounts of resources, especially on popular sites. Few users will worry about a 5 second cache timeout, but if you’re getting 100 queries a second, you’ve just eliminated 99.8% of your query overhead!

If we repeat our request with the new code, we can take a look at the updated statistics and note the improvement on a ‘warm’ request that fetches data from memcache:

Much better! Faster by every metric: We’ve cut a chunk off the wallclock time, so our users get pages faster, and we’ve reduced CPU time and API CPU time as well! As you can imagine, even more dramatic improvements are possible for more complex applications.

Appstats can also help you with your wardrobe: participate in the Appstats contest for the best before/after screenshots of Application Stats. Post your screenshots online and link to them on Twitter copying @app_engine and using the hashtag #coolappstats, before May 2nd 2010. The coolest pair of screenshots will be used to create a Google App Engine T-shirt, and we will send that T-shirt, autographed by the App Engine team, to the winner.

1 comment:

What's up Drew! said...

Shows the line of code taking the most CPU. AWESOME. Highly recommended.