Thursday, January 26, 2012

My summer with the Google App Engine Team


Today’s post is contributed by our Summer 2011 team intern, Chris Bunch. Chris did some great work on our Logs and MapReduce APIs and is also the first “App Engine Triple Crown” winner for developing the Experimental Logs Reader API in Python, Java and Go simultaneously.

Four years ago, I was a brand-new Ph.D. student at the University of California, Santa Barbara and when our research group (the RACELab) heard about Google App Engine, we were intrigued. We thought it presented a new model that enabled apps to scale the right way without severely constricting the types of programs users would write.

But we wanted to experiment with the core functionality of App Engine: the APIs, the scheduler, etc., and so we built AppScale, an open-source implementation of the Google App Engine APIs that allows users to deploy applications written in Python, Java, and Go to the infrastructure of their choice.
Wherever possible, we implement support for the App Engine APIs with alternative open-source technologies. We’ve added support for nine different databases, database-agnostic transactions, a REST interface that users of any programming language can communicate with (via an App Engine app), and the ability to run high performance computing programs over the whole thing and talk to it from your App Engine app. And here’s my favorite part - it all deploys automatically! You don’t need to tell it what block size you want for the distributed file system, or the size of the read buffers: we configure the necessary services automatically. Since AppScale is completely open source, if you don’t like the defaults, change them!

After creating our own system to run Google App Engine apps, I wanted to see how Google does it. Therefore, I decided to become an intern on the App Engine team and see if I could give them (and by extension, the App Engine community) something amazing over the summer. I started off with some work on the MapReduce API, making the sample app much easier to use and prettier all around. I also made a YouTube video showing how it all works and how easy it is to run MapReduce jobs over App Engine.

I then looked at a recurring question that App Engine users encounter: “How can I get my logging information for my application to answer data analytic questions?” It was an excellent problem to tackle, as we have users who want to be able to determine application-specific queries that Google Analytics or the Admin Console don’t answer. Currently users have to use appcfg to grab all their application’s data to a remote machine and run some analysis script over it.

To solve this problem, I created the Logs API, which gives applications programmatic access to their logs from within App Engine itself. Applications can use it to query small numbers of logs within a single request, and they can utilize the Pipeline, MapReduce, or Backends APIs if they have lots of logs they want to analyze. Logs contain both request-level information (e.g., the URL accessed, the HTTP response code returned) as well as logging info generated by the application (the logging module in Python, the Logger class in Java, and the logging methods that Go’s appengine package provides). The Logs API is available for use as of App Engine 1.6.1 by programmers using the Python, Java, or Go runtimes, in both the production environment and the local SDK.

I had a great time putting the Logs API together, and had a unique experience interning with the App Engine team. Programming in Python, Java, and Go on a daily basis was an exciting new challenge, and I loved it! 



Interested in interning with the App Engine team? Check out google.com/students for more information on internships.

3 comments:

Sarkis Dallakian said...

Great job! I don't use Logs API personally, but it's nice to know that it's there. Also, this would look pretty good in your Resume. Writing this API in 3 different languages is very impressive!!!

HRJ said...

I wasn't aware of AppScale before. It's nice to know that there are alternative implementations of the App Engine API, and kudos to Google for sharing info about it.

In fact, talking about the alternative could be beneficial to Google too as it reduces the tied-in-to-a-propreitary-platform feeling and may get more apps on board.

Alessandro Aglietti said...

Very interesting! Good idea!

Last month I had the same idea and I implemented an Entity/Json de/serializer.

You can try and learn at: http://json-datastore.aqquadro.appspot.com/

The logic it's very easy:

Entity is a simple JSON Object with two properties "key" and "properties". I write a set of JsonDeserializer and JsonSerializer for the gson library and it works!

See you soon,