At Google, we like to use our own projects for internal development. Following that philosophy, one of the first apps ever made for Google App Engine was the contest platform for Google Code Jam. The application was a good fit: a rich web interface, real-time user interaction, and a heavily parallel design. But we faced one major challenge, which was how to handle the huge scoreboard -- in the first round, we had over 11,000 contestants.
While App Engine's datastore allows you to sort entities by score, it doesn't have a built-in mechanism to answer the following two requests:
- Given a user, what is his or her rank on the scoreboard? In other words, for some arbitrary row in that scoreboard, what is its position amongst the 11,000 spots? This is useful for showing your own rank, as well as the ranks of your friends.
- Give me all users on the n-th page of the scoreboard. While it is possible to get all users in sorted order, datastore queries don't allow you to start at page n, and iterating over all of the users on the first n pages takes too much time.
To solve these issues, we came up with a library that maintains a data structure to efficiently support these use cases. We imagine that other applications will have similar problems (e.g., a high-score list in a game), and so we're happy to release our work as the "google-appengine-ranklist" library.
The library supports three different operations:
- Setting the score of a given user.
- Given a score, what's the rank of a user with this score? This is used to answer the "What's the rank of person U?" use case.
- Given a rank, what's the score S for this rank? This can be used to solve the paging problem, by constructing a query that returns the first few users with a score less than or equal to S, in sorted order