Friday, March 19, 2010

Faster Subversion Hosting for Project Hosting on Google Code

When we launched our first Subversion-on-Bigtable service in 2006 our goal was to scale to support hundreds of thousands of projects, with the idea that we could continue to improve the service over time. A year ago, however, we realized that we would have to rebuild our Subversion service to make dramatic improvements in performance. So, we did what we had to do: we rebuilt our service from the ground up, focusing on speed and reliability.

We are now happy to announce that we have rolled out our new service to all our Subversion users. As a result, most common Subversion operations are about 3 times faster than they used to be.

One of the features of Subversion's HTTP-based protocol is that anyone can browse repositories through a normal web browser. Many open source projects hosted on Google Code use this feature to host websites for their project or post the latest versions of their software. We didn't anticipate how popular this would be when we designed our first Subversion service, but our new system has special optimizations for browser access. Latency for these pages are much lower and international users will see a dramatic improvement. We also set the appropriate caching headers, which can be manually controlled with the google-cache-control Subversion property.

To improve our reliability, our new service now has a custom replication system based on the Paxos algorithm. Whenever you make a change to your repository, the new data is now copied to several different data centers before our service reports that the commit has succeeded --- so you can code in peace knowing that your data is stored safely in multiple locations.

If you haven’t already, we encourage you to try out our new Subversion service and let us know what you think.

22 comments:

  1. This is great and all, but ... who still uses subversion for large-scale projects?
    git is great for anything large-scale. bzr is the top-notch for single-contributor projects and anything small-scale.
    Granted, one of svn's annoyances is speed, but its poor support for any true and healthy workflow is unnerving.

    ReplyDelete
  2. There are a lot of successful projects that still use Subversion for various reasons. Chromium is the one example that comes to mind.

    DVCSs requires users to have a different workflow for large projects, i.e. lots of smaller repos. Several projects still prefer to have one large project that contains all their source code and find that Subversion works fine for them. Having one large repo in Git, for example, is begging for trouble.

    People should be free to choose what VCS tool that they like. Don' judge.

    ReplyDelete
  3. I'm not judging! But subversion is starting to show its age and lack of features, while other versioning systems are getting healthier and healthier.
    As for workflows, I should point you to bazaar's workflow system: http://wiki.bazaar-vcs.org/Workflows

    By the way, I don't know if it has anything to do with the updates but I cannot checkout some google code projects.


    svn: Server sent unexpected return value (502 Bad Gateway) in response to OPTIONS request for 'http://protobuf-editor.googlecode.com/svn/trunk'

    ReplyDelete
  4. Small addendum..

    You say people should be free to choose what VCS they like, and I agree to an extent. However, googlecode's policy has, FWIU, been to only support a minimal amount of backends to make life easier for the users who want to download the source (and once again, it's a healthy thing to do, I cannot fathom projects telling me I need to install mnt).
    But at that point, who decides the "free choice" of other users? Perhaps a good alternative would be git/bzr/... mirrors for read only branches.

    ReplyDelete
  5. We already support Mercurial, which is comparable in features and performance to Git and has relatively wide adoption in some communities.

    ReplyDelete
  6. I'm getting 502 errors when trying to commit to Google Code hosted SVN repositories.

    ReplyDelete
  7. I'm getting a 502 error as well.

    ReplyDelete
  8. me too on the 502s. Inconvenient!

    ReplyDelete
  9. I'm also getting the 502 error.

    ReplyDelete
  10. 502 Error for me, too. :(

    ReplyDelete
  11. me too.

    svn: Server sent unexpected return value (502 Bad Gateway)

    ReplyDelete
  12. Our apologies for the outage this afternoon. While it may appear to have been caused by the new service mentioned in this announcement, it was actually a coincidence and unrelated. We plan to post a more detailed postmortem soon.

    ReplyDelete
  13. This comment has been removed by the author.

    ReplyDelete
  14. I for one am happy they continue to support SVN, as for the small-scale to mid-scale projects (which make up the vast majority of All Open Source software) SVN works Just fine for its purposes.

    Thanks again Google.

    ReplyDelete
  15. Thanks for the update. I did find a dramatic speed boost after the svn service went back online.

    thanks for the tremendous effort and am looking forward to your postmortem .

    ReplyDelete
  16. For the curious, I have just posted a postmortem of the outage to our Google Group.

    ReplyDelete