We upgraded Ngin from Ruby 1.8.7 (REE 2011.12) to Ruby 1.9.3 (MRI)! This was a huge undertaking considering Ngin has 644 models, 83152 lines of code, depends on 95 gems directly, and has been around since Rails 1.1. This is the first in a series of posts where we will share our findings including:
Ngin is a website platform that empowers thousands of organization websites with an easy to use yet extremely powerful content management system. Inline page editing via page elements lies at the heart of Ngin. Admins can toggle into edit mode, add, edit, and drag page elements as desired to create content on their site without any technical knowledge. Ngin's 68 page element types provide everything one might want to do, from simple text blocks and photos, to tables, photo galleries, and integration with Facebook or Twitter.
Ngin's core business is focused on sports, from youth to the pros, from hockey to baseball, from one team to an entire league of teams. Ngin has brought professional level statistical packages and live scoring to all levels of sports. Check out the following great examples of what the Ngin platform can deliver:
Underneath the hood, Ngin runs on Engine Yard's Cloud, which in turn is powered by Amazon EC2 servers. All aspects of Ngin's infrastructure configuration is version controlled and managed by Chef recipes. Engine Yard's excellent Cloud product gives us the power and flexibility we need to customize our servers exactly how we want them. Engine Yard has a set of base recipes that lay the foundation, which is followed by an extensive set of recipes we have written to customize our server configuration.
The infrastructure stack underlying the Ngin platform starts with haproxy as the load balancer followed by Nginx with Passenger. Nginx serves static assets, handles SSL, and hands off the dynamic requests to the Passenger Rails processes. Rails utilizes MySQL 5.5 Percona Server as the persistent data store and Memcached as the caching store. Delayed Job manages long running tasks in the background.
While the Ngin platform is split into three separate Ruby on Rails applications, the bulk of the platform resides within a single application that we call Ngin:
Plans are in the works to gradually convert the Ngin platform into a number of smaller services that will be transparent to the end user. Until then Ngin remains an unusually large Rails application which makes efforts such as upgrading Ruby versions gigantic in nature.
A Ruby Machine
Our motivation for upgrading to Ruby 1.9 was driven by two things: improved performance and a desire to stay up to date. New features such as the new hash syntax, block argument scope, or the lambda operator while nice were simply not driving factors. We were most excited about the new and improved ruby virtual machine, including a new lazy sweep garbage collection algorithm. It is unfortunate that advances in the virtual machine are entangled with changes to the underlying ruby language, such that if you upgrade one you must upgrade both.
Getting our test suite to pass was a huge milestone in the upgrade effort. A large percentage of our tests were erring or failing. Our test suite, while not comprehensive, does provide 54% code coverage, which was significant enough to catch a large number of issues. The day our entire test suite passed on 1.9.3 was a good day!
Instead of spending rigourous man hours manually verifying all of Ngin's features, we relied heavily on our test suite and something new for us: using emproxy to duplex our production traffic onto a separate staging environment with the same hardware running Ruby 1.9.3. Having real production traffic hit an environment running Ruby 1.9.3 allowed us to find bugs that our test suite had missed by simply browsing errors reported by New Relic. We were able to fine tune the configuration of Ruby 1.9.3, including garbage collection settings, memory kill thresholds, number of passenger rails processes, and more prior to rolling this out to production.
Deploy time finally came. We deployed without downtime late at night during off hours. We upgraded ruby in place on actively running application servers and triggered our rolling deploy via capistrano which restarts one application server at a time. The deploy went smoothly as planned. The next morning we had a handfull of obscure bugs related to the 1.9 upgrade reported by our customers which we quickly fixed and deployed. We were extremely happy with the small amount of bugs introduced with this upgrade, which is directly attributed to our test suite and the emproxy technique.
On the performance front we saw only slight gains in raw speed which was disappointing. We'll dive further into this in a future post. The huge win for us was the smaller memory footprint. Ngin running on 1.9.3 requires only 60% as much memory as running on REE! This was significant enough to turn our EC2 High CPU Extra Large instances from being memory bound to being cpu bound, allowing us to significantly reduce the number of Ngin's application servers.
In the next few posts we are excited to dive into the details on how we turn emproxy on and off seamlessly without interruption to our production environment and how we optimized our Ruby 1.9.3 configuration for maximum performance. Along the way we'll provide code examples and point out key pieces of information that may help your application perform faster immediately!