Mixx’s Engine Room - Stoking the fires at Mixx

Performance and the Rails log

Posted by Joe on December 8th, 2008

One of the great strengths of Ruby on Rails is that it abstracts away database access so that you don’t have to worry about SQL when writing your application. Unfortunately, this hides database accesses from developers, which can lead to serious performance problems. The solution to this problem is no further away then the Rails log.

Consider this simple Mixx-ish application, which displays a list of stories:

The controller:
@stories = Story.find(:all, :conditions => "some condition")

The view:
<ul> <% @stories.each do |s| %> <li>"<%= s.title %>" by <%= s.submitter.display_name %> with <%= s.comments.count %> comments </li> <% end %> </ul>
(I should note that Jason and Doug create much better markup than this. This is what you get when a backend guy like me writes code for the purpose of demonstrating something - we wouldn’t use anything this ugly in Mixx production code.)

Seems pretty reasonable, right? But as the number of stories on the list grows, performance quickly goes bad. Let’s find out why.

First, sample output from a run of this code:

“Obama wins!” by joe with 3 comments

“Storms rage everywhere” by julie with 0 comments

“Mixx launches” by chris with 3 comments

“Eggplant farming” by chris with 1 comments

Next, take a look at the Rails log when this list is generated. This can be found in the application directory in logs/development.log:
Story Load (0.000911) SELECT stories.* FROM stories Rendering stories/show User Load (0.000601) SELECT * FROM `users` WHERE (`users`.`id` = 176) Comment Load (0.000391) SELECT count(DISTINCT `comments`.id) AS count_all FROM `comments` WHERE (`comments`.thingy_id = 545) User Load (0.000441) SELECT * FROM `users` WHERE (`users`.`id` = 188) Comment Load (0.000364) SELECT count(DISTINCT `comments`.id) AS count_all FROM `comments` WHERE (`comments`.thingy_id = 99) User Load (0.000424) SELECT * FROM `users` WHERE (`users`.id' = 6) Comment Load (0.000358) SELECT count(DISTINCT `comments`.id) AS count_all FROM `comments` WHERE (`comments`.thingy_id = 85) User Load (0.000408) SELECT * FROM `users` WHERE (`users`.`id` = 6) Comment Load (0.000269) SELECT count(DISTINCT `comments`.id) AS count_all FROM `comments` WHERE (`comments`.thingy_id = 18)
The first line is reasonable - that’s just getting the list of stories. But look what happens when rendering the view: for each story displayed, there are two queries sent to the database. It’s no wonder that generating a long list of stories takes a long time when you are issuing two queries for each story.

A quick look at the code tells us what is happening: the expression “s.submitter.display_name” gets the submitter of the story (which is in the Story model as a belongs_to attribute) and extracts the display_name from it. But that requires a database retrieval (the query against the users table). And the expression “s.comments.count” requires the query against the comments table.

Happily, there’s a number of techniques that can be used to eliminate these extra queries. These include:

1. Use of the “include:” directive when fetching the data. For example, change the above controller code as follows:
@stories = Story.find(:all, :conditions => "some condition", :include => [:submitter])
and all of the users table queries above are replaced with a single query against the users table.

(Note: once upon a time, Rails could be inefficient in its implementation of the :include directive. Happily, this is fixed in Rails 2.0, so there is no longer any reason to avoid :include.)

2. Store frequently used data in the parent table. Admittedly, this is a violation of database normalization principles. Under these principles, we never want to duplicate data in the database. But in some cases, the performance gains that we get from denormalization are worth the potential problems with having data get out of sync. Especially when the data in question is not crucial.

In this case, we added a comment_count field to the stories table. We use after_create and before_destroy actions on the Comment model to keep the comment_count field updated - when we create a new comment, we increment the comment_count value for its parent Story. (We made a conscious decision that, while we could be clever and manage to keep comment_count accurate, we can live if it is incorrect for some stories.) After that, we can use the comment_count field in Story instead of comments.count, thus avoiding the query required to get the count of comments for a story. The resulting view is as follows:
<ul> <% @stories.each do |s| %> <li><%= s.title %> by <%= s.submitter.display_name %> with <%= s.comment_count %> comments </li> <% end %> </ul>
In conjunction with the controller change noted in #1, the query log now looks like this:
Story Load (0.006564) SELECT * FROM `stories` WHERE ( some condition ) User Load (0.000943) SELECT * FROM `users` WHERE (`users`.id IN ('127','193','6','249','216','239','91','240','196','37','93','176', '188','244','235','136')) Rendering stories/show
What used to take nine queries (and could easily grow to more as the number of stories increased) now takes two.

3. In some cases, you can find the data you need in the model’s attributes. If you can thus avoid retrieving via a belongs_to or has_X relationship, then do so

There isn’t an example of this in the sample code. But suppose that we did not want to display the actual submitter of a story in the above example, but we want to display the story differently if it is submitted by the person viewing the page. (In our case, the User model of the viewer of the page is always contained in the @user variable.) We could use any of the following methods to make this test:
if @user == story.submitter # bad! Requires database retrieval of submitter if @user.id == story.submitter.id # bad! Also requires retrieval. if @user.id == story.submitter_id # good! No retrieval required.
Even if you use the :include directive to get the submitter, the first two methods require at least one database retrieval for the page.

Another common case involves a test for the existence of data related to a model via a belongs_to. For example, suppose we want to test whether a story has a submitter at all. Here are two ways of doing it:
if story.submitter # bad - retrieves submitter if present if story.submitter_id # good - no retrieval required
In summary, even though Rails allows you to abstract away database accesses, if you want your application to perform, you need to be aware of how it is using the database. An excellent method to be aware is to review the Rails log while you are developing your application, paying particular attention to repeating instances of the same query (as in the example above, where the queries against the users and comments tables repeat). In fact, you should frequently review the Rails log whenever you are building an application that has to perform well.

Filed under: Uncategorized |
5 comments

OpenID for the Rest of the World

Posted by Jason on October 6th, 2008

Through the history of the Internet, there are inflection points where a new technology hits “the masses” and really takes off. Almost universally, the common thread in a quick expansion and adoption of a technology is a major breakthrough—not of pure technology—but a breakthrough in design. Specifically, a major breakthrough in ease of use.

Before Google created a dead simple search interface, the vast majority of people used browsing as their primary method of navigating the Web. Before Mosaic made visual browsing simple, text-based navigation through Archie and Veronica were the norm. Apple made consuming music online easy with iTunes and the iPod. AOL made “getting online” easy for Mom and Dad. Most recently, YouTube has made Internet video sharing easy for every high schooler in the world.

The history of the Internet is littered with technologies that were fabulous, but never quite made it mainstream. At Mixx, we’re so in love with the idea of OpenID that we want to do our part to make sure OpenID isn’t relegated to the history books in the “might have been” chapter.

For those new to OpenID, the idea is simple. After creating an account with an OpenID provider, you are given a unique identifier (a custom URL) that you in turn use to log in to and register with sites that support OpenID (”consumers,” as they call them). The goal of OpenID is two-fold. First, you, dear user, can create your OpenID with an account provider you trust and, in turn, extend that trust to sites and services of your choosing. No longer do you need to hand over an email address and a password to every new (potentially fly-by-night) web application that comes along. Second, since you typically have a single OpenID, you won’t have to try and remember which username and password you used on this site or that site. You simply remember that you used OpenID!

OpenID, since it’s creation, has made great inroads across the web. Some of the largest service providers out there (AOL, Yahoo!, etc.) have become OpenID providers, giving OpenIDs to millions and millions of people. Every week, at least one site launches with OpenID support or an existing site adds OpenID functionality to their login or registration process. The trouble, as we’ve observed, is that while OpenID login and registration is being rapidly added to sites, its presentation lacks the design necessary for Mom and Dad to grasp OpenID’s power.

Today, we’re proud to launch our take on OpenID registration and login. If you swing by the Login or Registration pages and you’ll see something new. For existing Mixxers, the login screen will be contextual to your current method of login (either username/password or OpenID).

For new Mixxers, you can register with an AOL, Yahoo!, or Facebook account. Standard OpenID registration is also still available. If, by some twist of fate, you don’t have an account with any of the third-party services we currently support, you can still register with your email address. While Facebook isn’t OpenID per-se, both AOL and Yahoo! registration and login utilizes OpenID under the covers without asking users for their OpenID URL. In AOL’s case, we ask for your AOL or AIM account name and shuffle you off to the appropriate login page. Yahoo! login and registration is even simpler—click the big honkin’ “Login/Register with your Yahoo! ID” button and we take care of the rest.

We’ve gone one step further and added a great feature to login: we keep track of the last method of login you use and redesign the page based on that. So, if you use Facebook to login, you’ll be presented with a large Facebook icon and button. No need to hunt and click through our login options!

The last piece to the puzzle is managing your accounts. Navigate to your Account Settings and click the new “Accounts” tab. From there, you can add and remove your third-party accounts at will. By linking up your various accounts across the web with your Mixx account, you can use any of those methods to login to Mixx. This is a great first step toward the goal of interoperability between Mixx and your favorite web sites and applications.

We put a good deal of work into the new registration and login experience and we hope you find the improvements useful and, above all else, easy. As always, we appreciate your feedback and look forward to hearing what you have to say!

~ Jason (and the rest of the Mixx team)

Filed under: User Experience |
5 comments

Hiding content, accessibility, and the onload problem

Posted by Jason on September 15th, 2008

Since I joined the Mixx team (a year and some change ago) and began cranking out the HTML, CSS, and JavaScript that you see and use every day, I made it a point to build features out with accessibility in mind. Mixx has a great variety of users comprising all races, colors, creeds, and capabilities. It was (and remains to this day) important to us that we provide a great experience for our users while not leaving anyone out in the cold.

As most of you know, there’s a ton of interactivity on Mixx—voting, reporting, submitting, interacting with YourMixx—the list goes on. Each of these elements involves a different interaction with the application and every one of them involves manipulating content on-screen using a combination of JavaScript and CSS.

In an effort to accommodate users (or, more directly, their browser of choice) who have JavaScript turned off, we take the approach of displaying all content by default and then using JavaScript to hide the appropriate elements. The easiest to observe example of this is on the permalink pages. With recent redux of the permalinks, we added some new “tabs” (for lack of a better descriptor) below the entry information and above the comments: Activity, About this site, and Related.

When browsing to a permalink page, depending on how zippy your Internet connection is, you may notice that the three tabs are expaned and stacked on top of one another initially. After a brief pause, they’ll snap away and they may then be accessed by clicking on their respective button. This is, at the most basic, the situation described above. Page content loads. JavaScript does its job and hides the appropriate pieces.

In a perfect world, this would happen instantaneously and no one would notice. In the real world, Internet connections are variable, web servers hiccup, and external resources (like Google Analytics) have the side-effect of stalling firing of local scripts. The last point there is the one of concern to our discussion. Steve Souders, in his excellent book High Performance Web Sites, goes into great detail about why this stalling happens. Check out Chapter 6 on “Problems with Scripts.”

So what do we do? Let’s first take a look at how Mixx currently works.

As we’ve observed, our HTML and CSS style everything on the page and display elements by default. Once everything is loaded up, the Mixx JavaScript (using jQuery, a subject for another post) uses $().ready() to fire off our onload events. The appropriate bits hide away and we’re done. This is great as far as “best practices” and all that are concerned, but less-than-great from a perceptual viewpoint.

Robert Nyman, in his post How to hide and show initial content, depending on whether JavaScript support is available, outlines a technique where you add a <script> element to the <head> of your document which calls an anonymous function which, in turn, adds a CSS file to your page. The CSS file contains a single line with a selector (in his case an ID-based selector) that tells an element to not display.

It’s a brilliant solution that overcomes the onload problem introduced by remote assets. The only amendment I would make to his example is to use a class-based rather than ID-based selector. This way, you can set up a single class (”.alt”, for instance) and apply it to all elements you wish to hide.

Simple as that, really. Robert offers us an extensible solution to a problem that’s been troubling developers more and more. So why, then, you ask, has Mixx not implemented this? “Time”, mostly. We’re a small team here with a huge list of things to do! I promise you, though, that this will be implemented as we have time.

Have you found other effective solutions to this issue? If so, leave us a note in the comments!

Filed under: JavaScript |
3 comments

Dr. Semantic or: How I Learned to Stop Worrying and Love the XHTML/CSS

Posted by Doug on August 22nd, 2008

5 keys to making the jump from visual designer to web designer

So first let’s define, (my definitions when talking about the web)

A visual designer: Someone who designs graphics that make their way onto the web.

A web designer: Someone who has the visual design skills but also understands how to write clean, semantic, standards-based HTML/XHTML and CSS.

The latter understands the canvas that the ink is going on. The former knows how to mix the ink.

1. What type of learner are you?
My biggest mistake in my career was not answering this question before I attempted to learn “the other side” for the first time. I would have been writing markup years earlier if I had just picked up the right type of materials. Black type on white paper with an occasional graph/chart was the first method of self training. Big time fail. If anything it scared me back into my photoshop-only hole.

Five years later I picked up a book by Jeffery Zeldman. If you are reading this blog, chances are you have heard of it: “Designing with web standards“. This book saved my career. The book told of what could–and historically should–be done online. It opened doors that were once closed - heck, dead-bolted and Master Locked. The key area where the book affected me the most is it took the “scary factor” out of the word “development.” Instead of development getting in the way of design, semantic and standards-based design became the norm. The historical context also played a key role. It helped me understand what was right and what was wrong, and explained how technology limitations caused a lot of that wrong.

Looking back, this was the key player in getting me to the next step. Each book or other training aide purchased after this had this in mind. What Zeldman’s book brought me that others failed to were graphics and color. It spoke my language and conveyed the answers to the questions I had.

2. Are you really sure this is the life you want?
Some people just flat out aren’t capable of making the switch. I will never be a back-end developer and I am okay with that. My brain just doesn’t function that way. In this case, there is nothing wrong with only doing visual design. For me it was the next logical step in the direction I wanted to go.

3. Who do you know? Because they will help.
Where most books failed me, and classes never came close to delivering, were those connections I made in the industry that really helped me get over the hump. Attending conferences, being a part of the local scene or just sending a random email to a “weblebrity” always brought back more return than any book read or class attended. So don’t be afraid to reach out.

4. You’re going to screw up, it’s okay, just make sure you have the right tools when you do it.
You are going to make some mistakes, we all do. Something as silly as closing a paragraph tag will cause you to want to rip your hair out. Just calm down, make sure you have the necessary tools for the task and keep on keeping on.

Recommended Tools

Firefox - Get the latest version
* Firebug extension for Firefox
* Web Developer Toolbar for Firefox

Parallels - Get the latest version
* like it or not, IE6 is still one of the most widely used browsers out there. You Mac users will need this to test.

Opera - Get the latest version
* Dragonfly - Similiar to Firebug, but for Opera (rumor is they named it such because dragonflies eat firebugs)

html and CSS reference - HTML Dog

A great reference site brought to us by Dan Cedarholm

5. Make sure you are having fun.

Do what you love and you will love what you do. These guys agree:

“Choose a job you love, and you will never have to work a day in your life.” — Confucius

“I never did a day’s work in my life. It was all fun.” — Thomas A. Edison

Filed under: CSS |
2 comments

Ruby on Rails vs Java vs C vs Assembler

Posted by Joe on August 10th, 2008

A big advantage of having spent a long time in an industry is that you see certain patterns repeat themselves over time. One of these patterns had an important impact on our decision to use Ruby on Rails for Mixx.

One of my first jobs was at a defense contractor called Logicon. Logicon had a number of contracts with various intelligence agencies, many of which involved processing news stories received via newswire, categorizing those stories, and delivering them to analysts who had registered an interest in a specific topic. Thus, for example, an analyst specializing in Russia could subscribe to all stories that mentioned Vladamir Putin, Russia, and Georgia (but did not mention Atlanta).

This software had been written for IBM mainframes in the IBM assembly language. It could only be used in that environment, and was not adaptable to other systems. My job was to take the technology, turn it into an off-the-shelf product, and help package it for use by others. The hope was that we could sell it to a wide range of customers, both in and out of government.

One of our big early challenges was to get permission to write the new product in the C programming language. Our local Logicon vice president, the guy who controlled the green light, had been an assembly language programmer, having worked on the system that we were adapting. He was suspicious that we could get the necessary performance out of C. (High performance was the big selling point of this product, which was meant to process thousands of queries against several incoming stories per second, a big deal with the hardware of the time.) He had little experience with C himself, and so he needed convincing.

We built a prototype that showed that, while the C language implementation was not as fast as the original assembler version, it was more than fast enough for our target market. That, combined with the other advantages of C (greater portability and programmer productivity), was enough to convince him to okay the project. And so I spent the next few years of my life building and maintaining the Logicon Message Dissemination System (LMDS), which ended up being used by a number of government agencies. And which led directly to me being hired at AOL, who bought both LMDS and, eventually, me from Logicon.

(I’m not sure if LMDS is still in use at AOL. But it was as of a couple of years ago, being used as part of the system that categorizes news stories for the AOL Feeds Factory. Not bad, for a piece of software that was originally written twenty years ago.)

This was my first time encountering the debate between an old, established language and a newer language that, while not as efficient in terms of processor speed, was a lot more efficient in terms of programmer time.

Flash forward ten years. C was firmly entrenched at AOL, as was I. But a new language was gaining prominence - Java. And the argument arose: should AOL make Java the language-of-choice for new applications?

Java had a lot of advantages over C. It’s a much easier language to use, and it is not vulnerable to some of the truly painful bugs that you can create in C. (I once saw a C project of a dozen people delayed for a week because of a misplaced semi-colon.) All of this translates into greater programmer productivity.

But Java is never going to be as efficient as C in terms of machine resources. Some of the features that make Java safe also cost processor cycles. This is independent of the compilers involved: even with the best compiler in the world, a Java program will not out-perform an equivalent C program.

Take one example: array bounds checking. In C, you can create an array of ten items, and then happily ask to access the eleventh item. C will allow this - the guiding principle of C is that the programmer knows what he’s doing, so even if it seems stupid, just do what the programmer says, dammit! This can cause all kinds of nasty effects in C. (The one-week delay that I mentioned above resulted from just such a bug.)

But whenever you access an array element in Java, Java checks that you are not going out of bounds. An attempt to access the eleventh item an a ten-item array will result in a Java error that pinpoints exactly where you went wrong. The bug will be found in a matter of minutes, not days.

This saves lots of programmer time. But doing those array bounds checks costs processor time - every time you access an array element in Java, it has to check to see that your access is in bounds. Processing time is sacrificed for programmer time.

AOL ran some benchmarks and found that a Java program would take around twice as much CPU time to run as the equivalent C program. But, especially given that the limiting factor on performance was usually I/O and not CPU, and given that programmer time had gotten more expensive while CPU time was constantly getting cheaper, this was a good trade-off. And so AOL adopted the policy that new development, unless there was a good reason to do otherwise, should be done in Java.

Now we come to the present, the inevitable moment when there is a new kid on the block, a new language that is challenging Java. It is easier to develop a web application in Ruby on Rails than in Java. Yes, Ruby is not as efficient in terms of processing time, but it is far more efficient in terms of programmer time. And processing time is not the limiting factor in most applications anyway.

In other words, it is exactly the same argument as we had ten years ago when deciding whether to stick with C or switch over to Java. Which in turn was exactly the same argument as we had ten years before that in deciding whether to stay with assembler or switch over to C. It is an argument that comes down to the question of what the limiting factor in development is: CPU time or programmer time. And in most applications, the answer is going to be programmer time.

(Speaking as a programmer, may I just say that I am very happy that programmers are more expensive than CPU’s. I hope this continues far into the future, with CPU’s getting cheaper while programmers get more expensive!)

Clearly, there are other factors involved. We shouldn’t dive into the new technology just because it is new. And it may prove that Ruby on Rails is not the wave of the future, in the way that C and Java once were.

But it is interesting to see the same old arguments being held over and over, and it’s instructive, when having these arguments, to remember how they turned out the last time around.

Filed under: Performance, Ruby on Rails |
9 comments

Ruby tidbit: When rescue doesn’t

Posted by Bill on July 31st, 2008

I learned a valuable lesson this morning, and I thought I’d share it with you. To deal with error conditions, Ruby includes, as do other languages, exception handling. This allows you to put all your code that might generate one or more errors in a block, and deal with any errors that do occur in another block. Far preferable to days of old when we had to check the return value of any call that may generate an error, and deal with it on the spot, each in unique fashion. Exception handling is much cleaner and more manageable:

begin # Do stuff that may fail here rescue # Deal with those failures here end

Most articles dealing with Ruby exception handling will tell you that the way to catch exceptions is with the “rescue” line. But it turns out, that’s only part of the truth. An example done in Ruby’s irb console:

irb(main):001:0> begin irb(main):002:1* raise "Haha, you missed me!" irb(main):003:1> rescue irb(main):004:1> puts "No I didn't!" irb(main):005:1> end No I didn't!

So far so good. But now watch this:

irb(main):006:0> begin irb(main):007:1* raise Exception.new("Haha, you missed me!") irb(main):008:1> rescue irb(main):009:1> puts "No I didn't!" irb(main):010:1> end (irb):7:in `irb_binding': Haha, you missed me! (Exception)

Woops! In that case, rescue really did miss it. But why? Doesn’t “rescue” mean rescue everything? I was surprised to find out that it doesn’t. One more example should clear things up:

irb(main):016:0> begin irb(main):017:1* raise "Better luck next time!" irb(main):018:1> rescue => e irb(main):019:1> puts("I caught a " + e.class.to_s) irb(main):020:1> end I caught a RuntimeError => nil irb(main):021:0> begin irb(main):022:1* raise StandardError.new("Better luck next time!") irb(main):023:1> rescue => e irb(main):024:1> puts("I caught a " + e.class.to_s) irb(main):025:1> end I caught a StandardError => nil irb(main):026:0> begin irb(main):027:1* raise Exception.new("Better luck next time!") irb(main):028:1> rescue => e irb(main):029:1> puts("I caught a " + e.class.to_s) irb(main):030:1> end (irb):27:in `irb_binding': Better luck next time! (Exception)

Ah-ha! So while it seems that “rescue” would be the most generic way of dealing with exceptions of all kinds, it really isn’t. It will catch StandardError (of which RuntimeError is a subclass), but it misses Exception. The most generic exception handler turns to be “rescue Exception”:

begin # Do stuff here rescue Exception # I will catch everything, even stuff that "rescue" misses end

Lesson learned: unless you know the only thing you might have to deal with is StandardError or one of its subclasses, it’s better to use “rescue Exception” than just “rescue”. Nevermind documents that suggest that “rescue” catches exceptions; that’s true, but misleading - it really only catches certain kinds of them.

Filed under: Ruby on Rails |
5 comments

Scaling Rails

Posted by Joe on July 28th, 2008

Mixx is built using the Ruby on Rails framework. (Rails is the framework. Ruby is the language.) If you pay much attention to the internet tech world, this is likely to raise one big question: doesn’t that mean scaling problems?

Unfortunately, Twitter’s various problems with scaling and reliability have thrown up a lot of FUD around Rails. After all, Twitter is the best known large-scale Rails application on the web, and it has undeniably had problems with both scaling and stability. (Though having worked with the excellent engineers that Twitter just acquired with their purchase of Summize, I feel confident that their problems will soon be a thing of the past.) A common meme is that Twitter’s problems are due to Rails, and that therefore it is impossible to build a stable scalable application using Rails.

Since I’m the guy who chose to use Rails for Mixx, I obviously disagree. Let me tell you why.

(In the notes that follow, I’ll talk about both performance and scalability. I realize that these are not the same. However, they are closely related, and there’s FUD related to both around Rails, so I think it worthwhile to cover them both in this discussion.)

80-90% of end-user response time has nothing to do with anything on the server - it’s in the design and implementation of the page.
This is the result of a study done by Yahoo!, as reported in Steve Souders’s excellent High Performance Web Sites (which I highly recommend - there should be a copy of this book in every web development shop). Most of performance is tied to issues like setting correct cache headers on images, reducing the number of objects on the page, and proper design of markup, CSS, and JavaScript. If you want your pages to load fast, look to your markup. (This is one of the many reasons that the first person I hired when I came on-board was Jason Garber, Mixx’s UI architect. And why I bought him a copy of the Souders book as soon as it was published.)

Performance on the server is mostly a matter of design of the data stores, not of application code.
A typical web application involves retrieving data from one or more data stores, manipulating it in some way, and rendering the page to contain that data. Of these steps, the one that is most likely to cause performance problems is the retrieval from the data stores. A poorly indexed database can cause major problems, and the biggest performance gains on a well-written application are going to involve faster data stores and a smart caching strategy. But these issues are not unique to Rails - they are common problems to all web applications, no matter what language is being used.

(While I know next to nothing about the Twitter architecture, I’m willing to bet that its scaling problems have to do with data store design, not application code. If Twitter were translated into another language without redesigning those stores, I expect that it would have the same problems. Building systems that require putting together data from multiple collections, with each user requiring a different set of data, is a tricky data design problem, no matter what the language.)

Rails supports scaling through duplication of application servers.
It’s easy to fire up as many instances of application and web servers as you need in Rails. Hardware scaling - throwing another box into the mix to partition load - is just as easy with Rails as it is with any other modern web application framework.

The limiting factor to this kind of scaling is the size of your database host - but that’s going to be the limiting factor no matter what application framework you have. And the same tricks used in other languages to overcome those limits work in Rails.

Rails supports scaling through partitioning of the application.
Another typical scaling approach involves partitioning the application into sub-applications, each of which runs on a separate set of servers. This can be done in Rails as easily as in any other language.

Rails optimizes for programmer time, not processor time - and that is the right choice.
I’m not saying that Rails is going to perform as well as languages such as Java. But Rails is optimized to make life easier for programmers, not computers. This is a common theme in the history of programming languages - a new language comes along that is easier for programmers, but less efficient in terms of processor time. There’s an entire blog post in that - and I promise to write it sometime soon.

It is easy to write naive Rails code that performs poorly. That’s why we you shouldn’t write naive Rails code.
Clearly, any language can be used to write bad, inefficient programs. But Rails, which abstracts out the database to make database access transparent to the programmer, probably makes it easier to write inefficient code. It is easy to slap together a quick Rails application without worrying too much about the database - that’s the great strength of Rails. But to get performance, you need to pay attention to the details. That isn’t too difficult, and I’ll blog sometime about how we’ve done it at Mixx. But it is necessary.

(I suspect that this is another cause of the bad reputation for performance that Rails has. People throw together a quick Rails application, taking advantage of its ease in programming, and are then surprised when it doesn’t perform. But while building an application can be easy, we aren’t yet at the point where building a complex and highly performant web application is easy.)

And there it is. At Mixx, we feel that Rails will serve us quite well as our application platform. In fact, we have effectively bet the company’s future on this belief. I do not expect to be proven wrong on this - while Rails is not a perfect platform in terms of performance, and there are a number of improvements that could help (another fertile ground for a blog post), for our application it performs just fine.

Filed under: Performance, Ruby on Rails |
5 comments

Welcome to the Engine Room

Posted by Joe on July 22nd, 2008

Welcome to the Mixx Engine Room, the blog of the Mixx engineering team. Over time, we’ll be writing posts describing various aspects of Mixx technology - how things are built, challenges that we’ve encountered in scaling, the reasons behind some of our technical decisions, and other details about our systems that we feel we can share with the world.

At Mixx, we’re strong believers in community. We’ve read The Cluetrain, and we believe in it. This blog will be the engineering team’s car on the Mixx Cluetrain.

But this blog is not like the movies. Here we roll the credits first. And so, without further ado, I give you the Mixx engineering team, listed by Mixx longevity:

- Raghu Somaraju is a backend developer here at Mixx, building the server code for several of our subsystems. When you first registered for Mixx, and the last time you logged in, you used Raghu’s code. In his pre-Mixx days, Raghu worked at Yahoo! in the News and Information group for three years, where he met Mixx CEO Chris McGill. After Yahoo!, Raghu worked at Juno Online in India and at an online ad agency start-up that, alas, did not make it.

- Joe Dzikiewicz (hey, that’s me!) wears many hats at Mixx, so it’s a good thing he has such a big head. As CTO, he manages the development and operations teams. As backend architect, he does overall architectural design of the backend systems. As a backend developer, he codes much of that design, including the initial work on the voting, submission, and categorization systems. Before Mixx, Joe was a systems architect at AOL for ten years, most of it spent in the Search and Community development teams. If you ever searched for something on AOL, chances are you used some of Joe’s code. When not writing Mixx blog, Joe blogs at www.drdzoe.com.

- Like most of the crew at Mixx, Jason Garber wears many hats, all of them stylish. His job as User Interface Architect implies that he has some sort of working knowledge of blueprints, AutoCAD, and spackle. This couldn’t be farther from the truth. Unless, of course, you think of HTML as a blueprint, CSS as spackling paste, and JavaScript as AutoCAD, but that’s a rather weak analogy. Poor analogies notwithstanding, Jason spends a good deal of time working with the rest of the Mixx team to ensure that everything that goes out the door is functional, attractive, and easy to use.

Prior to joining up with Mixx, Jason worked at a small design and marketing firm in Bethesda, MD, and did some time at AOL, most notably working on Ficlets, which won an award for Best CSS at this year’s South by Southwest conference. Between breaths, he is also involved in the local DC tech scene, organizing events, shooting photography, and playing in a rock and roll band.

- Nathaniel Collinsworth is Director of Operations for Mixx. His responsibility is the smooth running of all the technical infrastructure that the company needs. This umbrella covers a large number of disciplines; database, web server, mail, security, monitoring and alarming, business continuity, etc. It also includes the less challenging, but just as critical, needs of the office personnel. Nathaniel spends much of his days ensuring that Mixx can continue to scale to the needs of it’s growing user base. The first half of his career Nathaniel managed technology operations groups for non-technology companies, including non-profits, mining companies and the US Federal Government. The second half he has focused on the scalability and distribution of community applications on the web for AOL and for Mixx. He enjoys working at Mixx because he gets to use his broad range of talents to serve both the Mixx employees and our users.

- Bill Kocik also joins us from AOL, where he spent several years working in both Systems Operations and Software Engineering disciplines. At Mixx, Bill spends most of his time writing code that runs on the servers. Of note, he designed and implemented the Mixx API, and is largely responsible for the Site Mail feature, as well as once-per-day email notifications and the Google AdSense revenue sharing functionality.

- Doug March is a Senior Web Developer at Mixx. He spends much of his time bringing product designs and ideas to life. On the side, he helps dream up new features and dabbles in Search Engine Optimization. Doug started his career in the world of government consulting. Later, he found his niche and honed his skills at Revolution Health. In his free time he is usually on the links trying to collect that elusive 5th hole-in-one. When not enlightening you via the engine room, Doug blogs at http://doug-march.com.

So that’s us. Follow this space to learn more about the way we think, and the way we do technology.

Filed under: About Us |
9 comments