Using Redis as a Secondary Index for MySQL

Hey, did you notice, on the brand-spanking-new Yahoo homepage, right there on the side of the page, it’s photos from your Flickr contacts (or maybe your groups)! No? Go check it out, I’ll wait.

Ok, great, you’re back! What you should have seen, assuming you have Flickr contacts (or are a member of some groups), is photos from your most recently active contact (or group!). Something like… this:

Flickr on Yahoo.com

(thanks schill!)

What you see above is the 10 most recent photos from my contact who most recently uploaded any photos. The homepage retrieves this data by making a call to a specially tailored method from the Flickr API.

The Latency Problem

In order to ensure performance for Yahoo.com, this API method had very tight SLAs; we can’t slow down the page for the millions of homepage visitors to pull in Flickr content, after all. While the method can return different data sets depending on the most recent activity from your Flickr contacts and groups, for the sake of this post we’re going to focus exclusively on the contacts case. The first step to returning data is to get your most recently active contact, and to do that we need a list of all of your contacts sorted by how recently they’ve uploaded a photo. In an ideal world, the SQL query would be something along the lines of:

SELECT contact_id FROM Contacts WHERE user_id = ? ORDER BY date_upload LIMIT 1;

Easy, right? Sadly, no. Due to the way our contact relationships are stored, the query we need to run is more complex than the above. Still, for the vast majority of Flickr users, the *actual* SQL query performed just fine, usually in less than 1ms. However, Flickr users come in all shapes and sizes. Some users have a few contacts, some have hundreds, and some… have tens of thousands. Unfortunately, the runtime for the query scaled proportionally to number of contacts the user has. I won’t delve into the specifics of the query or how we store contact relationships, but suffice it to say we investigated possible changes to both the query and to the indexes in MySQL, but neither was sufficient to give us the performance we needed across all possible use cases. With that in mind, we had to look elsewhere for optimizations.

The First Attempt

With a pure MySQL solution off the table, our first thoughts for optimization avenues turned to the obvious: denormalization. If we stored the 10 most recent photos from your most recently active contact ahead of time, then getting that list when the homepage was rendered would be trivial regardless of how many contacts you have. At Flickr we’re big fans of Redis, so our thoughts immediately turned to using it as a store for the denormalized contact data.

In order to denormalize the data, we need to constantly process six different user actions:

  • Add a contact
  • Remove a contact
  • Change a contact relationship
  • User uploads a photo
  • User deletes a photo
  • User changes a photo’s privacy

Each one of these actions can impact what photos you should see on the Yahoo homepage, and therefore must update the denormalized data appropriately. Because the photo related actions can potentially impact thousands of users, if not tens or hundreds of thousands (everyone who calls the photo owner a contact would need to have their denormalized data updated on upload), they must be processed by our offline task system to ensure site performance is not adversely impacted. Unfortunately, this is where we ran into a snag. The nature of Flickr’s offline task system is such that the order of processing is not guaranteed. In most cases this is not a problem, however when it comes to maintaining an accurate list of denormalized data not being able to predict the outcome of running multiple tasks is problematic.

Out of Order

Imagine the following scenario: you add a contact then quickly remove them. This results in two tasks being added, one to add and one to remove. If they’re processed in the order in which the actions were taken, everything is fine. However, if they’re processed out of order then when everything is said and done the denormalized data no longer accurately reflects your actual contact relationships. Maybe the next action that updates your data will correct this problem, or maybe it will introduce another problem. Over time, it’s likely that a not-insignificant number of users will have denormalized data that is out of sync with reality.


Out of order

Out of order by Foomandoonian

Ok, let’s try this another way

Looking back at our original problem, the bottleneck was not in generating the list of photos for your most recently active contact, it was just in finding who your most recently active contact was (specifically if you have thousands or tens of thousands of contacts). What if, instead of fully denormalizing, we just maintain a list of your recently active contacts? That would allow us to optimize the slow query, much like a native MySQL index would; instead of needing to look through a list of 20,000 contacts to see which one has uploaded a photo recently, we only need to look at your most recent 5 or 10 (regardless of your total contacts count)!

In the end, this is exactly what we ended up doing. We still process offline tasks to keep track of your contacts’ activity, but now the set of actions we need to track is smaller:

  • Add a new contact
  • Remove a contact
  • User uploads a photo

Each task maintains a per-user sorted set where the set member is the contact id, and the score is the timestamp of when that contact last uploaded a photo. So, for example, if a user (user id 12345) adds a new contact (user id 98765) we simply do:

ZADD user_12345 1363665432 98765

Removing a contact is the opposite, using ZREM as the yin to ZADD‘s yang:

ZREM user_12345 98765

When a user uploads a new photo it’s once again a ZADD (though it’s a ZADD against the set for every user that calls the uploading user a contact). If the uploader is already in a given user’s set, this will simply update the score, if not then the uploader will be added to the set.

As things stand right now, the set will grow without bound, which is obviously not much help if our goal is to limit the number of contacts we need to check against when querying the DB. The solution to this is to cap the set. After each ZADD we check to see if the size of the set has exceeded a threshold (generally we store an additional 20% on top of the data we absolutely need), and if so, we remove all of the extra records using ZREMRANGEBYRANK:

ZREMRANGEBYRANK user_12345 0 ($collection_size - $max_size) - 1

Where $collection_size is the current number of members in the set and $max_size is the maximum number of members we want to store. Note that we’re removing from the head of the set. Redis stores data in sorted sets in ascending order, so the least recently active contacts are at the beginning of the set.

Akin to how we must cap the set to keep it below a maximum size, we also have a threshold on the other end of the spectrum to keep the set above a minimum size. If a user happens to remove their 10 most recently active contacts then there would be no data in their set, and the Redis index would be of little value. With that in mind, any time the user removes a contact, we check to see if the size of the set has dropped under the minimum threshold, and if so we repopulate the data based on their remaining contacts. This is slightly more complex than a simple Redis command, so we’ll use actual PHP to explain:

//
// $key is the redis ZSET key, e.g. user_12345
//
$count = redis_zset_zcard($key);
if ($count < $MIN_SET_SIZE) {
      if ($count > 0) {
        $current_contacts = redis_zset_zrange($key, 0, -1);
    } else {
        $current_contacts = array();
    }

    //
    // In this call, the second parameter is the number of contacts
    // to return and the third parameter is a list of users to
    // exclude from the response. Since they’re already in the
    // set, there’s no need to add them again
    //
    $contacts = contacts_most_recently_uploaded_list($user, $MAX_SET_SIZE - $count,
        $current_contacts);

    foreach ($contacts as $contact) {
        redis_zset_zadd($key, $contact['last_upload'], $contact['id']);
    }
}

Remember, this is all being done outside of the context of a page request, so there’s no harm in spending a little bit of extra time to ensure when the API is called we have a reliable index that we can use to optimize the DB query.

The final action we need to take is to actually query the set to get the list of contacts so we can actually do said DB query optimization. This is done with a standard ZREVRANGE:

ZREVRANGE user_12345 0 10

Similar to how we cap the set by removing members from the beginning because those are the least recently active, when we want the most recently active we use ZREVRANGE to get members at the end of the set.

You’re probably wondering, what about the other three events that generated tasks for the fully denormalized solution? How are we able to get by without them? Well, because we’ll just be using the list of contacts to optimize a live DB query, we can take some liberties with data purity in Redis. Because we store multiple recently active contacts, it doesn’t matter if, for example, the most recent contact in your Redis set has deleted his most recent photos or made them all private. When we query the DB, we further restrict the list based on your current relationship with a contact and photo-visibility, so any issues with Redis being out of sync sort themselves out automatically.

Take the above scenario where a user adds and quickly removes a contact. If the remove is processed first and the contact remains in the user’s Redis set, when we go to query the live contacts DB, we’ll get no results for that relationship and move on to the next contact in the set—crisis averted. There is a slight problem with the reverse scenario wherein a user removes a contact and then adds them back. If the tasks are processed out of order, there’s a chance that the add may not end up being reflected in the Redis set. This is the only chance for corruption under this system (as opposed to the fully denormalized solution where many tasks could interact and corrupt one another), and it’s likely to be fixed the next time any of the other actions occur. Furthermore, the only downside of this corruption is a user seeing photos from their second most recently active contact; this is a condition we’re willing to live with given the overall gains provided by the solution as a whole.

Not All Wine and Roses

The dataset involved in this solution is one of the largest we’ve pushed at Redis so far, and it’s not without its pitfalls. Namely, as the size of the dataset increases, the amount of time spent doing RDB saves also increases. This can introduce latency into Redis commands while the save is in progress. From Redis Persistence:

RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great. AOF also needs to fork() but you can tune how often you want to rewrite your logs without any trade-off on durability.

This is a key factor to be aware of when optimizing Redis performance. The overall speed of the system can be adversely impacted by RDB saves, therefore taking steps to minimize the time spent saving is critical. We’ve solved this problem by isolating these writes to their own Redis instance, thereby limiting the size of the dataset to only keys related to contacts activity. In the long term, as activity increases, it’s likely that we’ll need to further reduce the number of writes-per-instance by sharding this contact activity to a number of Redis instances. In some cases, multiple small Redis instances running on a single host can be preferable to one large Redis instance. As mentioned in the Redis Persistence guide, RDB saves can suffer with a slow CPU; if upgrading hardware (mostly faster CPUs to improve fork() performance), is possible, that’s certainly another option to investigate.

Flickr flamily floto

Like this post? Have a love of online photography? Want to work with us? Flickr is hiring engineers, designers and product managers in our San Francisco office. Find out more at flickr.com/jobs.

Redis Global Locks Redux

In my last post I described how we use Redis to manage a global lock that allows us to automatically failover to a backup process if there was a problem in the primary process. The method described allegedly allowed for any number of backup processes to work in conjunction to pick up on primary failures and take over processing.

Locks #1
Locks #1 by Christoph Kummer

Thanks to an astute reader, it was pointed out that the code in the blog wouldn’t actually work as advertised:

 

The Problem

Nolan correctly noticed that when the backup processes attempts to acquire the lock via SETNX, that lock key will already exist from when it was acquired by the primary, and thus all subsequent attempts to acquire locks will simply end up constantly trying to acquire a lock that can never be acquired. As a reminder, here’s what we do when we check back on the status of a lock:

function checkLock(payload, lockIdentifier) {
    client.get(lockIdentifier, function(error, data) {
        // Error handling elided for brevity
        if (data !== DONE_VALUE) {
            acquireLock(payload, data + 1, lockCallback);
        } else {
            client.del(lockIdentifier);
        }
    });
}

And here’s the relevant bit from acquireLock that calls SETNX:

    client.setnx(lockIdentifier, attempt, function(error, data) {
        if (error) {
            logger.error(&quot;Error trying to acquire redis lock for: %s&quot;, lockIdentifier);
            return callback(error, dataForCallback(false));
        }

        return callback(null, dataForCallback(data === 1));
    });

So, you’re thinking, how could this vaunted failover process ever actually work? The answer is simple: the code from that post isn’t what we actually run. The actual production code has a single backup process, so it doesn’t try to re-acquire the lock in the event of failure, it just skips right to trying to send the message itself. In the previous post, I described a more general solution that would work for any number of backup processes, but I missed this one important detail.

That being said, with some relatively minor changes, it’s absolutely possible to support an arbitrary number of backup processes and still maintain the use of the global lock. The trivial solution is to simply have the backup process delete the key before trying to re-acquire the lock (or, technically acquire it anew). However, the problem with that becomes apparent pretty quickly. If there are multiple backup processes all deleting the lock and trying to SETNX a new lock again, there’s a good chance that a race condition could arise wherein one of backups deletes a lock that was acquired by another backup process, rather than the failed lock from the primary.

The Solution

Thankfully, Redis has a solution to help us out here: transactions. By using a combination of WATCH, MULTI, and EXEC, we can perform actions on the lock key and be confident that no one has modified it before our actions can complete. The process to acquire a lock remains the same: many processes will issue a SETNX and only one will win. The changes come into play when the processes that didn’t acquire the lock check back on its status. Whereas before, we simply checked the current value of the lock key, now we must go through the above described Redis transaction process. First we watch the key, then we do what amounts to a check and set (albeit with a few different actions to perform based on the outcome of the check):

function checkLock(payload, lockIdentifier, lastCount) {
    client.watch(lockIdentifier);
    client.multi()
        .get(lockIdentifier)
        .exec(function(error, replies) {
            if (!replies) {
                // Lock value changed while we were checking it, someone else got the lock
                client.get(lockIdentifier, function(error, newCount) {
                    setTimeout(checkLock, LOCK_EXPIRY, payload, lockIdentifier, newCount);
                });

                return;
            }

            var currentCount = replies[0];
            if (currentCount === null) {
                // No lock means someone else completed the work while we were checking on its status and the key has already been deleted
                return;
            } else if (currentCount === DONE_VALUE) {
                // Another process completed the work, let’s delete the lock key
                client.del(lockIdentifier);
            } else if (currentCount == lastCount) {
                // Key still exists, and no one has incremented the lock count, let’s try to reacquire the lock
                reacquireLock(payload, lockIdentifier, currentCount, doWork);
            } else {
                // Key still exists, but the value does not match what we expected, someone else has reacquired the lock, check back later to see how they fared
                setTimeout(checkLock, LOCK_EXPIRY, payload, lockIdentifier, currentCount);
            }
        });
}

As you can see, there are five basic cases we need to deal with after we get the value of the lock key:

  1. If we got a null reply back from Redis, that means that something else changed the value of our key, and our exec was aborted; i.e. someone else got the lock and changed its value before we could do anything. We just treat it as a failure to acquire the lock and check back again later.
  2. If we get back a reply from Redis, but the value for the key is null, that means that the work was actually completed and the key was deleted before we could do anything. In this case there’s nothing for us to do at all, so we can stop right away.
  3. If we get back a value for the lock key that is equal to our sentinel value, then someone else completed the work, but it’s up to us to clean up the lock key, so we issue a Redis DEL and call our job done.
  4. Here’s where things get interesting: if the key still exists, and its value (the number of attempts that have been made) is equal to our last attempt count, then we should try and reacquire the lock.
  5. The last scenario is where the key exists but its value (again, the number of attempts that have been made) does not equal our last attempt count. In this case, someone else has already tried to reacquire the lock and failed. We treat this as a failure to acquire the lock and schedule a timeout to check back later to see how whoever did acquire the lock got on. The appropriate action here is debatable. Depending on how long your underlying work takes, it may be better to actually try and reacquire the lock here as well, since whoever acquired the lock may have already failed. This can, however, lead to premature exhaustion of your attempt allotment, so to be safe, we just wait.

So, we’ve checked on our lock, and, since the previous process with the lock failed to complete its work, it’s time to actually try and reacquire the lock. The process in this case is similar to the above inasmuch as we must use Redis transactions to manage the reacquisition process, thankfully however, the steps are (somewhat) simpler:

function reacquireLock(payload, lockIdentifier, attemptCount, callback) {
    client.watch(lockIdentifier);
    client.get(lockIdentifier, function(error, data) {
        if (!data) {
            // Lock is gone, someone else completed the work and deleted the lock, nothing to do here, stop watching and carry on
            client.unwatch();
            return;
        }

        var attempts = parseInt(data, 10) + 1;

        if (attempts &gt; MAX_ATTEMPTS) {
            // Our allotment has been exceeded by another process, unwatch and expire the key
            client.unwatch();
            client.expire(lockIdentifier, ((LOCK_EXPIRY / 1000) * 2));
            return;
        }

        client.multi()
            .set(lockIdentifier, attempts)
            .exec(function(error, replies) {
                if (!replies) {
                    // The value changed out from under us, we didn't get the lock!
                    client.get(lockIdentifier, function(error, currentAttemptCount) {
                        setTimeout(checkLock, LOCK_TIMEOUT, payload, lockIdentifier, currentAttemptCount);
                    });
                } else {
                    // Hooray, we acquired the lock!
                    callback(null, {
                        &quot;acquired&quot; : true,
                        &quot;lockIdentifier&quot; : lockIdentifier,
                        &quot;payload&quot; : payload
                    });
                }
            });
    });
}

As with checkLock we start out by watching the lock key, and proceed do a (comparitively) simplified check and set. In this case, we’ve “only” got three scenarios to deal with:

  1. If we’ve already exceeded our allotment of attempts, it’s time to give up. In this case, the allotment was actually exceeded in another worker, so we can just stop right away. We make sure to unwatch the key, and set it expire at some point far enough in the future that any remaining processes attempting to acquire locks will also see that it’s time to give up.

Assuming we’re still good to keep working, we try and update the lock key within a MULTI/EXEC block, where we have our remaining two scenarios:

  1. If we get no replies back, that again means that something changed the value of the lock key during our transaction and the EXEC was aborted. Since we failed to acquire the lock we just check back later to see what happened to whoever did acquire the lock.
  2. The last scenario is the one in which we managed to acquire the lock. In this case we just go ahead and do our work and hopefully complete it!

Bonus!

To make managing global locks even easier, I’ve gone ahead and generalized all the code mentioned in both this and the previous post on the subject into a tidy little event based npm package: https://github.com/yahoo/redis-locking-worker. Here’s a quick snippet of how to implement global locks using this new package:

var RedisLockingWorker = require(&quot;redis-locking-worker”);

var SUCCESS_CHANCE = 0.15;

var lock = new RedisLockingWorker({
    &quot;lockKey&quot; : &quot;mylock&quot;,
    &quot;statusLevel&quot; : RedisLockingWorker.StatusLevels.Verbose,
    &quot;lockTimeout&quot; : 5000,
    &quot;maxAttempts&quot; : 5
});

lock.on(&quot;acquired&quot;, function(lastAttempt) {
    if (Math.random() &lt;= SUCCESS_CHANCE) {
        console.log(&quot;Completed work successfully!&quot;, lastAttempt);
        lock.done(lastAttempt);
    } else {
        // oh no, we failed to do work!
        console.log(&quot;Failed to do work&quot;);
    }
});
lock.acquire();

There’s also a few other events you can use to track the lock status:

lock.on(&quot;locked&quot;, function() {
    console.log(&quot;Did not acquire lock, someone beat us to it&quot;);
});

lock.on(&quot;error&quot;, function(error) {
    console.error(&quot;Error from lock: %j&quot;, error);
});

lock.on(&quot;status&quot;, function(message) {
    console.log(&quot;Status message from lock: %s&quot;, message);
});

More Bonus!

If you don’t need the added complexity if multiple backup processes, I also want to give credit to npm user pokehanai who took the methodology described in the original post and created a generalized version of the two-worker solution: https://npmjs.org/package/redis-paired-worker.

Wrapping Up

So there you have it! Coordinating work on any number of processes across any number of hosts couldn’t be easier! If you have any questions or comments on this, please feel free to follow up on Twitter.

Flickr flamily floto

Like this post? Have a love of online photography? Want to work with us? Flickr is hiring engineers, designers and product managers in our San Francisco office. Find out more at flickr.com/jobs.

Highly Available Real Time Push Notifications and You

One of the goals of our recently launched (and awesome!) new Flickr iPhone app was to further increase user engagement on Flickr. One of the best ways to drive engagement is to make sure Flickr users know what’s happening on Flickr in as near-real time as possible. We already have email notifications, but email is no longer a good mechanism for real-time updates. Users may have many email accounts and may not check in frequently causing timeliness to go right out the window. Clearly this called for… PUSH NOTIFICATIONS!

Motor bike racer getting a push start at the track, Brisbane
Motor bike racer getting a push start at the track, Brisbane by State Library of Queensland, Australia

I know, you’re thinking, “anyone can build push notifications, we’ve been doing it since 2009!” Which is, of course, absolutely true. The process for delivering push notifications is well trod territory by this point. So… let’s just skip all that boring stuff and focus on how we decided on the underlying architecture for our implementation. Our decisions focused on four major factors:

  1. Impact to normal page serving times should be minimal
  2. Delivery should be in near-real time
  3. Handle thousands of notifications per second
  4. The underlying services should be highly available

Baby Steps

Given these goals, we started by looking at systems we already have in place. Everyone loves not writing new code, right? Our thoughts immediately went to Flickr’s existing PuSH infrastructure. Our PuSH implementation is a great way to get an overview of relevant activity on Flickr, but it has limitations that made it unsuitable for powering mobile push notifications. The primary concern is that it’s less-near-real time than we’d like it to be. On average, activities occurring on Flickr will be delivered to a subscribed PuSH endpoint within one minute. That’s certainly better than waiting for an email to arrive or waiting until the next time you log in to the site and see your activity feed, but it’s not good enough for mobile notifications! This delay is due to some design decisions at the core of the PuSH system. PuSH is designed to aggregate activity and deliver a periodic digest and, because of this, it has a built in window to allow multiple changes to the same photo to be accumulated. PuSH is also focused on ensured delivery, so it maintains an up to date list of all subscribers. These features, which make PuSH great for the purpose it was designed, make it not-so-great for real time notifications. So, repurposing the PuSH code for reuse in a more real time fashion proved to be untenable.

Tentative Plans

So, what to do? In the end we wound up building a new lightweight event system that is broken up into three phases:

  1. Event Generation
  2. Event Targeting
  3. Message Delivery

Event Generation

The event generation phase happens while processing the response to a user request. As such, we wanted to ensure that there was little to no impact on the response times as a result. To ensure this was the case, all we do here is a lightweight write into a global Redis queue. We store the minimum amount of data possible, just a few identifiers, so we don’t have to make any extra DB calls and slow down the response just to (potentially) kick off a push notification. Everything after this initial Redis action is processed out of band by our deferred task system and has no impact on site performance.

Event Targeting

Next in the process is the event targeting phase. Here we have many workers reading from the global Redis queue. When a worker receives an event from the queue it rehydrates the data and loads up any additional information necessary to act on the notification. This includes checking to see what users should be notified, whether those users have devices that are registered to receive notifications, if they’ve opted out of notifications of this type, and finally if they’ve muted activity for the object in question.

Message Delivery

Flickr’s web-serving stack is PHP, and, up until now, everything described has been processed by PHP. Unfortunately, one area where PHP does not excel is long-lived processes or network connections, both of which make delivering push notifications in real time much easier. Because of this we decided to build the final phase, message delivery, as a separate endpoint in Node.js.

So, the question arose: how do we get messages pending delivery from these PHP workers over to the Node.js endpoints that will actually deliver them? For this, we again turned to Redis, this time using its built in pub/sub functionality. The PHP workers simply publish a message to a Redis channel with the assumption that there’s a Node.js process subscribed to that channel eagerly awaiting some data on which it can act.

After that the Node process delivers the notification to Apple’s APNS push notification system. Communicating with APNS is a well-documented topic, and not one that’s particularly interesting. In fact, I can sum it up with a single link: https://github.com/argon/node-apn, a great npm package for talking to APNS.

The Real Challenge

There is, however, a much more interesting problem to discuss at this point: how do we ensure that delivery to APNS is both scalable and highly available? At first blush, this seems like it could be problematic. What if the Node.js worker has crashed? The message will just be lost to the ether! Solving this problem turned out to be the majority of the work involved in implementing push notifications.

Scalability

The first step to ensuring a service is scalable is to divide the workload. Since Node.js is single threaded, we would already be dividing the workload across individual Node.js processes anyway, so this works out well! When we publish messages to the Redis pub/sub channel, we simply publish to a sharded channel. Each Node.js process subscribes to some subset of those sharded channels, and so will only act on that subset of messages.

APNS, Redis Pub/Sub

Configuring our Node.js processes in this way makes it easy to scale horizontally. Whenever we need to add more processing power to the cluster, we can just add more servers and more shards. This also makes it easy to pull hosts out of rotation for maintenance without impacting message delivery: we simply reconfigure the remaining processes to subscribe to additional channels to pick up the slack.

Availability

Designing for high availability proved to be somewhat more challenging. We needed to ensure that we could lose individual Node processes, a whole server or even an entire data center without degrading our ability to deliver messages. And we wanted to avoid the need for a human in the loop — automatic failover.

We already knew that we’d have multiple hosts running in multiple data centers, so the main question was how to get them coordinating with each other so that we would not lose messages in the event of an outage while also ensuring we would not deliver the same message multiple times. Our first thought experiment along these lines was to implement a relatively complex message passing scheme, where two hosts would subscribe to a given channel, one as the primary and one as the backup. The primary would pass a message to the backup saying that it was starting to process a message, and another when it completed. The backup would wait a certain amount of time to receive the first and then the second message from the primary. If a message failed to arrive, it would assume something had gone wrong with the primary and attempt to complete delivery to Apple’s push notification gateway.

Initial Failover Plan

This plan had two major problems: hosts had to be aware of each other and increasing the number of hosts working in conjunction raised the complexity of ensuring reliable delivery.

We liked the idea of having one host serve as a backup for another, but we didn’t like having to coordinate the interaction between so many moving pieces. To solve this issue we went with a convention based approach. Instead of each host having to maintain a list of its partners, we just use Redis to maintain a global lock. Easy enough, right? Perhaps some code is in order!

Finally, some code!

First we create our Redis clients. We need one client for regular Redis commands we use to maintain the lock, and a separate client for Redis pub/sub commands.

var redis = require("redis");
var client = redis.createClient(config.port, config.host);
var pubsubClient = redis.createClient(config.port, config.host);

Next, subscribe to the sharded channel and set up a message handler:

// We could be subscribing to multiple shards, but for the sake of simplicity we’ll just subscribe to one here
pubsubClient.subscribe("notification_" + shard);
pubsubClient.on("message", handleMessage);

Now, the interesting part. We have multiple Node.js processes subscribed to the same Redis pub/sub channel, and each process is in a different data center. Whenever any of them receive a message, they attempt to acquire a lock for that message:

function handleMessage(channel, message) {
    // Error handling elided for brevity
    var payload = JSON.parse(message);

    acquireLock(payload, 1, lockCallback);
}

Managing locks with Redis is made easy using the SETNX command. SETNX is a “set if not exists” primitive. From the Redis docs:

Set key to hold string value if key does not exist. In that case, it is equal to SET. When key already holds a value, no operation is performed.

If we have multiple processes calling SETNX on the same key, the command will only succeed for the process that first makes the call, and in that case the response from Redis will be 1. For subsequent SETNX commands, the key will already exist, and the response from Redis will be 0. The value we try to set with SETNX keeps track of how many attempts have been made to deliver the message, initially set to one, this allows us to retry failed messages a predefined number of times before giving up entirely.

function acquireLock(payload, attempt, callback) {
    var lockIdentifier = "lock." + payload.identifier;

    function dataForCallback(acquired) {
        return {
            "acquired" : acquired,
            "lockIdentifier" : lockIdentifier,
            "payload" : payload,
            "attempt" : attempt
        };
    }

    // The value of the lock key indicates how many lock attempts have been made
    client.setnx(lockIdentifier, attempt, function(error, data) {
        if (error) {
            logger.error("Error trying to acquire redis lock for: %s", lockIdentifier);
            return callback(error, dataForCallback(false));
        }

        return callback(null, dataForCallback(data === 1));
    });
}

At this point our attempt to acquire the lock has either succeeded or failed, and our callback is invoked. What we do next depends on whether we managed to acquire the lock. If we did acquire the lock, we simply attempt to send the message. If we did not acquire the lock, then we will check back later to see if the message was sent successfully (more on this later):

function lockCallback(error, data) {
    // Again, error handling elided for brevity
    if (data && data.acquired) {
        return sendMessage(data.payload, data.lockIdentifier, data.attempt === MAX_ATTEMPTS);
    } else if (data && !data.acquired) {
        return setTimeout(checkLock, LOCK_EXPIRY, data.payload, data.lockIdentifier);
    }
}

Finally, it’s time to actually send the message! We do some work to process the payload into a form we can use to pass to APNS and send it off. If all goes well, we do one of two things:

  1. If this was our first attempt to send the message, we update the lock key in Redis to a sentinel value indicating we were successful. This is the value the backup processes will check for to determine whether or not sending succeeded.
  2. If this was our last attempt to send the message (i.e. the primary process failed to deliver and now a backup process is handling delivery), we simply delete the lock key.
function sendMessage(payload, lockIdentifier, lastAttempt) {
    // Does some work to process the payload and generate an APNS notification object
    var notification = generateApnsNotification(payload);

    if (notification) {
        // The APNS connection is defined/initialized elsewhere
        apnsConnection.sendNotification(notification);

        if (lastAttempt) {
            client.del(lockIdentifier);
        } else {
            client.set(lockIdentifier, DONE_VALUE);
        }
    }
}

There’s one final piece of the puzzle: checking the lock in the process that did not acquire it initially. Here we issue a Redis GET to retrieve the current value of the lock key. If the process that won the lock managed to send the message, this key should be set to a well known sentinel value. If so, we don’t have any work to do, and we can simply delete the lock. However, if this value is not set to that sentinel value, then something went wrong with delivery in the process that originally acquired the lock and we should step up and try to deliver the message from this backup process:

function checkLock(payload, lockIdentifier) {
    client.get(lockIdentifier, function(error, data) {
        // Error handling elided for brevity
        if (data !== DONE_VALUE) {
            acquireLock(payload, data + 1, lockCallback);
        } else {
            client.del(lockIdentifier);
        }
    });
}

Summing Up

So, there you have it in a nutshell. This method of coordinating between processes makes it very easy to adjust the number of processes subscribing to a given shard’s channels. There’s no need for any process subscribed to a channel to be aware of how many other processes are also subscribed. As long as we have at least two processes in separate data centers subscribing to each shard we are protected from all of the from the following scenarios:

  • The crash of any individual Node.js process
  • The loss of a single host running the Node.js processes
  • The loss of an entire data center containing many hosts running the Node.js processes

Let’s go back over our initial goals and see how we fared:

  1. Impact to normal page serving times should be minimal

We accomplish this by minimizing the workload done as part of the normal browser-driven request/response processing. The deferred task system picks up from there, out of band.

  1. Delivery should be in near-real time

Processing stats from our implementation show that time from user actions leading to event generation to message delivery averages about 400ms and is completely event driven (no polling).

  1. Handle thousands of notifications per second

In stress tests of our system, we were able to process more than 2,000 notifications per second on a single host (8 Node.js workers, each subscribing to multiple shards).

  1. The underlying services should be highly available

The availability design is resilient to a variety of failure scenarios, and failover is automatic.

We hope you’re enjoying push notifications in the new Flickr iPhone app.

Addendum!

There was a minor problem with the code in this post when supporting more than two workers. For a full explanation of the problem and the solution, check out Global Redis Locks Redux.

Flickr flamily floto

Like this post? Have a love of online photography? Want to work with us? Flickr is hiring engineers, designers and product managers in our San Francisco office. Find out more at flickr.com/jobs.

Avoiding Dragons: A Practical Guide to Drag ’n’ Drop

You, the enterprising programmer, know about parsing EXIF from photos in the browser and even how and why to power this parsing with web workers. “Bat,” you ask yourself, “how do I get those photos into the browser in the first place?”

The oldest and most low-tech solution is the venerable <input type="file" name="foo">. This plops the old standby file button on your page and POSTs the file’s contents to your server upon form submission.

To address many of this simple control’s limitations we debuted a Flash-based file uploader in 2008. This workhorse has been providing per-file upload statuses, batch file selection, and robust error handling for the last four years through Flash’s file system APIs.

These days we can thankfully do this work without plugins. Not only can we use XHR to POST files and provide all the other fancy info we’ve long needed Flash for, but now we can pair this with something much better than an <input>: drag and drop. This allows people drag files directly into a browser window from the iPhotos, Lightrooms, and Windows Explorers of the world.

Let’s take a look at how this works.

Foundations first

Workmen laying the cornerstone, construction of the McKim BuildingWorkmen laying the cornerstone, construction of the McKim Building by Boston Public Library

Let’s begin with our simple fallback, a – yes – <input type="file">.

HTML: <input type="file" multiple accept="image/*,video/*">
JS: Y.all('input[type=file]').on('change', handleBrowse);

Here we start with an <input> that accepts multiple files and knows it only accepts images and videos. Then, we bind an event handler to its change event. That handler can be very simple:

function handleBrowse(e) {
	// get the raw event from YUI
	var rawEvt = e._event;
	
	// pass the files handler into the loadFiles function
	if (rawEvt.target && rawEvt.target.files) {
		loadFiles(rawEvt.target.files);
	}
}

A simple matter of handing the event object’s file array off to our universal function that adds files to our upload queue. Let’s take a look at this file loader:

function loadFiles(files) {
	updateQueueLength(count);
	
	for (var i = 0; i < files.length; i++) {
		var file = files[i];
		
		if (File && file instanceof File) {
			enqueueFileAddition(file);
		}
	}
}

Looks clear – it’s just going over the file list and adding them to a queue. “But wait,” you wonder, “why all this queue nonsense? Why not just kick off an XHR for the file right now?” Indeed, we’ve stuck in a layer of abstraction here that seems unnecessary. And for now it is. But suppose our pretty synchronous world were soon to become a whole lot less synchronous – that could get real fun in a hurry. For now, we’ll put that idea aside and take a look at these two queue functions themselves:

function updateQueueLength(quantity) {
	state.files.total += quantity;
}

function enqueueFileAddition(file) {
	state.files.handles.push(file);
	
	// If all the files we expect have shown up, then flush the queue.
	if (state.files.handles.length === state.files.total) {
		for (var i = 0, len = state.files.total; i < len; i++) {
			addFile(state.files.handles[i]);
		}
		
		// reset the state of the world
		state.files.handles = [];
		state.files.total = 0;
	}
}

Pretty straightforward. One function for leaving a note of how many files we expect, one function to add files and see if we have all the files we expect. If so, pass along everything we have to addFile() which sends the file into our whirlwind of XHRs heading off to the great pandas in the sky.

Droppin’ dragons

Droppin’ dragonsDroppin’ dragons by Phil Dokas

While all of that is well and good, it was all for a ho-hum <input> element! Let’s hook a modern browser’s drag and drop events into this system:

document.addEventListener('drop', function(e) {
	if (e.dataTransfer && e.dataTransfer.files) {
		loadFiles(e.dataTransfer.files);
	}
});

The drag and drop API is a fairly complicated one, but it thankfully makes the task of reading files out of a drop event easy. Every drop will have a dataTransfer attribute and when there’s at least one file in the drag that member will itself have a files attribute.

In fact, when you’re only concerned about handling files dragged directly into the browser you could call it a day right here. The loadFiles() function we wrote earlier knows how to handle instances of the File class and that’s exactly what dataTransfer.files stores. Easy!

Put it up to eleven

While easy is a good thing, awesome is awesome. How could we make dragging files into a browser even better? Well, how about cutting down on the trouble of finding the folder with your photos somewhere on your desktop, opening it, and then dragging those files into the browser? What if we could just drag the folder in and call it a day?

goes to 11goes to 11 by Rick Kimpel

Try to drag a folder into the browser with the current state of our code; what happens? Our code tells the browser to treat all dropped file system objects as files. So what ultimately happens for folders is a very elaborate “nothing”. To fix this, we need to tell the browser how to handle directories. In our case, we want it to recursively walk every directory it sees and pick out the photos from each.

From here on out we’re going to be treading over tumultuous land, rife with rapidly changing specs and swiftly updating browsers. This becomes immediately apparent in how we begin to add support for directories. We need to update our drop event handler like this:

document.addEventListener('drop', function(e) {
	if (e.dataTransfer && e.dataTransfer.items) {
		loadFiles(e.dataTransfer.items);
	}
	else if (e.dataTransfer && e.dataTransfer.files) {
		loadFiles(e.dataTransfer.files);
	}
});

Items? Files? The difference is purely a matter of one being the newer interface where development happens and the other being the legacy interface. This is spelled out a bit in the spec, but the short of it is that the files member will be kept around for backwards compatibility while newer abilities will be built in the items namespace. Our code above prefers to use the items attribute if available, while falling back to files for compatibility. The real fun is what comes next.

You see, the items namespace deals with Items, not Files. Items can be thought of as pointers to things in the file system. Thankfully, that includes the directories we’re after. But unfortunately, this is the file system and the file system is slow. And JavaScript is single-threaded. These two facts together are a recipe for latency. The File System API tackles this problem with the same solution as Node.js: asynchronicity. Most of the functions in the API accept a callback that will be invoked when the disk gets around to providing the requested files. So we’ll have to update our code to do two new things: 1) translate items into files and 2) handle synchronous and asynchronous APIs.

So what do these changes look like? Let’s turn back to loadFiles() and teach it how to handle these new types of files. Taking a look at the spec for the Item class, there appears to be a getAsFile() function and that sounds perfect.

function loadFiles(files) {
	updateQueueLength(count);
	
	for (var i = 0; i < files.length; i++) {
		var file = files[i];
		
		if (typeof file.getAsFile === 'function') {
			enqueueFileAddition(file.getAsFile());
		}
		else if (File && file instanceof File) {
			enqueueFileAddition(file);
		}
	}
}

Easy – but, there’s a problem. The getAsFile() function is very literal. It assumes the Item points to a file. But directories aren’t files and that means this method won’t meet our needs. Fortunately, there is a solution and that’s through yet another data type, the Entry. An Entry is much like a File, but it can also represent directories. As mentioned in this WHATWG wiki document, there is a proposed method, getAsEntry(), in the Item interface that allows you to grab an Entry for its file system object. It’s browser prefixed for now, so let’s add that in as well.

function loadFiles(files) {
	updateQueueLength(count);
	
	for (var i = 0; i < files.length; i++) {
		var file = files[i];
		var entry;
		
		if (file.getAsEntry) {
			entry = file.getAsEntry();
		}
		else if (file.webkitGetAsEntry) {
			entry = file.webkitGetAsEntry();
		}
		else if (typeof file.getAsFile === 'function') {
			enqueueFileAddition(file.getAsFile());
		}
		else if (File && file instanceof File) {
			enqueueFileAddition(file);
		}
	}
}

So what we have now is a way of handling native files and a way of turning Items into Entries. Now we need to figure out if the Entry is a file or a directory and then handle that appropriately.

What we’ll do is queue up any File objects we run across and skip the loop ahead to the next object. But if we have an Item and successfully turn it into an Entry then we’ll try to resolve this down to a file or a directory.

function loadFiles(files) {
	updateQueueLength(count);
	
	for (var i = 0; i < files.length; i++) {
		var file = files[i];
		var entry, reader;
		
		if (file.getAsEntry) {
			entry = file.getAsEntry();
		}
		else if (file.webkitGetAsEntry) {
			entry = file.webkitGetAsEntry();
		}
		else if (typeof file.getAsFile === 'function') {
			enqueueFileAddition(file.getAsFile());
			continue;
		}
		else if (File && file instanceof File) {
			enqueueFileAddition(file);
			continue;
		}
		
		if (!entry) {
			updateQueueLength(-1);
		}
		else if (entry.isFile) {
			entry.file(function(file) {
				enqueueFileAddition(file);
			}, function(err) {
				console.warn(err);
			});
		}
		else if (entry.isDirectory) {
			reader = entry.createReader();
			
			reader.readEntries(function(entries) {
				loadFiles(entries);
				updateQueueLength(-1);
			}, function(err) {
				console.warn(err);
			});
		}
	}
}

The code is getting long, but we’re almost done. Let’s unpack this.

The first branch of our new Entry logic ensures that what was returned by webkitGetAsEntry()/getAsEntry() is something useful. When they error they return null and this will happen if an application provides data in the drop event that isn’t a file. To see this in action try dragging a few files in from Preview in Mac OS X – it’s odd behavior, but this adequately cleans it up.

Next we handle files. The Entry spec provides the brilliantly simple isFile and isDirectory attributes. These guarantee whether you have a FileEntry or a DirectoryEntry on your hands. These classes have useful – though as promised, asynchronous – methods and here we use FileEntry’s file() method and enqueue its returned file.

Finally, the unicorn we’re chasing – handling directories. This is a tad more complicated, but the idea is straightforward. We create a DirectoryReader which lets us read its contents through its readEntries() method which provides an array of Entries. And what do we do with these Entries? We recursively call our loadFiles() function with them! In this step we achieve recursively walking a branch of the file system and rooting out every available image. Finally, we decrement the count of expected files by 1 to indicate that this was a directory and it has now been suitably handled.

But there is one more thing.

In that final directory reading step we recursively called loadFiles() with an array of Entries. As of right now, this function only expects to handle Files and Items. Let’s patch up this oversight, add a final bit of error handling, and call it a day.

function loadFiles(files) {
	updateQueueLength(count);
	
	for (var i = 0; i < files.length; i++) {
		var file = files[i];
		var entry, reader;
		
		if (file.isFile || file.isDirectory) {
			entry = file;
		}
		else if (file.getAsEntry) {
			entry = file.getAsEntry();
		}
		else if (file.webkitGetAsEntry) {
			entry = file.webkitGetAsEntry();
		}
		else if (typeof file.getAsFile === 'function') {
			enqueueFileAddition(file.getAsFile());
			continue;
		}
		else if (File && file instanceof File) {
			enqueueFileAddition(file);
			continue;
		}
		else {
			updateQueueLength(-1);
			continue;
		}
		
		if (!entry) {
			updateQueueLength(-1);
		}
		else if (entry.isFile) {
			entry.file(function(file) {
				enqueueFileAddition(file);
			}, function(err) {
				console.warn(err);
			});
		}
		else if (entry.isDirectory) {
			reader = entry.createReader();
			
			reader.readEntries(function(entries) {
				loadFiles(entries);
				updateQueueLength(-1);
			}, function(err) {
				console.warn(err);
			});
		}
	}
}

All we need to do to handle an Entry is to rely on the fact that Entries have those oh-so-helpful isFile and isDirectory attributes. If we see those we know we have an Entry of one type or another and we know how to work with them, so just skip on down to the FileEntry and DirectoryEntry handling code.

And that, finally, is it. There are many specs with very new data types at play here, but through this turmoil we can achieve some very nice results never before possible in browsers.

Further reading

Flickr flamily floto

Like this post? Have a love of online photography? Want to work with us? Flickr is hiring engineers, designers and product managers in our San Francisco office. Find out more at flickr.com/jobs.

Flickr at SF Web Performance

Wait! Did you say they all run Webkit?
Wait! Did you say they all run Webkit? by Schill

Thanks to everyone that came out to the SF Web Performance meet up last night! For those of you that missed it, JP and Aaron were kind enough to record the entire event on Ustream.

You can also view the slides and associated blog posts for each of the presentations:

  • Optimizing Touch Performance, by Stephen Woods: slides and blog post
  • Using Web Workers for fun and profit: Parsing Exif in the client, by Chris Berry: slides and blog post
  • The Grid: How we show 10,000 photos on a page without crashing your browser, by Scott Schiller: slides and blog post

Big thanks to JP and Aaron for setting it up and running the event so well!

Join the Flickr Frontend team tonight at the SF Web Performance meet up!

Team Tinfoil
Team Tinfoil by waferbaby

We will be hosting the SF Web Performance meet up tonight at 7pm at Citizen Space. Come join us for pizza, drinks, and these great talks:

Using Web Workers for fun and profit: Parsing Exif in the client, by Chris Berry

Exif, exchangeable image file format, describes various sets of metadata stored in a photo. Really interesting metadata, like image titles, descriptions, lens focal lengths, camera types, image orientation, even GPS data! I’ll go over the methods to extracting this data on the front-end, in real-time, using web workers.

The Grid: How we show 10,000 photos on a page without crashing your browser, by Scott Schiller

Flickr’s latest Web-based Uploadr interface uses HTML5 APIs to push bytes en masse. Its real power, however, is the UI which enables users to add and edit the metadata of hundreds of photos while they are uploading in the background.

Handling the selection, display and management of large numbers of photos in a browser UI meant that the Uploadr project needed to be designed for scalability from the ground up.

This talk will go into some of the details of the Uploadr “Grid” UI, technical notes and performance findings made during its development.

Optimizing Touch Performance, by Stephen Woods

Touch interfaces are amazing. Touch devices are amazingly slow. Stephen Woods will share hard-won advice for building responsive touch-based interfaces using HTML5, CSS, and JavaScript. He also reveals how Star Trek: The Next Generation predicted the need for instant user feedback in a touch-based UI and how Tivos slow UI was made bearable by a simple “bloop” sound.

See you there!

We saved you a step…

It seems when we launched version 2.0 of our Flickr shapes, we posted them with a flaw which made them useless to most popular geo applications.

Awwwww…

Luckily, Christopher Manning wrote a python script which makes them useful.

Yaaaayyyyy!

The least we can do is post an update which has already been christopher-manning-ified, So, we are very happy to announce version 2.0.1 of the Flickr shape files which can be downloaded here:
http://www.flickr.com/services/shapefiles/2.0.1/

Look, it works:


Flickr Shapes 2.0.1 in TileMill

A very hearty THANKS! from your friends at Flickr, Christopher.

Designing an OSM Map Style

With the recent change to our map system, we introduced a new map style for our OSM tiles. Since 2008, we’ve used the default OSM styles, which produces map tiles like this:

This style is extremely good at putting a lot of information in front of you. OSM doesn’t know your intended purpose for the maps (navigation, orientation, exploration, city planning, disaster response, etc.), so they err on the side of lots of information. This is good, but with the introduction of TileMill, non-professional cartographers (like myself) can now easily change map styles to better suit our needs. Using TileMill, we decided to take a crack at designing a map that is better suited to Flickr.

On Flickr, we use maps for a very specific purpose: to provide context for a photo. This means there are a lot of map features that we can leave out entirely. We can choose to hide features that are primarily used for navigation (ferry and train routes, bus stops) or for demarcation (city and county boundaries). Roads are useful as orientation tools, but certain road features (like exit numbers on highways) aren’t needed. In the end, we can reduce the data that the map shows to much smaller and more useful subset:

This is the style provided by MapBox’s excellent OSM Bright. As a starting point, this gets us a long way towards our goal of an unobtrusive yet still useful map. We made a few changes to OSM Bright and released them on GitHub as our Pandonia map style. Here are a few examples of the changes we made:

  • Toned down the road, land, and water colors, to allow greater contrast with the pink and blue dots that we use as markers
  • Reduced the density of road and highway names, as well as city, town and state names
  • Removed underground tram and rail line
  • Removed land use overlays for residential, commercial, and industrial zones, as well as parking lots
  • Removed state park overlays that overlapped the water

This is how it looks:

We tried a lot of different color combinations on the road to this style. Here is an animation of the different styles we tried, starting with OSM Bright.

Here it is zoomed in a bit more:

Over the next couple of weeks, we’ll be rolling out this style to all of the places where we use OSM tiles.

These maps are still a work in progress. The world is a big place, and creating a unified style that works well for every single location is challenging. If you notice problems with our new map styles, please let us know!

The great map update of 2012

Today we are announcing an update to the map tiles which we use site wide. A very high majority of the globe will be represented by Nokia’s clever looking tiles.

Nokia map tile

We are not stopping there. As some of you may know, Flickr has been using Open Street Maps (OSM) data to make map tiles for some places. We started with Beijing and the list has grown to twenty one additional places:

Mogadishu
Cairo
Algiers
Kiev
Tokyo
Tehran
Hanoi
Ho Chi Minh City
Manila
Davao
Cebu
Baghdad
Kabul
Accra
Hispaniola
Havana
Kinshasa
Harare
Nairobi
Buenos aires
Santiago

It has been a while since we last updated our OSM tiles. Since 2009, the OSM community has advanced quite a bit in the tools they provide and data quality. I went into a little detail about this in a talk I gave last year.

Introducing Pandonia

Nokia map tile

Today we are launching Buenos Aires and Santiago in a new style. We will be launching more cities in this new style in the near future. They are built from more recent OSM data and they will also have an entirely new style which we call Pandonia. Our new style was designed in TileMill from the osm-bright template, both created by the rad team at MapBox. TileMill changes the game when it comes to styling map tiles. The interface is developed to let you quickly iterate style changes to tiles and see the changes immediately. Ross Harmes will be writing a more detailed account of the work he did to create the Pandonia style. We appreciate the tips and guidance from Eric Gunderson, Tom MacWright, and the rest of the team at MapBox

We are looking forward to updating all of our OSM places with the Pandonia style in the near future and growing to more places after that… Antarctica? Null Island? The Moon? Stay tuned and see…

Changing our Javascript API

To host all of these new tiles we needed to find a flexible javascript api. Cloudmade’s Leaflet is a simple and open source tile serving javascript library. The events and methods map well to our previous JS API, which made upgrading simple for us. All of our existing map interfaces will stay the same with the addition of modern map tiles. They will also support touch screen devices better than ever. Leaflet’s layers mechanism will make it easier for us to blend different tile sources together seamlessly. We have a fork on GitHub which we plan to contribute to as time goes on. We’ll keep you posted.

Web workers and YUI

(Flickr is hiring! Check out our open job postings and what it’s like to work at Flickr.)

Web workers are awesome. They’ll change the way you think about JavaScript.

Factory Scenes : Consolidated/Convair Aircraft Factory San Diego

Chris posted an excellent writeup on how we do client-side Exif parsing in the new Uploader, which is how we can display thumbnails before uploading your photos to the Flickr servers. But parsing metadata from hundreds of files can be a little expensive.

In the old days, we’d attempt to divide our expensive JS into smaller parts, using setTimeout to yield to the UI thread, crossing our fingers, and hoping that the user could still scroll and click when they wanted to. If that didn’t work, then the feature was simply too fancy for the web.

Since then, a lot has happened. People started using better browsers. HTML got an orange logo. Web workers were discovered.

So now we can run JavaScript in separate threads (“parallel execution environments”), without interrupting the standard UI stuff the browser is always working on. We just need to put our job code in a separate file, and instantiate a web worker.

Without YUI

For simple, one-off tasks, you can just write some JavaScript in a new file and upload it to your server. Then create a worker like this:

var worker = new Worker('my_file.js');

worker.addEventListener('message', function (e) {
	// do something with the message from the worker
});

// pass some data into the worker
worker.postMessage({
	foo: bar
});

Of course, the worker thread won’t have access to anything in the main thread. You can post messages containing anything that’s JSON compatible, but not functions, cyclical references, or special objects like File references.

That means any modules or helper functions you’ve defined in your main thread are out of bounds, unless you’ve also included them in your worker file. That can be a drag if you’re accustomed to working in a framework.

With YUI

Practically speaking, a worker thread isn’t very different from the main thread. Workers can’t access the DOM, and they have a top-level self object instead of window. But plenty of our existing JavaScript modules and helper functions would be very useful in a worker thread.

Flickr is built on YUI. Its modular architecture is powerful and encourages clean, reusable code. We have a ton of small JS files—one per module—and the YUI Loader figures out how to put them all together into a single URL.

If we want to write our worker code like we write our normal code, our worker file can’t be just my_file.js. It needs to be a full combo URL, with YUI running inside it.

An aside for the brogrammers who have never seen modular JS in practice

Loader dynamically loads script and css files for YUI modules as well as external modules. It includes the dependency information for the version of the library in use, and will automatically pull in dependencies for the modules requested.

In development, we have one JS file per module. Let’s say photo.js, kitten.js, and puppy.js.

A page full of kitten photos might require two of those modules. So we tell YUI that we want to use photo.js and kitten.js, and the YUI Loader appends a script node with a combo URL that looks something like this:

<script src="/combo.php?photo.js&kitten.js">.

On our server, combo.php finds the two files on disk and prints out the contents, which are immediately executed inside the script node.

C-c-c-combo

Of course, the main thread is already running YUI, which we can use to generate the combo URL required to create a worker.

That URL needs to return the following:

  1. YUI.add() statements for any required modules. (Don’t forget yui-base)
  2. YUI.add() statement for the primary module with the expensive code.
  3. YUI.add() statement to execute the primary module.

Ok, so how do we generate this combo URL? Like so:

//
// Make a reference to our original YUI configuration object,
// with all of our module definitions and combo handler options.
//
// To make sure it's as clean as possible, we use a clone of the
// object from before we passed it into YUI.
//

var yconf = window.yconf; // global for demo purposes

//
// Y.Loader.resolve can be used to generate a combo URL with all
// the YUI modules needed within the web worker. (YUI 3.5 or later)
//
// The YUI Loader will bypass any required modules that have
// already been loaded in this instance, so in addition to the
// clean configuration object, we use a new YUI instance.
//

var Y2 = YUI(Y.merge(yconf));

var loader = new Y2.Loader({
	// comboBase must be on the same domain as the main thread
	comboBase: '/local/combo/path/',
	combine: true,
	ignoreRegistered: true,
	maxURLLength: 2048,
	require: ['my_worker_module']
});

var out = loader.resolve(true);

var combo_url = out.js[0];

Then, also in the main thread, we can start the worker instance:

//
// Use the combo URL to create a web worker.
// This is when the combo URL is downloaded, parsed, 
// and executed.
//

var worker = new window.Worker(combo_url);

To start using YUI, we need to pass our YUI config object into the worker thread. That could have been part of the combo URL, but our YUI config is pretty specific to the particular page you’re on, so we need to reuse the same object we started with in the main thread. So we use postMessage to pass it from the main thread to the worker:

//
// Post the YUI config into the worker.
// This is when the worker actually starts its work.
//

worker.postMessage({
	yconf: yconf
});

Now we’re almost done. We just need to write the worker code that waits for our YUI config before using the module. So, at the bottom of the combo response, in the worker thread:

self.addEventListener('message', function (e) {

	if (e.data.yconf) {

		//
		// make sure bootstrapping is disabled
		//
		
		e.data.yconf.bootstrap = false;

		//
		// instantiate YUI and use it to execute the callback
		//
		
		YUI(e.data.yconf).use('my_worker_module', function (Y) {

			// do some hard work!

		});

	}

}, false);

Yeah, I know the back-and-forth between the main thread and the worker makes that look complicated. But it’s actually just a few steps:

  1. Main thread generates a combo URL and instantiates a Web Worker.
  2. Worker thread parses and executes the JS returned by that URL.
  3. Main thread posts the page’s YUI config into the worker thread.
  4. Worker thread uses the config to instantiate YUI and “use” the worker module.

That’s it. Now get to work!