Twitter Engineering

Thursday, December 20, 2012

How our photo filters came into focus

The old adage “a picture is worth a thousand words” is very apt for Twitter: a single photo can express what otherwise might require many Tweets. Photos help capture whatever we’re up to: kids’ birthday parties, having fun with our friends, the world we see when we travel.

Like so many of you, lots of us here at Twitter really love sharing filtered photos in our tweets. As we got into doing it more often, we began to wonder if we could make that experience better, easier and faster. After all, the now-familiar process for tweeting a filtered photo has required a few steps:

1. Take the photo (with an app)
2. Filter the photo (probably another app)
3. Finally, tweet it!

Constantly needing to switch apps takes time, and results in frustration and wasted photo opportunities. So we challenged ourselves to make the experience as fast and simple as possible. We wanted everyone to be able to easily tweet photos that are beautiful, timeless, and meaningful.

With last week’s photo filters release, we think we accomplished that on the latest versions of Twitter for Android and Twitter for iPhone. Now we'd like to tell you a little more about what went on behind the scenes in order to develop this new photo filtering experience.

It’s all about the filters

Our guiding principle: to create filters that amplify what you want to express, and to help that expression stand the test of time. We began with research, user stories, and sketches. We designed and tested multiple iterations of the photo-taking experience, and relied heavily on user research to make decisions about everything from filters nomenclature and iconography to the overall flow. We refined and distilled until we felt we had the experience right.

We spent many hours poring over the design of the filters. Since every photo is different, we did our analyses across a wide range of photos including portraits, scenery, indoor, outdoor and low-light shots. We also calibrated details ranging from color shifts, saturation, and contrast, to the shape and blend of the vignettes before handing the specifications over to Aviary, a company specializing in photo editing. They applied their expertise to build the algorithms that matched our filter specs.

Make it fast!

Our new photo filtering system is a tight integration of Aviary's cross-platform GPU-accelerated photo filtering technology with our own user interface and visual specifications for filters. Implementing this new UI presented some unique engineering challenges. The main one was the need to create an experience that feels instant and seamless to use — while working within constraints of memory usage and processing speed available on the wide range of devices our apps support.

To make our new filtering experience work, our implementation keeps up to four full-screen photo contexts in memory at once: we keep three full-screen versions of the image for when you’re swiping through photos (the one you’re currently looking at plus the next to the right and the left), and the fourth contains nine small versions of the photo for the grid view. And every time you apply or remove a crop or magic enhance, we update the small images in the grid view to reflect those changes, so it’s always up to date.

Without those, you could experience a lag when scrolling between photos — but mobile phones just don't have a lot of memory. If we weren't careful about when and how we set up these chunks of memory, one result could be running out of memory and crashing the app. So we worked closely with Aviary's engineering team to achieve a balance that would work well for many use cases.

Test and test some more

As soon as engineering kicked off, we rolled out this new feature internally so that we could work out the kinks, sanding down the rough spots in the experience. At first, the team tested it, and then we opened it up to all employees to get lots of feedback. We also engaged people outside the company for user research. All of this was vital to get a good sense about which aspects of the UI would resonate, or wouldn’t.

After much testing and feedback, we designed an experience in which you can quickly and easily choose between different filtering options – displayed side by side, and in a grid. Auto-enhancement and cropping are both a single tap away in an easy-to-use interface.

Finally, a collaborative team of engineers, designers and product managers were able to ship a set of filters wrapped in a seamless UI that anyone with our Android or iPhone app can enjoy. And over time, we want our filters to evolve so that sharing and connecting become even more delightful. It feels great to be able to share it with all of you at last.

Posted by @ryfar
Tweet Composer Team

Thursday, December 13, 2012

Class project: “Analyzing Big Data with Twitter”

Twitter partnered with UC Berkeley this past semester to teach Analyzing Big Data with Twitter, a class with Prof. Marti Hearst. In the first half of the semester, Twitter engineers went to UC Berkeley to talk about the technology behind Twitter: from the basics of scaling up a service to the algorithms behind user recommendations and search. These talks are available online, on the course website.

In the second half of the course, students applied their knowledge and creativity to build data-driven applications on top of Twitter. They came up with a range of products that included tracking bands or football teams, monitoring Tweets to find calls for help, and identifying communities on Twitter. Each project was mentored by one of our engineers.

Last week, 40 of the students came to Twitter HQ to demo their final projects in front of a group of our engineers, designers and engineering leadership team.

The students' enthusiasm and creativity inspired and impressed all of us who were involved. The entire experience was really fun, and we hope to work with Berkeley more in the future.

Many thanks to the volunteer Twitter engineers, to Prof. Hearst, and of course to our fantastic students!

Cannot believe that I'll be visiting the Twitter HQ tomorrow for our project presentations. Yay! Yes, had to tweet about it. #ilovetwitter
— Priya Iyer (@myy_precious) December 6, 2012

@gilad @ucbtweeter Truly one of my dreams coming true! Thanks to everyone from Twitter and @martihearst for making this happen.
— Seema Hari (@SeemaHari) December 7, 2012

Posted by Gilad Mishne - @gilad
Engineering Manager, Search

Tuesday, December 11, 2012

Blobstore: Twitter’s in-house photo storage system

Millions of people turn to Twitter to share and discover photos. To make it possible to upload a photo and attach it to your Tweet directly from Twitter, we partnered with Photobucket in 2011. As soon as photos became a more native part of the Twitter experience, more and more people began using this feature to share photos.

In order to introduce new features and functionality, such as filters, and continue to improve the photos experience, Twitter’s Core Storage team began building an in-house photo storage system. In September, we began to use this new system, called Blobstore.

What is Blobstore?

Blobstore is Twitter’s low-cost and scalable storage system built to store photos and other binary large objects, also known as blobs. When we set out to build Blobstore, we had three design goals in mind:

Low Cost: Reduce the amount of money and time Twitter spent on storing Tweets with photos.
High Performance: Serve images in the low tens of milliseconds, while maintaining a throughput of hundreds of thousands of requests per second.
Easy to Operate: Be able to scale operational overhead with Twitter’s continuously growing infrastructure.

How does it work?

When a user tweets a photo, we send the photo off to one of a set of Blobstore front-end servers. The front-end understands where a given photo needs to be written, and forwards it on to the servers responsible for actually storing the data. These storage servers, which we call storage nodes, write the photo to a disk and then inform a Metadata store that the image has been written and instruct it to record the information required to retrieve the photo. This Metadata store, which is a non-relational key-value store cluster with automatic multi-DC synchronization capabilities, spans across all of Twitter’s data centers providing a consistent view of the data that is in Blobstore.

The brain of Blobstore, the blob manager, runs alongside the front-ends, storage nodes, and index cluster. The blob manager acts as a central coordinator for the management of the cluster. It is the source of all of the front-ends’ knowledge of where files should be stored, and it is responsible for updating this mapping and coordinating data movement when storage nodes are added, or when they are removed due to failures.

Finally, we rely on Kestrel, Twitter’s existing asynchronous queue server, to handle tasks such as replicating images and ensuring data integrity across our data centers.

We guarantee that when an image is successfully uploaded to Twitter, it is immediately retrievable from the data center that initially received the image. Within a short period of time, the image is replicated to all of our other data centers, and is retrievable from those as well. Because we rely on a multi-data-center Metadata store for the central index of files within Blobstore, we are aware in a very short amount of time whether an image has been written to its original data center; we can route requests there until the Kestrel queues are able to replicate the data.

Blobstore Components

How is the data found?

When an image is requested from Blobstore, we need to determine its location in order to access the data. There are a few approaches to solving this problem, each with its own pros and cons. One such approach is to map or hash each image individually to a given server by some method. This method has a fairly major downside in that it makes managing the movement of images much more complicated. For example, if we were to add or remove a server from Blobstore, we would need to recompute a new location for each individual image affected by the change. This adds operational complexity, as it would necessitate a rather large amount of bookkeeping to perform the data movement.

We instead created a fixed-sized container for individual blobs of data, called a “virtual bucket”. We map images to these containers, and then we map the containers to the individual storage nodes. We keep the total number of virtual buckets unchanged for the entire lifespan of our cluster. In order to determine which virtual bucket a given image is stored in, we perform a simple hash on the image’s unique ID. As long as the number of virtual buckets remains the same, this hashing will remain stable. The advantage of this stability is that we can reason about the movement of data at a much more coarsely grained level than the individual image.

How do we place the data?

When mapping virtual buckets to physical storage nodes, we keep some rules in mind to make sure that we don’t lose data when we lose servers or hard drives. For example, if we were to put all copies of a given image on a single rack of servers, losing that rack would mean that particular image would be unavailable.

If we were to completely mirror the data on a given storage node on another storage node, it would be unlikely that we would ever have unavailable data, as the likelihood of losing both nodes at once is fairly low. However, whenever we were to lose a node, we would only have a single node to source from to re-replicate the data. We would have to recover slowly, so as to not impact the performance of the single remaining node.

If we were to take the opposite approach and allow any server in the cluster to share a range of data on all servers, then we would avoid a bottleneck when recovering lost replicas, as we would essentially be able to read from the entire cluster in order to re-replicate data. However, we would also have a very high likelihood of data loss if we were to lose more than the replication factor of the cluster (two) per data center, as the chance that any two nodes would share some piece of data would be high. So, the optimal approach would be somewhere in the middle: for a given piece of data, there would be a limited number of machines that could share the range of data of its replica - more than one but less than the entire cluster.

We took all of these things into account when we determined the mapping of data to our storage nodes. As a result, we built a library called “libcrunch” which understands the various data placement rules such as rack-awareness, understands how to replicate the data in way that minimizes risk of data loss while also maximizing the throughput of data recovery, and attempts to minimize the amount of data that needs to be moved upon any change in the cluster topology (such as when nodes are added or removed). It also gives us the power to fully map the network topology of our data center, so storage nodes have better data placement and we can take into account rack awareness and placement of replicas across PDU zones and routers.

Keep an eye out for a blog post with more information on libcrunch.

How is the data stored?

Once we know where a given piece of data is located, we need to be able to efficiently store and retrieve it. Because of their relatively high storage density, we are using standard hard drives inside our storage nodes (3.5” 7200 RPM disks). Since this means that disk seeks are very expensive, we attempted to minimize the number of disk seeks per read and write.

We pre-allocate ‘fat’ files on each storage node disk using fallocate(), of around 256MB each. We store each blob of data sequentially within a fat file, along with a small header. The offset and length of the data is then stored in the Metadata store, which uses SSDs internally, as the access pattern for index reads and writes is very well-suited for solid state media. Furthermore, splitting the index from the data saves us from needing to scale out memory on our storage nodes because we don’t need to keep any local indexes in RAM for fast lookups. The only time we end up hitting disk on a storage node is once we already have the fat file location and byte offset for a given piece of data. This means that we can generally guarantee a single disk seek for that read.

Topology Management

As the number of disks and nodes increases, the rate of failure increases. Capacity needs to be added, disks and nodes need to be replaced after failures, servers need to be moved. To make Blobstore operationally easy we put a lot of time and effort into libcrunch and the tooling associated with making cluster changes.

When a storage node fails, data that was hosted on that node needs to be copied from a surviving replica to restore the correct replication factor. The failed node is marked as unavailable in the cluster topology, and so libcrunch computes a change in the mapping from the virtual buckets to the storage nodes. From this mapping change, the storage nodes are instructed to copy and migrate virtual buckets to new locations.

Zookeeper
Topology and placement rules are stored internally in one of our Zookeeper clusters. The Blob Manager deals with this interaction and it uses this information stored in Zookeeper when an operator makes a change to the system. A topology change can consist of adjusting the replication factor, adding, failing, or removing nodes, as well as adjusting other input parameters for libcrunch.

Replication across Data centers

Kestrel is used for cross data center replication. Because kestrel is a durable queue, we use it to asynchronously replicate our image data across data centers.

Data center-aware Routing

TFE (Twitter Frontend) is one of Twitter’s core components for routing. We wrote a custom plugin for TFE, that extends the default routing rules. Our Metadata store spans multiple data centers, and because the metadata stored per blob is small (a few bytes), we typically replicate this information much faster than the blob data. If a user tries to access a blob that has not been replicated to the nearest data center they are routed to, we look up this metadata information and proxy requests to the nearest data center that has the blob data stored. This gives us the property that if replication gets delayed, we can still route requests to the data center that stored the original blob, serving the user the image at the cost of a little higher latency until it’s replicated to the closer data center.

Future work

We have shipped the first version of blobstore internally. Although blobstore started with photos, we are adding other features and use cases that require blob storage to blobstore. And we are also continuously iterating on it to make it more robust, scalable, and easier to maintain.

Acknowledgments

Blobstore was a group effort. The following folks have contributed to the project: Meher Anand (@meher_anand), Ed Ceaser (@asdf), Harish Doddi (@thinkingkiddo), Chris Goffinet (@lenn0x), Jack Gudenkauf (@_jg), and Sangjin Lee (@sjlee).

Posted by Armond Bigian @armondbigian
Engineering Director, Core Storage & Database Engineering

Friday, December 7, 2012

Implementing pushState for twitter.com

As part of our continuing effort to improve the performance of twitter.com, we've recently implemented pushState. With this change, users experience a perceivable decrease in latency when navigating between sections of twitter.com; in some cases near zero latency, as we're now caching responses on the client.

This post provides an overview of the pushState API, a summary of our implementation, and details some of the pitfalls and gotchas we experienced along the way.

API Overview

pushState is part of the HTML 5 History API— a set of tools for managing state on the client. The pushState() method enables mapping of a state object to a URL. The address bar is updated to match the specified URL without actually loading the page.

history.pushState([page data], [page title], [page URL])

While the pushState() method is used when navigating forward from A to B, the History API also provides a "popstate" event—used to mange back/forward button navigation. The event's "state" property maps to the data passed as the first argument to pushState().

If the user presses the back button to return to the initial point from which he/she first navigated via pushState, the "state" property of the "popstate" event will be undefined. To set the state for the initial, full-page load use the replaceState() method. It accepts the same arguments as the pushState() method.

history.replaceState([page data], [page title], [page URL])

The following diagram illustrates how usage of the History API comes together.

Diagram illustrating use of the HTML 5 History API

Progressive Enhancement

Our pushState implementation is a progressive enhancement on top of our previous work, and could be described as Hijax + server-side rendering. By maintaining view logic on the server, we keep the client light, and maintain support for browsers that don't support pushState with the same URLs. This approach provides the additional benefit of enabling us to disable pushState at any time without jeopardizing any functionality.

On the Server

On the server, we configured each endpoint to return either full-page responses, or a JSON payload containing a partial, server-side rendered view, along with its corresponding JavaScript components. The decision of what response to send is determined by checking the Accept header and looking for "application/json."

The same views are used to render both types of requests; to support pushState the views format the pieces used for the full-page responses into JSON.

Here are two example responses for the Interactions page to illustrate the point:

pushState response

{
  // Server-rendered HTML for the view
  page: "<div>…</div>",
  // Path to the JavaScript module for the associated view
  module: "app/pages/connect/interactions",
  // Initialization data for the current view
  init_data: {…},
  title: "Twitter / Interactions"
}

Full page response

<html>
  <head>
    <title>{{title}}</title>
  </head>
  <body>
    <div id="page-container">{{page}}</div>
  </body>
</html>
<script>
  using({{module}}, function (pageModule) {
    pageModule({{init_data}});
  });
</script>

Client Architecture

Several aspects of our existing client architecture made it particularly easy to enhance twitter.com with pushState.

By contract, our components attach themselves to a single DOM node, listen to events via delegation, fire events on the DOM, and those events are broadcast to other components via DOM event bubbling. This allows our components to be even more loosely coupled—a component doesn't need a reference to another component in order to listen for its events.

Secondly, all of our components are defined using AMD, enabling the client to make decisions about what components to load.

With this client architecture we implemented pushState by adding two components: one responsible for managing the UI, the other data. Both are attached to the document, listen for events across the entire page, and broadcast events available to all components.

UI Component

Manages the decision to pushState URLs by listening for document-wide clicks, and keyboard shortcuts
Broadcasts an event to initiate pushState navigation
Updates the UI in response to events from the data component

DATA Component

Only included if we're using pushState
Manages XHRs and caching of responses
Provides eventing around the HTML 5 history API to provide a single interface for UI components

Example pushState() Navigation LifeCycle

The user clicks on link with a specialized class (we choose "js-nav"), the click is caught by the UI component which prevents the default behavior and triggers a custom event to initiate pushState navigation.
The data component listens for that event and…
1. Writes the current view to cache and, only before initial pushState navigation, calls replaceState() to set the state data for the view
2. Fetches the JSON payload for the requested URL (either via XHR or from cache)
3. Update the cache for the URL
4. Call pushState() to update the URL
5. Trigger an event indicating the UI should be updated
The UI component resumes control by handling the event from the data component and…
1. JavaScript components for the current view are torn down (event listeners detached, associated state is cleaned up)
2. The HTML for the current view is replaced with the new HTML
3. The script loader only fetches modules not already loaded
4. The JavaScript components for the current view are initialized
5. An event is triggered to alert all components that the view is rendered and initialized

Pitfalls, Gotchas, etc.

It'll come as no surprise to any experienced frontend engineers that the majority of the problems and annoyances with implementing pushState stem from either 1) inconsistencies in browser implementations of the HTML 5 History API, or 2) having to replicate behaviors or functionality you would otherwise get for free with full-page reloads.

Don't believe the API, title updates are manual

All browsers currently disregard the title attribute passed to the pushState() and replaceState() methods. Any updates to the page title need to be done manually.

popstate Event Inconsistencies

At the time of this writing, WebKit (and only WebKit) fires an extraneous popstate event after initial page load. This appears to be a known bug in WebKit, and is easy to work around by ignoring popstate events if the "state" property is undefined.

State Object Size Limits

Firefox imposes 640KB character limit on the serialized state object passed to pushState(), and will throw an exception if that limit is exceeded. We hit this limit in the early days of our implementation, and moved to storing state in memory. We limit the size of the serialized JSON we cache on the client per URL, and can adjust that number via a server-owned config.

It's worth noting that due to the aforementioned popstate bug in WebKit, we pass an empty object as the first argument to pushState() to distinguish WebKit's extraneous popstate events from those triggered in response to back/forward navigation.

Thoughtful State Management Around Caching

The bulk of the work implementing pushState went into designing a simple client framework that would facilitate caching and provide the right events to enable components to both prepare themselves to be cached, and restore themselves from cache. This was solved through a few simple design decisions:

All events that trigger navigation (clicks on links, keyboard shortcuts, and back/forward button presses) are abstracted by the pushState UI component, routed through the same path in the data component, and subsequently fire the same events. This allows the UI to be both cached and handle updates in a uniform way.
The pushState UI component fires events around the rendering of updates: one before the DOM is updated, and another after the update is complete. The former enables UI components such as dialogs and menus to be collapsed in advance of the page being cached; the later enables UI components like timelines to update their timestamps when rendered from cache.
POST & DELETE operations bust the client-side cache.

Re-implementing Browser Functionality

As is often the case, changing the browser's default behavior in an effort to make the experience faster or simpler for the end-user typically requires more work on behalf of developers and designers. Here are some pieces of browser functionality that we had to re-implement:

Managing the position of the scrollbar as the user navigates forward and backward.
Preserving context menu functionality when preventing a link's default click behavior.
Accounting for especially fast, indecisive user clicks by ensuring the response you're rendering is in sync with the last requested URL.
Canceling outbound XHRs when the user requests a new page to avoid unnecessary UI updates.
Implementing the canonical AJAX spinner, so the user knows the page is loading.

Final Thoughts

Despite the usual browser inconsistencies and other gotchas, we're pretty happy with the HTML 5 History API. Our implementation has enabled us to deliver the fast initial page rendering times and robustness we associate with traditional, server-side rendered sites and the lightening quick in-app navigation and state changes associate with client-side rendered web applications.

Helpful Resources

Mozilla's (@Mozilla) HTML 5 History API documentation
Chris Wanstrath's (@defunkt) pjax (pushState + ajax = pjax) plugin for jQuery project on GitHub
Benjamin Lupton's (@balupton) history.js project on GitHub
Modernizr's (@Modernizr) pushState capability detection

—Todd Kloots, Engineer, Web Core team (@todd)

Tuesday, December 4, 2012

Twitter and SMS Spoofing

Over the past two days, a few articles have been published about a potential problem concerning the ability to post false updates to another user's SMS-enabled Twitter account, and it has been misreported that US-based Twitter users are currently vulnerable to this type of attack.

The general concern is that if a user has a Twitter account configured for SMS updates, and an attacker knows that user's phone number, it could be possible for the attacker to send a fake SMS message to Twitter that looks like it's coming from that user's phone number, which would result in a fake post to that user's timeline.

Most Twitter users interact over the SMS channel using a "shortcode." In the US, for instance, this shortcode is 40404. Because of the way that shortcodes work, it is not possible to send an SMS message with a fake source addressed to them, which eliminates the possibility of an SMS spoofing attack to those numbers.

However, in some countries a Twitter shortcode is not yet available, and in those cases Twitter users interact over the SMS channel using a "longcode." A longcode is basically just a normal looking phone number. Given that it is possible to send an SMS message with a fake source address to these numbers, we have offered PIN protection to users who sign up with a longcode since 2007. As of August of this year, we have additionally disallowed posting through longcodes for users that have an available shortcode.

It has been misreported that US-based Twitter users are currently vulnerable to a spoofing attack because PIN protection is unavailable for them. By having a shortcode, PIN protection isn't necessary for US-based Twitter users, because they are not vulnerable to SMS spoofing. We only provide the option for PIN protection in cases where a user could have registered with a longcode that is susceptible to SMS spoofing.

We work hard to protect our users from these kinds of threats and many others, and will continue to keep Twitter a site deserving of your trust.

Posted by Moxie Marlinspike - @moxie
Engineering Manager, Product Security