Category Archives: Sync

Pertaining to Firefox Sync.

Syncing and storage on three platforms

As it’s Christmas, I thought I’d take a moment to write down my reflections on Firefox Sync’s iterations over the years. This post focuses on how they actually sync — not the UI, not the login and crypto parts, but how they decide that something has changed and what they do about it.

I’ve been working on Sync for more than five years now, on each of its three main client codebases: first desktop (JavaScript), then Android (built from scratch in Java), and now on iOS (in Swift).

Desktop’s overall syncing strategy is unchanged from its early life as Weave.

Partly as a result of Conway’s Law writ large — Sync shipped as an add-on, built by the Services team rather than the Firefox team, with essentially no changes to Firefox itself — and partly for good reasons, Sync was separate from Firefox’s storage components.

It uses Firefox’s observer notifications to observe changes, making a note of changed records in what it calls a Tracker.

This is convenient, but it has obvious downsides:

  • From an organizational perspective, it’s easy for developers to disregard changes that affect Sync, because the code that tracks changes is isolated. For example, desktop Sync still doesn’t behave correctly in the presence of fancy Firefox features like Clear Recent History, Clear Private Data, restoring bookmark backups, etc.
  • Sync doesn’t get observer notifications for all events. Most notably, bulk changes sometimes roll-up or omit events, and it’s always possible for code to poke at databases directly, leaving Sync out of the loop. If a Places database is corrupt, or a user replaces it manually, Sync’s tracking will be wrong. This is almost inevitable when sync metadata doesn’t live with the data it tracks.
  • Sync doesn’t track actual changes; it tracks changed IDs. When a sync occurs, it goes to storage to get a current representation of the changed record. (If the record is missing, we assume it was deleted.) This makes it very difficult to do good conflict resolution.
  • In order to avoid cycles, Sync stops listening for events while it’s syncing. That means it misses any changes the user makes during a sync.
  • Similarly, it doesn’t see changes that happen before it registers its observers, e.g., during the first few seconds of using the browser.

Beyond the difficulties introduced by a reliance on observers, desktop Sync took some shortcuts 1: it applies incoming records directly and non-transactionally to storage, so an interrupted sync leaves local storage in a partial state. That’s usually OK for unstructured data like history — it’ll try again on the next sync, and eventually catch up — but it’s a bad thing for something structured like bookmarks, and can still be surprising elsewhere (e.g., passwords that aren’t consistent across your various intranet pages, form fields that are mismatched so you get your current street address and your previous city and postal code).

During the last days of the Services team, Philipp, Greg, myself, and others were rethinking how we performed syncs. We settled on a repository-centric approach: records were piped between repositories (remote or local), abstracting away the details of how a repository figured out what had changed, and giving us the leeway to move to a better internal structure.

That design never shipped on desktop, but it was the basis for our Sync implementation on Android.

Android presented some unique constraints. Again, Conway’s Law applied, albeit to a lesser extent, but also the structure of the running code had to abide by Android’s ContentProvider/SyncAdapter/Activity patterns.

Furthermore, Fennec was originally planning to support Android’s own internal bookmark and history storage, so its internal databases mirrored that schema. You can still see the fossilized remnants of that decision in the codebase today. When that plan was nixed, the schema was already starting to harden. The compromise we settled on was to use modification timestamps and deletion flags in Fennec’s content providers, and use those to extract changes for Sync in a repository model.

Using timestamps as the basis for tracking changes is a common error when developers hack together a synchronization system. They’re convenient, but client clocks are wrong surprisingly often, jump around, and lack granularity. Clocks from different devices shouldn’t be compared, but we do it anyway when reconciling conflicts. Still, it’s what we had to work with at the time.

The end result is over-engineered, fundamentally flawed, still directly applies records to storage, but works well enough. We have seen dramatically fewer bugs in Android Sync than we saw in desktop Sync between 2010 and 2012. I attribute some of that simply to the code having been written for production rather than being a Labs project (the desktop bookmark sync code was particularly flawed, and Philipp and I spent a lot of time making it better), some of it to lessons learned, and some of it to better languages and tooling — Java and Eclipse produce code with fewer silly bugs 2 than JavaScript and Vim.

On iOS we had the opportunity to learn from the weaknesses in the previous two implementations.

The same team built the frontend, storage, and Sync, so we put logic and state in the right places. We track Sync-related metadata directly in storage. We can tightly integrate with bulk-deletion operations like Clear Private Data, and change tracking doesn’t rely on timestamps: it’s an integral part of making the change itself.

We also record enough data to do proper three-way merges, which avoids a swath of quiet data loss bugs that have plagued Sync over the years (e.g., recent password changes being undone).

We incrementally apply chunks of records, downloaded in batches, so we rarely need to re-download anything in the case of mid-sync failures.

And we buffer downloaded records where appropriate, so the scary part of syncing — actually changing the database — can be done locally with offline data, even within a single transaction.

Storage on iOS is significantly more involved as a result: we have sync_status columns on each table, and typically have two tables per datatype to track the original shared parent of a row. Bookmark sync is shaping up to involve six tables. But the behavior of the system is dramatically more predictable; this is a case of modeling essential complexity, not over-complicating. So far the bug rate is low, and our visibility into the interactions between parts of the code is good — for example, it’s just not possible for Steph to implement bulk deletions of logins without having to go through the BrowserLogins protocol, which does all the right flipping of change flags.

In the future we’re hoping to see some of the work around batching, use of in-storage tracking flags, and three-way merge make it back to Android and eventually to desktop. Mobile first!


  1. My feeling is that Weave was (at least from a practical standpoint) originally designed to sync two desktops with good network connections, using cheap servers that could die at any moment. That attitude doesn’t fit well with modern instant syncing between your phone, tablet, and laptop!
  2. For example, Sync’s tab record format, defined by the desktop code, includes a time last used. Sometimes this is a string, and sometimes it’s an integer. Hooray JavaScript!

PSA: Sync account changes in the pipeline

Until now, if you’ve had more than one channel of Firefox installed on your Android device, you’ve experienced some restrictions when using Sync. We’ve been working to improve this situation, and just landed the first change to this end (Bug 772645).
Users with accounts set up for Firefox Beta or the release version of Firefox on Android should be utterly unaffected, and can stop reading now!

Bug 772645 splits the existing “Firefox Sync” account type into three: one for Beta and Release, one for Aurora and Nightly, and one for developer builds. These divisions align with the different Android shared user IDs for each channel. (This means that Beta and Release will eventually share Sync accounts, as will Aurora and Nightly.)

As of this week you’ll be able to have one Nightly and one of Aurora/Beta/Release set up and syncing on your device at the same time.

When it merges to Aurora, you’ll be able to have either Nightly or Aurora, and either Beta or Release, because Aurora’s type will change to match Nightly.

Developer builds taken from mozilla-central should now be able to be installed and configured for Sync entirely separately from Aurora/Beta/Release/Nightly, which should make developers happy!

Here’s where you might have to take action.

If you currently have Nightly as your syncing Firefox, you will need to set up Sync again after upgrading. The account will disappear on upgrade.

If you wish to avoid this, you can install one of the other channels first, and it will “steal” the existing Sync account when you upgrade Nightly. There is no way for you to turn your previous settings into a Nightly account.

The exact same steps will apply when Aurora bumps to version 17 — your Aurora account type will go away, and you’ll need to pre-emptively switch to Beta or Release to keep it, or just set up Sync again.

Pardon our dust, but this turned out to be the sanest (and only?) road forward.

Props go to Nick Alexander for pushing hard on this; he did — as always — excellent and elegant work.

Any questions, please do let us know. There might well be hiccups as this bounces down the track, but we’ll do our best to help.

Cleaning up feels good

Most of the test failures were just down to Jenkins problems (though I appreciate the green upward trend of that graph!), but it’s nice to start killing warnings…


How to file a good Android Sync bug

So you’ve seen some behavior that looks like a bug — perhaps a crash, perhaps data being corrupted, perhaps some performance impact or a sync never finishing.

The first thing we need is a log from the device. There’s no about:sync-log any more, so you’ll need to fetch the logs from the device via USB or using an Android application.

If you have an Android SDK and a USB cable, you can run

adb logcat -v time

and capture logs during the event that reproduces the bug. There will be a lot of output.

If you don’t have an Android development environment set up, you can install aLogRec and email the ADB log to yourself. This reportedly won’t work with recent Android versions, so I encourage you to try installing the tools first.

Once you have the log in a file, you can attach it to a new Android Sync bug in Bugzilla. Please take the time to write:

  • An accurate summary of your issue. “Sync adds bookmarks from desktop Bookmarks Toolbar, but local changes do not appear on desktop” is way more useful than “Sync doesn’t work”.
  • A description that explains what you see, what you expect to see, and any other pertinent information. Let us know which other devices you have connected to your Sync account: “Firefox 23 on Mac, Firefox 24 on Windows 7”.
  • Steps to reproduce. If you can start with a clean profile everywhere, and take a reproducible sequence of actions to cause a bug to manifest, we will be really grateful, and the bug is much more likely to be fixed than one we have to guess at.
  • A description of your device and the version of Firefox you’re using. Something like “Samsung Galaxy S3, Android 4.1.4, Nightly build from 2013-08-24” is enough. You can find out your Android version in the Settings application.

Rapid development

We just shipped our second milestone for Android Sync, a native Java implementation of Firefox Sync to work with Native Fennec.

The first milestone was “preffed-off”, a code drop to ensure that building these two separate projects together would work. This milestone is the first time it’s available to users, which of course is both scary and exciting. By and large it seems to work, albeit with a bunch of known issues.

I thought I’d share some experiences from getting this done, with graphical fun from GitHub’s statistics view.

From the outset, I knew this was an impossible challenge. At the start of November my boss asks me “can we have something working in December?”. Given that “something working” involves building a brand-new client for the Sync protocol, the crypto layer, storage engines for a browser that’s still being built, and UI and system integration for an unfamiliar platform, my obvious response was “are you crazy?”.

That aside, I was actually fairly confident in the scope at the time. A small team (at that point just me, an intern, and a recent new-grad hire) can move quite quickly if the problem is well understood, and this was.

I knew in advance that this would require insane amounts of work: I worked until 4am on Christmas day, and that was hardly the exception. We landed our first milestone a week late, and upon reflection we added about a week of unplanned packaging work in the middle, so I call that a win.

This was the cost:


That’s the GitHub commit punchcard for the project, and that pretty directly reflects my working time. (Apparently I land a lot of code at 4pm on a Sunday.)

That wavy sliver of grey between 2am and 6am? Yeah, that’s when I was sleeping. The slight reduction of dots around noon and 7pm? Mealtimes. I think this means an average of a 14-hour day.

I started to burn out a little around two weeks ago, after a few weeks of 90+ hour weeks and extensive (including international) travel. You can see this in our commit graph:


The orange is me. That big dip on the right is Christmas, and thereafter I shifted from building stuff to fixing stuff, so my total contribution dropped. 359 commits over about 10 weeks is about 5 commits per day, with about 3,500 lines of changes (additions and deletions) per day.

We had to work with much less process than I like to get this done. Our test coverage is low (thanks for the Jenkins setup, gps!), but it does exist. The philikon on my shoulder poked me every time I landed something knowing it had inadequate test coverage. I only recently started absolutely mandating code review and bug numbers for each commit. My own work is still unreviewed, because we don’t have a module peer with enough experience to review my code. (But thank heavens for that philikon on my shoulder! It sometimes felt like I had a second brain doing real-time code review.)

Several times we relied on some heroes. Philipp and a contributor from Mozilla China stepped in and saved our J-PAKE code from going off the rails. Tony and Tracy did some late-night QA because we simply didn’t have enough padding to get them working builds beforehand. And Ally and Erin deserve a serious shout-out for taking occasional “can you make sure this Fennec bug gets landed?” pings and making sure the way was clear for this bull-headed engineer.

All of this omitted process has or is going to come back, of course, and we’re going to take a more measured approach going forward. On reflection, though, I’m actually pleased with how this development process worked. I think we made reasonable tradeoffs (with a few errors that I want to talk through with people) with a positive outcome.

In both the large and the small, software development is about horse-trading with risks. Spend an hour reviewing this code, and let some other feature go unimplemented, or trust that it works and risk a bug? Write that test or not? Go for the hack or the thorough solution? Ask for help or avoid the coordination overhead?

There are a lot of ways this project could have gone wrong, and some in which it did. Thanks to my team, our excellent project managers, and my boss for making sure it turned out OK!

Why does Firefox Sync use a key as well as a password?

A friend of mine, a software engineer, just asked me this.

Why do you force people to enter that enormous key just to protect their sync data? Passwords are sufficient for banking institutions and payroll facilities, arguably with more important data than your bookmarks. Why not make it optional extra security for those who want it, instead of making everyone pass around a 26 character string to every machine they want to sync from, and risk losing all of their information if it’s lost?

The answer is quite long-winded. Here’s a slightly edited version of my response.

Firstly, we try to make sure that people don’t have to enter it; we’re not blind to the additional complication involved. The sync key is generated for you during setup, so you don’t have to think up another password. It’s stored in Password Manager so you don’t have to remember it. When you set up a new laptop or Android phone you can usually use Easy Setup (the “forefront” UI in Firefox 4), which is much like Bluetooth pairing, so you don’t have to type it. The UI will only continue to hide the Sync key more deeply as we start to introduce better means of credentials exchange (such as QR codes for time-delayed J-PAKE)… in fact, soon enough we’re likely to rename it “Recovery Key”, because that’s what it’s for.

Secondly, that long string is an AES key, with all the joy it brings. We encrypt your data locally because we sync your entire history, bookmarks, and passwords, including access to banks, messages from revolutionary organizations, doctors’ heath data (HIPAA!), and more… and we have over a million active users. A breach without strong local encryption would make the PSN intrusion look like 4chan trolling. We want to ensure that we can’t get your data, either deliberately or under the coercion of the FBI. Being able to recover a user’s data from our servers means we are required to give your browsing history to the FBI if they show up with a warrant. Ever visited thepiratebay?

In essence, we make the same promises as DropBox, but we actually keep them. We really can’t betray your trust, and the sync key is why.

“But why not use the password for encryption?”. I’m glad you asked.

A password is inadequate for this purpose. We used to allow a user-entered passphrase in place of the sync key, but it had a lot of problems.

For one thing, users didn’t understand why they needed two passwords… and using just one is a terrible idea! Your account password goes over the wire for HTTP auth, and HTTPS is not always a defense — quite apart from the possibility of a compromised HTTP server (an attack vector against which we want to guard), I’ve personally helped out two users whose employers were running SSL MITM proxies, which allows them to snoop HTTPS traffic… including HTTP auth headers. We only detected it because the user’s employer had added their own root certificate to Windows’ cert store, but not Firefox’s, and Firefox threw a certificate error. Your HTTPS traffic is visible to your employer. That’s a terrible thing, but your Sync data is still secure, because we don’t just use your password as an encryption key.

The other issue with passwords is that there just isn’t enough entropy in a user-entered string to support our cryptographic guarantees, even with PBKDF2 as a bootstrap algorithm. Put it this way: is your password twenty-six base36 characters long (a solid 128-bit key), or is it eight to twelve letters with a couple of numbers? I thought so. Most people’s passwords aren’t even that strong.

Most banks don’t just use username + password. Many non-US banks require the use of additional hardware to generate strong tokens per-login (i.e., you have to carry a small USB device around with you), or other login methods. HSBC USA makes users type a second long password on a damn Javascript mouse-keyboard. Even Bank of America (a comparatively weak institution, in my experience) requires a username, a strong password, and a cookie credential that you can only get by providing your SSN and answering security questions to “authorize” the machine.

Payroll facilities… well, they don’t care, and in my experience they typically don’t understand technology too well. Just because ADP, or Sony, or DropBox jump off a bridge doesn’t mean we’re going to throw our users off, too.

Speaking more broadly: it would be really convenient for Firefox Sync to not use encryption. We could let you see your bookmarks in a webpage (a common request), and the client (the code I maintain) would be much simpler! That’s how Chrome approaches this problem… Google wants to see your bookmarks. But that’s not really how Mozilla works; we try to err on the side of safety, freedom, and serving the users’ best interests, rather than opting for the expedient solution. The vast majority of users simply do not have the knowledge to correctly evaluate the decision you’re asking them to make. That’s why users put their bank URL (and credentials, apparently!) into delicious, and put their private keys on the web.

And that’s why we don’t let you upload the contents of your Firefox profile with weak or no encryption.