Different kinds of storage

I’ve been spending most of my time so far on Project Tofino thinking about how a user agent stores data.

A user agent is software that mediates your interaction with the world. A web browser is one particular kind of user agent: one that fetches parts of the web and shows them to you.

(As a sidenote: browsers are incredibly complicated, not just for the obvious reasons of document rendering and navigation, but also because parts of the web need to run code on your machine and parts of it are actively trying to attack and track you. One of a browser’s responsibilities is to keep you safe from the web.)

Chewing on Redux, separation of concerns, and Electron’s process model led to us drawing a distinction between a kind of ‘profile service’ and the front-end browser itself, with ‘profile’ defined as the data stored and used by a traditional browser window. You can see the guts of this distinction in some of our development docs.

The profile service stores full persistent history and data like it. The front-end, by contrast, has a pure Redux data model that’s much closer to what it needs to show UI — e.g., rather than all of the user’s starred pages, just a list of the user’s five most recent.

The front-end is responsible for fetching pages and showing the UI around them. The back-end service is responsible for storing data and answering questions about it from the front-end.

To build that persistent storage we opted for a mostly event-based model: simple, declarative statements about the user’s activity, stored in SQLite. SQLite gives us durability and known performance characteristics in an embedded database.

On top of this we can layer various views (materialized or not). The profile service takes commands as input and pushes out diffs, and the storage itself handles writes by logging events and answering queries through views. This is the CQRS concept applied to an embedded store: we use different representations for readers and writers, so we can think more clearly about the transformations between them.

Where next?

One of the reasons we have a separate service is to acknowledge that it might stick around when there are no browser windows open, and that it might be doing work other than serving the immediate needs of a browser window. Perhaps the service is pre-fetching pages, or synchronizing your data in the background, or trying to figure out what you want to read next. Perhaps you can interact with the service from something other than a browser window!

Some of those things need different kinds of storage. Ad hoc integrations might be best served by a document store; recommendations might warrant some kind of graph database.

When we look through that lens we no longer have just a profile service wrapping profile storage. We have a more general user agent service, and one of the data sources it manages is your profile data.

Trivial SQL ORMs considered harmful

Our team has a little “things I learned this week” tradition in our team meetings, and it just blossomed onto our mailing list (async is better!).

In one such post, Michael pointed to sqldelight, a library to automatically generate Android SQL-handling code for a typed schema and a set of queries.

I wrote a little screed advising caution, which Margaret suggested would make a good blog post… so here it is, unedited.

Note that I have nothing against automated schema and query checking, nor against saving error-prone typing; my primary objection here is to the object mapping.

Michael notes:

It’s a square library that allows you to define your tables & queries in a separate text file and it will auto-generate table creation and methods of querying. To do so, it creates Objects which represent the row of your DB.

and I reply:

At the risk of being a negative nelly: broadly speaking I find this kind of trivial ORM to be a terrible design anti-pattern, and I strongly discourage its use for anything but saving some typing before committing a v0. We implemented something like this on the iOS side of the house, and it was a huge pain in the ass to get rid of later.

If your system is simple enough that you’re putting whole objects in and getting whole objects out — that is, a simple ORM is a good fit — you should instead be not using SQLite.  Serialize your objects to a flat file in JSON and keep them in memory. Up to about 100KB of data, it’s better in almost every way. (There are some exceptions, but they’re exceptions.)

For everyone else, your inputs and outputs will differ, or you’ll need more control, and so you should run screaming from sqldelight.

There are at least five reasons why I feel this way. I’ll stop at five to avoid writing an epic.

  1. Database tables really come into their own when you join them: bookmarks against favicons, hockey players against teams and games. If you join them (particularly with left/outer/etc. joins), your ORM needs to bulk up the generated model objects with optional fields; it has to, otherwise it can’t represent the result of the join.

    Those optional fields leak throughout your app — hey, is that favicon ID supposed to be set here? Does it need to be set to -1 sometimes? — and make your life unpleasant.

  2. SELECT * is an anti-pattern in database work. You might not need all of the fields, but requesting them all limits the indices that the storage layer can use. A smart storage engine can use compound indices to make some queries with limited projections very fast indeed. Or perhaps you want to get unique values.

    To take sqldelight’s example, you should not SELECT * FROM hockey_player; if you need that, slurp a JSON file instead! When populating a list view, you probably want SELECT name, id FROM hockey_player ORDER BY position. For a name picker you want SELECT DISTINCT name FROM hockey_player UNION hockey_officials. And so on.

  3. Migrations are a reality when dealing with data storage. sqldelight doesn’t seem to address this at all.

  4. Syncability (and backup, and export, and…) are also a reality. A sync system typically has a very different viewpoint on data storage than the frontend — not only does that mean you have a set of fields that only part of the application cares about (which screws up your ORM), it also often means that two parts of the system have utterly different conceptions of seemingly straightforward actions like “delete this thing”. ORMs are (almost by definition) one size fits none.

  5. Getting SQL-based storage — hell, getting any kind of storage — right is hard. Concurrency, performance, memory usage, and correctness all involve careful attention. Take a read of the Sqlite.jsm docs or some of Firefox for iOS’s database prep code if you want a hint of this. Libraries that generate data access code can slip past this attention, and that’s a bad thing.

Syncing and storage on three platforms

As it’s Christmas, I thought I’d take a moment to write down my reflections on Firefox Sync’s iterations over the years. This post focuses on how they actually sync — not the UI, not the login and crypto parts, but how they decide that something has changed and what they do about it.

I’ve been working on Sync for more than five years now, on each of its three main client codebases: first desktop (JavaScript), then Android (built from scratch in Java), and now on iOS (in Swift).

Desktop’s overall syncing strategy is unchanged from its early life as Weave.

Partly as a result of Conway’s Law writ large — Sync shipped as an add-on, built by the Services team rather than the Firefox team, with essentially no changes to Firefox itself — and partly for good reasons, Sync was separate from Firefox’s storage components.

It uses Firefox’s observer notifications to observe changes, making a note of changed records in what it calls a Tracker.

This is convenient, but it has obvious downsides:

  • From an organizational perspective, it’s easy for developers to disregard changes that affect Sync, because the code that tracks changes is isolated. For example, desktop Sync still doesn’t behave correctly in the presence of fancy Firefox features like Clear Recent History, Clear Private Data, restoring bookmark backups, etc.
  • Sync doesn’t get observer notifications for all events. Most notably, bulk changes sometimes roll-up or omit events, and it’s always possible for code to poke at databases directly, leaving Sync out of the loop. If a Places database is corrupt, or a user replaces it manually, Sync’s tracking will be wrong. This is almost inevitable when sync metadata doesn’t live with the data it tracks.
  • Sync doesn’t track actual changes; it tracks changed IDs. When a sync occurs, it goes to storage to get a current representation of the changed record. (If the record is missing, we assume it was deleted.) This makes it very difficult to do good conflict resolution.
  • In order to avoid cycles, Sync stops listening for events while it’s syncing. That means it misses any changes the user makes during a sync.
  • Similarly, it doesn’t see changes that happen before it registers its observers, e.g., during the first few seconds of using the browser.

Beyond the difficulties introduced by a reliance on observers, desktop Sync took some shortcuts 1: it applies incoming records directly and non-transactionally to storage, so an interrupted sync leaves local storage in a partial state. That’s usually OK for unstructured data like history — it’ll try again on the next sync, and eventually catch up — but it’s a bad thing for something structured like bookmarks, and can still be surprising elsewhere (e.g., passwords that aren’t consistent across your various intranet pages, form fields that are mismatched so you get your current street address and your previous city and postal code).

During the last days of the Services team, Philipp, Greg, myself, and others were rethinking how we performed syncs. We settled on a repository-centric approach: records were piped between repositories (remote or local), abstracting away the details of how a repository figured out what had changed, and giving us the leeway to move to a better internal structure.

That design never shipped on desktop, but it was the basis for our Sync implementation on Android.

Android presented some unique constraints. Again, Conway’s Law applied, albeit to a lesser extent, but also the structure of the running code had to abide by Android’s ContentProvider/SyncAdapter/Activity patterns.

Furthermore, Fennec was originally planning to support Android’s own internal bookmark and history storage, so its internal databases mirrored that schema. You can still see the fossilized remnants of that decision in the codebase today. When that plan was nixed, the schema was already starting to harden. The compromise we settled on was to use modification timestamps and deletion flags in Fennec’s content providers, and use those to extract changes for Sync in a repository model.

Using timestamps as the basis for tracking changes is a common error when developers hack together a synchronization system. They’re convenient, but client clocks are wrong surprisingly often, jump around, and lack granularity. Clocks from different devices shouldn’t be compared, but we do it anyway when reconciling conflicts. Still, it’s what we had to work with at the time.

The end result is over-engineered, fundamentally flawed, still directly applies records to storage, but works well enough. We have seen dramatically fewer bugs in Android Sync than we saw in desktop Sync between 2010 and 2012. I attribute some of that simply to the code having been written for production rather than being a Labs project (the desktop bookmark sync code was particularly flawed, and Philipp and I spent a lot of time making it better), some of it to lessons learned, and some of it to better languages and tooling — Java and Eclipse produce code with fewer silly bugs 2 than JavaScript and Vim.

On iOS we had the opportunity to learn from the weaknesses in the previous two implementations.

The same team built the frontend, storage, and Sync, so we put logic and state in the right places. We track Sync-related metadata directly in storage. We can tightly integrate with bulk-deletion operations like Clear Private Data, and change tracking doesn’t rely on timestamps: it’s an integral part of making the change itself.

We also record enough data to do proper three-way merges, which avoids a swath of quiet data loss bugs that have plagued Sync over the years (e.g., recent password changes being undone).

We incrementally apply chunks of records, downloaded in batches, so we rarely need to re-download anything in the case of mid-sync failures.

And we buffer downloaded records where appropriate, so the scary part of syncing — actually changing the database — can be done locally with offline data, even within a single transaction.

Storage on iOS is significantly more involved as a result: we have sync_status columns on each table, and typically have two tables per datatype to track the original shared parent of a row. Bookmark sync is shaping up to involve six tables. But the behavior of the system is dramatically more predictable; this is a case of modeling essential complexity, not over-complicating. So far the bug rate is low, and our visibility into the interactions between parts of the code is good — for example, it’s just not possible for Steph to implement bulk deletions of logins without having to go through the BrowserLogins protocol, which does all the right flipping of change flags.

In the future we’re hoping to see some of the work around batching, use of in-storage tracking flags, and three-way merge make it back to Android and eventually to desktop. Mobile first!


  1. My feeling is that Weave was (at least from a practical standpoint) originally designed to sync two desktops with good network connections, using cheap servers that could die at any moment. That attitude doesn’t fit well with modern instant syncing between your phone, tablet, and laptop!
  2. For example, Sync’s tab record format, defined by the desktop code, includes a time last used. Sometimes this is a string, and sometimes it’s an integer. Hooray JavaScript!

On soft martial arts and software engineers

I recently began studying tàijíquán (“tai chi”), the Chinese martial art.

Richard, holding a sword.

It always helps to have someone correct your form.

Many years ago I spent a year or two pursuing shōtōkan karate. Shōtōkan, by most standards, is a “hard” martial art: it opposes force with force, using low, stable stances to deliver direct strikes.

Tàijíquán is an internal art, mixing hard with soft. To most observers (and most practitioners!) it’s entirely a soft, slow-moving exercise form. To quote Wikipedia:

The ability to use t’ai chi ch’uan as a form of self-defense in combat is the test of a student’s understanding of the art. T’ai chi ch’uan is the study of appropriate change in response to outside forces, the study of yielding and “sticking” to an incoming attack rather than attempting to meet it with opposing force. The use of t’ai chi ch’uan as a martial art is quite challenging and requires a great deal of training.

(Other martial arts are soft, but more immediately applicable: jujutsu, judo, and wing chun, for example.)

I see some parallels between the hard/soft characterization of martial arts and the ‘lifecycle’, if you will, of software engineers.

You might find it hard to believe (HTML needs a sarcasm tag, no?), but I was once a young, arrogant developer. I’d been hired at a startup in the US on the strength of a phone call, I was good at what I did, and there was an endless list of problems to solve. I like solving problems, and I liked that I could impress by doing so. And so I did.

I routinely worked 14-hour days. I’d get up at 7, shower, and head to the office. After work I’d go out for dinner with coworkers, then work until bed. I had no real hobbies apart from drinking with my coworkers, so my time was spent writing code. It’s so easy to solve problems when you can solve them yourself.

Eventually, after one too many solo victories over seemingly impossible deadlines, I was burned out.

Hard martial arts are very tempting, particularly to the young and able-bodied: they yield direct results. The better you get, the harder and faster you hit.

The problem with hard martial arts is that the world keeps making newer, tougher opponents, while time and each engagement are conspiring to strip away your own vigor. It takes a toll on your knees, your shoulders. Bruises take longer and longer to go away.

The software industry is like this, too. It will happily take as much time as you give it. Beating that last hard problem by burning a weekend will only win you a pat on the back and a new, bigger task to accomplish. Meanwhile your shoulders hunch, RSI kicks in, your vision worsens. You take your first week off work because the painkillers aren’t enough to let you type any more. You find out what an EKG is, what a sit-stand desk is, what physical therapy is like.

And while it looks like you’re winning — after all, you’re producing software that works — you’re accruing costs, too. You’re spending your future. Not only are you personally losing your motivation, your vitality, and a large part of your self, but you’re also building more software. Either you have to own it, or nobody really does. Maybe someone else should. Maybe it shouldn’t have been built at all. You think you’re winning, but you won’t know until later. And all along, your aggressive approach to building a solution alienates those around you.

A soft martial art tries to use your opponent’s strength and momentum against them. It yields and redirects. Ultimately, it asks whether you need to engage at all.

Hard martial arts eventually force you to confront your own fragility: “I can’t keep doing this”. So does software development, if you’re paying attention. You need to learn to ask the right questions, to draw on the rest of your team, to invest your time in learning and tools, in communication, and above all to invest in other people.

As the quote above suggests, this takes practice. But it works out best in the long run.

Language switching in Firefox for Android

Bug 917480 just landed in mozilla-central, and should show up in your next Nightly. This sizable chunk of work provides settings UI for selecting a locale within Firefox for Android.

Animation of locale switching demonstration

If all goes well in the intervening weeks, Firefox 32 will allow you to choose from our 49 supported languages without restarting your browser, and regardless of the locales supported by your Android device. (For more on this, see my earlier blog post.)

We’ve tested this on multiple Android versions, devices, and form factors (and every one is different!), and we’re quite confident that things will work for almost everyone. But if something doesn’t work for you, please file a bug and let me know.

If you want more details, have a read through some of my earlier posts on the topic.

Building and testing multi-locale Firefox for Android

This is a follow-up to my earlier post on locale switching.

By default, local Firefox builds are English-only. Including other locales involves merging in content from the l10n repositories. Building an APK that includes other locales, then, means performing the following steps. You only have to do them once.

In short:

  1. Get checkouts of the appropriate l10n repositories for the locales you care about.
  2. Put some incantations in your .mozconfig.
  3. Install compare-locales. (See Bug 940103 to remove this step.)

Then, each time you build, run a small script between ./mach build and ./mach package.

Getting checkouts for Fennec’s supported locales

mkdir -P $L10NBASEDIR
pushd $L10NBASEDIR
while read line; do hg clone "http://hg.mozilla.org/releases/l10n/mozilla-aurora/$line"; done < $LOCALES

Augmenting your .mozconfig

Add the following lines:

# Make this match your checkouts.
mk_add_options 'export MOZ_CHROME_MULTILOCALE=en-US cs da de es-ES fi fr ja ko it nb-NO nl pl pt-BR pt-PT ru sk sv-SE zh-CN zh-TW'

# Use absolute paths.
mk_add_options 'export L10NBASEDIR=/Users/rnewman/moz/hg/l10n'
ac_add_options --with-l10n-base=/Users/rnewman/moz/hg/l10n

Install compare-locales

pip install compare-locales

Build and package

This step should be improved when we fix Bug 934196. Personally, I’ve just dumped the extra stuff in a locales.sh script and moved on with my life.

./mach build && \
pushd objdir-droid/mobile/android/locales && \
for loc in $(cat ../../../../mobile/android/locales/maemo-locales); do LOCALE_MERGEDIR=$PWD/merge-$loc make merge-$loc LOCALE_MERGEDIR=$PWD/merge-$loc; make LOCALE_MERGEDIR=$PWD/merge-$loc chrome-$loc LOCALE_MERGEDIR=$PWD/merge-$loc; done && \
popd && \
./mach package

Note that the new stuff is in bold.

Once this completes (assuming no errors), you’ll have an APK that contains multiple locales. Install it on your device!

Updating your l10n checkouts

Every now and then, do something like this:

for loc in $(cat $MOZILLA_CENTRAL/mobile/android/locales/maemo-locales); do \
  pushd $loc && hg pull && hg up -C && popd; done

Testing locale switching

Until we ship a UI for this, you’ll need to use a trivial testing add-on. That add-on puts menu items in the Tools menu; pick one, and it’ll switch your app locale.

The code for this add-on is on GitHub. You can also install the XPI directly. Then you’ll see the Tools menu full of locales, like this:

Switching to es-ES at runtime

Try it out… and whenever you make a change to UI code, use it to make sure you haven’t broken anything!

New locale-related work in Firefox for Android

I recently landed the first steps towards a new way of choosing your language and locale in Firefox for Android. This is of interest as a feature, of course, but it also means some new capabilities and obligations for Fennec front-end developers, so I thought I’d put pen to paper.


Right now, Firefox on Android — like most Android apps — displays its UI and web content in the locale you’ve selected in Android’s settings. In short: if your phone is set to use es_ES (Español [España]), then so is Firefox… and without digging around in about:config, you will also get web content in Spanish by default.

That’s not ideal for a number of reasons. Firstly, carriers tend to restrict the locales that you can select, sometimes to as few as four. If you only speak (or prefer to speak) a language that your carrier doesn’t let you choose, that’s a bad scene. Secondly, the Mozilla community extends beyond the locales that Android itself supports. The only way to address these two issues is to decouple Firefox’s locale selection from Android.

The work that just landed to do so is Bug 936756, upon which we will build two selection UIs — Bug 917480 for the app locale, and Bug 881510 for choosing which locales you wish to use when browsing the web.

What does this mean for users?

Quite simply: once a UI has been layered on top, you’ll be able to switch between each of Firefox for Android’s supported locales without restarting your browser, and maintain that selection independently of the Android OS locale. If you’re happy continuing to use Android’s settings to make that choice, that will continue to work, too.

How does it work?

It works by persisting a selected language in SharedPreferences, manipulating both the Gecko locale prefs and the Android/Java Locale and Resources frameworks to impose that language. To do so we hook into onConfigurationChanged events. For more details, read the bug!

(This is also a small step toward supporting l20n on Android. More on that at a later date.)

What does this mean for developers?

Historically, there’s been a sizable rift between day-to-day development and the final localized builds that users see. We front-end developers build en_US-only builds for testing, while our localization communities work with Aurora, 6-12 weeks later. Partly that’s because locale switching itself is cumbersome (watch Android restart everything under the sun!). Partly it’s because building a multi-locale APK has been difficult.

Unfortunately, that results in a failure to detect even obvious l10n issues during development. Take, for example, Bug 933272. Between Firefox 23 and 28, we displayed all plugin-related text in English, regardless of your selected locale. This is the kind of thing that’s easy to find by testing a local build in a non-English locale.

With switchable locales, two things are true:

  • It’s now easy to switch locales, so you can (and should!) routinely test with multiple locales, just as we test with tablets and phones, with screens rotated, etc.
  • You must test locale switching if you’re working on front-end code that includes strings, so that your new or changed feature doesn’t break locale switching!

I hope that’s a fair trade. Next post: how to build a multi-locale APK, without too much disruption to your existing toolchain.

PSA: Sync account changes in the pipeline

Until now, if you’ve had more than one channel of Firefox installed on your Android device, you’ve experienced some restrictions when using Sync. We’ve been working to improve this situation, and just landed the first change to this end (Bug 772645).
Users with accounts set up for Firefox Beta or the release version of Firefox on Android should be utterly unaffected, and can stop reading now!

Bug 772645 splits the existing “Firefox Sync” account type into three: one for Beta and Release, one for Aurora and Nightly, and one for developer builds. These divisions align with the different Android shared user IDs for each channel. (This means that Beta and Release will eventually share Sync accounts, as will Aurora and Nightly.)

As of this week you’ll be able to have one Nightly and one of Aurora/Beta/Release set up and syncing on your device at the same time.

When it merges to Aurora, you’ll be able to have either Nightly or Aurora, and either Beta or Release, because Aurora’s type will change to match Nightly.

Developer builds taken from mozilla-central should now be able to be installed and configured for Sync entirely separately from Aurora/Beta/Release/Nightly, which should make developers happy!

Here’s where you might have to take action.

If you currently have Nightly as your syncing Firefox, you will need to set up Sync again after upgrading. The account will disappear on upgrade.

If you wish to avoid this, you can install one of the other channels first, and it will “steal” the existing Sync account when you upgrade Nightly. There is no way for you to turn your previous settings into a Nightly account.

The exact same steps will apply when Aurora bumps to version 17 — your Aurora account type will go away, and you’ll need to pre-emptively switch to Beta or Release to keep it, or just set up Sync again.

Pardon our dust, but this turned out to be the sanest (and only?) road forward.

Props go to Nick Alexander for pushing hard on this; he did — as always — excellent and elegant work.

Any questions, please do let us know. There might well be hiccups as this bounces down the track, but we’ll do our best to help.

Cleaning up feels good

Most of the test failures were just down to Jenkins problems (though I appreciate the green upward trend of that graph!), but it’s nice to start killing warnings…