How to file a good Android Sync bug

So you’ve seen some behavior that looks like a bug — perhaps a crash, perhaps data being corrupted, perhaps some performance impact or a sync never finishing.

The first thing we need is a log from the device. There’s no about:sync-log any more, so you’ll need to fetch the logs from the device via USB or using an Android application.

If you have an Android SDK and a USB cable, you can run

adb logcat -v time

and capture logs during the event that reproduces the bug. There will be a lot of output.

If you don’t have an Android development environment set up, you can install aLogRec and email the ADB log to yourself. This reportedly won’t work with recent Android versions, so I encourage you to try installing the tools first.

Once you have the log in a file, you can attach it to a new Android Sync bug in Bugzilla. Please take the time to write:

  • An accurate summary of your issue. “Sync adds bookmarks from desktop Bookmarks Toolbar, but local changes do not appear on desktop” is way more useful than “Sync doesn’t work”.
  • A description that explains what you see, what you expect to see, and any other pertinent information. Let us know which other devices you have connected to your Sync account: “Firefox 23 on Mac, Firefox 24 on Windows 7”.
  • Steps to reproduce. If you can start with a clean profile everywhere, and take a reproducible sequence of actions to cause a bug to manifest, we will be really grateful, and the bug is much more likely to be fixed than one we have to guess at.
  • A description of your device and the version of Firefox you’re using. Something like “Samsung Galaxy S3, Android 4.1.4, Nightly build from 2013-08-24” is enough. You can find out your Android version in the Settings application.

Rapid development

We just shipped our second milestone for Android Sync, a native Java implementation of Firefox Sync to work with Native Fennec.

The first milestone was “preffed-off”, a code drop to ensure that building these two separate projects together would work. This milestone is the first time it’s available to users, which of course is both scary and exciting. By and large it seems to work, albeit with a bunch of known issues.

I thought I’d share some experiences from getting this done, with graphical fun from GitHub’s statistics view.

From the outset, I knew this was an impossible challenge. At the start of November my boss asks me “can we have something working in December?”. Given that “something working” involves building a brand-new client for the Sync protocol, the crypto layer, storage engines for a browser that’s still being built, and UI and system integration for an unfamiliar platform, my obvious response was “are you crazy?”.

That aside, I was actually fairly confident in the scope at the time. A small team (at that point just me, an intern, and a recent new-grad hire) can move quite quickly if the problem is well understood, and this was.

I knew in advance that this would require insane amounts of work: I worked until 4am on Christmas day, and that was hardly the exception. We landed our first milestone a week late, and upon reflection we added about a week of unplanned packaging work in the middle, so I call that a win.

This was the cost:

Android-sync_punchcard_jan_201

That’s the GitHub commit punchcard for the project, and that pretty directly reflects my working time. (Apparently I land a lot of code at 4pm on a Sunday.)

That wavy sliver of grey between 2am and 6am? Yeah, that’s when I was sleeping. The slight reduction of dots around noon and 7pm? Mealtimes. I think this means an average of a 14-hour day.

I started to burn out a little around two weeks ago, after a few weeks of 90+ hour weeks and extensive (including international) travel. You can see this in our commit graph:

Android-sync_jan_2012

The orange is me. That big dip on the right is Christmas, and thereafter I shifted from building stuff to fixing stuff, so my total contribution dropped. 359 commits over about 10 weeks is about 5 commits per day, with about 3,500 lines of changes (additions and deletions) per day.

We had to work with much less process than I like to get this done. Our test coverage is low (thanks for the Jenkins setup, gps!), but it does exist. The philikon on my shoulder poked me every time I landed something knowing it had inadequate test coverage. I only recently started absolutely mandating code review and bug numbers for each commit. My own work is still unreviewed, because we don’t have a module peer with enough experience to review my code. (But thank heavens for that philikon on my shoulder! It sometimes felt like I had a second brain doing real-time code review.)

Several times we relied on some heroes. Philipp and a contributor from Mozilla China stepped in and saved our J-PAKE code from going off the rails. Tony and Tracy did some late-night QA because we simply didn’t have enough padding to get them working builds beforehand. And Ally and Erin deserve a serious shout-out for taking occasional “can you make sure this Fennec bug gets landed?” pings and making sure the way was clear for this bull-headed engineer.

All of this omitted process has or is going to come back, of course, and we’re going to take a more measured approach going forward. On reflection, though, I’m actually pleased with how this development process worked. I think we made reasonable tradeoffs (with a few errors that I want to talk through with people) with a positive outcome.

In both the large and the small, software development is about horse-trading with risks. Spend an hour reviewing this code, and let some other feature go unimplemented, or trust that it works and risk a bug? Write that test or not? Go for the hack or the thorough solution? Ask for help or avoid the coordination overhead?

There are a lot of ways this project could have gone wrong, and some in which it did. Thanks to my team, our excellent project managers, and my boss for making sure it turned out OK!

Humane code review

In the year I’ve been at Mozilla I’ve submitted almost two hundred patches for review. My email archives suggest that I’ve had 26 reviews not granted, and 164 granted.

I’ve also been reviewing patches for Sync since January of 2011. And in the past few months our team — and thus the number of people whose code I review, or see being reviewed — has grown significantly.

This has given me a lot of food for thought on the topic of code review and motivation, so here’s a Saturday afternoon blog post!

From one valid (if extreme) perspective, we don’t own our code, and we have no investment in it. All code is bad, and a positive review brings only grim satisfaction. A negative review is a joyous opportunity to learn, not a personal insult. We toil beneath our hooded robes.

Very few people can be this detached from their output. (I try!)

A “review granted” or “review not granted” email brings most of us a burst of emotion; even having reached a very low-ego place with respect to my code (it’s hard to make it through two hundred code reviews and still derive my self-worth from a patch!), I still feel good when I get r+, and a mild shot of frustration and disappointment when I get r-. Despite the fantastic opportunity to learn inherent in each review, we need to fight through our emotions to get to the point of learning. Marko Kloos writes about a similar process in his discussion of writers’ rejection letters.

If I still sometimes feel these things, I thought, then what about people who aren’t like me? Perhaps they’re volunteer contributors with other demands on their time. Perhaps they’re just less motivated, or have more self-worth riding on each patch. Perhaps they are junior engineers who feel that they have to prove themselves. A blunt, factual, “this is totally wrong” review, as so many of us tend to give, can be a shattering blow to a person’s motivation. Even people with practice dissociating their ego from their work can take a motivation hit; it’s not just the personal involvement, but also the frustration at a setback.

I can already hear the comments: “but they shouldn’t feel like that! It’s just a code review!”. My rebuttal is “but people do, because most people are not like you, and not accounting for that can be damaging in all kinds of ways”.

For a lot of people, how you phrase your code review will dictate whether the patch ever lands at all, or at least how soon it lands.

I’ve experienced a few approaches to code review that I think might help.

The absolute first thing is to provide a useful review. Just like an email, if your review doesn’t tell the author what they need to do to get the patch approved, then they need to double their effort: they not only have to fix the patch, but they have to put in the time to find out how first! The more clearly you can state what you want, and ideally how to get there (“try looking at this test to see how to do it”), the more likely it is that the author will be able to keep on rolling.

The second is an elaboration on the first: engage. Don’t just throw an r-; walk over (or call, or IRC) and talk through it, perhaps pairing to get to a patch that’ll pass review. Rather than just “no”, spend a little time to make sure that the contributor is actually moving forward and motivated. That might mean you spend some time trying to solve someone else’s problem. That’s fine; it’s an investment in the code, just like code review, but it’s also an investment in the author!

The third is to be careful with your wording. Follow the same kinds of tips that you read in magazines about having a constructive argument: avoiding finger-pointing, focusing on solutions or improvements rather than problems, etc. This is a little bit of a hack, but people don’t stop being people when they open Bugzilla. There is a difference between “you forgot to clear session state here, so the following test exercises the completely wrong thing (you idiot)” and “we should make sure that the following test still works when you clear session state at the end of this one, like this…”. It’s small, but a few of those per review, and a few reviews a month, will eventually leave a sour taste in a contributor’s mouth. Do you want people to be apprehensive about asking you for review?

On a related note, be careful with your flagsMarco Bonardo once reviewed one of my patches. I was new to that component, and he gave me very detailed feedback on everything from style to direction. Something that stuck in my mind was that he cleared the review flag and set feedback+, rather than review-. The patch wasn’t anywhere near good enough to land, but rather than coming across as a rejection it was presented as “great start, keep it up!”. This put me in a positive mood and encouraged me… and I’m pretty unfazed about negative reviews! That technique went straight into my reviewing toolbox. (This is the same kind of diplomatic response that normal human beings use all the time in social situations: “do you like my new jacket?” “it really complements your jeans!”.)

Similarly, as a patch author I sometimes find a flaw in my own code and clear my review? flag; as a reviewer one can do the same, which has a smaller ego hit than that big minus sign.

Does anyone else have any good tips?

Why does Firefox Sync use a key as well as a password?

A friend of mine, a software engineer, just asked me this.

Why do you force people to enter that enormous key just to protect their sync data? Passwords are sufficient for banking institutions and payroll facilities, arguably with more important data than your bookmarks. Why not make it optional extra security for those who want it, instead of making everyone pass around a 26 character string to every machine they want to sync from, and risk losing all of their information if it’s lost?

The answer is quite long-winded. Here’s a slightly edited version of my response.

Firstly, we try to make sure that people don’t have to enter it; we’re not blind to the additional complication involved. The sync key is generated for you during setup, so you don’t have to think up another password. It’s stored in Password Manager so you don’t have to remember it. When you set up a new laptop or Android phone you can usually use Easy Setup (the “forefront” UI in Firefox 4), which is much like Bluetooth pairing, so you don’t have to type it. The UI will only continue to hide the Sync key more deeply as we start to introduce better means of credentials exchange (such as QR codes for time-delayed J-PAKE)… in fact, soon enough we’re likely to rename it “Recovery Key”, because that’s what it’s for.

Secondly, that long string is an AES key, with all the joy it brings. We encrypt your data locally because we sync your entire history, bookmarks, and passwords, including access to banks, messages from revolutionary organizations, doctors’ heath data (HIPAA!), and more… and we have over a million active users. A breach without strong local encryption would make the PSN intrusion look like 4chan trolling. We want to ensure that we can’t get your data, either deliberately or under the coercion of the FBI. Being able to recover a user’s data from our servers means we are required to give your browsing history to the FBI if they show up with a warrant. Ever visited thepiratebay?

In essence, we make the same promises as DropBox, but we actually keep them. We really can’t betray your trust, and the sync key is why.

“But why not use the password for encryption?”. I’m glad you asked.

A password is inadequate for this purpose. We used to allow a user-entered passphrase in place of the sync key, but it had a lot of problems.

For one thing, users didn’t understand why they needed two passwords… and using just one is a terrible idea! Your account password goes over the wire for HTTP auth, and HTTPS is not always a defense — quite apart from the possibility of a compromised HTTP server (an attack vector against which we want to guard), I’ve personally helped out two users whose employers were running SSL MITM proxies, which allows them to snoop HTTPS traffic… including HTTP auth headers. We only detected it because the user’s employer had added their own root certificate to Windows’ cert store, but not Firefox’s, and Firefox threw a certificate error. Your HTTPS traffic is visible to your employer. That’s a terrible thing, but your Sync data is still secure, because we don’t just use your password as an encryption key.

The other issue with passwords is that there just isn’t enough entropy in a user-entered string to support our cryptographic guarantees, even with PBKDF2 as a bootstrap algorithm. Put it this way: is your password twenty-six base36 characters long (a solid 128-bit key), or is it eight to twelve letters with a couple of numbers? I thought so. Most people’s passwords aren’t even that strong.

Most banks don’t just use username + password. Many non-US banks require the use of additional hardware to generate strong tokens per-login (i.e., you have to carry a small USB device around with you), or other login methods. HSBC USA makes users type a second long password on a damn Javascript mouse-keyboard. Even Bank of America (a comparatively weak institution, in my experience) requires a username, a strong password, and a cookie credential that you can only get by providing your SSN and answering security questions to “authorize” the machine.

Payroll facilities… well, they don’t care, and in my experience they typically don’t understand technology too well. Just because ADP, or Sony, or DropBox jump off a bridge doesn’t mean we’re going to throw our users off, too.

Speaking more broadly: it would be really convenient for Firefox Sync to not use encryption. We could let you see your bookmarks in a webpage (a common request), and the client (the code I maintain) would be much simpler! That’s how Chrome approaches this problem… Google wants to see your bookmarks. But that’s not really how Mozilla works; we try to err on the side of safety, freedom, and serving the users’ best interests, rather than opting for the expedient solution. The vast majority of users simply do not have the knowledge to correctly evaluate the decision you’re asking them to make. That’s why users put their bank URL (and credentials, apparently!) into delicious, and put their private keys on the web.

And that’s why we don’t let you upload the contents of your Firefox profile with weak or no encryption.

Idle musing

An autonomous car has been built, and has driven 140,000 miles without accident.

Apparently mass production is intended within ten years.

To me — at least, to me wearing my software engineering hat — I see parallels with projects that use unrealistic test data, then blow up catastrophically in the real world.

We developers often use inadequate test harnesses, data sets that are too small, and clean test inputs. We get a false sense of confidence in our code, and this is only deflated by thorough exercise during betas. At a higher level, software companies seem to have a tendency to target our own insular demographics: we build startups that target 25–35 year olds living in the Bay Area, for example, and wonder why they fail to get broad appeal.

The equivalent for the autonomous car?

167545_186408644703713_1000000

I doubt very much that Google’s autonomous car has been tested in a Northwest blizzard, dragging its thin all-weather tires through 8 inches of snow. In the Bay Area, that’s not a problem: you might have to scrape a windshield free of ice one day each year. Outside the Bay Area, one has to cope with snowplows, whiteouts, drifts, spray, cars sliding across the road in front of you, and frequent loss of traction… then sucking mud, laddered gravel roads, and dust storms (with the consequent problems with both visibility and sensor clogging).

The parking camera on my truck, which is at chest height on a tall man, usually ends up completely iced over and useless by the end of a journey. The front of my truck looked like the inside of an old freezer after today’s 3-hour drive in a blizzard. How will the sensor array on a small autonomous car fare?

Color me skeptical.

Working from home

The past few days have given me additional perspective on my life as a remote worker.

I’ve been acquainting myself with a new codebase: in Javascript, which means I have very little ability to explore, trace, poke at a live instance, etc. when compared to my usual tools. I also didn’t have the advantage of constantly badgering the guy two desks down, which is the typical substitute for understanding!

This, and a curious inability to concentrate, left me quite frustrated yesterday; I had to simply force myself to keep battering my mind against every possible approach until I had gathered enough understanding to make some progress. I’m usually quite imperturbable, but the early days prior to understanding can really rock the boat.

That inability to concentrate is the spark that caused this post. In an office environment, I would have wandered around, perhaps having short conversations with teammates that might have unlocked a door in my work. I also wouldn’t have felt bad about not making measurable progress: being visible and interaction with others is a reasonable substitute, but a remote worker doesn’t have that option. At home, I found myself constantly snapped out of the early stages of The Zone by minor distractions, and I had no outlet for my frustration.

With some consideration I conclude that I would be no more productive or able to concentrate in an office — open-plan is hardly the dictionary definition of monastic seclusion — but I would feel less negative about failure. Commiseration is a powerful thing.

On the other hand, I do get to step out my front door and go vole hunting with my dog, enjoying the beautiful sky and the rolling fields. I don’t think that I would trade this for an office job — even in an office as awesome as Mozilla’s.

First day

I’m pleased to announce that today is my first day at Mozilla Corporation. It’s both disorienting and exhilarating to be so surrounded by a whirl of smart people doing novel work on so many fronts; I look forward to getting more of a handle on things, and starting to get traction.

This, I’m sure, will lead to yet more posts here… more positive ones, I hope!

Squeaky doors

I work (remotely) with a senior development manager. His office door squeaks.

I know that he’s been in that office for several years, yet every time we have a video conference — multiple times each week — I’m greeted by a series of loud creaks as each physical participant enters the room.

I’m pretty sure he no longer notices the squeaky office door; perhaps it’s even a charming, reassuring quirk of his environment. Maybe he jokes “oh yes, you get used to it”, or “haha, one day I’ll bring in some oil”.

I notice the squeaky door. In fact, I’m a little surprised that anyone could ignore it for so long.

The squeaky door is, of course, a parable about bad tools or environments. (It’s also absolutely real: if I wasn’t a thousand miles away, I would have oiled that door a long time ago!)

When I moved into my shared office as a new PhD student, many years ago, the door squeaked. When I went to get coffee it would squeak. Running down the hall for a printout? Squeak, squeak. The very next day I brought in a small bottle of oil, and two minutes later the door was swinging silently.

It pays dividends to spend a little time working on removing the frictions we encounter every day: from oiling hinges, to moving furniture, all the way to switching build tools or deciding to work remotely. (I’d call a frustrating commute a big friction!)

Tools

I’ve always had a little bit of an obsession with tools. Building, buying, using, admiring.

All kinds of tools. I have an attachment to the shiny ones — pocket knives, clicking open and fitting the hand; watches; firearms, steel reciprocating and rotating under huge stresses with minuscule tolerances. The dirty, heavy ones, too; these were part of my youth. Large metalworking lathes, striking off big curls of swarf; old drill presses with the chuck key attached by a chain, carefully positioned so that the safety guard would interfere if the chuck key were accidentally left in place.

Photo

These all have some things in common: they’re built with singularity of purpose; they have been refined over years (sometimes millennia); and, whilst they demand respect from the user (these are not toys), their workings and affordances are obvious to their intended audience. There is no mollycoddling, no wizards, no DRM.

Having an ample collection of tools is a joy. With the right tool, any job is easy. Without it, one struggles, messes up, gets dirty, gets frustrated.

Yesterday I changed a couple of wheels on my truck. Each wheel weighs perhaps 70lb with the tire mounted.

My jack stands were too large for the restricted space around each jacking point, so I had to do one wheel at a time, using just my shop jack. The lever on the jack was too short, so I had to stretch under the rear of the (elevated!) truck to work it. Lifting the wheels into position on my own was a challenge — they are 33″ across and at least a foot deep. My torque wrench was at the limits of its range, and still needed help from my breaker bar extension. Even my heavy-duty impact wrench had difficulty breaking the factory lug nuts loose.

A job that would have been easy on a small car turned out to be a sweaty, dirty mess. Inadequate tools. Lesson learned.

Making tools as you need them is an important skill, and an important attitude. Machinists, mechanics, woodworkers… these people are all used to building jigs and tools to make their jobs possible, or to make difficult tasks repeatable. I am often surprised at the number of software developers who only use tools, never making their own. They will perform tiresome tasks by hand because Visual Studio doesn’t have a plugin for it, or because their shell isn’t up to the task. They will allow their existing tools to dictate their technology selection, because it never occurs to them to make their own. Perhaps they aren’t lazy enough (laziness being a chief virtue of a programmer), or maybe they have a high tolerance for frustration. Me, I get annoyed every time I have to use Remote Desktop to manually install a test environment. “You mean I have to use the mouse?!”

Programming tools are one of the few pieces of software for which you, the developer, are also the end user: scratching an itch. It should be second nature to build them; even to spend half or more of your time building tools to do your work for you. (From one perspective, this is what macros are all about.)

I don’t just mean making new tools, either: we should constantly look for existing processes, tools, systems, and components which are inadequate, no longer fit, or could be expanded to make our lives better, and replace them. These are the levers by which we move the world: shouldn’t we constantly look for longer ones?

One of the most frustrating work situations I’ve encountered is when my two previous points collided: having bad tools, and entrenched organizational forces that prevented the introduction of new, better ways of doing things. Few within the organization were aware of the alternatives, and the cost of changing was high, so hundreds or thousands of people plodded on each day, doing the software equivalent of building pyramids out of high-tech bricks, but moving them with rolling logs and frayed rope.

There’s no good way out of this situation: nobody can justify the disruption of a major shift in tools to upper management — the old way works, so it’s hard to even suggest it. If you’re lucky, some brave soul will do their best to incrementally improve the old technology: low-profile rolling logs, sharper stone axes. “Crappy Tool v2.0, Now With Slightly Less Suck”. There is no leader to drive the change past the entrenched resistance, because nobody who can justify the expense actually suffers from the bad tools.

The only solution, I think, is to avoid stagnation at all costs, because stagnation eventually becomes impossible to escape. Just as with engineering debt, if you build up too much it can be very difficult to shift. Be mindful of over-attachment to old tools, and constantly strive to use the best. Re-evaluate, re-implement, and replace.

And don’t ever hire anyone who doesn’t build their own tools.

Time for a new technical blog

The barrier to entry of my extremely clunky old blog was preventing me from writing… so here’s a new one. If only all things in life were so easy. Over time I might scavenge and re-import good stuff from the old into the new, but don’t count on it.

What’s in a name? As those who’ve worked with me are aware, I’m fond of the old saw that once you’re done with the first 80% of making something (particularly software), you’ve still got the last 80% to go. I aim to cover the whole 160% here.

Expect neat little code snippets, thoughts on user experience, and grand commentary on the distortion of Alexander’s architectural pattern language into the horror that is the modern software patterns movement. Or just stuff that’s too long for Twitter.