Monthly Archives: January 2012

How to file a good Android Sync bug

So you’ve seen some behavior that looks like a bug — perhaps a crash, perhaps data being corrupted, perhaps some performance impact or a sync never finishing.

The first thing we need is a log from the device. There’s no about:sync-log any more, so you’ll need to fetch the logs from the device via USB or using an Android application.

If you have an Android SDK and a USB cable, you can run

adb logcat -v time

and capture logs during the event that reproduces the bug. There will be a lot of output.

If you don’t have an Android development environment set up, you can install aLogRec and email the ADB log to yourself. This reportedly won’t work with recent Android versions, so I encourage you to try installing the tools first.

Once you have the log in a file, you can attach it to a new Android Sync bug in Bugzilla. Please take the time to write:

  • An accurate summary of your issue. “Sync adds bookmarks from desktop Bookmarks Toolbar, but local changes do not appear on desktop” is way more useful than “Sync doesn’t work”.
  • A description that explains what you see, what you expect to see, and any other pertinent information. Let us know which other devices you have connected to your Sync account: “Firefox 23 on Mac, Firefox 24 on Windows 7”.
  • Steps to reproduce. If you can start with a clean profile everywhere, and take a reproducible sequence of actions to cause a bug to manifest, we will be really grateful, and the bug is much more likely to be fixed than one we have to guess at.
  • A description of your device and the version of Firefox you’re using. Something like “Samsung Galaxy S3, Android 4.1.4, Nightly build from 2013-08-24” is enough. You can find out your Android version in the Settings application.

Rapid development

We just shipped our second milestone for Android Sync, a native Java implementation of Firefox Sync to work with Native Fennec.

The first milestone was “preffed-off”, a code drop to ensure that building these two separate projects together would work. This milestone is the first time it’s available to users, which of course is both scary and exciting. By and large it seems to work, albeit with a bunch of known issues.

I thought I’d share some experiences from getting this done, with graphical fun from GitHub’s statistics view.

From the outset, I knew this was an impossible challenge. At the start of November my boss asks me “can we have something working in December?”. Given that “something working” involves building a brand-new client for the Sync protocol, the crypto layer, storage engines for a browser that’s still being built, and UI and system integration for an unfamiliar platform, my obvious response was “are you crazy?”.

That aside, I was actually fairly confident in the scope at the time. A small team (at that point just me, an intern, and a recent new-grad hire) can move quite quickly if the problem is well understood, and this was.

I knew in advance that this would require insane amounts of work: I worked until 4am on Christmas day, and that was hardly the exception. We landed our first milestone a week late, and upon reflection we added about a week of unplanned packaging work in the middle, so I call that a win.

This was the cost:

Android-sync_punchcard_jan_201

That’s the GitHub commit punchcard for the project, and that pretty directly reflects my working time. (Apparently I land a lot of code at 4pm on a Sunday.)

That wavy sliver of grey between 2am and 6am? Yeah, that’s when I was sleeping. The slight reduction of dots around noon and 7pm? Mealtimes. I think this means an average of a 14-hour day.

I started to burn out a little around two weeks ago, after a few weeks of 90+ hour weeks and extensive (including international) travel. You can see this in our commit graph:

Android-sync_jan_2012

The orange is me. That big dip on the right is Christmas, and thereafter I shifted from building stuff to fixing stuff, so my total contribution dropped. 359 commits over about 10 weeks is about 5 commits per day, with about 3,500 lines of changes (additions and deletions) per day.

We had to work with much less process than I like to get this done. Our test coverage is low (thanks for the Jenkins setup, gps!), but it does exist. The philikon on my shoulder poked me every time I landed something knowing it had inadequate test coverage. I only recently started absolutely mandating code review and bug numbers for each commit. My own work is still unreviewed, because we don’t have a module peer with enough experience to review my code. (But thank heavens for that philikon on my shoulder! It sometimes felt like I had a second brain doing real-time code review.)

Several times we relied on some heroes. Philipp and a contributor from Mozilla China stepped in and saved our J-PAKE code from going off the rails. Tony and Tracy did some late-night QA because we simply didn’t have enough padding to get them working builds beforehand. And Ally and Erin deserve a serious shout-out for taking occasional “can you make sure this Fennec bug gets landed?” pings and making sure the way was clear for this bull-headed engineer.

All of this omitted process has or is going to come back, of course, and we’re going to take a more measured approach going forward. On reflection, though, I’m actually pleased with how this development process worked. I think we made reasonable tradeoffs (with a few errors that I want to talk through with people) with a positive outcome.

In both the large and the small, software development is about horse-trading with risks. Spend an hour reviewing this code, and let some other feature go unimplemented, or trust that it works and risk a bug? Write that test or not? Go for the hack or the thorough solution? Ask for help or avoid the coordination overhead?

There are a lot of ways this project could have gone wrong, and some in which it did. Thanks to my team, our excellent project managers, and my boss for making sure it turned out OK!