The Quick Fix
I had a situation recently. There was a crashing bug that was dominating our crash reporting. It was effecting just over 1% of a decent size install base, but with many customer support channels it was dominating attention from anything else.
I keep a close eye on our crash reporting with each release. I was the first to notice the problem and was deep into investigation by the time it caught others attention. This app goes through a rigorous QA process so when something like this comes up its very hard to figure out reproduction steps.
Obviously it was an issue, but it wasn’t until the iTunes App Store met its requirements to publish reviews for a new release that we got solid reproduction steps. There’s a good tangent here about the iTunes App Store and the Apple Wall between users and developers but I’ll leave that for another day. I’ll just say that it was the best 1 star review I had gotten in a long time.
With reproduction steps in hand I zeroed in on not only the problem but where we introduced it into the code. The code seemed innocuous but the nasty bugs usually are.
My plan was then to identify not only the root of the issue but anything that might be effected by any code change. This was not a quick process. This bug was part of a very sensitive portion of the user’s overall experience. Without knowing all the aspects of the issue there was a good chance that any fix would impact users in some other way. It was essential to understand the problem to its root.
Once I figured out the root of the problem I did what I would hope others devs would do with such a sensitive part of the users experience – I detailed it out, along with my plan for a fix, to a trusted colleague. As smart as you might think you are, you are wrong from time to time. Allowing someone else to review your thoughts and architecture shouldn’t be seen as a weakness but a strength. As devs we should always be humble about our code.
Implementing the plan meant some refactoring. To me it wasn’t dangerous refactoring, but some people see risk in lines of code changed versus moving away from code that could have more hidden issues. I wasn’t served well in that I missed an even deeper root issue that QA found with regression on my changes. It was a simple fix because I knew the root problem, but it was a blip in the regression testing for a new release.
As my changes were moving to final QA regression I was challenged that I did too much and was asked to look for a smaller impact code change/”fix” that would deal with the superficial problem. The root of the issue would be addressed in a later release.
I understand this way of thinking. Stop the crashes for now then fix later. What I don’t get is the allocation of resources.
I had figured out, documented and resolved the issue. Passing that fix down the line to reset and find a “quick fix” meant not only readjusting my mindset around the problem from a proper fix to a bad one, but also refocusing QA resources to test both. Instead of being done with testing over the course of a day, we would be adding another week of testing down the road to test both the “temp” fix and the real fix.
In the end my fix passed QA and went into production. We’re back to our next release without this hanging over our heads cause it was fixed right the first time.
What’s your point?
This all comes down to the idea that there are quick fixes in software development.
Software is complicated. As an apps life lives on its complexity grows faster then its utility. As software engineers its our duty to solve the root issues instead of “throwing code at the problem”.
When you don’t you’ll end up in the long run trying to put out fires and crashes instead of adding value to your app. Its much easier on yourself and your team and your customers to fix things right the first time.
I want “cool sailing with some waves”.