Quantcast
Viewing all articles
Browse latest Browse all 10

mozregression – new way for handling merges

I am currently investigating how we can make mozregression smarter to handle merges, and I will explain how in this post.


Problem

While bisecting builds with mozregression on mozilla-central, we often end up with a merge commit. These commits often incorporate many individual changes, consider for example this url for a merge commit. A regression will be hard to find inside such a large range of commits.


How mozregression currently works

Once we reach a one day range by bisecting mozilla-central or a release branch, we keep the most recent commit tested, and we use that for the end of a new range to bisect mozilla-inbound (or another integration branch, depending on the application) The beginning of that mozilla-inbound range is determined by one commit found 4 days preceding the date of the push of the commit (date pushed on mozilla-central) to be sure we won’t miss any commit in mozilla-central.

But there are multiple problems. First, it is not always the case that the offending commit really comes from m-i. It could be from any other integration branch (fx-team, b2g-inbound, etc). Second, bisecting over a 4 days range in mozilla-inbound may involve testing a lot of builds, with some that are useless to test.


Another approach

How can we improve this ? As just stated, there are two points that can be improved:

  • do not automatically bisect on mozilla-inbound when we finished mozilla-central or a release branch bisection. Merges can comes from fx-team, or another integration branch and this is not really application dependent.
  • try to avoid going back 4 days before the merge when going to the integration branch, there is a loss in productivity since we are likely to test commits that we already tested.

So, how can this be achieved ? Here is my current approach (technical):

  1. Once we are done with the nightlies (one build per day) from a bisection from m-c or any release branch, switch to use taskcluster to download possible builds between. This way we reduce the range to two pushes (one good, one bad) instead of a full day. But since we tested them both, only the commits in the most recent push may contain the regression.
  2. Read the commit message of the top most commit in the most recent push. If it does not looks like a merge commit, then we can’t do anything (maybe this is not a merge, then we are done).
  3. We have a merge push. So now we try to find the exact commits around, on the branch where the merged commits come from.
  4. Bisect this new push range using the changesets and the branch found above, reduce that range and go to 2.

Let’s take an example:

mozregression -g 2015-09-20 -b 2015-10-10

We are bisecting firefox, on mozilla-central. Let’s say we end up with a range 2015-10-01 – 2015-10-02. This is how the pushlog will looks like at the end, 4 pushes and more than 250 changesets.

Now mozregression will automatically reduce the range (still on mozilla-central) by asking you good/bad for those remaining pushes. So, we would end up with two pushes – one we know is good because we tested the top most commit, and the other we know is bad for the same reason. Look at the following pushlog, showing what is still untested (except for the merge commit itself) – 96 commits, coming from m-i.

And then mozregression will detect that it is a merge push from m-i, so automatically it will let you bisect this range of pushes from m-i. That is, our 96 changesets from m-c now converted to testable pushes in m-i. And we will end with a smaller range, for example this one where it will be easy to find our regression because this is one push without any merge.


Comparison

Note that both methods for the example above would have worked. Mainly because we are ending in commits originated from m-i. I tried with another bisection, this time trying to find a commit in fx-team – in that case, current mozregression is simply out – but with the new method it was handled well.

Also using the current method, it would have required around 7 steps after reducing to the one day range for the example above. The new approach can achieve the same with around 5 steps.

Last but not least, this new flow is much more cleaner:

  1. start to bisect from a given branch. Reduce the range to one push on that branch.
  2. if we found a merge, find the branch, the new pushes, and go to 1 to bisect some more with this new data. Else we are done.

Is this applicable ?

Well, it relies on two things. The first one (and we already rely on that a bit currently) is that a merged commit can be found in the branch where it comes from, using the changeset. I have to ask vcs gurus to know if that is reliable, but from my tests this is working well.

Second thing it that we need to detect a merge commit – and from which branch commits comes from. Thanks to the consistency of the sheriffs in their commit messages, this is easy.

Even if it is not applicable everywhere for some reason, it appears that it often works. Using this technique would result in a more accurate and helpful bisection, with speed gain and increased chances to find the root cause of a regression.

This need some more thinking and testing, to determine the limits (what if this doesn’t work ? Should we/can we use the old method in that case ?) but this is definitely something I will explore more to improve the usefulness of mozregression.


Viewing all articles
Browse latest Browse all 10

Trending Articles