Finding Problems with Git Bisect

Git-bisect is an amaz­ing tool for fix­ing prob­lems. It uses a bina­ry search algo­rithm to weed out the offend­ing com­mit so you can fix what­ev­er prob­lem was introduced.

Bina­ry Search Algo­rithm #

It’s not impor­tant to get too deep in the weeds on this but I do want to explore exact­ly what git-bisect is going and why a bina­ry search algo­rithm is a good approach to iden­ti­fy­ing the prob­lem with the repository.

A bina­ry search algo­rithm is used to quick­ly find a tar­get in a sequence (could be an array of data or a col­lec­tion like the com­mits in Git). In our case we are look­ing for a com­mit that intro­duced a bug.

Let’s say we have sev­en com­mits in our sequence.

r1  r2  r3  r4  r5  r6

We know that r1 is a good com­mit (no bug appar­ent) and r6 is a bad com­mit (shows the bug). We need to find where that bug was introduced.

The algo­rithm splits the sequence of com­mits (deter­mined by the bound­aries we set by telling git-bisect the last good com­mit and a bad com­mit) in half and checks out the mid­dle commit. 

r1  r2  r3  (r4) r5 r6

If the checked out com­mit doesn’t have the bug intro­duced then the algo­rithm will go lat­er in the sequence, set­ting the new bound­aries to r4..r6. We know that r4 and below won’t have the bug so those are tossed out and no longer evaluated.

r1  r2  r3  r4  (r5)  r6

Now git-bisect checks out r6 and we check whether it is good or bad. Does it have the bug? Indeed it does. Instead of check­ing out and test­ing six com­mits, we only had to check two. This doesn’t look impres­sive with a small sequence but imag­ine how quick this works on a large project with hun­dreds of commits.

Using git-Bisect #

Using git-bisect looks like this:

$ git bisect start 
	
$ git bisect good 69e8f1923
	
$ git bisect bad e58f8776

Git-bisect will then split your com­mit range in half and check out a com­mit for you to check to see if the bug exist­ed in the com­mit. You can think of this as divide and conquer.” 

If the bug still exists you reply to Git with:

$ git bisect bad

And then git-bisect will split the com­mits in half again and check one out for you to test. If you are at a com­mit that doesn’t have the bug present, you reply to Git with:

git bisect good

Do this until you (with the help of git-bisect) have nar­rowed down the offend­ing com­mit. Git will come back to you iden­ti­fy­ing the offend­ing commit.

97797cb24118068e0a28fedef19fabb5dd129b4d is the first bad commit

We now know where the bug was intro­duced and review the code to plan a fix.

To get the repos­i­to­ry back in a work­ing state, we need fin­ish up our git-bisect section.

git bisect reset

Chang­ing Terms in Git Bisect #

Now that we know how to use git-bisect let’s talk about how we can make the ter­mi­nol­o­gy a lit­tle eas­i­er to understand.

Pre­vi­ous­ly we used good” and bad” as a way to describe the state of our project as we hunt­ed for the intro­duc­tion of a prob­lem or bug. 

But those terms don’t always make sense.

Maybe we are try­ing to where a file called needle.rb was intro­duced. Sure, this is con­trived exam­ple but stick with me. So instead of say­ing good” or bad” we want to say the repos­i­to­ry has nee­dle” or no-nee­dle.”

To do this we need to run git-bisect with our terms includ­ed as options.

$ git bisect start --term-good=no-needle --term-bad=needle

When the checked out ver­sion of the project doesn’t have the needle.rb file then we can mark it as no-nee­dle” or, in the orig­i­nal lan­guage of git-bisect: good.” If the needle.rb file is present, we can tell git-bisect that that ver­sion of the project has the nee­dle (or is bad” because we don’t want the nee­dle in our project).