4 billion to 1 in 32 steps: binary search

8658266567_fcba6a18d7_z © 2013 Maxi Walton and used via Creative Commons licensing

If haystacks were sorted lists, you could actually find that needle that's disguised as dry grass, and not reach the end of your lifespan inspecting each blade. Yeah yeah yeah — this proposes a ridiculous scenario hinging on a tired cliche, but given that, it's an example of how an algorithm can work well.

To define my stupid scenario, a sorted list is what it says it is: A list of things in some sorted order, like numerical or alphabetical. Getting that done with a haystack is not my problem, nor is finding a costume for the needle. But once you've got that setup, say with 4 billion blades of dry grass (including one that's actually a needle), you could find that needle in 32 guesses using binary search. That's the whole point (and beauty) of algorithms: A set of instructions to get something done. For a given task, choosing the right algorithm to use will depend on things like speed, accuracy and memory consumption.

Before last year, any programming I learned was mostly for specific tasks, and I knew I was only scratching the surface enough to make some minimal helpful stuff happen. I hadn't bothered with trying to learn algorithms; it smacked of more math than I might need at the moment, and I got flashbacks of self-learning C in my early 20s and then abruptly stopping when the book brought up the quadratic formula. Or maybe I confused algorithm with logarithm, and high school math. I wasn't even bad at math then; it was just something that I didn't need too much in my line of work, so that muscle atrophied to its limited exercise needs.

But going down the path to learning full-on, I absorbed whatever starter materials I could. From what I'd read, I understood that going to a coding bootcamp sometimes might leave blind spots in theory that a computer science grad might get. But Code Fellows' advanced Python course does cover a swath of algorithms — data structures, searching and sorting methods, time complexity, and when and how to choose the right tool.

image from images-na.ssl-images-amazon.com

In preparing for the course, I'd heard Aditya Bhargava talking about his book Grokking Algorithms. When he explained the 4 billion part, my mind was blown. The book uses Python to illustrate algorithms in code, so I made a point of reading it going into the course. Enough of the material came up that it was indispensable to my understanding of the what, when, how and why of different algorithms.

The list being sorted is crucial to binary search. In searching, each guess is the middle element. If it's too high, the searching range for the next go-round is adjusted downward; the new high end is the item in the list before that last guess — the item that was the old midpoint. If the guess is too low, adjust on the low end: make the starting point for the next search the list item after the current midpoint. On each step, the range to search is cut in half.

Try this guessing game with a friend using the numbers 1 to 10. There's a good chance the first guess will be 5, as a lot of people automatically will go for the middle of the range. (If it's not, I'd bet the guess also happens to be their favorite number.) If the guesses follow this pattern, it'll take three tries to get to the number. That's binary search.

In this situation, 3 plays a mathematical role: it's the logarithm of 10, in base 2. It's base 2 because we're splitting our guessing range in half every time —  the binary part of 'binary search'.

The logarithm part means the inverse of the exponent. Bhargava explains this well:

You may not remember what logarithms are, but you probably know what exponentials are. log10 100 is like asking, “How many 10s do we multiply together to get 100?” The answer is 2: 10 × 10. So log10 100 = 2. Logs are the flip of exponentials.

image from upload.wikimedia.org
Only some programmers will mean an old anime series when they talk about Big O.

For guessing 1-10, noting that finding the answer takes log210 tries isn't just a mathematical parlor trick. It's used as a gauge for time complexity in programming, or how long it'll take your program to run. If we were to inspect each blade of grass in our haystack, it'd take until past death to pick up and look at 4 billion items. With binary search, we narrow each guess by half. If you divide 4 billion again and again and again, after 32 times you get to 1.

Big O notation is used to classify algorithms according to their running time. While a mathematician would call these two searches we've done as log210 and log24e9, programmers drop the subscripted 2, assuming we're talking about base 2. To generalize the classification, they also drop the number of items in the list to any number n, with n being the number of items in the list. So the Big O notation for binary search is simply 'O(log n)'. In the case of inspecting each item in the list, it's 'O(n)'.

Before starting the course, I ported Adit's binary search function to Ruby, and then extended both that and his to run from the command line with user input for numbers. They make the list and pick a random number, and then after performing the iterations to search, report back how many guesses it took, as well as the actual calculation of log(n) so you can compare how close the program was to the expected number of tries. That code is here. Try it, it's usually only a couple few guesses off.

It could also use some tests to complete it. I welcome any pull requests, but it's on the back burner for me until after the course is over.


To hear Bhargava explain binary search and other algorithms covered in his book, listen to this episode of the Talk Python to Me podcast. 

Thanks to Flickr user Maxi Walton for the photo. Wish I knew how to put a link in a Typepad caption.