by Sven Woltmann – May 14, 2021

We developers are often faced with determining the position of a particular element in a sorted array (or in a list). The most straightforward approach would be to traverse the array from left to right, matching each element with the element we are looking for. This is called a "linear search".

"Binary search" is much faster. In this article, you will learn:

- How does binary search work?
- How to implement binary search in Java (recursive and iterative)?
- Which binary search functions does the JDK provide?
- How fast is binary search compared to linear search?
- When does it make sense to run a binary search in a
`LinkedList`

?

You can find the source code for the article in this GitHub repository.

In the past, if we wanted to translate an unknown word, we didn't have an app for that. We had to look it up in a dictionary. In theory, we could search every page from the top left to the bottom right for the specific word, from front to back.

If we were lucky, we would find the word on the first pages of the book. If we're unlucky, we won't find it until near the end of the book – or not at all (we wouldn't find that out until the very last page). Even with words that are relatively far in front (such as "binary search"), we would have to search for quite a while this way.

This approach is called "linear search". The following image shows a simplified example with numbers instead of words. We want to find the position of the number 61 in the array shown.

In this simplified example, we need six steps to find the 61.

Of course, no one would look in a dictionary in this way. Instead, we open the book in the middle and see whether the word comes alphabetically before or after it. We thus know in which half of the book the word is located and can ignore the other half. After that, we search the middle again and narrow the search area to half once more (i.e., a quarter in total). With each additional search step, we halve the number of remaining pages. This way, we get to the target page – and on the target page to the word we are looking for – in relatively few steps.

We call this a "binary search". The following image clearly shows that the search leads to the result much faster than the linear search:

With binary search, we only need three steps:

- In the first step, we compare the searched value 61 with the middle element 36. 61 is larger, so it must be to the right of 36.
- In the second step, we compare 61 with the middle element of the right subarray, 79. The value we are looking for is smaller, so it must be to the left of 79.
- There is only one element between 36 and 79. We have to compare this element with the searched element again. In this example, we have found the searched element 61. However, there could have been another number between 36 and 79. This would have meant that the array does not contain 61 at all.

Of course, binary search only makes sense if the words in the dictionary are sorted (like the numbers in the example). If the words were printed in random order, we would have no choice but to search word by word – that is, linearly.

In the following pseudocode, we denote the element we are looking for by "key".

- Determine the middle position of the array range to be searched.
- Read the element at the middle position.
- Compare the key with the middle element:
- If the key is
*equal to*the middle element, then we have reached our goal. Return the middle position as result. - If the key is smaller than the middle element, perform a binary search in the subarray to the left of the middle position. However, if this subarray has a length of 0, the search ends without a result.
- If the key is greater than the middle element, perform a binary search in the subarray to the right of the middle position. However, if this subarray has a length of 0, the search ends without a result.

- If the key is

We can implement binary search recursively or iteratively.

The pseudocode for binary search from the previous chapter suggests a recursive implementation.

The recursive implementation in Java for an array of `int`

primitives looks like this:

```
public static int binarySearchRecursively(int[] array, int key) {
return binarySearchRecursively(array, 0, array.length, key);
}
public static int binarySearchRecursively(
int[] array, int fromIndex, int toIndex, int key) {
if (toIndex <= fromIndex) return -1;
int mid = (fromIndex + toIndex) >>> 1;
int midVal = array[mid];
if (key == midVal) {
return mid;
} else if (key < midVal) {
return binarySearchRecursively(array, fromIndex, mid, key);
} else {
return binarySearchRecursively(array, mid + 1, toIndex, key);
}
}
```

Code language: Java (java)

You can find the code in the GitHub repository in the class BinarySearch starting at line 12. Corresponding unit tests are provided in BinarySearchTest.

It is important to calculate the middle position `mid`

with an "unsigned right shift":

`int mid = (fromIndex + toIndex) >>> 1 `

And not as follows:

`int mid = (fromIndex + toIndex) / 2`

In case the sum is greater than `Integer.MAX_VALUE`

, the second variant would lead to an overflow or a "roll over", and the result would be a negative number.

Without the `>>>`

operator, the following method would also be correct:

`int mid = fromIndex + (toIndex - fromIndex) / 2;`

But that is nowhere near as cool ;-)

Recursion requires additional CPU cycles and additional memory on the heap. Therefore, iterative implementations are usually preferable.

The corresponding iterative Java implementation for an `int`

array looks like this:

```
public static int binarySearchIteratively(int[] array, int key) {
return binarySearchIteratively(array, 0, array.length, key);
}
public static int binarySearchIteratively(
int[] array, int fromIndex, int toIndex, int key) {
int low = fromIndex;
int high = toIndex;
while (low < high) {
int mid = (low + high) >>> 1;
int midVal = array[mid];
if (key == midVal) {
return mid;
} else if (key < midVal) {
high = mid;
} else {
low = mid + 1;
}
}
return -1;
}
```

Code language: Java (java)

The variables `low`

and `high`

are not absolutely necessary here. You could also change `fromIndex`

and `toIndex`

within the `while`

loop. However, reassigning method parameters is usually considered unclean design.

You can also find this code in the BinarySearch class starting at line 52 and the unit tests in BinarySearchTest starting at line 64.

Of course, we do not have to implement binary search in arrays ourselves. The JDK provides appropriate methods for arrays of all primitive data types and for object arrays in the `java.util.Arrays`

class. It also provides a method for binary search in lists in the `java.util.Collections`

class.

For example, in an `int`

array we can search as follows:

```
int[] array = new int[] {10, 19, 23, 25, 36, 61, 79, 81, 99};
int posOf36 = Arrays.binarySearch(array, 36);
```

Code language: Java (java)

In a corresponding `ArrayList`

of `Integer`

objects we can search as follows:

```
List<Integer> list = new ArrayList<>(List.of(10, 19, 23, 25, 36, 61, 79, 81, 99));
int posOf36 = Collections.binarySearch(list, 36);
```

Code language: Java (java)

Note: The `Collections.binarySearch()`

method can be invoked for any class that implements the `List`

interface. Thus, for example, also for `LinkedList`

.

In a linked list, however, a specific element cannot be accessed directly, but only by iteration. That brings us (almost) back to linear search. More about this – and why binary search on a `LinkedList`

can still be useful – you'll find out in the next chapter.

In binary search, we halve the number of entries left to search with each search step. Or the other way around: if the number of entries doubles, we only need one more search step.

This corresponds to logarithmic effort, i.e., *O(log n)*.

You can learn more about big O notation here: Big O Notation and Time Complexity – Easily Explained

We can verify the theoretically derived time complexity with the program BinarySearchRuntime from the GitHub repository. The program generates random arrays with 10,000 to 200,000,000 elements and searches them for a randomly selected element.

Since the times are in the nanosecond range, each measurement consists of searches for 100 different keys. The measurement is repeated 100 times for each array size; then, the median is printed. The following graph shows the average runtime in relation to the array size:

The logarithmic progression can be seen very well.

With linear search, the best case is finding the element we are looking for in the first step. In the worst case, we have to search the entire array. In the average case, half of the entries. With *n* entries, that is *n/2* search steps. The duration of the search increases linearly with the number of entries. We say:

The time complexity of the linear search is *O(n)*.

We can measure the runtime of linear search with the LinearSearchRuntime program. The following image shows the comparison of the runtimes of binary and linear search. I had to reduce the range to 100,000 elements to be able to recognize at least a minimal increase of the measured values for the binary search:

We can see the linear time of the linear search very nicely. It is also apparent that the binary search is orders of magnitude faster than the linear search.

Due to the higher complexity of the binary search code, linear search can be faster for small arrays. The following diagram shows a section of the comparison of run times for up to 500 elements. Each measurement point is the median of 100 measurements with 10,000 repetitions each.

That confirms the assumption. For arrays up to a maximum of about 230 elements, linear search is faster than binary search. Of course, this is not a general statement but applies only to my laptop and the JDK I currently use.

You can once again nicely see the linear time – *O(n)* – compared to the logarithmic time – *O(log n)*.

In the chapter Binary Search in the JDK, I mentioned that the `Collections.binarySearch()`

method can also be applied to a `LinkedList`

. `Collections.binarySearch()`

distinguishes internally between lists that implement the `RandomAccess`

interface, such as `ArrayList`

, and other lists. For lists with "random access", a regular binary search is performed.

To access the middle element in lists without random access, we would have to follow the elements from the beginning to the middle, element by element. From there, we would again reach the center of the left or right half by following the list, element by element. The following diagram should illustrate this:

For example, to find the position of 19, we would first have to follow the orange arrows to the center, then the blue arrows back to 23, and finally the yellow arrow to 19.

That works only with a doubly linked list. For iterating left in a singly linked list, you would have to jump back to the beginning and, from there, follow the arrows to the right again.

No matter if singly or doubly linked – in any case, we have to iterate over more elements than with linear search. While we have an average of *n/2* search steps in the linear search in total, we already iterate over *n/2* elements to reach the middle in the first step of the binary search. In the second step, we iterate over *n/4* elements; in the third step, we iterate over *n/8* elements, and so on.

So at first glance, binary search makes absolutely no sense in a `LinkedList`

.

Nevertheless, binary search in a `LinkedList`

can be faster than linear search. Although we have to *iterate* over more elements (as shown in the previous section) – the number of *comparisons* remains in the order of *O(log n)*!

Depending on the cost of the comparison function – which can be significantly higher for an object than for a primitive data type – this can make a considerable difference. So if you ever need to search in a `LinkedList`

, it's worth trying binary search with `Collections.binarySearch()`

and comparing it to linear search.

This article has shown the principle of binary search and its advantages over linear search for sorted arrays and lists. I demonstrated the theoretically derived time complexity on an example. I also showed that binary search could be useful for a doubly linked list.

A very similar technique is the search in a binary search tree.

If you liked the article, feel free to leave me a comment or share it using one of the share buttons at the end. If you want to be informed when the next articles are published, sign up for my newsletter using the form below.

About the author

I'm a freelance software developer with more than two decades of experience in scalable Java enterprise applications. My focus is on optimizing complex algorithms and on advanced topics such as concurrency, the Java memory model, and garbage collection.
Here on HappyCoders.eu, I want to help you become a better Java programmer. Read more about me here.