Once Upon an Algorithm Page 15
The lunch selection problem has two characteristic features: (1) the number of possibilities to be considered grows very quickly, and (2) known algorithms work only if they examine all or most of these possibilities. Problems like these are called intractable because algorithms take too long to be of practical use for all but very simple cases.
But this doesn’t stop us from ordering lunches, planning vacations, and making other selections we are often quite happy with. We employ approximation algorithms, which are efficient algorithms that do not necessarily compute an exact solution for the problem but a solution that is just good enough. For example, a very simple approximation for the lunch problem is to split the total budget among all members of the lunch party and have each person find a selection that fits.
The difficulty of finding optimal selections can also be exploited. Insurance companies can offer overwhelmingly many choices that make it difficult for customers to find the optimal choice and make more money for the company.
7
Mission Intractable
The algorithms discussed in previous chapters exhibited a range of different runtimes. For example, finding the smallest element in an unsorted list takes linear time, while it takes only constant time in a sorted list. Similarly, it takes linear time to find a specific element in an unsorted list, whereas this can be done in logarithmic time in a sorted array or a balanced binary search tree. In both cases the precomputed sorted data structure makes a difference. But different algorithms can also have different runtimes for the same input. For example, selection sort is a quadratic algorithm, whereas mergesort has linearithmic runtime.
An algorithm with quadratic runtime might be too slow to be useful in practice. Consider the task of sorting the names of all 300 million residents of the United States. Executed on a computer that can perform a billion operations per second, selection sort would take about 90 million seconds, or about 2 years and 10 months, which is rather impractical. In contrast, the linearithmic mergesort finishes the same task in less than 10 seconds. But if we are dealing with inputs of moderate size, maybe we don’t have to worry too much about runtime complexity, especially since computers get faster every year.
As an analogy consider the range of transportation solutions to different traveling problems. For example, to go to work, you may be able to ride a bike, take the bus, or drive a car. To cross the Atlantic, none of these solutions will do, and you have to take a cruise ship or an airplane. You could in principle also cross the Atlantic in a kayak, but the time (and other resources) it would take to actually do that makes it close to impossible in practice.
Similarly, there are computational problems that have solutions in principle but take too long to compute in practice. This chapter discusses examples of that sort. I present some problems that can be solved (so far) only by algorithms that have exponential runtime, that is, algorithms whose runtimes grow exponentially with the size of the input. Algorithms with exponential runtime are unusable in practice for all but very small inputs, which makes the question of lower bounds and whether faster algorithms exist particularly important. This question is at the core of the P = NP problem, a prominent problem in computer science that is still open.
As in the case of sorting, a result about the limits of computer science that looks at first like a disappointment doesn’t have to be. Even if no efficient algorithms can be developed for a particular problem, that doesn’t mean we have to give up on such problems altogether. In particular, we can devise approximation algorithms that compute not exact yet good enough solutions to such problems. Moreover, the fact that a particular problem cannot be solved in practice can sometimes be exploited for solutions to other problems.
While large inputs reveal the difference between a quadratic or a linearithmic algorithm, a quadratic algorithm may work fine for small inputs. For example, sorting a list of 10,000 elements with selection sort takes about one-tenth of a second on a computer that can perform a billion operations per second. While the runtime is barely noticeable in this case, an increase in the list’s size by a factor of ten means an increase of the algorithm’s runtime by a factor of one hundred. Thus, the runtime for sorting lists of 100,000 elements increases to about 10 seconds. This may not be acceptable anymore, in particular, in situations where users are interacting with a system and expect instant responses. However, the next generation of computers may remedy the situation, and the algorithm may again become usable. Technological advances may suffice to push the limit of a quadratic algorithm. Unfortunately, we cannot rely on this effect to make an algorithm with exponential runtime usable.
Tipping the Scale
At the beginning of Raiders of the Lost Ark, Indiana Jones explores a cave in search of a precious golden idol, the target of his expedition. The idol rests on a scale that is designed to trigger a series of deadly traps in case the idol is removed. To circumvent the protective mechanism, Indiana Jones replaces the idol by a bag of sand that he hopes approximates the idol’s weight. Alas, the bag is too heavy and triggers the traps, which leads to a spectacular escape from the deadly cave.
Had Indiana Jones known the exact weight of the idol, he could have filled the bag with the precise amount of sand, and his exit from the cave would have been much less dramatic. But since he was probably not carrying around a scale, he needed to gauge the weight in some other way. Fortunately, it is not too difficult to build a balance scale. Basically all you need is a stick; attach the sand bag to one end and the exact weight at the other, and then fill the sand bag with sand until the stick is balanced. Assuming that Indiana Jones doesn’t have an object that has the exact same weight as the idol, he has to approximate the weight with a collection of objects. This doesn’t seem like such a difficult problem. If the idol has an estimated weight of, say, 42 ounces (which is about 2.6 pounds or 1.2 kg) and Indiana Jones has six objects that weigh 5, 8, 9, 11, 13, and 15 ounces, respectively, then after trying several combinations he will find that the 5, 9, 13, and 15 ounce objects add up exactly to 42 ounces.
But how exactly does the approach of trying several combinations work, and how long does it take? In the example, the heaviest object weighs less than half of the idol, which means we need at least three objects to get to the idol’s weight. However, it is not immediately clear which three objects to pick and whether we need a fourth or fifth object as well. Moreover, an algorithm must work for any possible situation and thus be able to handle as input different numbers of objects of different weights as well as arbitrary target weights.
A straightforward method for solving the weighing problem is to systematically form all combinations of objects and check for each combination whether the total weight is equal to the target weight. This strategy is also called generate-and-test, , since it consists of the repeated execution of the following two steps: (1) generate a potential solution, and (2) test whether the potential solution actually is a solution. In this case, the generating step produces a combination of objects, and the testing step adds the weights of the objects and compares the sum with the target weight. It is important that the generating step be repeated systematically and cover all cases because otherwise the algorithm could miss the solution.
The runtime of the generate-and-test method depends on how many combinations of objects can be formed. To understand how many combinations exist, let us consider how any one particular combination is formed from all the available objects. To this end, take an arbitrary combination such as 5, 9, and 11. For every available object we can ask whether it is contained in this particular combination. For example, 5 is contained in it, but 8 is not; 9 and 11 are elements of the combination, but the elements 13 and 15 are not. In other words, any possible combination is given by a decision for each element to either include or exclude it, and all these decisions are independent of each other.
We can compare the process of forming a combination with that of filling out a questionnaire that contains the available objects, each with a check box n
ext to it. One combination corresponds to a questionnaire in which the boxes of the selected objects are checked. The number of combinations is then identical to the number of different ways the questionnaire can be filled out. And here we see that each box can be checked or not, independently of all the other boxes.
Therefore, the number of possible combinations is given by the product of the available choices, which is 2 for selecting or not selecting an object (or equivalently, checking or not checking a box). Since Indiana Jones has six objects to choose from, the number of combinations is 2 multiplied by itself six times, that is, 2 × 2 × 2 × 2 × 2 × 2 = 26 = 64.1 Since the generate-and-test algorithm has to test each combination, its runtime grows at least at the same rate. In fact, it is even more time consuming, since testing one combination requires adding all the weights and comparing the sum with the target weight.
When Runtime Explodes
While 64 combinations don’t seem too bad, the important aspect about the algorithm’s runtime is how fast it grows with larger input. As explained in chapter 2, the runtime of algorithms is measured as a growth rate rather than in absolute time, since this provides a more general characterization that is independent of specific problem examples and computer performance characteristics.
The latter aspect is sometimes given as an excuse for using algorithms with bad runtimes. The argument goes like this: “Yes, I know the algorithm takes several minutes to complete, but wait until we get the new, faster computer. That will resolve this issue.” This argument has some validity to it. Moore’s law says that the speed of computers doubles about every 18 months.2
When the runtime of a computer doubles, a quadratic algorithm can handle an input that is larger by a factor of about 1.4.3 Now consider what happens with the runtime of an exponential algorithm in such a case. How much can the input be increased so that the runtime does not increase by more than a factor of two? Since the runtime of the algorithm doubles if the input is increased by 1, this means the algorithm can deal with only one additional element. In other words, we need to double the speed of a computer to be able to process input with only one more element.
Table 7.1 Approximate runtimes for different sizes of input on a computer that can perform a billion steps per second.
Input Size Runtime
Linear
Linearithmic
Quadratic
Exponential
20
0.001 second
50
13 days
100
1,000
10,000
0.1 second
1 million
0.001 second
0.002 second
16 minutes
1 billion
1 second
30 seconds
32 years
Note: Blank cells indicate runtimes < 1 millisecond, too small to be meaningful for human perception of time, and thus of no practical significance. Gray cells indicate runtimes too immense to comprehend.
Since the runtime of the algorithm doubles so fast, technological improvements that increase computing power by some fixed factor are simply not sufficient to scale an exponential algorithm to cope with significantly larger inputs. Note that a larger factor for increasing the speed, say ten, does not make much of a difference. While this would allow a quadratic algorithm to handle input that is three times as large, an exponential algorithm would be able to deal with input that is increased by only 3, since this would increase the runtime by a factor of 2 × 2 × 2 = 23 = 8.
Table 7.1 illustrates the huge gap between algorithms with nonexponential and exponential runtimes in handling different input sizes: Runtimes for nonexponential algorithms start to be noticeable only for inputs larger than 1,000, whereas exponential algorithms handle inputs of 20 and below quite well. But for inputs of 100, exponential algorithms take more than 400 billion centuries to run which is 2,900 times the age of our universe.
The overwhelming effect of exponential growth is analogous to the explosion of a nuclear bomb, whose effect is due to miniscule amounts of energy that are released when atoms are split.4 The devastating destruction is caused by the fact that many of these fissions occur in a very short period of time—every split atom leads to two (or more) fissions in short succession—an exponential growth of fissions and correspondingly released energy.
Or consider the legend about the peasant who is said to have invented the game of chess. The king was so pleased that he granted the peasant any wish. The peasant asked for one grain of rice to be put on the first square, two on the second, four on the third, and so on, until all the squares were covered. Not realizing the nature of exponential growth, the king thought this wish was easy to fulfill and promised to grant it. Of course, he couldn’t keep his promise, since the number of rice grains needed to cover the board is larger than 18 trillion trillion, which is more than 500 times the rice production of the entire world in 2014.
The immense gap between the size of inputs that exponential and nonexponential algorithms can cope with justifies the distinction into practical algorithms (those that have less than exponential runtime) and impractical algorithms (those that have exponential runtime or worse). Algorithms with exponential runtime cannot be considered practical solutions to problems because they take too long to compute results for all but small inputs.
Shared Destiny
The generate-and-test algorithm for solving the weighing problem works only for relatively small inputs (less than 30 or so), which is fine for the specific problem Indiana Jones faces. But because of the exponential growth of its runtime, the algorithm will never be able to deal with inputs of size 100 or larger. This particular algorithm’s exponential runtime does not mean that there can’t be other, more efficient algorithms with nonexponential runtimes. Currently, however, no such algorithm is known.
A problem that can only be solved by algorithms with exponential (or worse) runtime is called intractable. Since we only know algorithms with exponential runtime for the weighing problem, it may seem that it is intractable, but we don’t really know for sure. Maybe there is a nonexponential algorithm that just hasn’t been discovered yet. If we could prove that no such nonexponential algorithm can exist for the problem, then we would know the problem is intractable. A lower bound could provide us with certainty.
But why should we care about this? Maybe computer scientists should just let go of this problem and move on to investigate other questions. However, the weighing problem is very similar to a range of other problems that all share the following two intriguing properties. First, the only known algorithms for solving the problem have exponential runtime, and second, if a nonexponential algorithm were found for any one of the problems, it would immediately lead to nonexponential algorithms for all the others. Any problem that is a member of this exclusive club is called NP-complete.5 The significance of NP-complete problems comes from the fact that many practical problems are indeed NP-complete and they all share the fate of potentially being intractable.
Toward the end of the adventure The Kingdom of the Crystal Skull, Indiana Jones’s companion Mac collects treasure items in the temple of the crystal skull. He tries to carry as many items as he possibly can while maximizing their total value. This problem, the weighing problem, and the group lunch problem are all instances of the knapsack problem, which is named for the task of filling a knapsack of limited capacity with as many items as possible while maximizing some value or utility of the packed items. In the weighing problem the knapsack is the scale, its limit is the weight to be measured, the packing of items amounts to placing objects on the scale, and the optimization objective is to get as close as possible to the target weight. For Mac’s selection problem the limit is what he can carry, packing means to select items, and the optimization is maximizing the items’ total value. In the lunch problem the limit is the total amount of cash available to purchase items, the packing is the selection of items, and the optimization is maximizing the value o
f the menu selection.
Incidentally, Mac and Indiana Jones both fail: Mac spends too much time selecting the items and dies in the crumbling temple, and Indiana Jones’s sandbag triggers the deadly traps, although he ultimately manages to escape them. It’s unclear whether their failure is due to the NP-completeness of the problems they try to solve, but it is a nice reminder of intractability.
The knapsack problem with its applications is only one of many NP-complete problems. Another well-known example is the traveling salesman problem, which asks for a round trip connecting a number of cities that minimizes the length of the trip. A simple method is to generate all possible round trips and find the shortest one. The complexity of this algorithm is also exponential, but it is much worse than the weighing problem. For example, while the weighing algorithm can complete a problem of size 20 in 1 millisecond, computing a round trip for 20 cities would take 77 years. Finding round trips is not just useful for traveling salesmen but has many other applications, ranging from planning school bus routes to scheduling tours of cruise ships. Many other optimization problems are also NP-complete.
The intriguing property of NP-complete problems is that they all are either intractable or have algorithms with nonexponential runtimes as solutions. One way of showing that a problem is NP-complete is to demonstrate that a solution for the problem could be transformed (in nonexponential time) into a solution for another problem that is already known to be NP-complete. Such a transformation of a solution is called a reduction. Reductions are clever ways of problem solving by transforming a solution for one problem into a solution for another. Reductions occur quite frequently, and sometimes implicitly, as when Indiana Jones reduces the problem of finding safe tiles to the spelling of the name Iehova, or when Sherlock Holmes reduces the task of maintaining information about suspects to operations on a dictionary.