Algorithms For Dummies. John Paul Mueller
Чтение книги онлайн.
Читать онлайн книгу Algorithms For Dummies - John Paul Mueller страница 20
1 Count each simple operation (arithmetic ones) as a time step.
2 Break complex operations into simple arithmetic operations and count time steps as defined in Step 1.
3 Count every data access from memory as one time step.
To perform this accounting, you write a pseudocode version of your algorithm (as mentioned in Chapter 1) and perform these steps using paper and pencil. In the end, it’s a simple approach based on a basic idea of how computers work, a useful approximation that you can use to compare solutions regardless of the power and speed of your hardware or the programming language you use.
Using a simulation is different from running the algorithm on a computer because you use a standard and predefined input. Real computer measurements require that you run the code and verify the time required to run it. Running code on a computer is actually a benchmark, another form of efficiency measurement, in which you also account for the application environment (such as the type of hardware used and the software implementation). A benchmark is useful but lacks generalization. Consider, for instance, how newer hardware can quickly execute an algorithm that took ages on your previous computer.
Getting even more abstract
If you thought things were abstract before, this section makes those previous sections seem concrete, but grit your teeth and move on because you really are up to the task! Measuring a series of steps devised to achieve a solution to a problem poses quite a few challenges. The previous section discusses counting time steps (number of operations), but sometimes you also need to compute space (such as the memory an algorithm consumes). You consider space when your problem is greedy for resources. Depending on the problem, you may consider an algorithm to be better when it works efficiently with regard to one of these resource consumption aspects:
Running time
Computer memory requirements
Hard-disk usage
Power consumption
Data-transmission speed in a network
Some of these aspects relate to others in an inverse manner, so if, for instance, you want speedier execution time, you can sometimes increase memory or power consumption to get it. Not only can you have different efficiency configurations when running an algorithm, you can also change the hardware characteristics and software implementation to accomplish your goals. In terms of hardware, using a supercomputer or a general-purpose computer does matter, and the software, or language used to write the algorithm, is definitely a game changer. In addition, the quantity and kind of data you feed the algorithm could result in better or worse performance measurements.
RAM simulations count time because when you can employ a solution in so many environments, and its resource usage depends on so many factors, you have to find a way to simplify comparisons so that they become standard. Otherwise, you can’t compare possible alternatives. The solution is, as so often happens with many other problems, to use a single measure and say that one size fits all. In this case, the measure is time, which you make equal to the number of operations, that is, the complexity of the algorithm.
A RAM simulation places the algorithm in a situation that’s both language and machine-agnostic (it’s independent of programming language and computer type). However, explaining how a RAM simulation works to others requires quite an effort. The analysis of algorithms proposes to use the number of operations you get from a RAM simulation and turn them into a mathematical function expressing how your algorithm behaves in terms of time, which is a quantification of the steps or operations required when the number of data inputs grows. For instance, if your algorithm sorts objects, you can express complexity using a function that reports how many operations it needs depending on the number of objects it receives.
Working with functions
A function in mathematics is simply a way to map some inputs to a response. Expressed in a different way, a function is a transformation (based on math operations) that transforms (maps) your input to an answer. For certain values of input (usually denoted by the letters x or n), you have a corresponding answer using the math that defines the function. For instance, a function like f(n) = 2n
tells you that when your input is a number n, your answer is the number n multiplied by 2.
A function describing how an algorithm relates its solution to the quantity of data it receives is something you can analyze without specific hardware or software support. It's also easy to compare with other solutions, given the size of your problem. Analysis of algorithms is really a mind-blowing concept because it reduces a complex series of steps into a mathematical formula.
In most cases, an analysis of algorithms isn’t interested in defining the function exactly. What you really need is a comparison of a target function with one or more other functions. These comparison functions appear within a set of proposed functions that perform poorly when contrasted to the target function. In this way, you don’t have to plug numbers into functions of greater or lesser complexity; instead, you deal with simple, premade, and well-known functions. It’s more effective and is similar to classifying the performance of algorithms into categories, rather than obtaining an exact performance measurement.
The set of generalized functions is called Big O notation, and in this book, you often encounter this small set of functions (put into parentheses and preceded by a capital O) used to represent the performance of algorithms. Figure 2-1 shows the analysis of an algorithm. A Cartesian coordinate system can represent its function as measured by RAM simulation, where the abscissa (the x coordinate) is the size of the input and the ordinate (the y coordinate) is its resulting number of operations. You can see three curves represented. Input size matters. However, quality also matters (for instance, when ordering problems, it’s faster to order an input that’s already almost ordered). Consequently, the analysis shows a worst case, f1(n)
, an average case, f2(n)
, and a best case, f3(n)
. Even though the average case might give you a general idea, what you really care about is the worst case, because problems may arise when your algorithm struggles to reach a solution. The Big O function is the one that, after a certain n0
value (the threshold for considering an input big), always results in a larger number of operations given the same input than the worst-case function f1
. Thus, the Big O function is even more pessimistic than the one representing your algorithm, so no matter the quality of input, you can be sure that things can’t get worse than that.
FIGURE 2-1: Complexity of an algorithm in case of best, average, and worst input case.
Many possible functions can result in worse results, but the choice of functions offered by the Big O notation that you can use is restricted because its purpose is to simplify complexity measurement by proposing a standard. This section contains just the few functions that are part of the Big O notation. The following list describes them in growing order of complexity:
Constant