LOADING...

加载过慢请开启缓存(浏览器默认开启)

loading

Chapter 2 Algorithm Analysis

2.1 Mathematical Background


Definition 2.1.

if there are positive constants and such that when .



Definition 2.2.

if there are positive constants and such that when .



Definition 2.3.

if and only if and .



Definition 2.4.

if for all positive constants there exists an such that when . Less formally, if and .


When we say that , we are guaranteeing that the function grows at a rate no faster than ; thus is an upper bound on . Since this implies that , we say that is a lower bound on .

Function Name
Constant
Logarithmic
Log-squared
Linear
Quadratic
Cubic
Exponential

Rule 1.

If and , then

  • (intuitively and less formally it is )


Rule 2.

If is a polynomial of degree , then



Rule 3.

for any constant . This tells us that logarithms grow very slowly


Several points are in order. First, it is very bad style to include constants or low-order terms inside a Big-Oh. Do not say or . In both cases, the correct form is . Second, we can always determine the relative growth rates of two functions and by computing , using L’Hopital’s rule if necessary. The limit can have four possible values:

  1. The limit is : This means that
  2. The limit is : This means that
  3. The limit is : This means that
  4. The limit does not esist: There is no relation

One stylistic note: It is bad to say , because the inequality is implied by the definition. It is wrong to write , which does not make sense.

2.3 What to Analyze

We define two functions, and , as the average and worst-case running time, respectively, used by an algorithm on input of size . Clearly, . If there is more than one input, these functions may have more than one argument.

2.4 Running Time Calculations

2.4.2 General Rules


Rule 1 – for loops.

The running time of a for loop is at most the running time of the statements inside the for loop (including tests) times the number of iterations.



Rule 2 – Nested loops.

Analyze these inside out. The total running time of a statement inside a group of nested loops is the running time of the statement multiplied by the product of the sizes of all the loops.



Rule 3 – Consecutive Statements.

These just add.



Rule 4 – if/else.

The running time of an if/else statement is never more than the running time of the test plus the larger of the running times of S1 and S2.


Other rules are obvious, but a basic strategy of analyzing from the inside (or deepest part) out works. If there are method calls, these must be analyzed first. If there are recursive methods, there are several options. If the recursion is really just a thinly veiled for loop, the analysis is usually trivial. For instance, the following method is really just a simple loop and is :

public static long factorial(int n) {
    if(n <= 1)
        return 1;
    else
        return n * factorial(n - 1);
}

When recursion is properly used, it is difficult to convert the recursion into a simple loop structure:

public static long fib(int n) {
    if(n <= 1)
        return 1;
    else
        return fib(n - 1) + fib(n - 2);
}

We can say that because constants do not matter. for , we have the following formula for the running time of :

Since , it is easy to show by induction that . So the running time of this program grows exponentially.

2.4.3 Solutions for the Maximum Subsequence Sum Problem

Algorithm 1

public static int maxSubSum1(int[] a) {
    int maxSum = 0;
    for(int i = 0; i < a.length; i++)
        for(int j = i; j < a.length; j++) {
            int thisSum = 0;
            for(int k = i; k <= j; k++)
                thisSum += a[k];
            if(thisSum > maxSum)
                maxSum = thisSum;
        }
    return maxSum;
}

The total is . It turns out that a more precise analysis, taking into account the actual size of these loops, shows that the answer is and that our estimate above was a factor of too high. The precise analysis is obtained from the sum .

First we have:

Next we evaluate:

This sum is computed by observing that it is just the sum of the first integers. To complete the calculation, we evaluate:

Algorithm 2

public static int maxSubSum2(int[] a) {
    int maxSum = 0;
    for(int i = 0; i < a.length; i++) {
        int thisSum = 0;
        for(int j = i; j < a.length; j++) {
            thisSum += a[j];

            if(thisSum > maxSum)
                maxSum = thisSum;
        }
    }
    return maxSum;
}

The inefficiency that the improved algorithm corrects can be seen by noticing that .

Algorithm 3

private static int max3(int a, int b, int c) {
    return Math.max(a, Math.max(b, c));
}

private static int maxSumRec(int[] a, int left, int right) {
    if(left == right)
        if(a[left] > 0)
            return a[left];
        else
            return 0;

    int center = (left + right) / 2;
    int maxLeftSum = maxSumRec(a, left, center);
    int maxRightSum = maxSumRec(a, center + 1, right);

    int maxLeftBorderSum = 0, leftBorderSum = 0;
    for(int i = center; i >= left; i--) {
        leftBorderSum += a[i];
        if(leftBorderSum > maxLeftBorderSum)
            maxLeftBorderSum = leftBorderSum;
    }

    int maxRightBorderSum = 0, rightBorderSum = 0;
    for(int i = center + 1; i <= right; i++) {
        rightBorderSum += a[i];
        if(rightBorderSum > maxRightBorderSum)
            maxRightBorderSum = rightBorderSum;
    }

    return max3(maxLeftSum, maxRightSum, maxLeftBorderSum + maxRightBorderSum);
}

public static int maxSubSum3(int[] a) {
    return maxSumRec(a, 0, a.length - 1);
}

In our case, the maximum subsequence sum can be in one of three places. Either it occurs entirely in the left half of the input, or entirely in the right half, or it crosses the middle and is in both halves.

First Second
4 -3 5 -2 -1 2 6 -2

The maximum subsequence sum for the first half is (elements through ) and for the second half is (elements through ), the maximum sum that spans both halves and goes through the middle is (elements through ).

The running time is analyzed in much the same way as for the program that computes the Fibonacci numbers:

If , and , then , , , and . The pattern that is evident, and can be derived, is that if , then .

This analysis assumes is even, since otherwise is not defined. By the recursive nature of the analysis, it is really valid only when is a power of , since otherwise we eventually get a subproblem that is not an even size, and the equation is invalid. When is not a power of , a somewhat more complicated analysis is required, but the Big-Ohr esult remains unchanged.

Algorithm 4

public static int maxSubSum4(int [] a) {
    int maxSum = 0, thisSum = 0;
    for(int j = 0; j < a.length; j++) {
        thisSum += a[j];

        if(thisSum > maxSum)
            maxSum = thisSum;
        else if(thisSum < 0)
            thisSum = 0;
    }
    return maxSum;
}

2.4.4 Logarithms in the Running Time

Clearly, all the work done inside the loop takes per iteration, so the analysis requires determining the number of times around the loop. The loop starts with and finishes with . Every time through the loop the value must be at least halved from its previous value; thus, the number of times around the loop is at most .

public static <T extends Comparable<? super T>> int binarySearch(T[] a, T x) {
    int low = 0, high = a.length - 1;
    while(low <= high) {
        int mid = (low + high) / 2;

        if(a[mid].compareTo(x) < 0)
            low = mid + 1;
        else if(a[mid].compareTo(x) > 0)
            high = mid - 1;
        else
            return mid;
    }
    return -1;
}

Binary search can be viewed as our first data structure implementation. It supports the contains operation in time, but all other operations (in particular insert) require time.

Euclid’s Algorithm

public static long gcd(long m, long n) {
    while (n != 0) {
        long rem = m % n;
        m = n;
        n = rem;
    }
    return m;
}

As before, estimating the entire running time of the algorithm depends on determining how long the sequence of remainders is. Although seems like a good answer, it is not at all obvious that the value of the remainder has to decrease by a constant factor. Indeed, the remainder does not decrease by a constant factor in one iteration. However, we can prove that after two iterations, the remainder is at most half of its original value. This would show that the number of iterations is at most and establish the running time. This proof is easy, so we include it here. It follows directly from the following theorem.


Theorem 2.1.

If , then .

Proof.

There are two cases. If , then since the remainder is smaller than , the theorem is true for this case. The other case is . But then goes into once with a remainder , proving the theorem.


The average-case performance of Euclid’s algorithm requires pages and pages of highly sophisticated mathematical analysis, and it turns out that the average number of iterations is about .

Exponentiation

private static boolean isEven (int i) {
    return (i % 2) == 0;
}

public static long pow(long x, int n) {
    if( n == 0 )
        return 1;
    if(n == 1)
        return x;
    if(isEven(n))
        return pow(x * x, n / 2);
    else
        return pow(x * x, n / 2) * x;
}

The obvious algorithm to compute uses multiplications. A recursive algorithm can do better. is the base case of the recursion. Otherwise, if is even, we have , and if is odd, .