Big O Problem#

Niklaus Wirth

the father of Pascal won the Turing Prize in 1984

Algorithms + Data Structures = Programs

Problem Solving#

Some programming tasks are routine.
Some are more challenging, if you want to find a good solution, or the best solution to a problem.
Sometimes you may not even be sure whether a solution to a problem exists!

Knowledge of algorithm-design techniques and efficient data structures gives you a useful toolbox to draw on, as a programmer.

1
for(inti=0; i < 10000000; i++) {
2
    for(intj=0; j < 10; j++)
3
        { }
4
}

Total time: 3 ms

1
for(inti=0; i < 10; i++) {
2
    for(intj=0; j < 10000000; j++)
3
        { }
4
}

Total time: 5 ms

1
double[] a =new double[10000000];
2
for(int j=0;j<10000000;j++)
3
    a[j]=j;
4
double sum=0;
5
for (int i=0; i<10000000; i++)
6
sum += a[i];

Total time: 42 ms

1
double[] a =new double[10000000];
2
for(int j=0;j<10000000;j++)
3
    a[j]=j;
4
double sum=0,sum1=0,sum2=0,sum3=0,sum4=0;
5
for (int i=0; i<10000000; i+=4{
6
    sum1+=a[i];sum2+=a[i+1];sum3+=a[i+2];sum4+=a[i+3];
7
}
8
sum = sum1+sum2+sum3+sum4;

Total time: 33 ms

Comparing algorithms & data structures#

How do we know if an algorithm or a data structure is any good?
How do we know which one to choose?

There are many different:

ways to organize and access data
sorting algorithms
data compression algorithms
ways to solve optimization problems

Two common measures: Time & Space#

Time:
How fast does the algorithm run?

Space: How much storage space on the computer is needed to run the algorithm?

Efficient: We say that an algorithm is efficient if it is a speedy algorithm (not just because it “works well”).

Space-efficient:
We say that an algorithm is space-efficient if it takes up little space.

Comparing Algorithm Efficiency#

Suppose we have two algorithms for solving the same problem. How do we know which one is more efficient?
We could implement both algorithms and perform a comparison on the time (or space) it takes to run each program.
However, there are problems with this approach:
- We would be comparing implementations, not algorithms itself.
- The input data chosen might favor one implementation over another, giving a false impression.

Order-of-Magnitude Analysis#

To analyze algorithms independently of specific implementations, computers or data, we use a measure in terms of the size of the input to the problem. It is usual to use a variable n to describe the size of the input.
For example:
- Sorting algorithms are usually analyzed in terms of the number of items to be sorted
- Searching through a file could be analyzed in terms of the file size – the number of characters in a file
We will use a notation known as Big ‘O’ notation to describe the order of magnitude of the time and space complexity of an algorithm.

Complexity and Big ‘O’ Notation#

The definition of how Big ‘O’ works is complicated and tricky at first glance, so we will first look at an example to understand how to measure an algorithm or a piece of code…
One important point to note:
When measuring complexity, you should specify what it is that you are counting, for example:
- lines of code executed
- number of array accesses
- number of comparisons

Find the maximum value in an array#

1
int[] a = {57, 6, 25, -9, 45, 46, …, 5, -99};

1
assert a.length > 0;
2
int n = a.length;
3
int j = 1;
4
int maxValue = a[0];
5
while (j < n) {
6
    if (a[j] > maxValue)
7
        maxValue = a[j];
8
    j++;
9
}

This examines at least 1 element of a and at most n

Beginners guide to Big O Notation by Rob Bell#

https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/

Complexity and Big ‘O’ Notation (2)#

To state the time efficiency (time complexity) of an algorithm:
- We give an upper bound on how long the algorithm takes (to within an order of magnitude);
- It is expressed in terms of the size of the input.
For example:
- An algorithm taking exactly n-1 steps – we could say that the algorithm was O(n) (“order n”).
- An algorithm taking n²-4 steps can be described as O(n²) (pronounced “order n squared”)

Big O examples#

The big O notation (a) gives an order of magnitude rather than a precise number of steps, so it is better to say O(n) rather than O(2n-1)
(b) gives an upper bound (think of this as “no worse than”)

1, 5, 30000, 56 are all O(1)
n, 4n, 678n, 10000n + 2, are all O(n)
n², 5n² + 6n + 1, n² + log n, 3n²-1000 are all O(n²)
$2\log_{2}n$ , $\log_{7}n$ , $1000+\log_{100}n$ are all O(log n)

Reminder: Log Table (for base 2)#

$n$	$2^n$
0	1
1	2
2	4
3	8
4	16
5	32
…	…
10	1024
20	1048576

$\log_{2}n$	$n$
0	1
1	2
2	4
3	8
4	16
5	32
…	…
10	1024
20	1048576

Function behaviour (complexity vs elements)#

Hierarchy: O( $1$ ) < O( $log n$ ) < O( $n$ ) < O( $n * log n$ ) < O( $n^2$ ) < O( $n^3$ ) < O( $2^n$ )

Names for orders#

O( $1$ ) constant (does not depend on n)
O( $n$ ) linear (depends directly on n)
O( $n^2$ ) quadratic /kwɑːˈdrætɪk/(squared)
O( $n^i$ ) polynomial /ˌpɑːliˈnoʊmiəl/
O( $log n$ ) logarithmic /ˌlɔːɡəˈrɪðmɪk/
O( $base^n$ ) exponential /ˌekspəˈnenʃl/

Example Growth Rates#

Table of common growth-rate functions (approximately):

	$10$	$100$	$1,000$	$10,000$	$10^5$	$10^6$
$\log_{2}n$	$3$	$6$	$9$	$13$	$16$	$19$
$n$	$10$	$100$	$1,000$	$10,000$	$10^5$	$10^6$
$n\log_{2}n$	$30$	$664$	$9,965$	~ $10^5$	~ $10^6$	~ $10^7$
$n^2$	$100$	$10,000$	$10^6$	$10^8$	$10^10$	$10^12$

O( $log n$ ) algorithms are much quicker than O( $n$ ) algorithms, which are in turn much quicker than O( $n^2$ )

Some constant-time algorithms#

O(1) algorithms (constant time complexity) always take the same number of steps, no matter how much input data there is
Examples of O(1) algorithms (usually):
- finding out whether a list is empty or not
- finding out the value of the first element of a list
- Some tasks must take longer, e.g.
- finding the maximum element in an unsorted list requires going through the whole list, so can’t take O(1) time – it will need to be O(n).

Linear Search#

Linear search uses a linear data structure such as an array or list (for example an ArrayList object).
Linear search is O(n) comparisons
This is the case whether
- using an array or list
- the element sought-for is found or not
- you are considering the “worst case” or the “average case”

Bubble Sort#

1
void bubbleSort(int[] a) {
2
    int n = a.length;
3
    for (int j = n-1; j > 0; j--)
4
        for (int i = 0; i < j; i++)
5
            if (a[i] > a[i+1])
6
                swap(a[i], a[i+1]);
7
}

Complexity of bubbleSort? (Count the nested loops)
O(n²)

Insertion Sort#

1
void insertionSort(int[] a) {
2
    int i = 1;
3
    while (i < a.length) {
4
        int x = a[i];
5
        j = i;
6
        while (j != 0 && a[j-1] > x) {
7
            a[j] = a[j-1];
8
            j--;
9
        } // j == 0 || a[j-1] <= x
10
        a[j] = x; i++;
11
    }
12
}

Complexity of insertion Sort?
O(n²)

More sorting algorithms#

Naïve sorting algorithms are O(n²),
- e.g. bubble sort, insertion sort, selection sort
We can do better than O(n²) for sorting!

Quick Sort#

1
void quickSort(int low, int high, int[] numbers) {
2
    // first do partitioning
3
    int i = low; int j = high;
4
    int pivot = numbers[low];
5
    while (i <= j) {
6
        while (numbers[i] < pivot) i++;
7
        while (numbers[j] > pivot) j--;
8
        if (i <= j) {
9
            swap (numbers[i], numbers[j]);
10
            i++; j--;
11
        }
12
    }
13
    if (low < j) quicksort(low, j, numbers);
14
    if (i < high) quicksort(i, high, numbers);
15
}

Complexity of quickSort?
average: O(n log(n))
worst: O(n²)

Binary Search#

1
// returns true if x is in a
2
public static boolean search(int [ ] a, int x)
3
    int i, j, m;
4
    i = -1; j = a.length;
5
    while (j != i + 1) {
6
        m = (i + j ) / 2;
7
        if (a[m] <= x) i = m; else j = m;
8
    }
9
    // 0 <= i && i <= j && j <= a.length && (a[i] <= x < a[j]) && (j == i + 1)
10
    return i >= 0 && i < a.length && a[i] == x;
11
}

Complexity of binary search?
Note the halving.
Average: O(log n)
Worst: O(n)

Common Data Structure Operations#

Array Sorting Algorithms#

Caution

⚠️注意：涉及到长字符串操作时，即使没有显式写出遍历逻辑，也可能存在O(n)的时间复杂度。如 a="abcdefg";a+="h";

Complexity: References reading#

Goodrich & Tamassia (5th ed),
- Chapter 4, Data Structures and Algorithms in Java
Weiss,
- Chapter 2, Data Structures and Problem Solving Using Java
Cormen, Leiserson & Rivest,
- Chapter 2, Introduction to Algorithms
https://www.bigocheatsheet.com/

Summary#

What you want to be familiar with by the end of the week:
- big O notation
- complexity of simple algorithms
- how to get expressions for T(n) from algorithms
- performance of some sorting algorithms

Linear Search#

Sets and Sequences#

A set is a collection of values (all of the same type).
- There is no concept of order.
- a value can only occur in a set once.
A sequence is an ordered collection of values (all of the same type).
- Items can be selected by index.
- The same value can occur more than once in a sequence.

Representing sets and sequences#

In this module we will look at various data structures for representing sets and sequences and relationships.

Arrays#

An array is a data structure that can store elements (values) (of the same type).
- Elements can be accessed by means of an integer index.
- Most programming languages provide arrays.
- Most recently devised programming languages index the elements starting at zero(‘zero-based’).

Arrays in Java#

Arrays in Java: example#

Accessing arrays#

We access an element of an array by indexing.
- a[i]
valid i must satisfy the condition
- 0 <= i < capacity

Java implementations check this condition and will issue an error message if the condition does not hold. (‘Array access out of bounds’).

Using an array to represent a set#

We can represent a set (of fixed maximum size) using an array.
For example:
- static final int maxSize = 8;
- int[ ] values = new int[maxSize];
- int size = 0; // 0 <= size <= maxSize
Usually we use an integer variable size (or whatever name) to indicate how many values there are currently in the set.

is the set empty?#

The set is empty if size is zero:
- empty ➔ size == 0;
What type is variable empty?

Include a value in the set#

We include a value x in the set by using the next free position.
This is indicated by the value of size:

1
a[size] = x; // size initially zero

Since a value can appear in a set only once. Ensure: the set does not already contain x.

We then need to increase size by one:

1
size ++;

Indicate one more element was included. Ensure: size < maxSize or size != maxSize

Does the set contain a particular value?#

We can determine whether a value x appears in a set by using a linear search.
We start at the beginning of the array and keep advancing until:
- either we have gone off the end
- or else we have found a value x

Linear search#

This suggests a while loop: set current to first
while (what?) {
advance current } // gone off end or else current is x

What? should be

!(gone off end or else current is x)====!(p||q)
!(gone off end) and then !(current is x)=====!p && !q

Linear search as program#

1
int i;
2
i = 0;
3
while (i != capacity && a[i] != x) { // see next slide
4
    i = i + 1;
5
} // i == capacity || a[i] == x

after the loop: i == capacity, so no x found, or first x in a found at i

Order is important#

Please note that the order of terms is important:
We’d better not write:

1
int i;
2
i = 0;
3
while (a[i] != x && i != capacity ) { // erroneous
4
    i = i + 1;
5
}

Why comment on it as erroneous?

because we might ask about a[i] before checking that i is in range

Conditional Boolean operators not commutative#

As we have seen, conditional and and conditional or are not commutative in programming sometimes.
So, in general:
- p && q ≠q && p
- p || q ≠q || p Furthermore, they do not distribute:
  (p && q) || r ≠(p || r) && (q || r)

Avoiding conditional operators#

If these properties of conditional Boolean operators trouble us, or
If the condition is something more complex than just a[i] == x,
Then we can apply the ‘bounded linear search theorem’.

bounded linear search theorem#

1
int i = 0;
2
boolean found = false;
3
while (!found && i != capacity) { // order not important
4
    found = a[i] == x;
5
    i = i + 1;
6
}

i == capacity means there is no x in a
found means first x found at i-1 position

Assigning value of Boolean expression#

So no need for if statement here.

Excluding from a set#

Find index of element. Use linear search.
If not found
- nothing to do
If x found at j, a[j] ← a[size – 1] // swap last element into j
size ← size - 1

Representing a sequence#

Similar to a set except:

No need to check for duplicates before appending
Need to preserve order when deleting…

Deleting from sequence#

Use index of element to be deleted (there can be many x’s in a).
Let’s call that index value i:

1
condition 0 <= i < size
2
j ← i
3
while ( j < size -1) {
4
    a[j] ← a[j + 1] // copy
5
    j ← j + 1
6
}
7
size ← size – 1 //elements number-1

Dijkstra reference#

Professor Edsger Dijkstra writes about the linear search theorem in:
- EWD 1009 ‘On a somewhat disappointing correspondence’
- https://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/EWD1009.html

More-sophisticated structures#

An array-list can be extended at run time, and provides built-in methods such as contains and indexOf.
They are implemented in a similar way to what we have seen here with arrays.
We will be looking at more-efficient structures in this module, such as sorted sequences, hash-tables and trees.

Summary#

What you want to be familiar with by the end of the week:
- big O notation
- complexity of simple algorithms
- how to get expressions for T(n) from algorithms
- performance of some sorting algorithms

音乐

音乐

Data Structures and Algorithms Week1: Big O Problem

Big O Problem#

Problem Solving#

Comparing algorithms & data structures#

Two common measures: Time & Space#

Comparing Algorithm Efficiency#

Order-of-Magnitude Analysis#

Complexity and Big ‘O’ Notation#

Find the maximum value in an array#

Beginners guide to Big O Notation by Rob Bell#

Complexity and Big ‘O’ Notation (2)#

Big O examples#

Reminder: Log Table (for base 2)#

Function behaviour (complexity vs elements)#

Names for orders#

Example Growth Rates#

Some constant-time algorithms#

Linear Search#

Bubble Sort#

Insertion Sort#

More sorting algorithms#

Quick Sort#

Binary Search#

Common Data Structure Operations#

Array Sorting Algorithms#

Complexity: References reading#

Summary#

Linear Search#

Sets and Sequences#

Representing sets and sequences#

Arrays#

Arrays in Java#

Arrays in Java: example#

Accessing arrays#

Using an array to represent a set#

is the set empty?#

Include a value in the set#

Does the set contain a particular value?#

Linear search#

Linear search as program#

Order is important#

Conditional Boolean operators not commutative#

Avoiding conditional operators#

bounded linear search theorem#

Assigning value of Boolean expression#

Excluding from a set#

Representing a sequence#

Deleting from sequence#

Dijkstra reference#

More-sophisticated structures#

Summary#

支持与分享

评论区

音乐

目录