Efficiency of Algorithms
Efficiency
A measure of how well an algorithm uses resources, such as time and memory, to solve a problem.
We will focus on time efficiency, which is often the most critical factor when choosing an algorithm.
- Big O notation is a mathematical way to describe how the running time or memory use of an algorithm grows as the input size increases.
- It does not measure the exact time in seconds.
- Instead, it focuses on the rate of growth of operations as the input ($n$) becomes larger.
Why Do We Use Big O?
- To compare algorithms independent of hardware or programming language.
- To predict scalability: how the algorithm behaves as $n$ grows.
- To choose efficient algorithms for large datasets.
How Big O Works
We look at the number of steps an algorithm performs in relation to input size $n$:
| Big O | Name | Example | Description |
|---|---|---|---|
| $O(1)$ | Constant | Accessing an array element | Time does not depend on input size. |
| $O(\log n)$ | Logarithmic | Binary search | Each step halves the problem size. |
| $O(n)$ | Linear | Summing a list | Work grows directly with input size. |
| $O(n \log n)$ | Log-linear | MergeSort, QuickSort | Efficient sorting algorithms. |
| $O(n^2)$ | Quadratic | Nested loops over $n$ | Work grows with the square of input size. |
| $O(2^n)$ | Exponential | Recursive subset generation | Work doubles with each extra input. |
| $O(n!)$ | Factorial | Traveling Salesman brute force | Extremely inefficient for large $n$. |
Why Efficiency Matters
- Efficient algorithms can handle larger datasets and solve problems faster.
- Inefficient algorithms may become impractical as the size of the input grows.
How to Count in Big O
To deduce the efficiency of an algorithm, consider the following steps:
- Identify the Input Size: Determine the variable that represents the size of the input (e.g., $n$ for a list of length $n$).
- Count the Operations: Estimate the number of operations the algorithm performs as the input size grows.
- Determine the Growth Rate: Use Big O notation to describe how the number of operations scales with the input size.
- When analysing an algorithm, focus on the operations that dominate its running time as the input size grows.
- These are the operations that determine its efficiency.
Single Loop vs Nested Loops
- Loops are a common source of repeated steps in algorithms.
- The number of iterations depends on the loop structure and the input data.
- A for loop typically iterates a fixed number of times, determined by its range.
- Consider the following Python code:
for i in range(n)
print(i)- The loop runs from $0$ to $n-1$, so the print(i) statement is executed n times.
- A while loop continues until a condition is no longer true.
- The number of iterations depends on how the condition changes.
- Consider this Python code:
i = 1
while i <= n:
print(i) i = i + 1- The loop starts with $i = 1$ and increments $i$ by 1 each time.
- It stops when $i > n$, so the print(i) statement is executed n times.
Single Loop: Iterates through a list once.
Finding the maximum value in a list
def find_max(lst):
max_val = lst[0]
for num in lst: # Only one loop
if num > max_val:
max_val = num
return max_val The single loop solution has a time complexity of $O(n)$, where $n$ is the size of the list.
- A common mistake is to assume that a while loop always runs $n$ times.
- If the condition or increment is incorrect, the loop may run more or fewer times.
Nested Loops: Iterates through a list multiple times, often for comparisons.
for i in range(1, n+1):
for j in range(1, m+1):
print(i, j)
In this case:
- The outer loop runs $n$ times.
- Inner loop runs $m$ times per outer iteration.
- Total executions: $n×m$, hence complexity O(n*m)
- Students often forget to multiply the iterations of nested loops.
- Always consider how many times the inner loop runs for each iteration of the outer loop.
- Don't assume that an algorithm with nested loops is always inefficient.
- Sometimes, the inner loop may run only a few times, resulting in a lower time complexity than expected.
Bubble Sort
def bubble_sort(lst):
n = len(lst)
for i in range(n): # Outer loop
for j in range(0, n-i-1): # Inner loop
if lst[j] > lst[j+1]:
lst[j], lst[j+1] = lst[j+1], lst[j]
return lst
bubble_sort([5, 4, 3, 1, 2])- In the first pass, Bubble Sort compares the first element with the second, the second with the third, … up to the $(n-1)^\text{th}$ element.
- That’s $n-1$ comparisons.
- In the second pass, the largest element is already in place, so we do $n-2$ comparisons.
- In the third pass, we do $n-3$ comparisons, and so on.
- The total number of comparisons is: $(n−1)+(n−2)+(n−3)+⋯+1(n-1) + (n-2) + (n-3) + \dots + 1(n−1)+(n−2)+(n−3)+⋯+1$
- This is the sum of the first $n-1$ integers: $\frac{(n-1) \cdot n}{2}$
- In Big O notation, we drop constants and lower-order terms
- Hence, the nested loops solution has a time complexity of $O(n^2)$, which grows much faster as $n$ increases.
- Some algorithms can exit early when a condition is met, improving efficiency.
- For instance, linear search with early exit
def linear_search(lst, target):
for num in lst:
if num == target:
return True
return False
linear_search([1, 4, 3], 4)- If the target is found early in the list, the algorithm stops, saving time.
- However, in the worst case, it still is a full list iteration.
- Hence, such solua tion still has a time complexity of $O(n)$.
Conditional Statements
- Conditional statements like if and else do not directly affect the number of iterations but determine which steps are executed.
- When analysing an algorithm, consider all possible paths.
Consider this Python code:
for i in range(1, n+1):
if i % 2 == 0:
print(i)- The loop runs $n$ times, but print(i) is executed only for even values of $i$.
- If $n$ is even, print(i) is executed $n/2$ times.
- If $n$ is odd, print(i) is executed $(n-1)/2$ times.
- Hence time complexity is O(n).
Finding the Sum of a List
def find_sum(lst):
total = 0
for num in lst:
total += num
return total
find_sum([1, 2, 3, 4])- Input Size: $n$ (length of the list).
- Operations: One addition per element.
- Growth Rate: $O(n)$ (linear time).
Checking for Duplicates in a List
def has_duplicates(lst):
n = len(lst)
for i in range(n):
for j in range(i+1, n):
if lst[i] == lst[j]:
return True
return False
print(has_duplicates([1, 2, 3, 4]))
print(has_duplicates([3, 2, 3, 4]))- Input Size: $n$ (length of the list).
- Operations: Nested loops result in approximately $n^2/2$ comparisons.
- Growth Rate: $O(n^2)$ (quadratic time).
- Can you determine the number of iterations for a given loop?
- How does the input size affect the iteration count?
- What happens to the iteration count if the loop structure changes?
- What is the time complexity of a single loop that iterates through a list of length $t$?
- Why is it important to consider efficiency when designing algorithms?