8.1. Complexity¶

8.1.4. Cyclomatic Complexity¶

• Measures the minimum number of test cases required for full test coverage.

8.1.5. Big O notation¶

Most common:

• O(sqrt_n)

• O(log_n)

• O(log2_n)

• O(1)

• O(n)

• O(nlog2_n)

• O(n^2)

• O(2^n)

• O(n!)

Table 8.1. Table of common time complexities 1

Running time (T(n))

Name

Example algorithms

O(1)

constant time

Finding the median value in a sorted array of numbersCalculating (−1)n

O(α(n))

inverse Ackermann time

Amortized time per operation using a disjoint set

O(log* n)

iterated logarithmic time

Distributed coloring of cycles

O(log log n)

log-logarithmic

Amortized time per operation using a bounded priority queue

O(log n)

logarithmic time

Binary search

poly(log n)

polylogarithmic time

O(nc) where 0 < c < 1

fractional power

Searching in a kd-tree

O(n)

linear time

Finding the smallest or largest item in an unsorted array, Kadane's algorithm, linear search

O(n log* n)

n log-star n time

Seidel's polygon triangulation algorithm

O(n log n)

linearithmic time

Fastest possible comparison sort; Fast Fourier transform

n poly(log n)

quasilinear time

O(n2)

Bubble sort; Insertion sort; Direct convolution

O(n3)

cubic time

Naive multiplication of two n×n matrices. Calculating partial correlation

2O(log n) = poly(n)

polynomial time

Karmarkar's algorithm for linear programming; AKS primality test

2poly(log n)

quasi-polynomial time

Best-known O(log2 n)-approximation algorithm for the directed Steiner tree problem

O(2nε) for all ε > 0

sub-exponential time

Best-known algorithm for integer factorization; formerly-best algorithm for graph isomorphism

2o(n)

sub-exponential time

Best-known algorithm for integer factorization; formerly-best algorithm for graph isomorphism

2O(n)

exponential time (with linear exponent)

Solving the traveling salesman problem using dynamic programming

2poly(n)

exponential time

Solving matrix chain multiplication via brute-force search

O(n!)

factorial time

Solving the traveling salesman problem via brute-force search

22poly(n)

double exponential time

Deciding the truth of a given statement in Presburger arithmetic

8.1.6. References¶

1(1,2)

https://en.wikipedia.org/wiki/Big_O_notation

8.1.7. Assignments¶

"""
* Assignment: Performance Compexity Segmentation
* Required: yes
* Complexity: easy
* Lines of code: 10 lines
* Time: 8 min

English:
1. Count occurrences of each group
2. Define groups:
a. small - numbers in range [0-3)
b. medium - numbers in range [3-7)
c. large - numbers in range [7-10)
3. Print result: dict[str, int]:
a. key - group
b. value - number of occurrences
4. Run doctests - all must succeed

Polish:
1. Policz wystąpienia każdej z group
2. Zdefiniuj grupy
a. small - liczby z przedziału <0-3)
b. medium - liczby z przedziału <3-7)
c. large - liczby z przedziału <7-10)
3. Wypisz result: dict[str, int]:
a. klucz - grupa
b. wartość - liczba wystąpień
4. Uruchom doctesty - wszystkie muszą się powieść

Tests:
>>> import sys; sys.tracebacklimit = 0

>>> type(result)
<class 'dict'>

>>> assert all(type(x) is str for x in result.keys())
>>> assert all(type(x) is int for x in result.values())

>>> result
{'small': 16, 'medium': 19, 'large': 15}
"""

DATA = [1, 4, 6, 7, 4, 4, 4, 5, 1, 7, 0,
0, 6, 5, 0, 0, 9, 7, 0, 4, 4, 8,
2, 4, 0, 0, 1, 9, 1, 7, 8, 8, 9,
1, 3, 5, 6, 8, 2, 8, 1, 3, 9, 5,
4, 8, 1, 9, 6, 3]

# dict[str,int] number of digit occurrences in segments
result = {'small': 0, 'medium': 0, 'large': 0}


"""
* Assignment: Performance Compexity UniqueKeys
* Required: yes
* Complexity: easy
* Lines of code: 3 lines
* Time: 8 min

English:
1. Collect unique keys from all rows in one sequence result
2. Run doctests - all must succeed

Polish:
1. Zbierz unikalne klucze z wszystkich wierszy w jednej sekwencji result
2. Uruchom doctesty - wszystkie muszą się powieść

Hints:
* row.keys()
* Compare solutions with :ref:Micro-benchmarking use case

Tests:
>>> import sys; sys.tracebacklimit = 0

>>> result is not Ellipsis
True
>>> type(result) in (set, list, tuple, frozenset)
True
>>> sorted(result)
['Petal length', 'Petal width', 'Sepal length', 'Sepal width', 'Species']
"""

DATA = [
{'Sepal length': 5.1, 'Sepal width': 3.5, 'Species': 'setosa'},
{'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
{'Sepal length': 6.3, 'Petal width': 1.8, 'Species': 'virginica'},
{'Sepal length': 5.0, 'Petal width': 0.2, 'Species': 'setosa'},
{'Sepal width': 2.8, 'Petal length': 4.1, 'Species': 'versicolor'},
{'Sepal width': 2.9, 'Petal width': 1.8, 'Species': 'virginica'},
]

# Unique keys from DATA dicts
# type: set[str]
result = ...


"""
* Assignment: Performance Compexity SplitTrainTest
* Required: no
* Complexity: easy
* Lines of code: 9 lines
* Time: 13 min

English:
1. Using List Comprehension split DATA into:
a. features_train: list[tuple] - 60% of first features in DATA
b. features_test: list[tuple] - 40% of last features in DATA
c. labels_train: list[str] - 60% of first labels in DATA
d. labels_test: list[str] - 40% of last labels in DATA
2. In order to do so, calculate pivot point:
a. length of DATA times given percent (60% = 0.6)
b. remember, that slice indicies must be int, not float
c. for example: if dataset has 10 rows, then 6 rows will be for
training, and 4 rows for test
3. Run doctests - all must succeed

Polish:
1. Używając List Comprehension podziel DATA na:
a. features_train: list[tuple] - 60% pierwszych features w DATA
b. features_test: list[tuple] - 40% ostatnich features w DATA
c. labels_train: list[str] - 60% pierwszych labels w DATA
d. labels_test: list[str] - 40% ostatnich labels w DATA
2. Aby to zrobić, wylicz punkt podziału:
a. długość DATA razy zadany procent (60% = 0.6)
b. pamiętaj, że indeksy slice muszą być int a nie float
c. na przykład: if zbiór danych ma 10 wierszy, to 6 wierszy będzie
do treningu, a 4 do testów
3. Uruchom doctesty - wszystkie muszą się powieść

Tests:
>>> import sys; sys.tracebacklimit = 0

>>> assert type(features_train) is list, \
'make sure features_train is a list'

>>> assert type(features_test) is list, \
'make sure features_test is a list'

>>> assert type(labels_train) is list, \
'make sure labels_train is a list'

>>> assert type(labels_test) is list, \
'make sure labels_test is a list'

>>> assert all(type(x) is tuple for x in features_train), \
'all elements in features_train should be tuple'

>>> assert all(type(x) is tuple for x in features_test), \
'all elements in features_test should be tuple'

>>> assert all(type(x) is str for x in labels_train), \
'all elements in labels_train should be str'

>>> assert all(type(x) is str for x in labels_test), \
'all elements in labels_test should be str'

>>> features_train  # doctest: +NORMALIZE_WHITESPACE
[(5.8, 2.7, 5.1, 1.9),
(5.1, 3.5, 1.4, 0.2),
(5.7, 2.8, 4.1, 1.3),
(6.3, 2.9, 5.6, 1.8),
(6.4, 3.2, 4.5, 1.5),
(4.7, 3.2, 1.3, 0.2)]

>>> features_test  # doctest: +NORMALIZE_WHITESPACE
[(7.0, 3.2, 4.7, 1.4),
(7.6, 3.0, 6.6, 2.1),
(4.9, 3.0, 1.4, 0.2),
(4.9, 2.5, 4.5, 1.7)]

>>> labels_train
['virginica', 'setosa', 'versicolor', 'virginica', 'versicolor', 'setosa']

>>> labels_test
['versicolor', 'virginica', 'setosa', 'virginica']
"""

DATA = [
('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]

ratio = 0.6
header, *rows = DATA
split = int(len(rows) * ratio)

features_train = ...
features_test = ...
labels_train = ...
labels_test = ...