# 4.4. Sequence Set¶

## 4.4.1. Rationale¶

• Only unique values

• Mutable - can add, remove, and modify items

• Can store elements of any hashable types

• Set is unordered data structure and do not record element position or insertion

• Do not support getitem and slice

Hashable (Immutable):

• int

• float

• bool

• NoneType

• str

• tuple

Non-hashable (Mutable):

• list

• set

• dict

"Hashable types are also immutable" is true for builtin types, but it's not a universal truth.

## 4.4.2. Syntax¶

Defining only with set() - no short syntax:

>>> data = set()

Comma after last element of a one element set is optional. Brackets are required

>>> data = {1}
>>> data = {1, 2, 3}
>>> data = {1.1, 2.2, 3.3}
>>> data = {True, False}
>>> data = {'a', 'b', 'c'}
>>> data = {'a', 1, 2.2, True, None}

Stores only unique values:

>>> {1, 2, 1}
{1, 2}

Compares by values, not types:

>>> {1}
{1}
>>> {1.0}
{1.0}
>>> {1, 1.0}
{1}
>>> {1.0, 1}
{1.0}

Can store elements of any hashable types:

>>> data = {1, 2, 'a'}
>>> data = {1, 2, (3, 4)}
>>>
>>> data = {1, 2, [3, 4]}
Traceback (most recent call last):
TypeError: unhashable type: 'list'
>>>
>>> data = {1, 2, {3, 4}}
Traceback (most recent call last):
TypeError: unhashable type: 'set'

## 4.4.3. Type Casting¶

• set() converts argument to set

>>> data = 'abcd'
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = ['a', 'b', 'c', 'd']
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = ('a', 'b', 'c', 'd')
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = {'a', 'b', 'c', 'd'}
>>> set(data) == {'a', 'b', 'c', 'd'}
True

## 4.4.4. Deduplicate¶

Works with str, list, tuple

>>> data = [1, 2, 3, 1, 1, 2, 4]
>>> set(data)
{1, 2, 3, 4}

Converting set deduplicate items:

>>> data = ['Twardowski',
...         'Twardowski',
...         'Watney',
...         'Twardowski']
...
>>> set(data) == {'Twardowski', 'Watney'}
True

>>> data = {1, 2}
>>>
>>> data == {1, 2, 3}
True
>>>
>>> data == {1, 2, 3}
True
>>>
>>> data == {1, 2, 3, 4}
True
>>> data = {1, 2}
Traceback (most recent call last):
TypeError: unhashable type: 'list'
>>> data = {1, 2}
>>> data == {1, 2, (3, 4)}
True
>>> data = {1, 2}
Traceback (most recent call last):
TypeError: unhashable type: 'set'

## 4.4.6. Update¶

>>> data = {1, 2}
>>> data.update({3, 4})
>>> data == {1, 2, 3, 4}
True
>>> data.update([5, 6])
>>> data == {1, 2, 3, 4, 5, 6}
True
>>> data.update((7, 8))
>>> data == {1, 2, 3, 4, 5, 6, 7, 8}
True

## 4.4.7. Pop¶

Gets and remove items

>>> data = {1, 2, 3}
>>> value = data.pop()
>>> value in [1, 2, 3]
True

## 4.4.8. Membership¶

Is Disjoint?:

• True - if there are no common elements in data and x

• False - if any x element are in data

>>> data = {1,2}
>>>
>>> data.isdisjoint({1,2})
False
>>> data.isdisjoint({1,3})
False
>>> data.isdisjoint({3,4})
True

Is Subset?:

• True - if x has all elements from data

• False - if x don't have element from data

>>> data = {1,2}
>>>
>>> data.issubset({1})
False
>>> data.issubset({1,2})
True
>>> data.issubset({1,2,3})
True
>>> data.issubset({1,3,4})
False
>>> {1,2} < {3,4}
False
>>> {1,2} < {1,2}
False
>>> {1,2} < {1,2,3}
True
>>> {1,2,3} < {1,2}
False
>>> {1,2} <= {3,4}
False
>>> {1,2} <= {1,2}
True
>>> {1,2} <= {1,2,3}
True
>>> {1,2,3} <= {1,2}
False

Is Superset?: * True - if data has all elements from x * False - if data don't have element from x

>>> data = {1,2}
>>>
>>> data.issuperset({1})
True
>>> data.issuperset({1,2})
True
>>> data.issuperset({1,2,3})
False
>>> data.issuperset({1,3})
False
>>> data.issuperset({2,1})
True
>>> {1,2} > {1,2}
False
>>> {1,2} > {1,2,3}
False
>>> {1,2,3} > {1,2}
True
>>> {1,2} >= {1,2}
True
>>> {1,2} >= {1,2,3}
False
>>> {1,2,3} >= {1,2}
True

## 4.4.9. Basic Operations¶

Union (returns sum of elements from data and x):

>>> data = {1,2}
>>>
>>> data.union({1,2})
{1, 2}
>>> data.union({1,2,3})
{1, 2, 3}
>>> data.union({1,2,4})
{1, 2, 4}
>>> data.union({1,3}, {2,4})
{1, 2, 3, 4}
>>> {1,2} | {1,2}
{1, 2}
>>> {1,2,3} | {1,2}
{1, 2, 3}
>>> {1,2,3} | {1,2,4}
{1, 2, 3, 4}
>>> {1,2} | {1,3} | {2,4}
{1, 2, 3, 4}

Difference (returns elements from data which are not in x):

>>> data = {1,2}
>>>
>>> data.difference({1,2})
set()
>>> data.difference({1,2,3})
set()
>>> data.difference({1,4})
{2}
>>> data.difference({1,3}, {2,4})
set()
>>> data.difference({3,4})
{1, 2}
>>> {1,2} - {2,3}
{1}
>>> {1,2} - {2,3} - {3}
{1}
>>> {1,2} - {1,2,3}
set()

Symmetric Difference (returns elements from data and x, but without common):

>>> data = {1,2}
>>>
>>> data.symmetric_difference({1,2})
set()
>>> data.symmetric_difference({1,2,3})
{3}
>>> data.symmetric_difference({1,4})
{2, 4}
>>> data.symmetric_difference({1,3}, {2,4})
Traceback (most recent call last):
TypeError: symmetric_difference() takes exactly one argument (2 given)
>>> data.symmetric_difference({3,4})
{1, 2, 3, 4}
>>> {1,2} ^ {1,2}
set()
>>> {1,2} ^ {2,3}
{1, 3}
>>> {1,2} ^ {1,3}
{2, 3}

Intersection (returns common element from in data and x):

>>> data = {1,2}
>>>
>>> data.intersection({1,2})
{1, 2}
>>> data.intersection({1,2,3})
{1, 2}
>>> data.intersection({1,4})
{1}
>>> data.intersection({1,3}, {2,4})
set()
>>> data.intersection({1,3}, {1,4})
{1}
>>> data.intersection({3,4})
set()
>>> {1,2} & {2,3}
{2}
>>> {1,2} & {2,3} & {2,4}
{2}
>>> {1,2} & {2,3} & {3}
set()

## 4.4.10. Cardinality¶

>>> data = {1, 2, 3}
>>> len(data)
3

## 4.4.11. Assignments¶

Code 4.7. Solution
"""
* Assignment: Sequence Set Create
* Required: yes
* Complexity: easy
* Lines of code: 5 lines
* Time: 5 min

English:
1. Create sets:
a. result_a with elements: 1, 2, 3
b. result_b with elements: 1.1, 2.2, 3.3
c. result_c with elements: 'a', 'b', 'c'
d. result_d with elements: True, False
e. result_e with elements: 1, 2.2, True, 'a'
2. Run doctests - all must succeed

Polish:
1. Stwórz sety:
a. result_a z elementami: 1, 2, 3
b. result_b z elementami: 1.1, 2.2, 3.3
c. result_c z elementami: 'a', 'b', 'c'
d. result_d z elementami: True, False, True
e. result_e z elementami: 1, 2.2, True, 'a'
2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
>>> import sys; sys.tracebacklimit = 0

>>> assert result_a is not Ellipsis, \
'Assign result to variable: result_a'
>>> assert result_b is not Ellipsis, \
'Assign result to variable: result_b'
>>> assert result_c is not Ellipsis, \
'Assign result to variable: result_c'
>>> assert result_d is not Ellipsis, \
'Assign result to variable: result_d'
>>> assert result_e is not Ellipsis, \
'Assign result to variable: result_e'

>>> assert type(result_a) is set, \
'Variable result_a has invalid type, should be set'
>>> assert type(result_b) is set, \
'Variable result_b has invalid type, should be set'
>>> assert type(result_c) is set, \
'Variable result_c has invalid type, should be set'
>>> assert type(result_d) is set, \
'Variable result_d has invalid type, should be set'
>>> assert type(result_e) is set, \
'Variable result_e has invalid type, should be set'

>>> assert result_a == {1, 2, 3}, \
'Variable result_a has invalid value, should be {1, 2, 3}'
>>> assert result_b == {1.1, 2.2, 3.3}, \
'Variable result_b has invalid value, should be {1.1, 2.2, 3.3}'
>>> assert result_c == {'a', 'b', 'c'}, \
'Variable result_c has invalid value, should be {"a", "b", "c"}'
>>> assert result_d == {True, False}, \
'Variable result_d has invalid value, should be {True, False}'
>>> assert result_e == {1, 2.2, True, 'a'}, \
'Variable result_e has invalid value, should be {1, 2.2, True, "a"}'
"""

# set[int]: with elements: 1, 2, 3
result_a = ...

# set[float]: with elements: 1.1, 2.2, 3.3
result_b = ...

# set[str]: with elements: 'a', 'b', 'c'
result_c = ...

# set[bool]: with elements: True, False
result_d = ...

# set[int|float|bool|str]: with elements: 1, 2.2, True, 'a'
result_e = ...

Code 4.8. Solution
"""
* Assignment: Sequence Set Many
* Required: yes
* Complexity: easy
* Lines of code: 9 lines
* Time: 8 min

English:
1. Non-functional requirements:
a. Assignmnet verifies creation of set() and method .add() and
.update() usage
b. For simplicity numerical values type as floats, and not str
c. Example: instead of '5.8' just type 5.8
d. Do not use str.split(), slice, getitem, for, while or
any other control-flow statement
2. Create set result representing row with index 1
3. Values from row at index 2 add to result using .add() (five calls)
4. From row at index 3 create set and add it to result using
.update() (one call)
5. From row at index 4 tuple and add it to result using .update()
(one call)
6. From row at index 5 list and add it to result using .update() (
one call)
7. Run doctests - all must succeed

Polish:
1. Wymagania niefunkcjonalne:
a. Zadanie sprawdza tworzenie set() oraz użycie metod .add() i
.update()
b. Dla uproszczenia wartości numeryczne wypisuj jako float,
a nie str
c. Przykład: zamiast '5.8' zapisz 5.8
d. Nie używaj str.split(), slice, getitem, for, while lub
jakiejkolwiek innej instrukcji sterującej
2. Stwórz zbiór result reprezentujący wiersz o indeksie 1
3. Wartości z wiersza o indeksie 2 dodawaj do result używając .add()
(pięć wywołań)
4. Na podstawie wiersza o indeksie 3 stwórz set i dodaj go do result
używając .update() (jedno wywołanie)
5. Na podstawie wiersza o indeksie 4 stwórz tuple i dodaj go do
result używając .update() (jedno wywołanie)
6. Na podstawie wiersza o indeksie 5 stwórz list i dodaj go do
result używając .update() (jedno wywołanie)
7. Uruchom doctesty - wszystkie muszą się powieść

Tests:
>>> import sys; sys.tracebacklimit = 0

>>> assert result is not Ellipsis, \
'Assign result to variable: result'

>>> assert type(result) is set, \
'Variable result has invalid type, should be set'

>>> assert len(result) == 22, \
'Variable result length should be 22'

>>> assert ('sepal_length' not in result
...     and 'sepal_width' not in result
...     and 'petal_length' not in result
...     and 'petal_width' not in result
...     and 'species' not in result)

>>> assert result >= {5.8, 2.7, 5.1, 1.9, 'virginica'}
>>> assert result >= {5.1, 3.5, 1.4, 0.2, 'setosa'}
>>> assert result >= {5.7, 2.8, 4.1, 1.3, 'versicolor'}
>>> assert result >= {6.3, 2.9, 5.6, 1.8, 'virginica'}
>>> assert result >= {6.4, 3.2, 4.5, 1.5, 'versicolor'}
"""

DATA = ['sepal_length,sepal_width,petal_length,petal_width,species',
'5.8,2.7,5.1,1.9,virginica',
'5.1,3.5,1.4,0.2,setosa',
'5.7,2.8,4.1,1.3,versicolor',
'6.3,2.9,5.6,1.8,virginica',
'6.4,3.2,4.5,1.5,versicolor']

# set[float|str]: with row at DATA[1] (manually converted to float and str)
result = ...

# add to result float 5.1
...

# add to result float 3.5
...

# add to result float 1.4
...

# add to result float 0.2
...

# add to result str setosa
...

# update result with set 5.7, 2.8, 4.1, 1.3, 'versicolor'
...

# update result with tuple 6.3, 2.9, 5.6, 1.8, 'virginica'
...

# update result with list 6.4, 3.2, 4.5, 1.5, 'versicolor'
...