In this article I cover data structures in Python. There are quite a few data structures available in Python, lets start with builtins first.
List : A List holds an ordered collection of items. Lets create a list of fruits in a basket and access its elements.
# Create a list
basket = [‘apple’, ‘banana’, ‘mango’]
# Access its elements

for fruit in basket:

print(fruit)

#Output

apple
banana
mango
Some of the common operations we can perform on a list are:
append () – Adds an element to the end of the list

basket.append(‘guava’)

print(basket)

[‘apple’, ‘banana’, ‘mango’, ‘guava’]

extend () – Adds all elements of a list to some other list

another_basket = [‘watermelon’, ‘papaya’, ‘pineapple’]

basket.extend(another_basket)

print(basket)

[‘apple’, ‘banana’, ‘mango’, ‘guava’, ‘watermelon’, ‘papaya’, ‘pineapple’]

pop () – Eliminates and returns last element from the list

lastfruit = basket.pop()

print(lastfruit)

Output: pineapple

remove () – Eliminates an element from the list

basket.remove(‘mango’)

print(basket)

[‘apple’, ‘banana’, ‘guava’, ‘watermelon’, ‘papaya’] # mango removed from list

index () – Returns the index of first occurrence of an element in list

print(basket)

print(basket.index(‘watermelon’))

[‘apple’, ‘banana’, ‘guava’, ‘watermelon’, ‘papaya’]

3 # index starts with number 0, watermelon is at index 3

Tuple:Another data-structure is tuple, which is also an ordered collection of elements, with a difference that tuple is immutable, i.e once created, it can not be modified, i.e. we can not add / remove elements from tuple.

# Create a tuple

base_colors = (‘red’, ‘green’, ‘blue’)

print(base_colors)

(‘red’, ‘green’, ‘blue’)

Common operations of tuple:
count() : counts no. of times an element is present in tuple

(5, 5, 6, 7, 7, 5, 9).count(5)

output: 3 # element 5 appears 3 times in tuple

index() : index of first appearance element in tuple

(7, 5, 3, 4, 5, 3, 2).index(5)

Output: 1 # element 5 appears first at index 1

tuple unpacking: tuple unpacking allows extraction of elements of tuple and assign to variables

x, y = (0, 1)

print(x);
0

print(y);
1

Dictionary : My favorite, a dictionary stores data in the form of key-value pairs. Keys need to be unique (thats why they are keys).

# Create a dictionary

favorites = {‘day’: ‘Sunday’,

‘number’: 9,

‘season’: ‘spring’

}

print(favorites)

{‘day’: ‘Sunday’, ‘number’: 9, ‘season’: ‘spring’}

# Extract value of a key

print(favorites[‘day’])

Sunday
# Adding an element to dictionary

favorites[‘movie’] = ‘Sholay’

print(favorites)

{‘day’: ‘Sunday’, ‘number’: 9, ‘season’: ‘spring’, ‘movie’: ‘Sholay’}

Set : A set is an un-ordered collection of elements and a set has no duplicate elements.

# Create a set

BRIC = {‘Brazil’, ‘Russia’}

print(BRIC)

{‘Brazil’, ‘Russia’}

# adding elements to a set

BRIC.add(‘India’)

BRIC.add(‘Çhina’)

print(BRIC)

{‘Brazil’, ‘Russia’, ‘India’, ‘Çhina’}

Set Operations:

Union :
{2,3,4,5}.union({4,5,6})

Output: {2, 3, 4, 5, 6}

Intersection:

{2,3,4,5}.intersection({4,5,6})

Output: {4, 5}

Difference:

{2,3,4,5}.difference({4,5,6})

Output: {2, 3}

Numpy Array:

A numpy array is a collection of homogeneous elements.

# Create a numpy array

a = np.array([1,2,3])

print(a)

[1 2 3]

# no. of dimensions

print(a.ndim) ;
1

# Create a numpy array using np.arange()

a = np.array(np.arange(start=11, stop=23))

print(a)

[11 12 13 14 15 16 17 18 19 20 21 22]

# Convert 1 dim array to 2 dim array using reshape()

a = a.reshape((3,4))

print(a)

[[11 12 13 14]

[15 16 17 18]
[19 20 21 22]]

Pandas Series: Pandas Series is a one-dimensional labeled array. The axis labels are called index. Pandas Series is like a column in an excel sheet.

# Create Series

s = pd.Series([‘Ind’, ‘Aus’, ‘NZ’])

print(s)

0 Ind
1 Aus
2 NZ
# default index are numeric, starting with 0

# Lets create a series with custom index values

s = pd.Series(data = [‘Rohit’, ‘Dhoni’], index = [‘Mumbai Indians’, ‘Chennai Super Kings’])

print(s)

Mumbai Indians Rohit

Chennai Super Kings Dhoni

# access element based on index value

print(s[‘Mumbai Indians’])

Rohit

Pandas DataFrame : Pandas dataframe is tabular data structure, synonymous to tables we create in databases.

# Lets create a data frame from dictionary

df = pd.DataFrame({‘Team’: [‘MI’, ‘CSK’, ‘DD’, ‘KKR’],

‘Captain’: [‘Rohit’, ‘Mahendra’, ‘Ravindra’, ‘Gautam’]})

print(df)

Team Captain

0 MI Rohit
1 CSK Mahendra
2 DD Ravindra
3 KKR Gautam

DataFrame itself deserves a full length post – to be covered later

Leave a comment