Python. You might be thinking I know what a list is, I have used it numerous times. But there is a difference between using a list and using a list efficiently. So, I will show you what I have learned as a Software Engineer.
We can create a list basically two ways:
using
list()
constructorusing
[]
list comprehension (or using square brackets you might say)
There is no difference in output when we use these methods. Let's see!
Add/create a list using list comprehension []
def list_comprehension() -> list:
return [num for num in range(10)]
data = list_comprehension()
print(data)
"""OUTPUT:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
"""
Add/create a list using for
loop
def list_for() -> list:
data_list = []
for num in range(10):
data_list.append(num)
return data_list
data = list_for()
print(data)
"""OUTPUT:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
"""
As I said, there is no difference. We will see the difference when we try to add a huge amount of data. Let's see!
Time took to add data for list comprehension
import time
def time_process(func):
"""Measure time"""
def inner():
start_time = time.time()
func()
end_time = time.time()
elapsed_time = end_time - start_time
print('Execution time:', time.strftime("%H:%M:%S", time.gmtime(elapsed_time)))
return inner
@time_process
def list_comprehension() -> list:
"""Return list comprehension"""
return [num*2 for num in range(1000000000)]
list_comprehension()
"""OUTPUT:
Execution time: 0:00:39
"""
Time took to add data for the for
loop
@time_process
def list_for() -> list:
data = []
for num in range(1000000000):
data.append(num*2)
return data
list_for()
"""OUTPUT:
Execution time: 0:00:52
"""
You might say this is not that big of a difference. But it is! In real-time it makes a difference. And when you don't know how much data you will receive from DB or API, comprehension is the best option to create a list.
You might say, what if we have condition?
If you need to use only an if
then comprehension got your back!
Comprehension with for
loop
@time_process
def list_comprehension() -> list:
return [num for num in range(1000000000) if num%2 == 0]
list_comprehension()
"""OUTPUT:
Execution time: 0:00:35
"""
Append with for
loop
@time_process
def list_for() -> list:
data = []
for num in range(1000000000):
if num%2 == 0:
data.append(num)
return data
list_for()
"""OUTPUT:
Execution time: 0:00:42
"""
This is about writing, what about reading, you might say! To read data efficiently, instead of returning a list
we can use generators
.
Why? You might ask!
@time_process
def read_from_generator():
"""Return list comprehension"""
generator_data = (num for num in range(1000))
return generator_data
"""OUTPUT:
AVG time took to read: 0.03886222839 ms
"""
@time_process
def read_from_list():
"""Return list generated by for without comprehension"""
list_data = list(num for num in range(1000))
return list_data
"""OUTPUT:
AVG time took to read: 0.04118919373 ms
"""
To read 1000000000
data from generators, it took 23 sec, whereas it took 53 secs for List. But for the time being, let's ignore the time they took to read the data.
Let me give you an inside secret if you don't need to re-read the data always use generators. Why? You might ask!
The reason to use generators is, Generators tend to be very memory efficient. Generators return data only when it is needed or asked for. Let's see with an example.
# read 3 values from generator
from_generator = read_from_generator()
for i in from_generator:
print(i, end=' ')
print('\nCompleted first read!')
for i in from_generator:
print(i, end=' ')
print('\nCompleted last read!')
"""OUTPUT:
0 1 2
Completed first read!
Completed last read!
"""
# read 3 values from generator
from_list = read_from_list()
for i in from_list:
print(i, end=' ')
print('\nCompleted first read!')
for i in from_list:
print(i, end=' ')
print('\nCompleted last read!')
"""OUTPUT:
0 1 2
Completed first reaad!
0 1 2
Completed last reaad!
"""
In the above code block, if I get data from read_from_generator
method and iterate through that and I get data from read_from_list
and iterate through that, I will not see any difference. But now if I try to iterate through both of them again, I will not see any iterations in the case of generators.
If you want the data again, you need to call read_from_generator
method again. You might say why the hell do I want that. The reason is MEMORY EFFICIENCY!
Generators don't keep all the data in the memory all the time, unlike lists. Once you read the data it is gone. So, if you know you know you will iterate through data only once, always choose Generators.
I ran every method we discuss in this blog 25 times and the below table is the result of that.
List Comprehension | Without List Comprehension | List comprehension and if | Without List comprehension and if | Read from Generator | Read from List |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:56 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:57 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:41 | 0:00:23 | 0:01:01 |
0:00:38 | 0:00:52 | 0:00:34 | 0:00:42 | 0:00:23 | 0:01:01 |
0:00:40 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:53 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:53 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:51 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:41 | 0:00:23 | 0:00:51 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:53 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:34 | 0:00:41 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:41 | 0:00:23 | 0:00:53 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:41 | 0:00:23 | 0:00:52 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:42 | 0:00:23 | 0:00:53 |
0:00:38 | 0:00:52 | 0:00:35 | 0:00:41 | 0:00:23 | 0:00:53 |
0:00:39 | 0:00:52 | 0:00:35 | 0:00:41 | 0:00:23 | 0:00:52 |
Conclusion
Always use List comprehension and Generators whenever possible.