Edit: 1.11.2020 - Add scenarios where list comprehensions replace map
and filter
functions.
Previously, we saw if-statements expressed in one-line, for example:
y = []
# Falsy
print("Truthy") if y else print("Falsy")
We can also write for-loops in one-line. And thats a way to think about list comprehensions
.
# traditional for-loop; [0, 2, 4]
num = []
for x in range(5):
if x % 2 == 0:
num.append(x)
num # call num
# list comprehension, provides the same thing
# [0, 2, 4]
[x for x in range(5) if x % 2 == 0]
Here are some examples from Data Science from Scratch:
# [0, 2, 4]
even_numbers = [x for x in range(5) if x % 2 == 0]
# [0, 1, 4, 9, 16]
squares = [x * x for x in range(5)]
# [0, 4, 16]
even_squares = [x * x for x in even_numbers]
Dan Bader provides a helpful way to conceptualizing list comprehensions
:
(values) = [ (expression) for (item) in (collections) ]
A good way to understand list comprehensions
is to de-construct it back to a regular for-loop:
# recreation of even_numbers; [0, 2, 4]
even_bracket = []
for x in range(5):
if x % 2 == 0:
even_bracket.append(x)
# recreation of squares; [0, 1, 4, 9, 16]
square_bracket = []
for x in range(5):
square_bracket.append(x * x)
# recreate even_squares; [0, 4, 16]
square_even_bracket = []
for x in even_bracket:
square_even_bracket.append(x * x)
Moreover, list comprehensions also allow for filtering with conditions. Again, we can understand this with a brief comparison with the for-loop.
# traditional for-loop
filtered_bracket = []
for x in range(10):
if x > 5:
filtered_bracket.append(x * x)
# list comprehension
filtered_comprehension = [x * x
for x in range(10)
if x > 5]
The key take-away here is that list comprehensions
follow a pattern. Knowing this allows us to better understand how they work.
values = [expression
for item in collection
if condition]
Python also supports dictionaries or sets comprehension, although we'll have to revisit this post as to why we would want to do this in a data wrangling, transformation or analysis context.
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
square_dict = {x: x * x for x in range(5)}
# {1}
square_set = {x * x for x in [1,-1]}
Finally, comprehensions can include nested for-loops:
pairs = [(x,y)
for x in range(10)
for y in range(10)]
We will expect to use list comprehensions
often, so we'll revisit this section as we see more applications in context.
Map, Filter, Reduce, Partial
In the first edition of this book the author introduced these functions, but has since reached enlightenment ๐ง, he states:
"On my journey toward enlightenment I have realized that these functions (i.e., map, filter, reduce, partial) are best avoided, and their uses in the book have been replaced with list comprehensions, for loops and other, more Pythonic constructs." (p.36)
He's being facetious, but I was intrigued anyways. So here's an example replacing map with list comprehensions.
# create list of names
names = ['Russel', 'Kareem', 'Jordan', 'James']
# use map function to loop over names and apply an anonymous function
greeted = map(lambda x: 'Hi ' + x, names)
# map returns an iterator (see also lazy evaluation)
print(greeted) # <map object at 0x7fc667c81f40>
# because lazy evaluation, won't do anything unless iterate over it
for name in greeted:
print(name)
#Hi Russel
#Hi Kareem
#Hi Jordan
#Hi James
## List Comprehension way to do this operation
greeted2 = ['Hi ' + name for name in names]
# non-lazy evaluation (or eager)
print(greeted2) # ['Hi Russel', 'Hi Kareem', 'Hi Jordan', 'Hi James']
Here's another example replacing filter with list comprehensions:
# create list of integers
numbers = [13, 4, 18, 35]
# filter creates an interator
div_by_5 = filter(lambda num: num % 5 == 0, numbers)
print(div_by_5) # <filter object at 0x7fc667c9ad30>
print(list(div_by_5)) # must convert iterator into a list - [35]
# using list comprehension to achieve the same thing
another_div_by_5 = [num for num in numbers if num % 5 == 0]
# lists do not use lazy evaluation, so it will print out immediately
print(another_div_by_5) # [35]
In both cases, it seems list comprehensions
is more pythonic and efficient.
For more content on data science, machine learning, R, Python, SQL and more, find me on Twitter.