Lecture 23: Sorting

July 29th, 2021


Today: lambda output, lambda/def, custom sorting with lambda, wordcount sorting, introduction to modules

Lambda - You The Power Hacker

Lambda is powerful feature, letting you express a lot of computation in very little space. As a result, it's weird looking at first, but when it clicks, you should feel like a Power Hacker when you wield it.

These are well suited to little in-class exercises .. just one line long. Not easy, but they are short!

Lambda - Code as a Parameter

Countless times, you have called a function and passed in some data for it to use. The function name is the verb, and the parameters are extra nouns to guide the computation:

e.g. "draw_line" is the verb, with these int coords

canvas.draw_line(0, 0, 100, 50, color='red')

With lambda, we open up a new category, passing in code as the parameter for the function to use, e.g. with map():

map(lambda s: s.upper() + '!',
    ['pass', 'code']) ->
 ['PASS!', 'CODE!']

Having an easy way to pass code between functions can be very handy.


Recall: Lambda Steps

1. The word "lambda"

2. What type of element? - choose a good name for the parameter: n:, s:, ...

3. Write expression to produce, no "return"

Map With Type-Change

> lambda1 section

The output list does not need to have the same element type as the input list. The lambda can output any type it likes, and that will make the output list. See examples: super_tuple() and lens()

Example: lens(strs)

> lens()

lens(strs): Given a list of strings. return a list of their int lengths.

Solution

def lens(strs):
    return map(lambda s: len(s), strs)

Lambda vs. Def

Lambda and def are similar:

def double(n):
    return n * 2

Equivalent lambda

lambda n: n * 2

Def Features

Def vs. Lambda

map/def Example - map_parens()

> map_parens()

In lambda1, see the map_parens() problem.

['xx(hi)xx', 'abc(there)xyz', 'fish'] ->
  ['hi', 'there', 'fish']

Solution Code. map() works with "parens" by name

def parens(s):
    left = s.find('(')
    right = s.find(')', left)
    
    if left == -1 or right == -1:
        return s
    return s[left + 1:right]


def map_parens(strs): return map(parens, strs)


Lambda-2 Examples

> lambda-2 section

The first of these work on a list of (x, y) tuples. These are a little more complicated but packing even more power into the one line.

xy_sum()

> xy_sum()

xy_sum(points): Given a list of len-2 (x, y) tuples. Return a list of the sums of each tuple. Shows that the result-list does not need to hold the same type as the input list. Solve with a map/lambda.

[(4, 2), (1, 2) (2, 3)]  ->  [6, 3, 5]

The input is a list of points, that is a list of (x, y) tuples. Q: What type is the param to the lambda? A: one "point" (x, y) tuple

Solution

def xy_sum(points):
    return map(lambda point: point[0] + point[1], points)

xs()

> xs()

Given a points list of len-2 (x, y) tuples. Return a list of just the x value of each tuple. Solve with a map/lambda.

Solution

def xs(points):
    return map(lambda point: point[0], points)

min_x()

> min_x()

Given a non-empty list of len-2 (x, y) tuples. What is the leftmost x among the tuples? Return the smallest x value among all the tuples, e.g. [(4, 2), (1, 2) (2, 3)] returns the value 1. Solve with a map/lambda and the builtin min(). Recall: min([4, 1, 2]) returns 1

[(4, 2), (1, 2), (2, 3)]  -> 1

Solution

def min_x(points):
    return min(map(lambda point: point[0], points))
    # Use map/lambda to form a list of
    # just the x coords. Feed that into min()

Custom Sort - Power Feature

Python Custom Sort - Food Examples

We'll try these food examples in the interpreter.

Default sorted()

By default sorted() works on list of tuples, compares [0] first, then [1], and so on

>>> foods = [('radish', 2, 8), ('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10)]
>>> 
>>> # By default, sorts food tuples by [0]
>>> sorted(foods)
[('apple', 7, 9), ('broccoli', 6, 10), ('donut', 10, 1), ('radish', 2, 8)]
>>> 

Sort By Tastiness

alt: circle tastiness for sorting

Project Out Sort-By Values

alt: project out tasty values per food

Project Out With Lambda

alt: lambda food: food[1]

Custom Sort Lambda - Plan

Q: What is the parameter to the lambda?

A: One elem from the list (similar to map() function)

Sort By Tasty

>>> foods = [('radish', 2, 8), ('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10)]
>>> 
>>> sorted(foods, key=lambda food: food[1])
[('radish', 2, 8), ('broccoli', 6, 10), ('apple', 7, 9), ('donut', 10, 1)]

Most Tasty (reverse=True)

>>> sorted(foods, key=lambda food: food[1], reverse=True)  # most tasty
[('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10), ('radish', 2, 8)]

Most Healthy

>>> sorted(foods, key=lambda food: food[2], reverse=True)  # most healthy
[('broccoli', 6, 10), ('apple', 7, 9), ('radish', 2, 8), ('donut', 10, 1)]

Most tasty * healthy

Not limited to just projecting out existing values. We can project out a computed value. Here we compute tasty * healthy and sort on that. So apple is first, 7 * 9 = 63, broccoli is second with 6 * 10 = 60. Donut is last :(

>>> sorted(foods, key=lambda food: food[1] * food[2], reverse=True)
[('apple', 7, 9), ('broccoli', 6, 10), ('radish', 2, 8), ('donut', 10, 1)]
>>>

Sorted vs. Min Max

>>> foods = [('radish', 2, 8), ('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10)]
>>> max(foods)     # uses [0] by default - tragic!
('radish', 2, 8)
>>> 
>>> sorted(foods, key=lambda food: food[1])
[('radish', 2, 8), ('broccoli', 6, 10), ('apple', 7, 9), ('donut', 10, 1)]
>>> 
>>> max(foods, key=lambda food: food[1])  # most tasty
('donut', 10, 1)
>>> min(foods, key=lambda food: food[1])  # least tasty
('radish', 2, 8)

Key performance point: computing one max/min element is much faster than sorting all n elements.

Python Custom Sort String Examples

>>> # The default sorting is not good with upper/lower case
>>> strs = ['coffee', 'Donut', 'Zebra', 'apple', 'Banana']
>>> sorted(strs)
['Banana', 'Donut', 'Zebra', 'apple', 'coffee']

String Sort Lambda

>>> strs = ['coffee', 'Donut', 'Zebra', 'apple', 'Banana']
>>> 
>>> sorted(strs, key=lambda s: s.lower())    # not case sensitive
['apple', 'Banana', 'coffee', 'Donut', 'Zebra']
>>> 
>>> sorted(strs, key=lambda s: s[len(s)-1])  # by last char
['Zebra', 'Banana', 'coffee', 'apple', 'Donut']
>>> 

Movie Examples

Given a list of movie tuples, (name, score, date-score), e.g.

[('alien', 8, 1), ('titanic', 6, 9), ('parasite', 10, 6), ('caddyshack', 4, 5)]

sort_score(movies)

> sort_score()

Given a list of movie tuples, (name, score, date-score), where score is a rating 1-10, and date 1-10 is a rating as a "date" movie. Return a list sorted in increasing order by score.

sort_date(movies)

> sort_date()

Given a list of movie tuples, (name, score, date-score), where score is a rating 1-10, and date-score 1-10 is a rating as a "date" movie. Return the list sorted in decreasing by date score.


Put It All Together - WordCount + Sorted

Look at wordcount project, apply custom sorting to the output stage.

Sorted vs. Dict Count Items

>>> items = [('z', 1), ('a', 3), ('e', 11), ('b', 3), ('c', 2)]
>>> items = [('z', 1), ('a', 3), ('e', 11), ('b', 3), ('c', 2)]
>>> 
>>> # sort by [0]=word is the default
>>> sorted(items)
[('a', 3), ('b', 3), ('c', 2), ('e', 11), ('z', 1)]
>>> 
>>> sorted(items, key=lambda pair: pair[1])   # sort by count
[('z', 1), ('c', 2), ('a', 3), ('b', 3), ('e', 11)]
>>> 
>>> sorted(items, key=lambda pair: pair[1], reverse=True)
[('e', 11), ('a', 3), ('b', 3), ('c', 2), ('z', 1)]
>>> 
>>> max(pairs, key=lambda pair: pair[1])      # largest count
('e', 11)

Wordcount - Top-Count - Lambda

Here is the WordCount project we had before. This time look at the print_counts() and print_top() functions.

> wordcount.zip

print_counts() - Alphabetic Output

Here is the output of the regular print_counts() function, which prints out in alphabetic order. Output looks like:

$ python3 wordcount.py poem.txt 
are 2
blue 2
red 2
roses 1
violets 1
$

print_counts() Solution

This is the standard dict-output sorted loop.

def print_counts(counts):
    """
    Given counts dict, print out each word and count
    one per line in alphabetical order, like this
    aardvark 1
    apple 13
    ...
    """
    for word in sorted(counts.keys()):
        print(word, counts[word])
    # Alternately use .items() to access all the key/value data
    # for key, value in sorted(counts.items()):
    #    print(key, value)

print_top()

The print_top(counts, n) function - print the n most common words in decreasing order by count.

$ python3 wordcount-solution.py -top 10 alice-book.txt 
the 1639
and 866
to 725
a 631
she 541
it 530
of 511
said 462
i 410
alice 386

print_top() Solution

def print_top(counts, n):
    """
    Given counts dict and int N, print the N most common words
    in decreasing order of count
    the 1045
    a 672
    ...
    """
    items = counts.items()
    # Could print the items in raw form, just to see what we have
    # print(items)
    pass
    # Your code - my solution is 3 lines long, but it's dense!
    # Sort the items with a lambda so the most common words are first.
    # Then print just the first N word,count pairs with a slice
    # 1. Sort largest count first
    items = sorted(items, key=lambda pair: pair[1], reverse=True)
    # 2. Slice to grab first N
    for word, count in items[:n]:
        print(word, count)

Modules and Modern Coding

May or may not get to this

Standard Modules - import math

>>> import math
>>> math.sqrt(2)  # call sqrt() fn
1.4142135623730951
>>> math.sqrt

>>> 
>>> math.log(10)
2.302585092994046
>>> math.pi       # constants in module too
3.141592653589793

Quit and restart the interpreter without the import, see common error:

>>> # quit and restart interpreter
>>> math.sqrt(2)  # OOPS forgot the import
Traceback (most recent call last):
NameError: name 'math' is not defined
>>>
>>> import math
>>> math.sqrt(2)  # now it works
1.4142135623730951

Random Module Exercise

Try "random" module. Import it, call its "randrange(20)" function.

>>> import random
>>> 
>>> random.randrange(4)
3
>>>