Python Extras
Slide 1
Announcements
- The final exam information has been posted: https://web.stanford.edu/class/archive/cs/cs106a/cs106a.1228/assessments/final-exam/
- Tomorrow's lecture is cancelled. We will have lecture on Wednesday and Thursday.
- Chris needs to move his office hours this week. I will have office hours during the following times:
- Wednesday 11am-12pm
- Thursday 10am-12pm
- For Wednesday's lecture (the last one before our final lecture on Thursday), I'm taking suggestions for things to cover. We already have one suggestion:
git
andgithub
. If you have other suggestions, which can be Python-related or not, I'll entertain the suggestions.
Slide 2
Today's Topics
- Reading and writing binary data files
- Dictionary Comprehensions
- Sets
- Queues
- Multithreading
- try/except
- for/while else
- enumerate
Slide 3
Reading and writing binary data files
- Let's start by talking about binary files, or files that are not strictly text (e.g., images, binary programs, .zip files, etc.)
- First, let's discuss writing to a regular text file. You open the file with the
"w"
parameter, and then use the.write()
function. E.g.,>>> with open("myfile.txt", "w") as f: ... f.write("Here is some text\n") ... f.write("We must manually put newlines because 'write' does not do it automatically.\n") ... 18 76 >>>
- The
18
and76
are the return values fromf.write
and indicate how many bytes were written.myfile.txt
now looks like this:Here is some text We must manually put newlines because 'write' does not do it automatically.
- What if we wanted to open and use a file like a
.jpg
file? We can't simply do it with the regular 1-argumentopen
:>>> with open("EarthFromSpace.jpg") as f: ... data = f.read() ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte >>>
- Instead, we have to open it in binary mode:
>>> with open("EarthFromSpace.jpg", "rb") as f:
... data = f.read()
...
>>> type(data)
<class 'bytes'>
>>>
- Reading binary files lets you read any file (even text files), but you get back a
bytes
type. It is like a string, but each byte is considered a simple value between 0 and 255 (an "8-bit" value, because2**8 == 256
). - Let's look at the first three bytes of the file we read:
>>> data[:3] b'\xff\xd8\xff' >>>
- The values are numbers in hexadecimal, or base-16. If we go look up the JPEG format (or just Google
b'\xff\xd8\xff'
), we find that these first three bytes indicate that the binary file we have read is a JPEG file (and it is). - We can actually convert the bytes to an image with our PIL library (and with some help from the io library):
>>> import PIL.Image as Image >>> import io >>> image = Image.open(io.BytesIO(data)) >>> image.show() >>>
- If we wanted to write the raw bytes to another file, we can do that:
>>> with open("newimage.jpg", "wb") as f:
... f.write(data)
...
124524
>>> len(data)
124524
- So, we see that we wrote out all the bytes into the new image file.
Slide 4
Dictionary Comprehensions
- We have covered list comprehensions already, but Python also has dictionary comprehensions, which, as you can imagine, perform a similar function on dictionaries: they process a dict and return a dict.
- Here is a simple example:
>>> numbers = {'one': 1, 'two': 2, 'three': 3, 'four': 4} >>> square_numbers = {k:v**2 for (k,v) in numbers.items()} print(square_numbers) {'one': 1, 'two': 4, 'three': 9, 'four': 16} >>>
- So, what is going on here? The general form is:
dict_variable = {key:value for (key,value) in dictonary.items()}
-
In other words, the comprehension loops through all the elements in the dict and pulls out the key and value for each element, and then uses it to make some update so that a new dict can be formed.
- Just like with list comprehensions, you can filter, as well:
>>> state_details = {'state': 'CA', 'flower': 'poppy', 'bird': 'quail', 'marine mammal': 'California Gray Whale', 'motto': 'Eureka!'} >>> just_upper = {k:v for (k,v) in state_details.items() if v[0].isupper()} >>> print(just_upper) {'state': 'CA', 'marine mammal': 'California Gray Whale', 'motto': 'Eureka!'} >>>
- Dictionary comprehensions can be useful if you want to make changes to a dictionary, or to filter it. I don't use them often in my own code, but they are there if you need them.
Slide 5
Sets
- Python has a built-in data type called a
set
. Sets are a common way to store unique elements, and they are fast, too. Python sets can also be used to get intersections, unions, and other math set-like output. Here is an example:>>> s1 = set() >>> s1.add(1) >>> s1.add(5) >>> s1.add(9) >>> s1.add(5) >>> print(s1) {1, 5, 9}
- As you can see, when you add elements to a set, if the element is already in the set, it is ignored.
- You can create a set all at once using curly braces, e.g.,
s2 = {5, 8, 2}
. - Sets are not ordered in Python (they are ordered in many other languages, though):
>>> s2 = {5, 8, 2} >>> s2 {8, 2, 5}
- You can perform some neat operations using sets:
>>> s1 {1, 5, 9} >>> s2 {8, 2, 5} >>> s1.union(s2) {1, 2, 5, 8, 9} >>> s1.intersection(s2) {5} >>> s1.isdisjoint(s2) False >>> s1.isdisjoint({2, 4, 8}) True >>> s1.issubset({1, 2, 3, 4, 5, 6, 7, 8, 9}) True
- If you do want a sorted list of elements from a set, just convert it to a list first, and then use
sorted
:>>> s2 {8, 2, 5} >>> sorted(list(s2)) [2, 5, 8]
- If you want to find all the unique elements in a list, just convert to a set (but remember, it isn't sorted!):
>>> set([1, 2, 20, 6, 210, 1, 5, 9, 8, 1, 3]) {1, 2, 3, 5, 6, 8, 9, 210, 20} >>>
Slide 6
Queues
- A queue is a data structure that allows first-in-first-out access to elements. You can think of a queue as a line in a supermarket – the first person in the line is the first person handled, and the last person in line is the last person handeld. If someone comes into the line, they go to the back of the line, and have to wait for all other people to be served first.
- Python has a queue library (meaning that you need to
import queue
to use it). Here is an example:
>>> supermarket_line = queue.Queue()
>>> supermarket_line.put(4)
>>> supermarket_line.put(8)
>>> supermarket_line.put(12)
>>> supermarket_line
<queue.Queue object at 0x7f8632931d90>
>>> supermarket_line.get()
4
>>> supermarket_line.get()
8
>>> supermarket_line.get()
12
>>> supermarket_line.get()
(hangs!)
- You can only get the next element in a queue – it has no random access (like a list).
- In the example above, the
.put()
function is used to put a new element into the queue. Whensupermarket_line.get()
is called, the first element in the queue is returned. Then, the next element in is returned on the nextsupermarket_line.get()
. - It turns out that if you have an empty queue and try to
get()
from it, your program hangs (freezes)!- This is because queues are usually used in multithreading programs – see below for an example of this. If another part of your program puts something into the queue, then your
get()
will return that value.
- This is because queues are usually used in multithreading programs – see below for an example of this. If another part of your program puts something into the queue, then your
- If you know that your queue might be empty, you should check for that before calling
get()
:>>> supermarket_line = queue.Queue() >>> for n in (1, 3, 5, 7, 9): ... supermarket_line.put(n) ... >>> while not supermarket_line.empty(): ... print(supermarket_line.get()) ... 1 3 5 7 9 >>>
- You can see the size of a queue using
supermarket_line.qsize()
(you cannot uselen(supermarket_line)
), but you should generally use thewhile not supermarket_line.empty()
approach above if you are removing elements in a loop.
Slide 7
Multithreading
- Multithreading is the ablity for your program to seem like two or more different parts of the program are running simultaneously. Python comes with a threading library for this purpose. The way Python is built, two parts of your program cannot literally be running at the same time (many other languages, like C and C++ do allow this), but it suits most purposes (e.g., it is used extensively for graphics in tkinter). If you take CS111, you will cover multithreading in detail.
- Here is an example:
- The
count_up
function will be run in its own thread, and every second it will add the next higher number to thecount_queue
. - The
pull_from_queue
function pulls numbers from the queue the value it gets reachesMAX_VAL
- If there aren't any numbers in the queue, the function counts as fast as it can.
- Then it gets the next number and prints it.
main()
startscount_up
in a thread, and then asks the user to hit<return>
when ready.
- The
import time
import threading
import queue
MAX_VAL = 10
def count_up(count_queue):
for i in range(MAX_VAL + 1):
time.sleep(1) # sleep for 1 second
count_queue.put(i)
def pull_from_queue(count_queue):
print(f'Current queue size: {count_queue.qsize()}')
while True:
next_value = count_queue.get()
print(next_value)
if next_value == MAX_VAL:
break # stop the loop
if count_queue.qsize() == 0:
print("This thread can do stuff at the same time as the other thread.")
local_count = 0
while count_queue.qsize() == 0:
local_count += 1
print(f"I counted to {local_count} while waiting. Now I'll get the next value.")
def main():
count_queue = queue.Queue()
count_thread = threading.Thread(target=count_up, args=[count_queue])
count_thread.start() # this runs count_up(count_queue) in its own thread
input("When you are ready to begin, press <return>")
pull_from_queue(count_queue)
count_thread.join() # this cleans up the thread we set up
if __name__ == "__main__":
main()
Let's assume the user waited about three seconds before hitting <return>
. This might be the output:
% python3 thread_ex.py
When you are ready to begin, press <return>
Current queue size: 3
0
1
2
This thread can do stuff at the same time as the other thread.
I counted to 930072 while waiting. Now I'll get the next value.
3
This thread can do stuff at the same time as the other thread.
I counted to 1608413 while waiting. Now I'll get the next value.
4
This thread can do stuff at the same time as the other thread.
I counted to 1691159 while waiting. Now I'll get the next value.
5
This thread can do stuff at the same time as the other thread.
I counted to 1684888 while waiting. Now I'll get the next value.
6
This thread can do stuff at the same time as the other thread.
I counted to 1705034 while waiting. Now I'll get the next value.
7
This thread can do stuff at the same time as the other thread.
I counted to 1656617 while waiting. Now I'll get the next value.
8
This thread can do stuff at the same time as the other thread.
I counted to 1613523 while waiting. Now I'll get the next value.
9
This thread can do stuff at the same time as the other thread.
I counted to 1639027 while waiting. Now I'll get the next value.
10
Slide 8
try/except
- Python has the ability to catch errors that happen so your program doesn't crash. This is useful in many situations (though not all – sometimes you actually want your program to crash if the logic has gone awry).
- For example, let's say you asked for your user to type a decimal number. You might have something like this:
num = float(input("Please type a decimal number: ")) print(f"Your number: {num}")
% python3 ask_for_num.py Please type a decimal number: 4.3 Your number: 4.3
But, what if the user typed something that wasn't a number?
% python3 ask_for_num.py Please type a decimal number: abc Traceback (most recent call last): File "/Users/tofer/GoogleDriveCG/cs106a-summer-2021/website/lectures/25-python-extras/ask_for_num.py", line 1, in <module> num = float(input("Please type a decimal number: ")) ValueError: could not convert string to float: 'abc'
Your program will crash! We use the
try/except
functionality to catch the problem when it happens:try: num = float(input("Please type a decimal number: ")) print(f"Your number: {num}") except ValueError: print("You didn't type a number!")
% python3 ask_for_num.py Please type a decimal number: abc You didn't type a number!
- Here, we knew we might get a
ValueError
, so we had our programtry
a code block, and if that code block produces aValueError
, then theexcept
block runs, and our program doesn't crash. - You can see a list of exceptions that Python handles by default here.
- You can also create your own exception types, but that is rarely necessary.
- You can, if absolutly necessary, have an
except
without any specific exception (it would just beexcept:
), but you want to avoid that as you won't be able to tell what caused the error. - Here is another example:
def read_file(filename): try: with open(filename) as f: lines = f.readlines() for line in lines: print(line) except FileNotFoundError: print(f"The file '{filename}' was not found.") def main(): read_file("somefile.txt")
% python3 ask_for_num.py The file 'somefile.txt' was not found.
Slide 9
for/while else
- Another Python feature we are going to look at is another one that I have rarely used, but that can make looping code easier.
- Python is the only language I know of that has an
else
clause for both thefor
loop and thewhile
loop. It is used if you want to do something if your loop exits normally (e.g., when the top-line condition causes the loop to stop). Here is an example:>>> while a < 5: ... print(a) ... a += 1 ... else: ... print("The loop made it without breaking out") ... 0 1 2 3 4 The loop made it without breaking out >>>
Here is a converse example:
>>> a = 0 >>> while a < 5: ... print(a) ... a += 1 ... if a == 2: ... break ... else: ... print("The loop made it without breaking out") ... 0 1 >>>
- You can also use a similar construct with a
for
loop, and it is somewhat more useful. For example, let's say you had a list and wanted to loop through it until you got to a particular value, but stop once you reach that value. If the value isn't in the list, you want to do something else. E.g.,lst = [1, 3, 5, 7, 9] found_val = False for val in lst: if val == 5: print("Found 5!") found_val = True break if not found_val: print("Did not find 5 :(")
- This necessitates a boolean
found_val
, which is a bit ugly. - Instead, you could do the following:
lst = [1, 3, 5, 7, 9] for val in lst: if val == 5: print("Found 5!") break else: print("Did not find 5 :(")
- No more need for the boolean. If the loop exited normally (by going through all the values in
lst
), then theelse
block runs. - Note that the choice of the word
else
was probably a bad one (and the creator of Python has admitted as much). Once you understand it, it is useful, but seeing it in someone elses code (no pun intended) is often somewhat jarring.
One more thing: enumerate
- The last topic we are going to cover today is one that I actually use quite frequently in my own code. Sometimes, you want to loop through some list or other iterable, but you want both the elements from the list and you want the index of the element you are on. We've often done this the following way – we've looped over a range and then extracted the value, e.g., for a string:
s = "hello"
for i in range(len(s)):
c = s[i]
print(i, c)
Output:
0 h
1 e
2 l
3 l
4 o
- This is great, but there is an easier way, using
enumerate
:s = "hello" for i, c in enumerate(s): print(i, c)
-
This accomplishes the same thing, and we don't have to manually extract the character using the index (we also don't need a range).
- We can also use
enumerate
on other data structures that don't index directly, like sets. Could we do this?
my_set = {'hello', 'goodbye', 'seeya', 'toodleoo'}
for i in range(len(my_set)):
# get the string associated with i?
s = my_set[i] # error!
- Nope! There is no way to index into a set. But, if we cared to count along while we extracted elements from the set, we could use
enumerate
(but remember, sets are not ordered in Python!):
my_set = {'hello', 'goodbye', 'seeya', 'toodleoo'}
for i, s in enumerate(s):
print(i, s)
Output:
0 seeya
1 goodbye
2 hello
3 toodleoo