Today: Debugging 1-2-3, debug printing, string upper/lower, string case-sensitive, reversed, movie example, grid, grid testing, demo HW3
Is this course going to get harder and harder each week until nobody is left? Mercifully, no! We're actually going to settle down a little from here on out.
Is "Run Button Fever" real? If you are stuck on part of the homework, re-read the words for that section of the handout carefully. Often the handout talks about common pitfall cases.
How do you know if an index number is valid in a string? In other words, is this index number too big? This can be written with < len(s)
:
if 3 < len(s):
# 3 is valid in s
...
Recall the "winning1" function we did, where in one case the result is 10x the score, and the other case 12x. Here is a solution to it. What is the essential structure of this code? There's a test depending on the score, and from that it does one return or the other.
def winnings1(score): if score < 5: return score * 10 else: return score * 12
I mentioned in the if/else discussion that I find "else" seems to attract bugs a little out of proportion. Therefore, I have a slight preference to write solutions without an else when that is possible.
Here is a version of the function above without the else. It still has the essential structure of one test selecting between two returns, but it does not use "else".
def winnings1(score): if score < 5: return score * 10 return score * 12
The key here is the return. Since the return exits the function immediately, as soon as an if-statement is True
, the function skips all the other cases below. If an if-test is False
, the run carries on to the next case below, which is at the same indentation.
You can generalize this to a series of pick-offs, like the below. It's like a vertical to-do list of tests, and the code will just go through them order until one is True
. The cases form a simple, vertical series to try in that order.
def pick_offs(x): if *case1*: return *answer1* if *case2*: return *answer2* if *case3*: return *answer3* return *nocaseworked*
This is a good code pattern, and it can be used in several places in homework 3. Essentially it's a vertical list of cases to try, and the run will go through them from top to bottom until one works.
We're going to weave together two things below - show some string algorithms, but also show some debugging techniques.
To write code is to see a lot of bugs. We'll mention the 3 debug techniques here, and do some concrete examples of two of these below.
For more details, see the Python Guide Debugging chapter
An "exception" in Python represents an error during the run that halts the program. Read the exception message and line number. Read the exception text (which can be quite cryptic looking), from the bottom up. Start the bottom of the error report, looking for the first line of code that is your code and read the error message which can be quite helpful. Go look at the line of your code. Many bugs can be fixed right there, just knowing the error message and which line caused it.
Don't ask "why is this not working?". Ask "why is the code producing this output?".
The code and the output are not shifting around - they are crisp and repeatable, just sitting there. Look at the first part of the output which is wrong. What line of the code produced that?
This can work well with Doctests which can show you what you need with one click. Run the Doctest and you have the code, the input, and the output all to work with. It can also be handy to write a small Doctest.
We talked about writing a simple, obvious test case as a first Doctest. e.g. for the alpha_only() function that returns the alphabetic chars from a string, an input like '@Ax4**y'
, looking for output 'Axy'
. That's fine. For debugging, sometimes it's nice to add a tiny test that still shows the bug, maybe '@A'
- the loops and everything run so few times, there's less chaos to see through.
Sometimes looking at the code to see how it produced the output is too hard! In that case try print() below.
This is a more rarely used technique. Instead of tracking the code in your head, add print() to see the state of the variables. It's nice to just have the computer show the state of the variables in the loop or whatever. This works on the experimental server and in PyCharm - demo below.
Note that return is the formal way for a function to produce a result. The print() function does not change that. Print() is a sort of side-channel of text, alongside the formal result. We'll study this in more detail when we do files.
The experimental server shows the function result first, then the print output below. This can also work in a Doctest. Be sure to remove the print() lines when done - they are temporary scaffolding while building.
We're going to think about the little details of characters. Some characters have uppercase and lowercase versions, e.g. 'A'
vs. 'a'
, and some chars just have the one form like '@'
.
'a' # Lowercase 'b' # Lowercase 'A' # Uppercase relative of 'a' '@' # Doesn't have case
s.upper() s.lower() s.isupper() s.islower()
>>> # Return with all chars converted to upper form >>> 'Kitten123'.upper() 'KITTEN123' >>> 'Kitten123'.lower() 'kitten123' >>> >>> 'a'.upper() 'A' >>> 'A'.upper() 'A' >>> '@'.upper() '@' >>> >>> 'A'.isupper() True >>> 'a'.isupper() False >>> 'a'.islower() True >>> '@'.islower() False >>> 'ab@'.islower() # '@' ignored by .islower() True >>>
Recall - strings are "immutable", which means that once created they are not changed — no changing individual characters in the string and no adding or removing characters.
Suppose we have a variable storing a string: s = 'Hello'
Calling a function like s.upper()
returns a new answer string, but the original string is always left unchanged. This is a very common point of confusion. The .upper()
function is being called, so it's easy to get the impression that the string is changed.
We can write code in the interpreter to see this immutable vs. function call in action.
>>> s = 'Hello' >>> s.upper() # Returns uppercase form of s 'HELLO' >>> >>> s # Original s unchanged 'Hello' >>> >>> s + '!!!' # Returns + form 'Hello!!!' >>> >>> s # Original s unchanged 'Hello' >>>
x = change(x)
So how do you change a string variable? Each time we call a function to compute a changed string, use =
to change the variable, say s
, to point to the new string.
Say we have a string s, and want to change it to be uppercase and have '!!!'
at its end. Here is code that works to change s, using the x = change(x)
pattern.
>>> s = 'Kitten' >>> s = s.upper() # Compute upper, assign back to s >>> s 'KITTEN' >>> s = s + '!!!' >>> s 'KITTEN!!!' >>>
Mnemonic: x = change(x)
'A'
vs. 'a'
The chars 'A'
and 'a'
are two different characters. This is called "case sensitive" and is the default behavior in the computer.
>>> 'A' == 'a' False >>> s = 'red' >>> s == 'red' True >>> s == 'Red' # Must match exactly False
If we ask you to write some string code, and don't say anything about upper/lower case, assume it should be case-sensitive.
Computing something not case-sensitive means the logic treats uppercase and lowercase versions of a char as being equal. For example, if you are looking at a web page and search for the word 'dog'
, you would expect 'Dog'
and 'DOG'
to count as a matches. That's "not case-sensitive" logic, and it's what regular people expect.
only_ab(): Given string s. Return a string made of the 'a'
and 'b'
chars in s. The char comparisons should not be case sensitive, so 'a' and 'A' and 'b' and 'B' all count. Use the string .lower() function.
'aABBccc' -> 'aABB'
Strategy: write the code using s[i].lower()
to look at the lowercase form of each char in s.
Here is the case-sensitive approach using boolean or
, which detects only the chars 'a'
and 'b'
. Use this as a starting point.
def only_ab(s): result = '' for i in range(len(s)): if s[i] == 'a' or s[i] == 'b': result += s[i] return result
Here's an opportunity to demonstrate Debug technique-2 — look at the "got" output and the code that produced it.
In this case the output is:
only_ab('aaABBbccc') -> 'aab' Expected output: 'aaABBb'
Look at the "got" output. We can see the code that produced it at the same time.
Key question: Where does the output first go wrong vs. the expected? In this case it fails to grab the first 'A'
. Look at the code. Why is the 'A'
missing?
In reality, when not working, your thoughts are like "why is this stupid thing not working?" But as a practical matter, Looking at the got output and the code that produced it is the path to fixing the code.
Here's another exercise involving upper/lower logic.
'12abc$z' -> 'ABCZ'
Given string s. Return a string made of all the alphabetic chars in s, converted to uppercase form.
Use string functions .isalpha() and .upper()
reversed()
FunctionThe Python built-in reversed() function: return reversed form of a sequence such as from range().
Here is how reversed() alters the output of range():
range(5) -> 0, 1, 2, 3, 4 reversed(range(5)) -> 4, 3, 2, 1, 0
This fits into the regular for/i/range idiom to go through the same index numbers but in reverse order:
for i in reversed(range(5)):
# i in here: 4, 3, 2, 1, 0
For more detail, see the guide Python range()
The reversed() function appears in part of homework-3.
Say we want to compute the reversed form of a string:
'Hello' -> 'olleH'
There are many ways to do this, and we might make a study of it later. Here is a plan for today:
1. Start with the regular double_char() code, but change it to add a single s[i]
per iteration, so it makes a plain copy of the input string.
def reverse2(s): result = '' for i in range(len(s)): result += s[i] return result
2. Add reversed()
to the loop: reversed(range(len(s)))
i goes through: 4, 3, 2, 1, 0
3. We have result += s[i]
in the loop, and i
is going through the indexes last to first. This adds the last char 'o'
, then the next to last char 'l'
, and so on until it gets to 'H'
. So in effect, it builds a reversed version of the string.
Write the code with that plan, then see next section.
There's actually a whole section of reverse string problems we may play with later - trying out various techniques.
At this spot, we can look at Debug-3 technique — add print() inside the code temporarily to get visibility into what the code is doing. This works on the experimental server and in Doctests. The printing will make the Doctests fail, so it should only be in there temporarily.
Here's the reverse2() code with print() added
def reverse2(s): result = '' for i in reversed(range(len(s))): result += s[i] print(i, s[i], result) return result
Heres's what the output looks like in the experimental server - it shows the formal result first, and the print() output below that. This is kind of beautiful, revealing what's going on inside the loop:
'olleH' 4 o o 3 l ol 2 l oll 1 e olle 0 H olleH
This is a more rarely used technique, but it can be very powerful.
Suppose you have a bug and the code is not computing what it's supposed to. First you just look at the output and try to just see what the bug is. Sometimes that is enough. In your mind, you are thinking about what s[i]
is going to be for each loop - a thought experiment.
However if you are staring at the code and cannot figure out the bug, you could put some print() calls in there and it will show you exactly what s[i]
is for each run of the loop. This can be a very clarifying technique if you are not spotting the bug at first. Instead of using your brain to think what's going on with s[i]
, just let the computer show you.
This works with Doctests too - the printed output appears in the Doctest window. Unfortunately, the printed output interferes with the Doctest success/fail logic, causing it to always fail, even if the code is correct. So you can print() temporarily to see what's going on, but you need to remove it when you are done.
Today we will use this "movie" example and exercise: movie.zip
The movie-starter.py file is the code with bugs, and movie.py is a copy of that to work on, and movie-solution.py has the correct code.
"Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin."
-John von Neumann (early CS giant)
Computing random numbers with a computer turns out to be a real problem.
A computer program is "deterministic" - each time you run the lines with the same input, they do exactly the same thing.
>>> x = 6 >>> >>> x = x + 1 >>> x 7 >>>
Every time the code runs, the answer is the same.
Repeatable - on a related note, our black-box functions are "repeatable" - calling a function with the same input returns the same output every time. e.g. 'hello'.upper() -> 'HELLO'
Creating random numbers with deterministic code and inputs is impossible, so we settle for pseudorandom numbers. These are numbers which are statistically random looking, but in fact are generated by a deterministic algorithm producing each "random" number in turn. Running the algorithm with the same inputs will yield the same "random" series of numbers again.
Aside: it is possible to create true random numbers by measuring a random physical process - getting the randomness from outside the determinism of the computer. Someday I would like to run a seminar where we build such a device as a project.
>>>
For more details, see the Python Guide Interpreter chapter
Ways to get an interpreter (apart from the experimental server)
1. With an open PyCharm project, click the "Python Console" button at the bottom .. that's an interpreter.
2. In the command line for your computer, type "python3" ("py" on Windows), and that runs the interpreter directly. Use ctrl-d (ctrl-z on windows) to exit.
Keep in mind that there are two different places where you type commands - your computer command line where you type commands like "date" or "pwd". Then there's the Python interpreter with the >>>
prompt where you type python expressions.
Try random module in the "Python Console" tab at the lower-left of your PyCharm window to get an interpreter. This won't work right in the experimental server interpreter, so try PyCharm.
>>> import random # hw3 starter code has this already >>> >>> random.randrange(10) 1 >>> random.randrange(10) 3 >>> random.randrange(10) 9 >>> random.randrange(10) 1 >>> random.randrange(10) 8 >>> random.choice('doofus') 'o' >>> random.choice('doofus') 'u' >>> random.choice('doofus') 'o' >>> random.choice('doofus') 'o' >>> random.choice('doofus') 's' >>> random.choice('doofus') 's' >>> random.choice('doofus') 'o' >>> random.choice('doofus') 's'
The code for this one is provided to fill in letters at the right edge, so we'll just look at it. Demonstrates some grid code for the movie problem. We're not testing this one - testing random behavior is a pain, although it is possible.
def random_right(grid):
"""
Set the right edge of the grid to some
random letters from 'doofus'.
(provided)
"""
for y in range(grid.height):
if random.randrange(10) == 0: # 10% of the time
ch = random.choice('doofus')
grid.set(grid.width - 1, y, ch)
return grid
Think about scroll_left()
def scroll_left(grid): """ Implement scroll_left as in lecture notes. """ # v1 - has bugs for y in range(grid.height): for x in range(grid.width): # Move char at x,y leftwards ch = grid.get(x, y) if ch != None and grid.in_bounds(x - 1, y): grid.set(x - 1, y, ch) return grid
Need concrete cases to write Doctest. They can be small! An input grid, and the expected output grid. That's what makes one test case - an input and expected. We could also call these "before" and "after" pictures.
Doctest input grid (before)
[['a', 'b', 'c'], ['d', None, None]]
Doctest expected grid (after)
[['b', 'c', None], [None, None, None]]
If the got differs in any little way from the expected, the test fails, e.g. having an extra space, or using "
instead of the expected '
.
# Don't write the expected output like this - will fail [["b", "c", None], [None, None, None]] # Write it exactly syntactically as the function returns it [['b', 'c', None], [None, None, None]]
Run the Doctest to debug the code.
def scroll_left(grid): """ Implement scroll_left as in lecture notes. >>> grid = Grid.build([['a', 'b', 'c'], ['d', None, None]]) >>> scroll_left(grid) [['b', 'c', None], [None, None, None]] """
Here is the failed Doctest, compare output to expected:
Expected: [['b', 'c', None], [None, None, None]] Got: [['b', 'c', 'c'], ['d', None, None]]
See that v1 fails to erase where the 'c'
moved from.
How do you debug a function? Run its small, frozen, visible Doctests, look at the output, expected and the code - all of which the Doctest makes visible.
Here is the code with bugs fixed and the Doctest now passes.
def scroll_left(grid):
"""
Implement scroll_left as in lecture notes.
>>> grid = Grid.build([['a', 'b', 'c'], ['d', None, None]])
>>> scroll_left(grid)
[['b', 'c', None], [None, None, None]]
"""
for y in range(grid.height):
for x in range(grid.width):
# Move letter at x,y leftwards
ch = grid.get(x, y)
if ch != None and grid.in_bounds(x - 1, y):
grid.set(x - 1, y, ch)
grid.set(x, y, None)
return grid
$ python3 movie.py $ $ python3 movie.py 80 40 # bigger window
To debug, we want output which is: small, frozen, and visible
The Doctest gives us exactly this.
Looking at the failing Doctest, we have the got output the expected and the code - looking at these three is a good step for debugging.
The failing Doctest is like a to-do item — what is the first bit of the got that is wrong? What line produced that?
Note also that the data for the Doctest case is small and made visible by the system. It's not moving around. We can take our time. Contrast this to watching the animation.
Here is an additional Movie example for more practice.
This example function sets all the squares on the left edge to 'a'
, and also all the squares on the right edge.
Implement set_edges(), then write Doctests for it. We're not doing this in lecture, but it's an example.
def set_edges(grid): """ Set all the squares along the left edge (x=0) to 'a'. Do the same for the right edge. Return the changed grid. """ pass
Solution code:
... for y in range(grid.height): grid.set(0, y, 'a') # left edge grid.set(grid.width - 1, y, 'a') # right edge return grid
Q: How can we tell if that code works? With our image examples, at least you could look at the output, although that was not a perfect solution either. Really we want to be able to write test for a small case with visible data.
Here's a visualization - before and after - of grid and how set_edges() modifies it.
Here are the key 3 lines added to set_edges() that make the Doctest: (1) build a "before" grid, (2) call fn with it, (3) write out the expected result of the function call
... >>> grid = Grid.build([['b', 'b', 'b'], ['x', 'x', 'x']]) >>> set_edges(grid) [['a', 'b', 'a'], ['a', 'x', 'a']] ...