Today: loose ends, better/shorter code, flex-arrow example
#
Use comments to fill in information that is useful or interesting and not clear from the code itself. These are not needed too often, since frequently the code is readable enough on its own with good function and variable names.
1. Avoid repeating what the code says. Don't do this:
i = i + 1 # Add 1 to i
2. Could mention what a line accomplishes - not so obvious sometimes - describing the higher-level goal of the code:
# Increase i to the next multiple of 10 i = 10 * (i // 10 + 1)
3. Could mention the goal of a group lines, framing what goal the next 8 lines accomplish. Can also use blank lines to separate logical stages from each other.
# Advance i to end of email address i = at + 1 while i < len(s) and .... ...
is
, Not Same as ==
is None
1. Say we have a word
variable. The following if-statement will work perfectly for CS106A, and you probably wrote it that way before, and it's fine and we will never mark off for it.
if word == None: # Works fine, not PEP8 print('Nada')
2. However, there is a very old rule in PEP8 that comparisons to the value None
should be written with the is
operator. This is an awkward rule, but we are stuck with it. You may have noticed PyCharm complaining about the above form, so you can write it as follows and it works correctly and is PEP8:
if word is None: # Works fine, PEP8 print('Nada') if word is not None: # "is not" variant print('Not Nada')
Very important limitation:
3. The is
operator is similar to ==
, but actually does something different for most data types, like strings and ints and lists. For the value None
, the is
operator is reliable. Therefore, the rule is:
Never use the is operator for values other than None
.
Never, never, never, never.
Only use is
with None
as above. If you use it with other values, it will lead to horrific, weekend-ruining bugs.
There's a longer explanation about this awkward "is" rule in the guide page above.
The short story is that is
computes if two values occupy the same bytes in memory. It is extremely rare for an algorithm to need to compute that, so it is unfortunate that PEP8 forces the is
operator into normal looking code.
>>> a = [1, 2, 3] >>> b = [1, 2, 3] >>> # b looks like a, but is separate >>> a [1, 2, 3] >>> b [1, 2, 3] >>> >>> a == b # == returns True True >>> >>> a is b # "is" False - not same mem False >>> >>> c = a # c - same memory as a >>> a is c # "is" True case True >>>
See Python copy/is chapter for the details.
Not showing any new python, but showing moves and patterns you may not have thought of to shorten and clean up your code.
Many of these techniques involve leveraging variables to clean up the code.
I'm going to show you a v1 form of the code that is suboptimal and then how to write it better. You could go down this path by accident easily enough, putting in the code for case 1, case 2, .. not seeing how to generalize them.
first_n(s, n): Given a string s and an int n. The n parameter will be one of 1, 2, 3, 4
. Return a string of the first n chars of s. If s is not long enough to provide n chars, return None
.
'Python', 1 -> 'P' 'Python', 2 -> 'Py' 'Python', 3 -> 'Pyt' 'Python', 4 -> 'Pyth' 'Py', 3 -> None
Here is v1 which works perfectly, but is not the best.
def first_n(s, n): if len(s) < n: return None if n == 1: return s[:1] if n == 2: return s[:2] if n == 3: return s[:3] if n == 4: return s[:4]
Note: we don't need an if/elif here, since the "return" exits the function when an if-test succeeds.
When n
is 1
, want an 1
in the slice. When n
is 2
, want an 2
in the slice. How can we get that effect?
The variable n
points to one of the values 1, 2, 3, 4
when this function runs.
Therefore, in your code, if there is a spot where you need a value that appears in a variable or parameter, you can just literally write that variable there.
def first_n(s, n): if len(s) < n: return None return s[:n]
If n
is 1
, uses 1
in the slice. If n
is 2
, it uses 2
, and so on. Use the variable itself, knowing that at run-time, Python will evaluate the variable, pasting in whatever value it points to.
We mentioned this strategy before. It's a common CS106A technique.
Your algorithm has a few values flowing through it. Pick a value out and put it in a well-named variable. Use the variable on the later lines.
1. Helps readability - the named variable helps all the later lines read out what they do.
2. In the style of divide-and-conquer, computing a value that is part of the solution and storing it in a variable - you have picked off and solved a smaller part of the problem, and then put that behind you to work on the next thing.
Q: Given a list lst
, what is the index of the first element in the list?
A: 0
Q: Given a list lst
, what is the index of the last element in the list?
A: len(lst) - 1
For example, if the length is 10, the first element is at 0, the last element is at 9. This is just zero-based indexing showing up again.
Essentially:
nums = list of numbers nums[0] -> low bound nums[len(nums) 1] -> high bound Adjust numbers outside of low..high range to be in range. [0, 7, -2, 12, 3, -3, 5] -> [0, 5, 0, 5, 3, 0, 5]
Given nums, a list of 1 or more numbers. We'll say the first number in the list is the desired lower bound (low), and the last number is the desired upper bound (aka high). Change the list so any number less than low is changed to low, and likewise for any number greater than high. Return the changed list. Use for/i/range to loop over the elements in the list. Use an add-var strategy to introduce variables for the bounds.
Here is a v1 solution that works, but I would never write it this way.
def low_high(nums): for i in range(len(nums)): if nums[i] < nums[0]: nums[i] = nums[0] if nums[i] > nums[len(nums) - 1]: nums[i] = nums[len(nums) - 1] return nums
This code is not especially readable. The many details on each line make it hard to see what is going on.
Introduce variables low and high to store those intermediate values, using the variables on the later lines. This is much more readable.
def low_high(nums): low = nums[0] high = nums[len(nums) - 1] for i in range(len(nums)): if nums[i] < low: nums[i] = low if nums[i] > high: nums[i] = high return nums
Which is easier to write without bugs, knowing that you might sometimes forget a -1
or get <
and >
backwards.
v1:
if nums[i] > nums[len(nums) - 1]: nums[i] = nums[len(nums) - 1]
v2:
if nums[i] > high: nums[i] = high
The version building on the variable is (a) readable and (b) easier to write correctly the first time. Readability is not just about reading, it's about writing it with fewer bugs as you are typing.
On our lecture examples we do this constantly - pulling some intermediate value into a well-named variable to use on later lines.
Your computation has natural partial results. Storing these in variables with good names make better code.
Say your code is not working, and you are trying to fix it, and you are running out of ideas. There is some long line of code, and you are not sure if it is right. Try pulling a part of that computation out, looking at just that bit carefully, and storing it in a variable.
What was a long horizontal line, you stretch out vertically into an increased number of shorter lines, better able to concentrate on one part at a time.
Notice that v1 is short but wide. In contrast, v2 is longer because of the 2 added lines to set the variables, but more narrow since the later lines have less on therm.
Say we want to set alarm differently for weekends, something like this:
if not is_weekened: alarm = '9:00 am' else: alarm = 'off'
The above code is fine. Here I will propose a slightly shorter way way will use often. (1) Initialize (set) the variable to its common, default value first. (2) Then an if-statement detects the case where the var should e set to a different value.
alarm = '9:00 am' if is_weekend: alarm = 'off'
if case-1: lines-a ... ... if case-2: lines-b ... ...
grounded(minutes, is_birthday): Given you came home minutes late, how many days of grounded are you. If minutes is an hour or less, grounding is 5, otherwise 10. Unless it is your birthday, then 30 extra minutes are allowed. Challenge: change this code to be shorter, not have so much duplicated code.
The code below works correctly. You can see there is one set of lines for the birthday case, and another set of similar lines for the not-birthday case. What exactly is the difference between these two sets of lines?
def grounded(minutes, is_birthday): if not is_birthday: if minutes <= 60: return 5 return 10 # is birthday if minutes <= 90: return 5 return 10
1. Set limit first. 2. Then unified lines below use limit, work for all cases.
def grounded(minutes, is_birthday): limit = 60 if is_birthday: limit = 90 if minutes <= limit: return 5 return 10
ncopies: word='bleh' n=4 suffix='@@' -> 'bleh@@bleh@@bleh@@bleh@@' ncopies: word='bleh' n=4 suffix='' -> 'bleh!bleh!bleh!bleh!'
ncopies(word, n, suffix): Given name string, int n, suffix string, return
n copies of string + suffix.
If suffix is the empty string, use '!'
as the suffix.
Challenge: change this code to be shorter,
not have so many distinct paths.
The problem starts with eversion below as a starting point. Look at lines that are similar - make a unified version of those lines using a variable, as above.
Before:
def ncopies(word, n, suffix): result = '' if suffix == '': for i in range(n): result += word + '!' else: for i in range(n): result += word + suffix return result
Solution: use logic to set an ending
variable to hold what goes on the end for all cases. Later, unified code uses that variable vs. separate if-stmt for each case. Alternately, could use the suffix
parameter as the variable, changing it to '!'
if it's the empty string.
def ncopies(word, n, suffix): result = '' ending = suffix if ending == '': ending = '!' for i in range(n): result += word + ending return result
> match()
match(a, b): Given two strings a and b. Compare the chars of the strings at index 0, index 1 and so on. Return a string of all the chars where the strings have the same char at the same position. So for 'abcd' and 'adddd' return 'ad'. The strings may be of any length. Use a for/i/range loop. The starter code works correctly. Re-write the code to be shorter.
match(): 'abcd' 'adddd' -> 'ad' 01234
Code before unify:
def match(a, b): result = '' if len(a) < len(b): for i in range(len(a)): if a[i] == b[i]: result += a[i] else: for i in range(len(b)): if a[i] == b[i]: result += a[i] return result
def match(a, b): result = '' # Set length to whichever is shorter length = len(a) if len(b) < len(a): length = len(b) for i in range(length): if a[i] == b[i]: result += a[i] return result
If we have time, a fun bit of drawing code.
Download the flex-arrow.zip to work this fun little drawing example.
Recall: floating point values passed into draw_line() etc. work fine.
Ultimately we want to produce this output:
The "flex" parameter is 0..1.0: the fraction of the arrow's length used for the arrow heads. The arms of the arrow will go at a 45-degree angle away from the horizontal.
Specify flex on the command line so you can see how it works, here calling the version with the code completed. Close the window to exit the program. You can also specify larger canvas sizes.
$ python3 flex-arrow-solution.py -arrows 0.25 $ python3 flex-arrow-solution.py -arrows 0.15 $ python3 flex-arrow-solution.py -arrows 0.1 1200 600
Look at the draw_arrow() function. It is given x,y of the left endpoint of the arrow and the horizontal length of the arrow in pixels. The "flex" number is between 0 .. 1.0, giving the head_len - the horizontal extent of the arrow head - called "h" in the diagram. Main() calls draw_arrow() twice, drawing two arrows in the window.
This starter code the first half of the drawing done.
def draw_arrow(canvas, x, y, length, flex): """ Draw a horizontal line with arrow heads at both ends. It's left endpoint at x,y, extending for length pixels. "flex" is 0.0 .. 1.0, the fraction of length that the arrow heads should extend horizontally. """ # Compute where the line ends, draw it x_right = x + length - 1 canvas.draw_line(x, y, x_right, y) # Draw 2 arrowhead lines, up and down from left endpoint head_len = flex * length # what goes here?
With head_len
computed - what the two lines to draw the left arrow head? This is a nice visual algorithmic problem.
Code to draw left arrow head:
# Draw 2 arrowhead lines, up and down from left endpoint head_len = flex * length canvas.draw_line(x, y, x + head_len, y - head_len) # up canvas.draw_line(x, y, x + head_len, y + head_len) # down
Here is the diagram again.
Add the code to draw the head on the right endpoint of the arrow. The head_len
variable "h" in the drawing. This is a solid, CS106A applied-math exercise. In the function, the variable x_right
is the x coordinate of the right endpoint of the line:x_right = x + length - 1
When that code is working, this should draw both arrows (or use the flex-arrow-solution.py):
$ python3 flex-arrow.py -arrows 0.1 1200 600
def draw_arrow(canvas, x, y, length, flex): """ Draw a horizontal line with arrow heads at both ends. It's left endpoint at x,y, extending for length pixels. "flex" is 0.0 .. 1.0, the fraction of length that the arrow heads should extend horizontally. """ # Compute where the line ends, draw it x_right = x + length - 1 canvas.draw_line(x, y, x_right, y) # Draw 2 arrowhead lines, up and down from left endpoint head_len = flex * length canvas.draw_line(x, y, x + head_len, y - head_len) # up canvas.draw_line(x, y, x + head_len, y + head_len) # down # Draw 2 arrowhead lines from the right endpoint # your code here pass canvas.draw_line(x_right, y, x_right - head_len, y - head_len) canvas.draw_line(x_right, y, x_right - head_len, y + head_len)
Now we're going to show you something a little beyond the regular CS106A level, and it's a teeny bit mind warping.
Run the code with -trick like this, see what you get.
$ python3 flex-arrow-solution.py -trick 0.1 1000 600