Today: ethics: privacy, program style: readability and decomposition, bits and bytes

Ethics - Privacy

By "privacy" here we're referring to individuals vs. the government. (Thanks to Ethics Fellow Wanheng Hu for feedback.)

Encryption Technology - SMS vs. E2E

1. SMS is a traditional setup. Alice and Bob have a key, but Verizon also has the key. The message is encrypted in transit, but Verizon has a copy. (The keys may be per-hop, but the essential feature is that Verizon has a copy.)

2. E2E. Alice and Bob both have a key, and Verizon does not. Thus Verizon only sees the ciphertext. Essential point: if the government asks for the plaintext, Verizon does not have it.

Encryption Technology - Phones, Hard Drives

The files on phones are typically encrypted, only unlocked by the owner's PIN or fingerprint/face-id unlock. Likewise, an external hard drive can be encrypted with a user's password. Such encryption is effective, even in the face of law-enforcement efforts.

Ethic: Have Some Privacy

Allowing people some privacy is good for society.

Short answer: tolerance. Giving people some privacy helps give them some individual freedom, even in the face of intolerance. In computer terms, privacy could be described as a "hack" which helps get a sort of tolerance.

Privacy is not a black-and-white issue. You do not want 0% or 100% privacy. This gets back to the dual-use pattern — we have both sympathetic and unsympathetic users of privacy, so we end up with a compromise of "some", but not 100%, privacy.

Privacy - Sympathetic examples

The terror group ISIS was very unfriendly to gay people. That such a person is able to keep their phone, messages encrypted away from ISIS seems good. Note also the de-facto tolerance angle. Or a dissident smuggling their memoirs out of an authoritarian regime.

If we just look at these examples, privacy looks great. But unfortunately there are just as many unsympathetic examples

Privacy - Unsympathetic examples

Criminals are highly aware of using encryption for chats, data etc. A man was an alleged pedophile and refused to unlock his encrypted hard drive. The courts kept him in jail for years, and eventually he was released. The legality of this situation is currently debated in the US. Does the 5th amendment right against self-incrimination apply to one's phone?

The Nth Room case of blackmail and cybersex trafficking on (encrypted) Telegram.

Aside - Crazy "Phone For Criminals" Story

There was a privacy-focussed phone, marketed to criminals. It turned out to be an FBI front, which used the information for convictions. For an entertaining hour, check out the Search Engine podcast episode:Best Phone For Crimes. Evidence was primarily not used against US citizens, I suspect because its collection violated the limited-government need for a warrant — perhaps an example of the system working as intended.

Compromise - Some Privacy

So we end up with a compromise where individuals have some, but not absolute privacy.

US History - limited government. Includes limitations on the government spying on citizens. Compromise: government needs a warrant, probably cause to get info.

Edward Snowden PRISM US was spying on citizens to some degree.

In contrast, in the above crime-phone story .. they did not pursue US citizens with the info. Here the limited-government rules seemed to be followed.

E2E vs. Warrant

Note that the technology of end-to-end encryption short-circuits the warrant system. Verizon does not have the data to give.

Law Enforcement Back-Door Requests

Law enforcement has lobbied for a "back door" to be added to encryption, where trusted parts of the government can, say, decrypt anyone's phone. Apple/Google argue convincingly that any such backdoor will then be used by ISIS, Russia etc. etc. The current state in the US is that there is no back door.

History Note - Democracy vs. Authoritarian

Democracy was increasing 1945-2000, but now Authoritarianism seems to be on the rise. I suspect this is temporary, and Democracy will again increase. But who knows, perhaps this is my own wishful thinking? This will an interesting arc of history that coincides with your adult life, see what happens.

Note that China, North Korea, Iran ... Whats App is illegal in all these countries. Authoritarian governments do not like to extend privacy to their citizens. I think citizens flourish more in democracies, and that's where I want to live.

Big Picture - The Truth About Software

We'll start a the highest level, seeing the truisms that guide software building. There's software in everything, so you should know the lay of the land.

Goal #1 - Code That Computes the Correct Answer

The main thing we want from code. If code produces the wrong answer, do we really care how fast it runs?

Problem: Natural Sate of Code = Broken

"Broken" is the natural state of code. It's easy to type in some code, and have it not work. We need a plan to work in this environment. Code can work so nicely, we should keep in mind that even more easily it can fail to work.

Can You Judge Code By Looking at It?

Can you judge code correctness by looking at it? The surprising answer is - no. To really judge, you need to simulate what the loops and if-statements will with various inputs. In effect, you need to run the code to see what it does.

How To Judge - Run Tests

We need to run the code against a few inputs, checking the output for each case. If the code works against a few cases, suggests it is probably correct. It is not a 100% proof, which is surprisingly difficult or impossible to obtain, but tests are very good in practice.

Corollary: Code Not Run is Probably Buggy

Code that the computer has never run over likely has bugs in it.

This can happen if an if-test is always false in a program. This happened with the AT&T phone network, where there was some code in the phone-switching system like this.

if rare_error_condition:
    code to
    route around     # un-noticed bug here
    error condition

The error handling code within the if_statement had a simple bug in it, but those lines had never run, so nobody noticed. Until one day the if-statement was true and the code ran (for the first time) and crashed, taking out a part of the US phone system for a while.

Code tests can help with this. There are modern "code coverage" tools that look at all the tests, making sure that every line has been run in some test or other.

Goal #2 - Clean Code

Clean code with good style. This helps reduce bugs in the first place, and it's easier to fix and add features to code that is already clean. Stanford has always put an emphasis on writing clean code with good style.

Goal #3 - Run Fast

If the code is works correctly and looks good, we might also want to tune it to run fast or use less memory. For some bits of code, speed is crucial. However, the best strategy is generally getting the code working first before messing with it for maximum performance.

Program Design Strategy

Why is code written the way it is? Today we tell the outside, strategic story, driving what forms of code work best.

For lecture, we'lllook over to the three Python guide chapters on style readability and decomposition.

Python Guide: PEP8 Tactics (mostly did this one on an earlier lecture)

Python Guide: Readable Code

Python Guide: Decomposition

Extra topic for fun if we have time.

Bits and Bytes

At the smallest scale in the computer, information is stored as bits and bytes. In this section, we'll look at how that works.

Bit

a "bit", like an atom, the smallest unit of storage
A bit stores just a 0 or 1
"In the computer it's all 0's and 1's" ... bits
Anything with two separate states can store 1 bit
Nick's tennis racket example
In a chip: electric charge = 0/1
In a hard drive: spots of North/South magnetism = 0/1
A bit is too small to be much use
Group 8 bits together to make 1 byte

Byte

One byte = grouping of 8 bits
e.g. 0 1 0 1 1 0 1 0
One byte can store one roman character, e.g. 'A' or 'x' or '$'

How Many Patterns With N Bits?

How many different patterns can be made with 1, 2, or 3 bits?

Number of bits	Different Patterns
1	0 1
2	00 01 10 11
3	000 001 010 011 100 101 110 111

Combare 3 bits vs. 2 bits
Consider just the leftmost bit
It can only be 0 or 1
Leftmost bit is 0, then append 2-bit patterns
Leftmost bit is 1, then append 2-bit patterns again
Result ... 3-bits has twice as many patterns as 2-bits
Every row - double the number of patterns of previous row

Number of bits	Different Patterns
1	0 1
2	00 01 10 11
3	000 001 010 011 100 101 110 111

In general: add 1 bit, double the number of patterns
1 bit -> 2 patterns
2 bits -> 4 patterns
3 bits -> 8 patterns
n bits -> 2ⁿ - 2 to the nth power
number of patterns is exponential of number of bits
Few things in life are exponential!
Compound interest (note: try to put something in 401k when young)
Spread of novel pathogen in population
Exponential growth is so fast, it is unintuitive

Number of bits	Number of Patterns
1	2
2	4
3	8
4	16
5	32
6	64
7	128
8	256

One Byte - 256 Patterns

1 byte is a group of 8 bits
8 bits can make 256 different patterns
How to store an int number in 1 byte?
Each number gets its own pattern
e.g. binary pattern 110 is the int 12, but we're not doing the details of that today
Imagine assigning each number its own pattern, starting with 0:
0 = pattern 1
1 = pattern 2
2 = pattern 3
...
254 = pattern 255
255 = pattern 256
There are 256 possible patterns, so 255 is the max int stored in one byte
pixel.red takes in a number 0..255, why?
The red/green/blue numbers of a pixel are each stored in one byte
That's why it's 0..255

"HDR" Image

HDR High Dynamic Range - more than 256 values
HDR uses 10 bits per color
How many more colors is that than 8 bits? 258 colors?
No it's exponential, doubling with each bit
8 bits = 256 colors
9 bits = 512 colors
10 bits = 1024 colors - HDR
10 bit HDR is maybe close to the human perceptual limit anyway
Uses a little more space, looks better

Future Image Format: AVIF

JPEG is 8 bits per color
And it's compression is not the best by modern standards
Next gen format: AVIF
Free/open standard, like JPEG
Compresses much better
10, 12 bit HDR supported
Alternative format: HEIF heavily patented (Apple)
Data formats where you need a patent license to be permitted to look at the data, probably not a good idea
HTML, HTTP, JPEG, PNG ... free/open standards, super successful