MS&E 130/231: Information
Systems, Autumn 2005-06
Instructor: Ashish Goel
Handout 13 – practice problems for the final
A)
Final exam from last year
Closed book. All questions worth ten points. Total time: 120 minutes
Please be warned – I
am planning to make this year’s exam a little harder.
- What are the technical design features of KaZaA
that make it (at least till now) legal whereas much of Napster’s
functionality was deemed illegal?
- A wants to send a file to B. A and B do not trust
each other, and would like the file to pass through an intermediary C who
will maintain a record of the transaction. A and B do not want C to be
able to recover the file
(since the information in the file is private). However, in case of
future litigation, either A or B must be able to give the original file to
C and get C to verify that this was indeed the file that was transferred
from A to B. Suggest a simple protocol for implementing this scheme.
Ignore network intruders and the like – assume a fully secure
channel between any pair.
- Prove that public key cryptography is at least as
powerful as shared key cryptography.
- Consider the following relational schema
representing the collection of CDs which have been recorded by a
collection of artists:
CompactDiscs(artistID,
artistName, title, year);
Artists(artistID,
artistName, artistAddress);
Here
you may assume that artistID uniquely identifies an artist. Explain why the
above schema is not a well-designed one. How would you fix the problem? What
are the appropriate primary and foreign key constraints for your new schema?
- A sink component of the World Wide Web is a set
of web pages such that none of the web pages in the set points to any of
the web pages outside this set. Explain why sink components are a problem
for the naďve version of PageRank discussed in class. Explain how PageRank
addresses this problem.
- Which of the three is the largest (in total size)
out of
a)
A corpus (i.e.
collection) of documents,
b) A set of indices built on each document in the corpus,
c)
A set of reverse indices
built on each word that occurs somewhere in the corpus.
Which
is the smallest? Explain your answer. We are not interested in small
differences in size, but in big “orders-of-magnitude” differences.
B) Some additional
problems
1.
Suppose
p = 7, q = 5, and n = pq = 35
a.
What
is f(n)?
b.
Suppose
e = 11. By trial and error, find d in the range 1…f(n)-1
such that ed = 1 (mod f(n)).
c.
Suppose
the RSA public key is (n,e) and the RS private key is (n.d). Apply the public
key to the number 6.
2.
Suppose
you have a basic AIMD (additive increase, multiplicative decrease)
implementation of TCP – each time all the packets in a window get
successfully acked, the window size increase by one. When there is a packet
drop, the window size is halved. If a TCP connection suffers a packet loss
every 10th packet, what is the rate at which it transmits? Assume a
fixed RTT of 100ms, a packet size of 1KB, and assume that no ACKs get lost.
3.
You
are given a database with the following four tables:
Class(instructorSSN,
year, quarter, classNumber);
EnrolledIn(studentSSN,
enrolledYear, enrolledQuarter, enrolledClassNumber);
Instructors(name,
SSN);
Students(name,
SSN);
The
first table stores information for all the classes in the history of a
University. The second has enrollment data for all the classes. The third
stores the name and the SSN of every past and present instructor, and the last
does the same for students.
- What are the appropriate key constraints for
each of these tables?
- Write an SQL query (forget about efficiency
etc.) that lists the names of all the students who took a class with
someone named “Ashish Goel”. No name should be listed twice.
a.
Write
an SQL query that lists the name of the instructor with SSN 123456789. What
index should you build to efficiently support this query?
4.
You
are given N + M web pages, N of which are maintained by men and the other M by
women. Every web page has a hyperlink to every web page maintained by someone
of the opposite sex. What is the PageRank of a web page maintained by a man?
Assume that the reset probability is 0.
- Consider our running
example database:
movie(title,
year, studioName);
movieStar(name,
address, gender, birthDate);
starsIn(starName,
movieTitle, movieYear);
Write SQL statements for the
following two queries:
a.
Who were the male stars
in the movie titled “Serendipity”?
b.
Who were the stars
(names only) in the movies produced by MGM in 1995? Each name should be output
at most once.
- How would you write 5(a) using nested SELECTs to
guarantee that the query is executed efficiently? Why would this help?
Also, what index, if any, should you build to further speed up queries
which are the same as 5(a) but with different movie names?