From Languages to Information is offered online!
What this means:
|Week||Date||Homework||In-class||Video Lectures and Readings|
|1||Jan 7 and 9||-||
Basic Text Processing [slides pptx] [slides pdf]
Edit Distance [slides pptx] [slides pdf]
|2||Jan 14 and 16||
Homework 1: Spamlord
Due Fri Jan 17, 5:00pm
Language Modeling [slides pptx] [slides pdf] (skip the video/slides on Good Turing Smoothing)
Spelling Correction and the Noisy Channel [slides pptx] [slides pdf]
|3||Jan 21 and 23||
Homework 2: AutoCorrect!
Due Fri Jan 24, 5:00pm
Naïve Bayes and Text Classification [slides pptx] [slides pdf]
Sentiment Analysis [slides pptx] [slides pdf]
|4||Jan 28 and 30||
Homework 3: Thumbs up!
Due Fri Jan 31, 5:00pm
Information Retrieval (I) [slides pptx] [slides pdf]
Information Retrieval (II) [slides pptx] [slides pdf]
|5||Feb 4 and 6||
Homework 4: Search!
Due Fri Feb 7, 5:00pm
Tuesday: Group Work on Information Retrieval and Answer Key
Thursday Feb 6: Guest Lecturer*: Jennifer Chu-Caroll, IBM T. J. Watson Research Center [questions]
Relation Extraction [slides pptx] [slides pdf]
Question Answering [slides pptx] [slides pdf]
|6||Feb 11 and 13||
Homework 5: Jeopardy!
Due Fri Feb 14, 5:00pm
Machine Translation 1 [slides pptx] [slides pdf]
Machine Translation 2 [slides pptx] [slides pdf]
|7||Feb 18 and 20||Tuesday: Group Work on Machine Translation||
Word Meaning and Word Similarity [slides pptx] [slides pdf]
|8||Feb 25 and 27||
Homework 6: Translate!
Due Fri Feb 28, 5:00pm
Tuesday: Dan Lecture (POS Tagging), Group Work on PA 6|
Thursday: Group Work on PA 6
|9||Mar 4 and 6||
Homework 6 Peer Grading
Due Fri Mar 7, 5:00pm
Web graphs, Links, and PageRank [slides pptx] [slides pdf]
|10||Mar 11 and 13||-||
Tuesday: Peter Norvig, Google*|
Thursday: Course Review, Discussion of Practice Final and its Solutions
Social Networks [slides pptx] [slides pdf]
You can take it either of these two days (but not both):
Tuesday and Thursday 3:15-4:30pm in 420-040
If you have a question that is not confidential or personal, post it on the Piazza forum - responses tend to be quicker and have a wider audience. To contact the teaching staff directly, we strongly encourage you to come to office hours. If that is not possible, you can also email (non-technical questions only) to the course staff list, firstname.lastname@example.org. We can not reply to email sent to individual staff members. If you have a matter to be discussed privately, please come to office hours, or use email@example.com to make an appointment. For grading questions, please talk to us after class or during office hours.
We use the mailing list generated by Axess to convey messages to the class. We will assume that all students read these messages.
Since we occasionally reuse homeworks from previous years, we expect students not to copy, refer to, or look at the solutions in preparing their answers. It is an honor code violation to intentionally refer to a previous year's solutions. This applies both to the official solutions and to solutions that you or someone else may have written up in a previous year. It is also an honor code violation to find some way to look at the test set or interfere in any way with programming assignment scoring or tampering with the submit script.
Readings from MR+S are required, but the book is available online *HERE*.
Extracting meaning, information, and structure from human language text, speech, web pages, genome sequences, social networks, or any less structured information. Methods include: string algorithms, edit distance, language modeling, naive Bayes, inverted indices, vector semantics. Applications such as information retrieval, question answering, text classification, social network models, machine translation, genomic sequence alignment, word meaning extraction.
CS 103, CS 107 and CS 109.
Each week, we will ask you to watch a set of video lectures (2 to 2.5 hours total). The videos will have some in-video questions embedded in them, which you should answer. You are required to watch the videos, but the embedded quizzes are not counted toward the final grade.
After watching a week's video lectures, we will ask you to answer an open-notes, open-book review quiz (about 5 questions) on the content that you just learned. Each review quiz may be attempted several times, with a time lag of 10 minutes in between each attempt. The questions, as well as the options for each question, are randomly selected from a larger pool each time you take a quiz. We will take the highest score over all attempts for each quiz. The first two attempts will not be penalized; subsequent attempts will incur a cumulative 20% penalty (e.g., the maximum score possible is 80% on the 3rd attempt and 60% on the 4th attempt). Review Quizzes for each week are due 11:59pm Tuesday of the following week. There are no late days for review quizzes.
Since lectures are on-line, the in-class sessions Tuesday and Thursday mornings will be used for problem-solving, reviews, discussions, guest speakers from industry, and presentation of state-of-the-art research. Attendence at the guest lectures as well as the first lecture, my lecture on networks, and possibly one other in-person lecture is required (this is the 5% class participation part of your grade). You can get extra credit for class participation by answering questions on the class forum and asking good question of the invited speakers.
6 programming assignments (in Java or Python, your choice). Each assignment is due at 5:00pm on the Friday it is due.
Programming Assignment Collaboration: You may talk to anybody you want about the assignments and bounce ideas off each other. But you must write the actual programs yourself.
You have 4 free late (calendar) days to use on the first 5 programming assignments (HW 6 is a peer-graded assignment, and late days may not be applied). Once these are exhausted, any PA turned in late will be penalized 20% per late day. Each 24 hours or part thereof that a homework is late uses up one full late day. However, no assignment will be accepted more than four days after its due date.
We will expect you to do a significant amount of textbook reading in this course.
Tuesday Mar 18, 12:15pm-3:15pm Location: Hewlett 201
Thursday Mar 20, 12:15pm-3:15pm Location: CUBAUD (Cubberley Auditorium)