From Languages to Information is offered online, adopting the format used by CS145 and CS229A!
What this means:
|Week||Date||Homework||In-class||Video Lectures and Readings|
|1||Jan 8 and 10||-||
Basic Text Processing [slides pptx] [slides pdf]
Edit Distance [slides pptx] [slides pdf]
|2||Jan 15 and 17||
Homework 1: Spamlord
Due Fri Jan 18, 5:00pm
Language Modeling [slides pptx] [slides pdf]
Spelling Correction and the Noisy Channel [slides pptx] [slides pdf]
|3||Jan 22 and 24||
Homework 2: AutoCorrect!
Due Fri Jan 25, 5:00pm
Naïve Bayes and Text Classification [slides pptx] [slides pdf]
Sentiment Analysis [slides pptx] [slides pdf]
|4||Jan 29 and 31||
Homework 3: Thumbs up!
Due Fri Feb 1, 5:00pm
MaxEnt Classifiers [slides pptx] [slides pdf]
MEMM Sequence Models and Named Entity Tagging [slides pptx] [slides pdf]
|5||Feb 5 and 7||
Homework 4: Extract!
Due Fri Feb 8, 5:00pm
|Named Entity Classification [slides pdf] [Starter code]||
Information Retrieval (I) [slides pptx] [slides pdf]
Information Retrieval (II) [slides pptx] [slides pdf]
|6||Feb 12 and 14||
Homework 5: Search!
Due Fri Feb 15, 5:00pm
|Information Retrieval [slides pdf]||
Relation Extraction [slides pptx] [slides pdf]
XML: accessing structured information [slides pptx] [slides pdf]
To get these, go to library.stanford.edu/ezproxy/, choose Safari Tech Books, and search for XML in a Nutshell.
|7||Feb 19 and 21||-||-||
Word Meaning and Word Similarity [slides pptx] [slides pdf]
Question Answering [slides pptx] [slides pdf]
|8||Feb 26 and 28||
Homework 6: Jeopardy!
Due Fri Mar 1, 5:00pm
|XML, Relation Extraction & QA [starter code]||
Machine Translation 1 [slides pptx] [slides pdf]
Machine Translation 2 [slides pptx] [slides pdf]
|9||Mar 5 and 7||
Homework 7: Translate!
Due Fri Mar 8, 5:00pm
|Machine Translation [slides]||
Web graphs, Links, and PageRank [slides pptx] [slides pdf]
|10||Mar 12 and 14||-||-||
Social Networks [slides pptx] [slides pdf]
Friday March 22, 12:15-3:15pm, Location: Cubberly Auditorium
(Alternate) Tuesday Mar 19, 12:15pm-3:15pm, Location: Annenberg Auditorium
Leon Lin (Head TA), Mason Chua, Thomas Dimson, Milind Ganjoo, Kevin Nguyen and Rukmani Ravisundaram
Locations change, and will be updated on Piazza.
Tuesday and Thursday 9:30-10:45am in 260-113
Mail non-technical questions only to email@example.com. We will not reply to email sent to individual staff members. If you have a matter to be discussed privately, please come to office hours, or use firstname.lastname@example.org to make an appointment.
We prefer that most questions are posted on the Piazza forum - responses tend to be quicker and have a wider audience.
We use the mailing list generated by Axess to convey messages to the class. We will assume that all students read these messages.
Readings from MR+S are required, but the reading are available here (the published book).
Extracting meaning, information, and structure from human language text, web pages, social networks, genome sequences, or any less structured information. Methods include: string algorithms, edit distance, naive Bayes and MaxEnt classifiers, language modeling, XML processing. Applications such as information retrieval, question answering, text classification, social network models, machine translation, genomic sequence alignment, word meaning extraction.
CS 103, CS 107 and CS 109.
Each week, we will ask you to watch a set of video lectures (2 to 2.5 hours total). The videos will have some in-video questions embedded in them, which you should answer. You are required to watch the videos, but the embedded quizzes are not counted toward the final grade.
After watching a week's video lectures, we will ask you to answer an open-notes, open-book review quiz (about 5 questions) on the content that you just learned. Each review quiz may be attempted several times, with a time lag of 10 minutes in between each attempt. The questions, as well as the options for each question, are randomly selected from a larger pool each time you take a quiz. We will take the highest score over all attempts for each quiz. The first two attempts will not be penalized; subsequent attempts will incur a cumulative 20% penalty (e.g., the maximum score possible is 80% on the 3rd attempt and 60% on the 4th attempt). Review Quizzes for each week are due 11:59pm Tuesday of the following week. There are no late days for review quizzes.
Since lectures are on-line, the in-class sessions Tuesday and Thursday mornings will be used for problem-solving, reviews, discussions, guest speakers from industry, and presentation of state-of-the-art research. You can get extra credit for class participation by answering questions on the class forum.
7 programming assignments (in Java or Python, your choice). Each assignment is due at 5:00pm on the Friday it is due.
Homework Collaboration: You may talk to anybody you want about the assignments and bounce ideas off each other. But you must write the actual homeworks and programs yourself.
You have 4 free late (calendar) days to use on the homeworks. Once these are exhausted, any homework turned in late will be penalized 20% per late day. Each 24 hours or part thereof that a homework is late uses up one full late day.
We will expect you to do a significant amount of textbook reading in this course.
Friday Mar 22, 12:15pm-3:15pm in Cubberly Auditorium
(Alternate) Tuesday Mar 19, 12:15pm-3:15pm in Annenberg Auditorium