|
Required |
Manning, C. and Schuetze, H. 2000. Foundations of Satistical Natural Language Processing. Charniak, E. 1998. Statistical Language Learning. MIT Press. Readings to be distributed online |
|---|---|
|
Web-site |
http://www.rohan.sdsu.edu/~gawron/stat | Course Description |
This course explores some basic ideas in statistical Natural language processing that underpin models for statistical parsing. Topics covered include Markov chains and Hidden Markov Models, statistical estimators for n-gram models, part-of-speech tagging, probabilistic context-free grammars, and lexicalized grammars. Assignments will be short and targeted on assessing whether concepts essential for continuing are getting across. It is thus to everyone's advantage to turn them in a timely fashion. Late assignments will be accepted up to 2 weeks after the due date, with a deuction of 15% a weekmof lateness. Assignments will be graded on a check/no-check basis, so anyone who turns in all the assignments on time will get full-credit for the 20% of the grade apportioned to assignments. Extra credit problems are harder and will be graded by quality of the work. In all cases the computations or reasoning leading to an answer must be spelled out. Assignments that do not show their work will be treated as incomplete. The midterm will be graded by quality of answer. Again, computations and reasoning must be spelled out. The programming project needs to be selected by the 10th week. Before settling on a project discuss it with the instructor. For those not at home with a programming language there are several projects possible that use computational tools and corpora already available, for example an evaluation of discounting methods using The CMU toolkit. |
|
Grading |
Assignments(20%) |
|
Week 1: |
Sep. 4 Chapter 1: Manning and Schuetze. The "empirical" outlook. Zipf's Law. |
|---|---|
|
Sep. 6 Chapter 2.1: Manning and Schuetze. Review/introduction to Probability theory. Bayes' theorem. Random variables. Joint and conditional distribitions. |
|
|
Week 2: |
Sep. 11 Chapter 2.1: Manning and Schuetze. Expectation (expected value). Bayesian Decision. |
|
Sep. 13 Chapter 2.2.1-2.2.2: Manning and Schuetze. Introduction to Information Theory. Entropy. Exercise 1 handed out. |
|
|
Week 3: |
Sep. 18 Chapter 6.1,6.2.1: Manning and Schuetze. N-gram models. Maximum Likelihood Estimation (MLE) for n-gram models. Data sparseness. Some practical issues for N-gram models. Exploiting Zipf's Law. |
|
Sep. 20 Chapter 6.2.2-6.2.6: Manning and Schuetze. Discounting methods. Exercise 1 due. |
|
|
Week 4: |
Sep. 25 Chapters 2.2.6, 6.2.3: Manning and Schuetze. Testing Language models. Training and test sets. Held-out training data. Development test and final test data. Cross-entropy and perplexity. |
|
Sep. 27 Introduction to CMU Statistical Toolkit. |
|
|
Week 5: |
Oct 2 Language-modeling practicum. Looking at the code for perplexity measurement. Exercise 2 handed out: language model comparison on a large corpus. |
|
Oct 4 Chapter 6.3.1-6.4: Manning and Schuetze. Linear interpolation. Backoff. |
|
|
Week 6: |
Oct 9 Charniak, Chap 2. Overview of language models and model evaluation. Markov Chains. Intuitive cross-entropy. Exercise 2 due. |
|
Oct 11 Charniak, Chap 3. Ngram models as Markov chains. Linear interpolation with Markov chains. Hidden Markov models (HMMs). |
|
|
Week 7: |
Oct 16 Charniak, Chap 3. Manning and Schuetze 10.1-10.3. Part-of-speech tagging. Application of HMMs. |
|
Oct 18 Charniak, Chap 4.1. Viterbi search. |
|
|
Week 8: |
Oct 23 HMMs in speech recognition. Differences between the tagging problem and the speech recognition problem. |
|
Oct 25 Charniak, Chap 4.2-4.4. HMM training. Forward-backward algorithm. Exercise 3. Viterbi and HMM exercises passed out. |
|
|
Week 9: |
Oct 30 Charniak Chap. 1. The standard model. Grammars. |
|
Nov 1 Charniak Chap. 5. Extending probabilistic language models to deal with structure. Probabilistic context-free grammars (PCFGs). Exercise 3 due. Midterm passed out. |
|
|
Week 10: |
Nov 6 Charniak Chap. 6. Assigning probabilities to trees with PCFGs. Programming project proposals should all be in by now. |
|
Nov 8 Charniak Chap. 7. Training PCFGs. The inside-outside algorithm. Midterm due. |
|
|
Week 11 |
Nov 13 Charniak Chap. 8. Leaning PCFGs. Problems with PCFGs. |
|
Nov 15 Collins ACL 97. Lexicalized grammars. Back to bigrams. |
|
|
Week 12 |
Nov 20 Collins continued. |
|
Nov 22 Thanksgiving. No class. |
|
|
Week 13 |
Nov 27 Parsing algorithms for charts. Earley algorithm. |
|
Nov 29 Cubic time parsing for CFGs. |
|
|
Week 14 |
Dec 4 Eisner Cubic time paper. Eisner's dependency model. |
|
Dec 6 Eisner Cubic time paper. Eisner's cubic time parsing algorithm. |
|
|
Week 15 |
Dec 10 Programming project practicum. |
|
Dec 17 Programming project practicum. |