This is a introductory course in computational linguistics at an advanced level. No pre-requisites for graduate students, we will learn the rudiments of programming and the theoretical underpinnings of grammar systems from scratch. Reference Textbook Optional: Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008. Or 3rd edition PDF. Software
All software used in this class will be freely available.
|
Instructor: Sandiway Fong sandiway@arizona.edu
Office: Douglass 311 (send email for an appointment or
take a chance and drop by before/after class)
Location | Psychology, Rm 204 |
Time | Tuesdays/Thursdays 3:30PM - 4:45PM |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
8/22 | lecture1.pdf | lecture1.pptx | 36 | Viewer | Administrivia and Introduction.
Homework 1: read PDF of chapter 1 of textbook for in-class quiz next time.here. Homework 2: Install Perl and Python. |
8/24 | lecture2.pdf | lecture2.pptx | 32 | Language and computers: assitive technologies and
limits. Recursive nature of language. Introduction to natural
language analysis: parser demos.
Homework 3. A quick quiz. |
|
8/29 | lecture3.pdf | lecture3.pptx | 37 | Zoom | Homework 3 Review. Language Modeling. ChatGPT. English subject verb agreement. Homework 4. |
8/31 | lecture4.pdf | lecture4.pptx | 24 | Viewer |
Text summarization: Open Text Summarizer (OTS), macOS Summarize
Service and ChatGPT. Beginning programming with Perl: focusing on
proper use of quotes. QWERTY keyboard history.
world.perl / world.py |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
9/5 | lecture5.pdf | lecture5.pptx | 31 | Viewer | Homework 4 review. A bit more on quoting. perlintro: scalars and arrays. |
9/7 | lecture6.pdf | lecture6.pptx | 32 | Viewer | perlintro contd. Numeric and string
equality. Coercion. Repetition. General looping: while, for,
foreach. List range. Useful string functions, including chomp and
split. tr. String length: bytes vs. characters.
File I/O: open and <filehandle>. Files: falconheavylaunch.txt Homework 5. HW files: 2letters.txt / 3letters.txt / 4letters.txt / 5letters.txt |
9/12 | lecture7.pdf | lecture7.pptx | 27 | Viewer | Using WSL2: directory structures. Homework 5 review. Scrabble word
length statistics. Worked file
I/O example. Split and summing the words. Word frequency table using
hash tables. Sorting in Perl, Python and on the command line.
Files: falconheavylaunch.txt Terminal log: terminal7.txt |
9/14 | lecture8.pdf | lecture8.pptx | 29 | Viewer | Hash and dict in Perl and Python, respectively. Anonymous arrays
in Perl. Part of Speech dict example. Homework 6 on spelling rules +
disemvoweling.
4:50pm: slides corrected. |
9/19 | lecture9.pdf | lecture9.pptx | 37 | Viewer | Homework 6 review. Perl references. Perl Modules:
cpan. Date::Calc. Python library timedate.
Ungraded homework: install Lingua::EN::CMUDict, the CMU Pronouncing Dictionary. File: dow.perl |
9/21 | lecture10.pdf | lecture10.pptx | 37 | Viewer | Digital advertising. Clickbait. Homework 7. Perl regex. |
9/26 | lecture11.pdf | lecture11.pptx | 22 | Viewer | A note on Homework 7. ChatGPT and clickbait.
"The lines of code that changed everything" Getting deeper into Perl regex. Capture, backreferences, shortest vs. greedy matching, nondeterminism (backtracking). Note: Homework 8 was previewed also during this lecture. Slides for that are in Lecture 12. |
9/28 | lecture12.pdf | lecture12.pptx | 27 | Viewer | Homework 8: regex.
More on Perl regex: Perl code inside a replacement string in s/pattern/replacement/! Character and word frequency counting Zipf's Law: Brown Corpus case study File: 12thnight.txt File (UTF-8): pandora.txt Slides corrected: 4:50pm |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
10/3 | lecture13.pdf | lecture13.pptx | 38 | Viewer | Homewwork 8 review. Perl regex: lookahead, lookbehind. Predicate argument structure. Framenet. Propbank. Stanford CoreNLP. Universal dependencies. |
10/5 | lecture14.pdf | lecture14.pptx | 26 | Viewer | Stanford CoreNLP. Universal dependencies
continued. ChatGPT. Homework 9. Slides corrected: 4:25pm |
10/10 | lecture15.pdf | lecture15.pptx | 35 | Viewer | Homework 9 review. Ungraded regex exercises. Regex recursion. |
10/12 | lecture16.pdf | lecture16.pptx | 34 | Viewer | Regex recursion contd. Prime number testing using Perl regex. Finite State Automata (FSA). |
10/17 | lecture17.pdf | lecture17.pptx | 24 | Viewer |
regex example: xkcd simplewriter.
FSA contd.: formal definition, Perl and Python implementations, e-transitions, and single vs. multiple start states. File: fsa.perl |
10/19 | lecture18.pdf | lecture18.pptx | 10 | Viewer | Non-deterministic FSA (NDFSA). NDFSA to DFSA conversion. Three
worked examples.
Homework 10. Important: Homework 10 slides updated after class! |
10/24 | lecture19.pdf | lecture19.pptx | 21 | Viewer | Homework 10 review.
Regular Languages and closure properties. Turing Machines, a useful digression. |
10/26 | lecture20.pdf | lecture20.pptx | 16 | Viewer | Example of a machine is (perhaps) easier to build than a regex.
The state bypass method: converting a FSA into a regex algorithmically Homework 11 4:45pm homework slides clarified. |
10/31 | lecture21.pdf | lecture21.pptx | 3 | No recording | Dr. Shawn E. Nordell, UA Graduate Career Services.
Resources for finding jobs, interships and resume support. A note on Homework 11. |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
11/2 | lecture22.pdf | lecture22.pptx | 17 | Viewer | Homework 11 review.
Beyond regular languages: {anbn | n ≥ 1} and {1n | n is prime}. A formal tool: the Pumping Lemma for regular languages. |
11/7 | lecture23.pdf | lecture23.pptx | 26 | Viewer | Homework 12. Install SWI-Prolog.
A quick introduction. Recursive definitions. Non-determinism and backtracking. Language enumeration. Control of backtracking using fail and cut. Built-in predicates: length/2 and findall/3. set_prolog_flag for answer reporting: answer_write_options. Some example Prolog programs: test.prolog N!: factorial.prolog N!: factorial2.prolog Σ*: sigmastar.prolog Terminal log: terminal23.txt |
11/9 | lecture24.pdf | lecture24.pptx | 29 | Viewer |
Homework 13.
The Chomsky hierarchy: (type-3) regular grammars. The definite clause grammar (DCG) formalism. Regular grammars in Prolog. Tree representation. Prolog's default computation rule. Top-down and bottom-up derivations. The correspondence between right recursive regular grammars and FSA. Example type-3 DCG: apbp.prolog bb+ grammar: bbp.prolog Grammar: bab.prolog Terminal log: terminal24.txt |
11/14 | lecture25.pdf | lecture25.pptx | 26 | Viewer | Homework 13 Review.
Beyond regular languages: left and right recursive rules. Using an extra argument to return a parse tree. Grammar: anbn.prolog Grammar: anbn2.prolog A brief note on expressive power and extra arguments. Left recursion, set enumerate and infinite loops. Grammar: lrrg.prolog |
11/16 | lecture26.pdf | lecture26.pptx | 36 | Viewer |
538 Homework
Extra arguments and context sensitivity, the case of {anbncn | n ≥ 1}. abc.prolog abc2.prolog A natural language grammar. Using stanza.run. File: g26.prolog Terminal log: terminal26.txt Slides updated: 5pm |
11/21 | lecture27.pdf | lecture27.pptx | 14 | Viewer | Homework 14: last homework assigned!
538 Presentations. File: g26.prolog |
11/23 | No class: Thanksgiving Break | ||||
11/28 | lecture28.pdf | lecture28.pptx | 6 | Viewer | 538 Presentations: Part 1. |
11/30 | lecture29.pdf | lecture29.pptx | 15 | Viewer | Homework 14 review.
538 Presentations: Part 2 Slides corrected: 5:10pm |
Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
---|---|---|---|---|---|
Powerpoint | |||||
12/5 | lecture30.pdf | lecture30.pptx | 6 | Viewer | 538 Presentations: Part 3 |