|
This is a introductory course in computational linguistics at an advanced level. No pre-requisites for graduate students, we will learn the rudiments of programming and the theoretical underpinnings of grammar systems from scratch. Reference Textbook Optional: Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008. Or 3rd edition PDF. Software
All software used in this class will be freely available.
|
Instructor: Sandiway Fong sandiway@arizona.edu
Office: Douglass 311 (send email for an appointment or
take a chance and drop by before/after class)
| Location | Psychology, Rm 206 |
| Time | Tuesdays/Thursdays 9:30AM - 10:45AM |
| Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
|---|---|---|---|---|---|
| Powerpoint | |||||
| 8/26 | lecture1.pdf | lecture1.pptx | 36 | Viewer | Administrivia and Introduction. Syllabus.
Homework 1: read PDF of chapter 1 of textbook for Homework 3 next time here. Homework 2: Install Perl and Python. A note on programming languages. AI and coding. |
| 8/28 | lecture2.pdf | lecture2.pptx | 32 | Viewer | Language and computers: openai /whisper, assitive technologies and
limits. Recursive nature of language. Introduction to natural
language analysis: syntactic structure. Parser demos, ChatGPT.
Homework 3 |
| Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
|---|---|---|---|---|---|
| Powerpoint | |||||
| 9/2 | lecture3.pdf | lecture3.pptx | 25 | Viewer | Homework 3 Review. What is a Language Model: a look at text
completion. English subject
verb agreement. Quotes. ChatGPT-2 to 5.
Homework 4: using GPT-2 on HuggingFace. Stanza and the Berkeley Neural Parser. Slides updated: 11am 9/2 |
| 9/4 | lecture4.pdf | lecture4.pptx | 19 | Viewer |
Beginning programming with Perl: focusing on
proper use of quotes. QWERTY keyboard history. ChatGPT help.
world.perl / world.py |
| 9/9 | lecture5.pdf | lecture5.pptx | 35 | Viewer |
Homework 4 review and extensions.
A bit more on quoting: GNU Linux documentation. Installing WSL2 / Ubuntu on Windows 11. perlintro: scalars and arrays. Files: myprog.perl / myprog.py [NOTE: didn't finish the slides today.] Terminal log: terminal5.txt |
| 9/11 | lecture6.pdf | lecture6.pptx | 42 | Viewer |
A note on Lecture 5.
perlintro: contd. Arrays. Numeric and string equality. Coercion. Repetition. General looping: while, for, foreach. List range. Useful string functions, including chomp and split. tr. String length: bytes vs. characters. File I/O: open and <filehandle>. Files: myprog.perl / myprog.py Example text file: falconheavylaunch.txt Homework 5. HW files: 3letters.txt / 4letters.txt / 5letters.txt / 6letters.txt terminal6.txt |
| 9/16 | lecture7.pdf | lecture7.pptx | 25 | Viewer | Homework 5 review. Scrabble word
length statistics.
Worked Perl file I/O example. Split and summing the words. Word frequency table using hash tables. Sorting in Perl, Python and on the command line. Files: falconheavylaunch.txt Terminal log: terminal7.txt |
| 9/18 | lecture8.pdf | lecture8.pptx | 37 | Viewer | Finish up from last lecture: sorting with Bash and Python.
Hash and dict in Perl and Python, respectively. Anonymous arrays in Perl. Part of Speech dict example. Homework 6 on spelling rules + disemvoweling. Files: hw6template.perl |
| 9/23 | lecture9.pdf | lecture9.pptx | 39 | Viewer |
Synsalon talk tomorrow @ 4pm! Strong Minimalist Thesis (SMT).
Homework 6 review. Perl references. Perl Modules: cpan. Date::Calc. Python library timedate. Ungraded homework: install Lingua::EN::CMUDict, the CMU Pronouncing Dictionary. File: dow.perl |
| 9/25 | lecture10.pdf | lecture10.pptx | 32 | Viewer |
Ungraded Homework Review.
Digital advertising. Clickbait. Homework 7. Introduction to Perl regex. 11:15am: slides updated |
| 9/30 | lecture11.pdf | lecture11.pptx | 29 | Viewer | Upcoming talk on Generative Linguistics and Generative AI. Oct 10th.
A note on Homework 7. ChatGPT 5 and clickbait. Does Clickbait Actually Atract More Clicks? Three Clickbait Studies You Must Read "The lines of code that changed everything" Getting deeper into Perl regex: Word boundaries, capture, backreferences, shortest vs. greedy matching, nondeterminism (backtracking). |
| Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
|---|---|---|---|---|---|
| Powerpoint | |||||
| 10/2 | lecture12.pdf | lecture12.pptx | 27 | Viewer | Homework 8: regex and the Pandora Papers.
More on Perl regex: Perl code inside a replacement string in s/pattern/replacement/! Character and word frequency counting Zipf's Law: Brown Corpus case study File: 12thnight.txt Homework file (UTF-8): pandora.txt |
| 10/7 | lecture13.pdf | lecture13.pptx | 35 | Viewer |
Homework 8 review.
Perl regex: lookahead, lookbehind. Predicate argument structure. Framenet. Propbank. |
| 10/9 | lecture14.pdf | lecture14.pptx | 30 | Viewer | Stanford CoreNLP. Stanford dependencies and Universal Dependencies
(UD). ChatGPT and syntax.
Homework 9 Slides updated @ 11am: available for hw consult after linguistics colloquium Friday afternoon (3-4:30pm; Comm 311) |
| 10/14 | lecture15.pdf | lecture15.pptx | 46 | Viewer | Homework 9 review.
Extra: ChatGPT 5 on the center-embedding cases Ungraded regex exercises. Regex recursion. 11:30am: slides corrected |
| 10/16 | lecture16.pdf | lecture16.pptx | 30 | Viewer | Regex recursion contd.
Prime number testing using Perl regex. Introduction to Finite State Automata (FSA). Slides updated: 11am |
| 10/21 | lecture17.pdf | lecture17.pptx | 17 | Viewer |
regex example: xkcd simplewriter.
Disjunctive list of words in Javascript. FSA contd.: formal definition, Perl and Python implementations revisited. e-transitions, and single vs. multiple start states. File: fsa.perl |
| 10/23 | lecture18.pdf | lecture18.pptx | 16 | Viewer | Non-deterministic FSA (NDFSA). NDFSA to DFSA conversion. Three
worked examples.
Markov Chains. Case of Eugene Onegin. Homework 10.Note: Q5 is not optional. Slides updated: 12pm |
| 10/28 | lecture19.pdf | lecture19.pptx | 22 | Viewer | Homework 10 review.
Regular Languages and closure properties. Turing Machines (TMs), a useful digression. Programming a TM: +1, +2, doubling. |
| 10/30 | lecture20.pdf | lecture20.pptx | 16 | Viewer | Talk announcemen.
Worked example of a machine is (perhaps) easier to build than a regex. The state bypass method: converting a FSA into a regex algorithmically Homework 11 |
| Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
|---|---|---|---|---|---|
| Powerpoint | |||||
| 11/4 | lecture21.pdf | lecture21.pptx | 35 | Viewer |
Announcement: class schedule for the remainder of the semester
538 Presentations: announcement Homework 11 review. Beyond regular languages: {anbn | n ≥ 1} and {1n | n is prime}. A formal tool: the Pumping Lemma for regular languages. |
| 11/6 | lecture22.pdf | lecture22.pptx | 26 | Viewer |
Homework 12. Install SWI-Prolog.
A quick introduction. Recursive definitions. Non-determinism and backtracking. Language enumeration. Control of backtracking using fail and cut. Built-in predicates: length/2 and findall/3. set_prolog_flag for answer reporting: answer_write_options. Some example Prolog programs: test.prolog N!: factorial.prolog N!: factorial2.prolog Σ*: sigmastar.prolog Terminal log: terminal22.txt |
| 11/11 | Viewer | Veterans Day - no classes | |||
| 11/13 | lecture23.pdf | lecture23.pptx | 29 | Viewer |
Homework 13.
The Chomsky hierarchy: (type-3) regular grammars. The definite clause grammar (DCG) formalism. Regular grammars in Prolog. Tree representation. Prolog's default computation rule. Top-down and bottom-up derivations. The correspondence between right recursive regular grammars and FSA. Example type-3 DCG: apbp.prolog bb+ grammar: bbp.prolog Grammar: bab.prolog Terminal log: terminal23.txt |
| 11/18 | lecture24.pdf | lecture24.pptx | 26 | Viewer |
Homework 13 Review.
Beyond regular languages: left and right recursive rules. Using an extra argument to return a parse tree. Grammar: anbn.prolog Grammar: anbn2.prolog A brief note on expressive power and extra arguments. Left recursion, set enumerate and infinite loops. Grammar: lrrg.prolog |
| 11/20 | lecture25.pdf | lecture25.pptx | Viewer | 538 Presentations: part 1 | |
| 11/25 | lecture26.pdf | lecture26.pptx | Viewer | 538 Presentations: part 2 | |
| 11/27 | Viewer | Thanksgiving - no classes | |||
| Date | Lecture Notes | Number of Slides |
Panopto | Topic | |
|---|---|---|---|---|---|
| Powerpoint | |||||
| 12/2 | Viewer | Away - no classes | |||
| 12/4 | Viewer | Away - no classes | |||
| 12/9 | Viewer | Away - no classes | |||