To my linguistics homepage

LING/C SC/PSYC 438/538
Computational Linguistics
Fall 2023

This is a introductory course in computational linguistics at an advanced level. No pre-requisites for graduate students, we will learn the rudiments of programming and the theoretical underpinnings of grammar systems from scratch.

Reference Textbook

Optional: Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008. Or 3rd edition PDF.

Software

All software used in this class will be freely available.
We will use Python, Perl and SWI-Prolog as programming languages.
The instructor reserves the right to ask students to install additional software necessary to do some homework exercises.

Instructor: Sandiway Fong sandiway@arizona.edu
Office: Douglass 311 (send email for an appointment or take a chance and drop by before/after class)

Administrivia

Location Psychology, Rm 204
Time Tuesdays/Thursdays 3:30PM - 4:45PM

Syllabus

See lecture1 slides.

Lecture Notes

Available in Adobe PDF and Microsoft Powerpoint .pptx formats.

August

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
8/22 lecture1.pdf lecture1.pptx 36 Viewer Administrivia and Introduction.
Homework 1: read PDF of chapter 1 of textbook for in-class quiz next time.here.
Homework 2: Install Perl and Python.
8/24 lecture2.pdf lecture2.pptx 32 Language and computers: assitive technologies and limits. Recursive nature of language. Introduction to natural language analysis: parser demos.
Homework 3. A quick quiz.
8/29 lecture3.pdf lecture3.pptx 37 Zoom Homework 3 Review. Language Modeling. ChatGPT. English subject verb agreement. Homework 4.
8/31 lecture4.pdf lecture4.pptx 24 Viewer Text summarization: Open Text Summarizer (OTS), macOS Summarize Service and ChatGPT. Beginning programming with Perl: focusing on proper use of quotes. QWERTY keyboard history.
world.perl / world.py

September

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
9/5 lecture5.pdf lecture5.pptx 31 Viewer Homework 4 review. A bit more on quoting. perlintro: scalars and arrays.
9/7 lecture6.pdf lecture6.pptx 32 Viewer perlintro contd. Numeric and string equality. Coercion. Repetition. General looping: while, for, foreach. List range. Useful string functions, including chomp and split. tr. String length: bytes vs. characters.
File I/O: open and <filehandle>.
Files: falconheavylaunch.txt
Homework 5.
HW files: 2letters.txt / 3letters.txt / 4letters.txt / 5letters.txt
9/12 lecture7.pdf lecture7.pptx 27 Viewer Using WSL2: directory structures. Homework 5 review. Scrabble word length statistics. Worked file I/O example. Split and summing the words. Word frequency table using hash tables. Sorting in Perl, Python and on the command line.
Files: falconheavylaunch.txt
Terminal log: terminal7.txt
9/14 lecture8.pdf lecture8.pptx 29 Viewer Hash and dict in Perl and Python, respectively. Anonymous arrays in Perl. Part of Speech dict example. Homework 6 on spelling rules + disemvoweling.
4:50pm: slides corrected.
9/19 lecture9.pdf lecture9.pptx 37 Viewer Homework 6 review. Perl references. Perl Modules: cpan. Date::Calc. Python library timedate.
Ungraded homework: install Lingua::EN::CMUDict, the CMU Pronouncing Dictionary.
File: dow.perl
9/21 lecture10.pdf lecture10.pptx 37 Viewer Digital advertising. Clickbait. Homework 7. Perl regex.
9/26 lecture11.pdf lecture11.pptx 22 Viewer A note on Homework 7. ChatGPT and clickbait.
"The lines of code that changed everything"
Getting deeper into Perl regex. Capture, backreferences, shortest vs. greedy matching, nondeterminism (backtracking).
Note: Homework 8 was previewed also during this lecture. Slides for that are in Lecture 12.
9/28 lecture12.pdf lecture12.pptx 27 Viewer Homework 8: regex.
More on Perl regex: Perl code inside a replacement string in s/pattern/replacement/!
Character and word frequency counting
Zipf's Law: Brown Corpus case study
File: 12thnight.txt
File (UTF-8): pandora.txt
Slides corrected: 4:50pm

October

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
10/3 lecture13.pdf lecture13.pptx 38 Viewer Homewwork 8 review. Perl regex: lookahead, lookbehind. Predicate argument structure. Framenet. Propbank. Stanford CoreNLP. Universal dependencies.
10/5 lecture14.pdf lecture14.pptx 26 Viewer Stanford CoreNLP. Universal dependencies continued. ChatGPT. Homework 9.
Slides corrected: 4:25pm
10/10 lecture15.pdf lecture15.pptx 35 Viewer Homework 9 review. Ungraded regex exercises. Regex recursion.
10/12 lecture16.pdf lecture16.pptx 34 Viewer Regex recursion contd. Prime number testing using Perl regex. Finite State Automata (FSA).
10/17 lecture17.pdf lecture17.pptx 24 Viewer regex example: xkcd simplewriter.
FSA contd.: formal definition, Perl and Python implementations, e-transitions, and single vs. multiple start states.
File: fsa.perl
10/19 lecture18.pdf lecture18.pptx 10 Viewer Non-deterministic FSA (NDFSA). NDFSA to DFSA conversion. Three worked examples.
Homework 10.
Important: Homework 10 slides updated after class!
10/24 lecture19.pdf lecture19.pptx 21 Viewer Homework 10 review.
Regular Languages and closure properties.
Turing Machines, a useful digression.
10/26 lecture20.pdf lecture20.pptx 16 Viewer Example of a machine is (perhaps) easier to build than a regex.
The state bypass method: converting a FSA into a regex algorithmically
Homework 11
4:45pm homework slides clarified.
10/31 lecture21.pdf lecture21.pptx 3 No recording Dr. Shawn E. Nordell, UA Graduate Career Services.
Resources for finding jobs, interships and resume support.
A note on Homework 11.

November

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
11/2 lecture22.pdf lecture22.pptx 17 Viewer Homework 11 review.
Beyond regular languages: {anbn | n ≥ 1} and {1n | n is prime}.
A formal tool: the Pumping Lemma for regular languages.
11/7 lecture23.pdf lecture23.pptx 26 Viewer Homework 12. Install SWI-Prolog.
A quick introduction.
Recursive definitions. Non-determinism and backtracking. Language enumeration. Control of backtracking using fail and cut.
Built-in predicates: length/2 and findall/3.
set_prolog_flag for answer reporting: answer_write_options.
Some example Prolog programs:
test.prolog
N!: factorial.prolog
N!: factorial2.prolog
Σ*: sigmastar.prolog
Terminal log: terminal23.txt
11/9 lecture24.pdf lecture24.pptx 29 Viewer Homework 13.
The Chomsky hierarchy: (type-3) regular grammars. The definite clause grammar (DCG) formalism.
Regular grammars in Prolog. Tree representation. Prolog's default computation rule. Top-down and bottom-up derivations.
The correspondence between right recursive regular grammars and FSA.
Example type-3 DCG: apbp.prolog
bb+ grammar: bbp.prolog
Grammar: bab.prolog
Terminal log: terminal24.txt
11/14 lecture25.pdf lecture25.pptx 26 Viewer Homework 13 Review.
Beyond regular languages: left and right recursive rules.
Using an extra argument to return a parse tree.
Grammar: anbn.prolog
Grammar: anbn2.prolog
A brief note on expressive power and extra arguments.
Left recursion, set enumerate and infinite loops.
Grammar: lrrg.prolog
11/16 lecture26.pdf lecture26.pptx 36 Viewer 538 Homework
Extra arguments and context sensitivity, the case of {anbncn | n ≥ 1}.
abc.prolog
abc2.prolog
A natural language grammar. Using stanza.run.
File: g26.prolog
Terminal log: terminal26.txt
Slides updated: 5pm
11/21 lecture27.pdf lecture27.pptx 14 Viewer Homework 14: last homework assigned!
538 Presentations.
File: g26.prolog
11/23 No class: Thanksgiving Break
11/28 lecture28.pdf lecture28.pptx 6 Viewer 538 Presentations: Part 1.
11/30 lecture29.pdf lecture29.pptx 15 Viewer Homework 14 review.
538 Presentations: Part 2
Slides corrected: 5:10pm

December

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
12/5 lecture30.pdf lecture30.pptx 6 Viewer 538 Presentations: Part 3


To my linguistics homepage
Last modified: Tue Dec 5 16:25:43 MST 2023