To my linguistics homepage

LING/C SC/PSYC 438/538
Computational Linguistics
Fall 2025

This is a introductory course in computational linguistics at an advanced level. No pre-requisites for graduate students, we will learn the rudiments of programming and the theoretical underpinnings of grammar systems from scratch.

Reference Textbook

Optional: Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008. Or 3rd edition PDF.

Software

All software used in this class will be freely available.
We will use Python, Perl and SWI-Prolog as programming languages.
The instructor reserves the right to ask students to install additional software necessary to do some homework exercises.

Instructor: Sandiway Fong sandiway@arizona.edu
Office: Douglass 311 (send email for an appointment or take a chance and drop by before/after class)

Administrivia

Location Psychology, Rm 206
Time Tuesdays/Thursdays 9:30AM - 10:45AM

Syllabus

See lecture1 slides.

Lecture Notes

Available in Adobe PDF and Microsoft Powerpoint .pptx formats.

August

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
8/26 lecture1.pdf lecture1.pptx 36 Viewer Administrivia and Introduction. Syllabus.
Homework 1: read PDF of chapter 1 of textbook for Homework 3 next time here.
Homework 2: Install Perl and Python.
A note on programming languages. AI and coding.
8/28 lecture2.pdf lecture2.pptx 32 Viewer Language and computers: openai /whisper, assitive technologies and limits. Recursive nature of language. Introduction to natural language analysis: syntactic structure. Parser demos, ChatGPT.
Homework 3

September

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
9/2 lecture3.pdf lecture3.pptx 25 Viewer Homework 3 Review. What is a Language Model: a look at text completion. English subject verb agreement. Quotes. ChatGPT-2 to 5.
Homework 4: using GPT-2 on HuggingFace. Stanza and the Berkeley Neural Parser.
Slides updated: 11am 9/2
9/4 lecture4.pdf lecture4.pptx 19 Viewer Beginning programming with Perl: focusing on proper use of quotes. QWERTY keyboard history. ChatGPT help.
world.perl / world.py
9/9 lecture5.pdf lecture5.pptx 35 Viewer Homework 4 review and extensions.
A bit more on quoting: GNU Linux documentation.
Installing WSL2 / Ubuntu on Windows 11.
perlintro: scalars and arrays.
Files: myprog.perl / myprog.py
[NOTE: didn't finish the slides today.]
Terminal log: terminal5.txt
9/11 lecture6.pdf lecture6.pptx 42 Viewer A note on Lecture 5.
perlintro: contd. Arrays. Numeric and string equality. Coercion. Repetition. General looping: while, for, foreach. List range.
Useful string functions, including chomp and split. tr.
String length: bytes vs. characters.
File I/O: open and <filehandle>.
Files: myprog.perl / myprog.py
Example text file:
falconheavylaunch.txt
Homework 5.
HW files: 3letters.txt / 4letters.txt / 5letters.txt / 6letters.txt
terminal6.txt
9/16 lecture7.pdf lecture7.pptx 25 Viewer Homework 5 review. Scrabble word length statistics.
Worked Perl file I/O example. Split and summing the words. Word frequency table using hash tables.
Sorting in Perl, Python and on the command line.
Files: falconheavylaunch.txt
Terminal log: terminal7.txt
9/18 lecture8.pdf lecture8.pptx 37 Viewer Finish up from last lecture: sorting with Bash and Python.
Hash and dict in Perl and Python, respectively. Anonymous arrays in Perl. Part of Speech dict example.
Homework 6 on spelling rules + disemvoweling.
Files: hw6template.perl
9/23 lecture9.pdf lecture9.pptx 39 Viewer Synsalon talk tomorrow @ 4pm! Strong Minimalist Thesis (SMT).
Homework 6 review. Perl references. Perl Modules: cpan. Date::Calc. Python library timedate.
Ungraded homework: install Lingua::EN::CMUDict, the CMU Pronouncing Dictionary.
File: dow.perl
9/25 lecture10.pdf lecture10.pptx 32 Viewer Ungraded Homework Review.
Digital advertising. Clickbait.
Homework 7.
Introduction to Perl regex.
11:15am: slides updated
9/30 lecture11.pdf lecture11.pptx 29 Viewer Upcoming talk on Generative Linguistics and Generative AI. Oct 10th.
A note on Homework 7. ChatGPT 5 and clickbait.
Does Clickbait Actually Atract More Clicks? Three Clickbait Studies You Must Read
"The lines of code that changed everything"
Getting deeper into Perl regex:
Word boundaries, capture, backreferences, shortest vs. greedy matching, nondeterminism (backtracking).

October

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
10/2 lecture12.pdf lecture12.pptx 27 Viewer Homework 8: regex and the Pandora Papers.
More on Perl regex: Perl code inside a replacement string in s/pattern/replacement/!
Character and word frequency counting
Zipf's Law: Brown Corpus case study
File: 12thnight.txt
Homework file (UTF-8): pandora.txt
10/7 lecture13.pdf lecture13.pptx 35 Viewer Homework 8 review.
Perl regex: lookahead, lookbehind.
Predicate argument structure. Framenet. Propbank.
10/9 lecture14.pdf lecture14.pptx 30 Viewer Stanford CoreNLP. Stanford dependencies and Universal Dependencies (UD). ChatGPT and syntax.
Homework 9
Slides updated @ 11am: available for hw consult after linguistics colloquium Friday afternoon (3-4:30pm; Comm 311)
10/14 lecture15.pdf lecture15.pptx 46 Viewer Homework 9 review.
Extra: ChatGPT 5 on the center-embedding cases
Ungraded regex exercises.
Regex recursion.
11:30am: slides corrected
10/16 lecture16.pdf lecture16.pptx 30 Viewer Regex recursion contd.
Prime number testing using Perl regex.
Introduction to Finite State Automata (FSA).
Slides updated: 11am
10/21 lecture17.pdf lecture17.pptx 17 Viewer regex example: xkcd simplewriter.
Disjunctive list of words in Javascript.
FSA contd.: formal definition, Perl and Python implementations revisited.
e-transitions, and single vs. multiple start states.
File: fsa.perl
10/23 lecture18.pdf lecture18.pptx 16 Viewer Non-deterministic FSA (NDFSA). NDFSA to DFSA conversion. Three worked examples.
Markov Chains. Case of Eugene Onegin.
Homework 10.Note: Q5 is not optional.
Slides updated: 12pm
10/28 lecture19.pdf lecture19.pptx 22 Viewer Homework 10 review.
Regular Languages and closure properties.
Turing Machines (TMs), a useful digression.
Programming a TM: +1, +2, doubling.
10/30 lecture20.pdf lecture20.pptx 16 Viewer Talk announcemen.
Worked example of a machine is (perhaps) easier to build than a regex.
The state bypass method: converting a FSA into a regex algorithmically
Homework 11

November

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
11/4 lecture21.pdf lecture21.pptx 35 Viewer Announcement: class schedule for the remainder of the semester
538 Presentations: announcement
Homework 11 review.
Beyond regular languages: {anbn | n ≥ 1} and {1n | n is prime}.
A formal tool: the Pumping Lemma for regular languages.
11/6 lecture22.pdf lecture22.pptx 26 Viewer Homework 12. Install SWI-Prolog.
A quick introduction.
Recursive definitions. Non-determinism and backtracking. Language enumeration. Control of backtracking using fail and cut.
Built-in predicates: length/2 and findall/3.
set_prolog_flag for answer reporting: answer_write_options.
Some example Prolog programs:
test.prolog
N!: factorial.prolog
N!: factorial2.prolog
Σ*: sigmastar.prolog
Terminal log: terminal22.txt
11/11 Viewer Veterans Day - no classes
11/13 lecture23.pdf lecture23.pptx 29 Viewer Homework 13.
The Chomsky hierarchy: (type-3) regular grammars. The definite clause grammar (DCG) formalism.
Regular grammars in Prolog. Tree representation. Prolog's default computation rule. Top-down and bottom-up derivations.
The correspondence between right recursive regular grammars and FSA.
Example type-3 DCG: apbp.prolog
bb+ grammar: bbp.prolog
Grammar: bab.prolog
Terminal log: terminal23.txt
11/18 lecture24.pdf lecture24.pptx 26 Viewer Homework 13 Review.
Beyond regular languages: left and right recursive rules.
Using an extra argument to return a parse tree.
Grammar: anbn.prolog
Grammar: anbn2.prolog
A brief note on expressive power and extra arguments.
Left recursion, set enumerate and infinite loops.
Grammar: lrrg.prolog
11/20 lecture25.pdf lecture25.pptx Viewer 538 Presentations: part 1
11/25 lecture26.pdf lecture26.pptx Viewer 538 Presentations: part 2
11/27 Viewer Thanksgiving - no classes

December

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
12/2 Viewer Away - no classes
12/4 Viewer Away - no classes
12/9 Viewer Away - no classes


To my linguistics homepage
Last modified: Thu Oct 30 10:46:51 MST 2025