To my linguistics homepage

LING/C SC/PSYC 408/508
Computational Techniques for Linguists
Fall 2023

This is a introductory course to computers and general programming useful for linguists (and non-engineers). Topics include Linux and the Terminal (Shell usage and programming), Python and web technologies such as HTML, CSS, Javascript and Apache2. A term project is required.

Textbook

No textbook is required. All reading material will be made available online

Software

All software used in this class will be freely available.
Students will be expected to install virtualized Ubuntu (Linux) on their computers for programming use.

Instructor: Sandiway Fong sandiway@arizona.edu
Office: Douglass 311 (send email for an appointment or take a chance and drop by before/after class)

Administrivia

Location C E Chavez Bldg, Rm 405
Time Tuesdays/Thursdays 12:30PM - 1:45PM

Syllabus

See lecture1 slides.

Lecture Notes

Available in Adobe PDF and Microsoft Powerpoint .pptx formats.

August

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
8/22 lecture1.pdf lecture1.pptx 25 Administrivia and Introduction.
Computers and storage. Architecture and Machine Language.
8/24 lecture2.pdf lecture2.pptx 24 Binary and hexadecimal. Computer vs. Human Brain. Machine Language. Parallelism and supercomputers. Integers and 2's complement arithmetic. Binary Coded Decimal (BCD). Floating point numbers.
Homework 1
8/29 lecture3.pdf lecture3.pptx 26 Viewer Homework 1 review. Character representation: ASCII and Unicode. HTML5 and Unicode.
pokemon.html / pokemon2.html / pokemon3.html
8/31 lecture4.pdf lecture4.pptx 42 Viewer Homework 2. Installing Ubuntu as a guest operating system on your computer.
  • Install Windows Sub-system for Linux 2 (WSL2),
  • Install VirtualBox for Intel/AMD x86 architectures
  • Install Multipass

September

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
9/5 lecture5.pdf lecture5.pptx 18 Viewer Terminal commands. The Bash shell. Control-D, what it means. Control-G, ringing the bell. bc calculator. Shell variables. Shell commands. Directories and files. Shell arithmetic using expr and ((...)). PS1 customization. .bashrc editing using nano.
A first shell program using for-do-done.
Terminal log: terminal5.txt
9/7 lecture6.pdf lecture6.pptx 18 Viewer cat command. Bash shell programming. Asking for Shell input using read. chmod. if-then-else/elif-fi.
Homework 3: Bash shell exercises. Useful word commands: wc, tr, sort, and uniq.
File: text.txt
9/12 lecture7.pdf lecture7.pptx 28 Viewer Homework 3 review. The spirit of Unix. tail. awk. termgraph: ASCII graphics. A note on file permissions: chmod.
Slides updated: 1:50pm
9/14 lecture8.pdf lecture8.pptx 21 Viewer bc revisited: scale, obase, arbitrary precision pi and e. Bash command substitution: two ways. Positional parameters $n. If-test: [...] and [[...]]]. Homework 4.
File: test.sh / test2.sh / test3.sh
Terminal log: terminal8.txt
9/19 lecture9.pdf lecture9.pptx 18 Viewer Homework 4 Review.
Bash shell: suffix and prefix deletion, string maninpulation, loops, positional parameters and globbing.
Example programs:
line30.sh
cmd.sh
rmext.sh
Terminal log: terminal9.txt
9/21 lecture10.pdf lecture10.pptx 28 Viewer Final lecture on bash.
Example exercises: (1) file deletion, (2) double-spacing, (3) non-blank lines only, and (4) find and sed.
Example programs:
rm.sh
doublespace.sh
doublespace2.sh
nonblank.sh
Terminal log: terminal10.txt
9/26 lecture11.pdf lecture11.pptx 30 Viewer Introduction to html5: html, css and javascript.
What is hypertext? Client-side vs. server-side. HTML tags. URL format. IMG. Embedded images using base64. Text element tags. Preformatted blocks. Turning on debugging in the browser: Safari, Google Chrome and Firefox.
Ungraded Homework Exercise.
Slides updated: 1:45pm
9/28 lecture12.pdf lecture12.pptx 18 Viewer More html tags. X11 colors. UTF-8 and the meta tag. Introduction to CSS: inline style. Tabs.
Homework 5
Example html file used in class: test.html

October

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
10/3 lecture13.pdf lecture13.pptx 25 Viewer Javascript. Console in browser.
document.write(), document.getElementById().innerHTML.
Tic-tac-toe example.
Javascript variables, and numbers. Random number generation.
File: sample.html
File: sample2.html
File: sample3.html
10/5 lecture14.pdf lecture14.pptx 26 Viewer Javascript strings, operators, if-else, switch-case, for/while loops.
Tic Tac Toe example contd.
File: sample3.html
File: sample4.html
File: sample5.html
File: sample6.html
File: sample7.html
Homework 6: making the game playable, step-by-step!
Slides corrected: 1:42pm
10/10 lecture15.pdf lecture15.pptx 22 Viewer Some notes on Homework 6: setTimeout().
Some Term Project Ideas.
Forms and Javascript: File: inputtext.html
File: inputcheckbox.html
File: select.html
File: radio.html
Example: BMI Gauge animated display (using Javascript)
File: bmi-gauge.html
File: gaugeSVG.js
10/14 lecture16.pdf lecture16.pptx 22 Viewer Example: BMI Gauge animated display (using Javascript) contd.
File: bmi-gauge.html
File: gaugeSVG.js
Gage:
File: counter.html
Homework 7
10/17 lecture17.pdf lecture17.pptx 44 Viewer xkcd: simplewriter
Installing and configuring the Apache2 webserver on macOS and Ubuntu: start/stop service, DocumentRoot, and UserDir.
IP lookup. IPv6 and IPv4. Geolocation.

File: sample-index.html
Terminal log: terminal17.txt
10/19 lecture18.pdf lecture18.pptx 26 Viewer Homework 8
cgi-bin (Common Gateway Interface) setup on macOS and Ubuntu.
An Example using HTML/Javascript: diskspace.cgi.
Command: df -g.
Use of awk to extract information from df.
File: test.cgi (Download linked file as...)
File: diskspace.cgi (Download linked file as...)
File: canvasjs.min.js (Javascript widget)

Terminal log: terminal18.txt
Slides corrected: 1:35pm
10/24 lecture19.pdf lecture19.pptx 29 Viewer Running a cgi-bin program from inside your home directory.
Worked example: CMU pronouncing dictionary run in Perl.
File: cmudict.cgi (Download linked file as...)
Slides corrected for macOS setup: 11:15am Oct 26
10/26 lecture20.pdf lecture20.pptx 32 Viewer Get method and POST method for forms.
Worked example: adding and deleting names from a database.
Files:
cumdictform.html
cmudict2.cgi
form-post.html
read.cgi
addnames.html
get2.cgi
addnames3.html
get3.cgi
Terminal log: terminal20.txt
10/31 lecture21.pdf lecture21.pptx 31 Viewer Why Python? Examples of what you can do with Python + nltk
Python numbers.
Homework 9: install Python 3.
Terminal log: terminal21.txt
Slides updated: 2:20pm

November

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
11/2 lecture22.pdf lecture22.pptx 26 Viewer Homework 10: Install nltk and nltk data.
More on Python: range() and use in calculating compound interest.
Type coercion: converting numbers and strings.
Function def. yield and return.
Formatted output, e.g. print('format string'.format()).
Strings and lists: indexability and mutability.
Command line arguments: sys.arg list.
File: futval.py
File: futval2.py
File: futval3.py
Terminal log: terminal22.txt
11/7 lecture23.pdf lecture23.pptx 20 Viewer Python and text. Exercises. for-loop and list comprehension. Functions sum() and mean().
nltk.corpus.gutenberg.words(). nltk.FreqDist() and .plot().
set()
Files: open(), .read(), .readline(), .readlines().
Slides corrected: 1:55pm
Terminal log: terminal23.txt
11/9 lecture24.pdf lecture24.pptx 22 Viewer .word_tokenize(), .pos_tag().
Penn Treebank POS tagset.
treebank.parsed_sents(), .draw().
.chunk.ne_chunk()
.concordance(), .similar(), .common_contexts()
Homework 11.
Terminal log: terminal24.txt
11/14 lecture25.pdf lecture25.pptx 28 Viewer def lex_diversity(), eval().
Importing your own corpus: an example using Project Gutenberg.
text.findall(r"... pattern ...")
String methods: word.startswith(string), word.endswith(string), word.istitle().
Terminal log: terminal25.txt
11/16 lecture26.pdf lecture26.pptx 19 Viewer Solving an encoding mystery: Latin-1 vs. UFT8.
Sentence tokenization: nltk.sent_tokenize(string). Literary Style: Stream of consciousness and nltk. Mrs. Dalloway vs. Brown Fiction. matplotlib bar charting.
11/21 lecture27.pdf lecture27.pptx 32 Viewer Term Project. No more homework assignments.
Berkeley Neural Parser and Stanford Stanza on the famous buffalo sentence.
CFG parsing of the buffalo sentence.
nltk CFG: buffalo.txt
11/23 No class: Thanksgiving Break
11/28 lecture28.pdf lecture28.pptx 19 Viewer Bigrams and Conditional Frequency Distributions.
nltk.ConditionalFreqDist()
Generating random text. random.choice().
File: nwords.py
File: oliver_twist0-53.txt
Terminal log: terminal28.txt
11/30 lecture29.pdf lecture29.pptx 30 Viewer nltk.ConditionalFreqDist() and other corpora from nltk.
Brown corpus and modals per genre.
Reuters news: reuters
Presidental Inaugural addresses: inaugural and startswith america, citizen. Howard Taft's 1909 address.
Universal Declaration of Human Rights: udhr and word length for Latin1 encoded languages.
udhr and histogram of length of declaration in words.
Wordlist corpus: unusual words in Emma.
Stopword list.
Male and female name lists. Names ending in 'a' and 'e' are female?

Slides corrected: 1:30pmm

December

Date Lecture Notes Number
of Slides
Panopto Topic
PDF Powerpoint
12/5 lecture30.pdf lecture30.pptx 26 Viewer Reminder: Term Project writeup due end of week.
Stylometry: two approaches.
100 most frequent words, Who wrote Wuthering Heights?
word length: mendenhall1887.pdf


To my linguistics homepage
Last modified: Sun Jun 2 14:01:19 JST 2024