Course Staff

Chenhao Tan
Chenhao Tan
Instructor
Qirun Dai
Qirun Dai
Teaching Assistant
Dang Nguyen
Dang Nguyen
Teaching Assistant
Darin Keng
Darin Keng
Teaching Assistant

Logistics

Content

What is this course about?

This course will introduce fundamental concepts in natural language processing (NLP). It will cover the basics of enabling computers to understand and generate language, including word embeddings, language modeling, transformers, and an overview of large language models. It will also cover topics on connections with other disciplines such as linguistics and other social sciences.

Prerequisites


Coursework

Assignments

Project

Exams

Compute

Modal has generously offered compute to each student. See details on Ed. Modal

Textbook

There are a lot of resources online for related content. We will provide readings and pointers throughout the course. A recommended textbook is Speech and Language Processing by Dan Jurafsky and James H. Martin.

Honor Code

We expect students to not look at solutions or implementations online. Like all other classes at UChicago, we take academic honesty very seriously. Please make sure to read the UChicago Academic Honesty page.

Collaboration policy

For individual assignments, collaboration with fellow students is encouraged as long as they are properly disclosed for each submission. However, you should not share any written work or code for your assignments. After discussing a problem with others, you should write the solution by yourself. For final projects, you are expected to work in groups 2-3.

AI tools policy

Using generative AI tools such as Claude Code and ChatGPT is allowed as long as they are properly disclosed for each submission. For individual assignments, we encourage you to implement it on your own so that it maximizes your learning, but learning the content with AI tools is acceptable. In fact, we encourage your creative use of these tools, treating them as collaborators in the learning process.

Additional course policies can be found on Canvas.

Submitting Coursework

Late Days


Preliminary Schedule

# Date Topic Materials Deadlines
1 Tues Jan 6 Introduction and tokenization lecture, notebook(html version) Assignment 1 out
2 Thurs Jan 8 Word vectors lecture, notebook(html version)
Readings:
Efficient Estimation of Word Representations in Vector Space by Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean, SLP Chapter 2 and 5
3 Tues Jan 13 Text Classification lecture, notebook
Readings: Deep Unordered Composition Rivals Syntactic Methods for Text Classification by Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, Hal Daumé III; SLP Chapter 4
Assignment 1 due (Tuesday night)
Assignment 2 out
4 Thurs Jan 15 N-gram Language Modeling, Neural Language Models lecture, notebooks,
Readings: A Neural Probabilistic Language Model by Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin; SLP Chapter 3, 13
5 Tues Jan 20 The NLP Recipe lecture, notebooks
6 Thurs Jan 22 Attention lecture, notebooks,
Readings: SLP Chapter 8, Attention Is All You Need by Vaswani et al.
Assignment 2 due (Friday night)
Assignment 3 out (Friday)
7 Tues Jan 27 Transformers lecture, notebooks,
Readings: SLP Chapter 8, Attention Is All You Need by Vaswani et al.
8 Thurs Jan 29 Attention and transformer demo lecture, notebooks,
Readings: SLP Chapter 8
9 Tues Feb 3 Pretraining and Fine-tuning lecture, notebooks,
Readings: SLP Chapter 8, Language Models are Unsupervised Multitask Learners by Radford et al., Scaling Laws for Neural Language Models by Kaplan et al.
10 Thurs Feb 5 Midterm Assignment 3 due (Friday night)
Project Proposal due (Sunday night)
Assignment 4 out (Friday)
11 Tues Feb 10 Decoding LLMs and Prompting lecture, SLP Chapter 7, Chain-of-Thought Prompting Elicits Reasoning in Large Language Models by Wei et al., Large Language Models are Zero-Shot Reasoners by Kojima et al. Fast Inference from Transformers via Speculative Decoding by Leviathan et al.
12 Thurs Feb 12 Benchmarking and evaluation lecture, notebooks, Task-Completion Time Horizons of Frontier AI Models
13 Tues Feb 17 Post-training lecture, notebooks, Training language models to follow instructions with human feedback by Ouyang et al., DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning by DeepSeek-AI.
14 Thurs Feb 19 Reward Optimization Continued lecture, notebooks, Assignment 4 due (Friday night)
Blog Entry 1 due
15 Tues Feb 24 Guest Lecture: Hypothesis Generation with Large Language Models
16 Thurs Feb 26 Selected advanced topics in LLMs Blog Entry 2 due
17 Tues Mar 3 Guest Lecture: Multimodal NLP
18 Thurs Mar 5 Final Project Presentation
19 TBD Final Exam

Acknowledgments

This course website is adapted from the Stanford CS336 course website. This course is built on prior offerings by Mina Lee and Chenhao Tan.