Content
What is this course about?
This course will introduce fundamental concepts in natural language processing (NLP). It
will cover the basics of enabling computers to understand and generate language, including
word embeddings, language modeling, transformers, and an overview of large language models. It
will also cover topics on connections with other disciplines such as linguistics and other
social sciences.
Prerequisites
-
Proficiency in Python
You should have a strong foundation in Python and familiarity with Git and Jupyter Notebook. In addition, you must be able to quickly learn and adapt to new frameworks. The course will involve using the PyTorch framework, the HuggingFace library, and several APIs (e.g., OpenAI and TogetherAI). Although optional tutorial sessions will be provided to introduce the basics, you are expected to be able to learn from documentation and troubleshoot effectively.
-
Calculus, Linear Algebra and Probability
You should be comfortable taking (multivariable) derivatives, understanding
matrix/vector notation and operations, and the basics of probability.
-
Machine Learning
You should have taken one of CMSC 25300, CMSC 25400, and CMSC 25025.
Coursework
Assignments
-
Assignment 1: TBD
- Due: Tuesday, January 13, 2026
-
Assignment 2: TBD
- Due: Friday, January 23, 2026
-
Assignment 3: TBD
- Due: Friday, February 6, 2026
-
Assignment 4: TBD
- Due: Friday, February 20, 2026
Project
- Project Proposal: Due Week 5
- Weekly Blog Entries: Weeks 7, 8, 9
- Final Project Presentation: Week 9
Exams
- Midterm: Week 5
- Final Exam: Week 10
Compute
Modal has generously offered compute to each student. See details on Ed.
Textbook
There are a lot of resources online for related content. We will provide readings and pointers
throughout the course. A recommended textbook is
Speech and Language Processing by Dan
Jurafsky and James H. Martin.
Honor Code
We expect students to not look at solutions or implementations online. Like all other
classes at UChicago, we take academic honesty very seriously. Please make sure to read the
UChicago Academic
Honesty page.
Collaboration policy
For individual assignments, collaboration with fellow students is encouraged as long as they
are properly disclosed for each submission. However, you should not share any written work
or code for your assignments. After discussing a problem with others, you should write the
solution by yourself. For final projects, you are expected to work in groups 2-3.
AI tools policy
Using generative AI tools such as Claude Code and ChatGPT is allowed as long as they are
properly disclosed for each submission. For individual assignments, we encourage you to
implement it on your own so that it maximizes your learning, but learning the content with
AI tools is acceptable. In fact, we encourage your creative use of these
tools, treating them as collaborators in the learning process.
Additional course policies can be found on Canvas.
Submitting Coursework
- All coursework should be submitted via Gradescope by the deadline.
Late Days
- Each student has 6 late days to use throughout the quarter.
- Each assignment can use at most 3 late days.