Chenhao Tan University of Chicago @ChenhaoTan, @chenhaotan.bsky.social chenhao@uchicago.edu
Detecting Pretraining Data from Large Language Models. Shi et al. (2024)
Proving Test Set Contamination in Black Box Language Models. Oren et al. (2024)
Ouyang et al. (2022); Taori et al. (2023); Wang et al. (2023)
Ouyang et al. (2022); Touvron et al. (2023)
DeepSeek-AI (2025)