Assistant Professor
Computer Science and Engineering
University of California, Santa Cruz
Office: E2-341A
About Me
I’m an Assistant Professor of Computer Science and Engineering at UC Santa Cruz. My research interests are data-centric machine learning and trustworthy machine learning. The central question associated with my work is learning from dynamic, biased and noisy data.
Previously I was holding a postdoctoral fellow position at Harvard University. I have a Ph.D. from the University of Michigan, Ann Arbor and a B.Sc. from Shanghai Jiao Tong University, China.
My research is generously supported by the National Science Foundation (by their CORE, FAI, CAREER and TRIPOS programs), Office of Naval Research (Basic AI Research), Amazon, UC Santa Cruz and CROSS. I was partially supported by the DARPA SCORE program.
[REAL@UCSC] Our group’s research results are disseminated at
[Opensource] is online. This is a library to help you understand and curate your data.
[Datasets] is online. This is the CIFAR data companioned with noisy human annotations which we collected from MTurk.
Recent News
- [2024.10] We released a preprint outlining our perspective on the problem of large language model unlearning.
- [2024.08] We received a new grant from the NSF SLES program to study “Foundation of Safe Learning Under Distribution Shift”.
- [2023.04] Invited to give an IJCAI 2023 Early Career Spotlight talk.
- [2023.04] We will be delivering a hands-on (we will primarily use jupyter notebook examples) tutorial on learning with noisy labels at IJCAI 2023. Stay tuned!
- [2023.04] We will be organizing the Data-centric Machine Learning Research (DMLR) workshop at ICML 2023. Parallelly we will launch a new journal DMLR. Stay tuned!
Recent papers
[2024.10] We have 8 papers accepted to NeurIPS 2024.
[2024.07] Our work Predicting the replicability of social and behavioural science claims in COVID-19 preprints is accepted to Nature Human Behaviour!
[2024.02] We have 5 papers accepted to ICLR 2024, including two spotlight selections!
[2023.10] We have released a preprint on Large Language Model Unlearning. In this paper, we proposed a solution to teach a large language model to “forget” certain undesired training data, including data that represents harmful concept & bias, copyright-protected contents , and user privacy or other policy violation.
[2023.10] We have released a preprint on Trustworthy Large Language Model. In this paper, we identify the major dimensions of consideration for building a trustworthy LLM.
Recent awards
[2023.05] Our preprint Long-Term Fairness with Unknown Dynamics provides a reinforcement learning solution for achieving long-term fairness when we do not know the user-model interaction dynamics. This paper is selected as a highlight paper and the Best paper runner-up at ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models.
[2022.07] Our paper Model Transferability with Responsive Decision Subjects won the Best paper award at ICML 2022 workshop on New Frontiers in Adversarial Machine Learning.
[2022.07] Our paper Unfairness Despite Awareness: Group-Fair Classification with Strategic Agents is awarded the Best paper award at AAMAS 2022 Workshop on Learning with Strategic Agents (LSA).
[2022.02] I have received the NSF CAREER award. Details: here. Thank you NSF!
Invited talks
[2024.05] Large Language Model Unlearning@Silicon Valley Chapter of IEEE Computer Society.
- [2023.11] Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models @RIKEN workshop on Weakly Supervised Learning.
[2022.11] Agency Bias in Machine Learning@USC ML Seminar.
- [2022.11] Agency Bias in Machine Learning@UW ECE Colloquium.