Contact

Assistant Professor

Computer Science and Engineering

University of California, Santa Cruz

Email:

Office: E2-341A

About Me

I’m an Assistant Professor of Computer Science and Engineering at UC Santa Cruz. My research interests are data-centric machine learning and trustworthy machine learning. The central question associated with my work is learning from dynamic, biased and noisy data.

Previously I was holding a postdoctoral fellow position at Harvard University. I have a Ph.D. from the University of Michigan, Ann Arbor and a B.Sc. from Shanghai Jiao Tong University, China.

My research is generously supported by the National Science Foundation (by their CORE, FAI, CAREER and TRIPOS programs), Office of Naval Research (Basic AI Research), Amazon, UC Santa Cruz and CROSS. I was partially supported by the DARPA SCORE program.


Pinned


Recent News

  • [2024.10] We released a preprint outlining our perspective on the problem of large language model unlearning.
  • [2024.08] We received a new grant from the NSF SLES program to study “Foundation of Safe Learning Under Distribution Shift”.
  • [2023.04] Invited to give an IJCAI 2023 Early Career Spotlight talk.
  • [2023.04] We will be delivering a hands-on (we will primarily use jupyter notebook examples) tutorial on learning with noisy labels at IJCAI 2023. Stay tuned!
  • [2023.04] We will be organizing the Data-centric Machine Learning Research (DMLR) workshop at ICML 2023. Parallelly we will launch a new journal DMLR. Stay tuned!

Recent papers

  • [2024.10] We have 8 papers accepted to NeurIPS 2024.

  • [2024.07] Our work Predicting the replicability of social and behavioural science claims in COVID-19 preprints is accepted to Nature Human Behaviour!

  • [2024.02] We have 5 papers accepted to ICLR 2024, including two spotlight selections!

  • [2023.10] We have released a preprint on Large Language Model Unlearning. In this paper, we proposed a solution to teach a large language model to “forget” certain undesired training data, including data that represents harmful concept & bias, copyright-protected contents , and user privacy or other policy violation.

  • [2023.10] We have released a preprint on Trustworthy Large Language Model. In this paper, we identify the major dimensions of consideration for building a trustworthy LLM.


Recent awards


Invited talks

  • [2024.05] Large Language Model Unlearning@Silicon Valley Chapter of IEEE Computer Society.

  • [2023.11] Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models @RIKEN workshop on Weakly Supervised Learning.
  • [2022.11] Agency Bias in Machine Learning@USC ML Seminar.

  • [2022.11] Agency Bias in Machine Learning@UW ECE Colloquium.