Yuanzhi Li

Cited by

	All	Since 2019
Citations	24735	23821
h-index	52	50
i10-index	93	92

12000

6000

3000

9000

20162017201820192020202120222023202478 176 456 1056 1477 1818 2192 5672 11470

Public access

View all

36 articles

0 articles

available

not available

Based on funding mandates

Yuanzhi Li

Assistant Professor at CMU

Verified email at andrew.cmu.edu - Homepage

Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Lora: Low-rank adaptation of large language models EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen arXiv preprint arXiv:2106.09685, 2021	8224	2021
Sparks of artificial general intelligence: Early experiments with gpt-4 S Bubeck, V Chandrasekaran, R Eldan, J Gehrke, E Horvitz, E Kamar, ... arXiv preprint arXiv:2303.12712, 2023	3217	2023
A convergence theory for deep learning via over-parameterization Z Allen-Zhu, Y Li, Z Song International conference on machine learning, 242-252, 2019	1617	2019
Learning and generalization in overparameterized neural networks, going beyond two layers Z Allen-Zhu, Y Li, Y Liang Advances in neural information processing systems 32, 2019	880	2019
Convergence analysis of two-layer neural networks with relu activation Y Li, Y Yuan Advances in neural information processing systems 30, 2017	775	2017
Learning overparameterized neural networks via stochastic gradient descent on structured data Y Li, Y Liang Advances in neural information processing systems 31, 2018	729	2018
A theoretical analysis of NDCG type ranking measures Y Wang, L Wang, Y Li, D He, TY Liu Conference on learning theory, 25-54, 2013	722	2013
A latent variable model approach to pmi-based word embeddings S Arora, Y Li, Y Liang, T Ma, A Risteski Transactions of the Association for Computational Linguistics 4, 385-399, 2016	664*	2016
Textbooks are all you need S Gunasekar, Y Zhang, J Aneja, CCT Mendes, A Del Giorno, S Gopi, ... arXiv preprint arXiv:2306.11644, 2023	436	2023
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning Z Allen-Zhu, Y Li arXiv preprint arXiv:2012.09816, 2020	405	2020
An alternative view: When does SGD escape local minima? B Kleinberg, Y Li, Y Yuan International conference on machine learning, 2698-2707, 2018	365	2018
Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations Y Li, T Ma, H Zhang Conference On Learning Theory, 2-47, 2018	361	2018
Phi-3 technical report: A highly capable language model locally on your phone M Abdin, SA Jacobs, AA Awan, J Aneja, A Awadallah, H Awadalla, ... arXiv preprint arXiv:2404.14219, 2024	353	2024
Towards explaining the regularization effect of initial large learning rate in training neural networks Y Li, C Wei, T Ma Advances in neural information processing systems 32, 2019	352	2019
Textbooks are all you need ii: phi-1.5 technical report Y Li, S Bubeck, R Eldan, A Del Giorno, S Gunasekar, YT Lee arXiv preprint arXiv:2309.05463, 2023	302	2023
Linear algebraic structure of word senses, with applications to polysemy S Arora, Y Li, Y Liang, T Ma, A Risteski Transactions of the Association for Computational Linguistics 6, 483-495, 2018	273	2018
Algorithmic framework for model-based deep reinforcement learning with theoretical guarantees Y Luo, H Xu, Y Li, Y Tian, T Darrell, T Ma arXiv preprint arXiv:1807.03858, 2018	260	2018
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions S Chen, S Chewi, J Li, Y Li, A Salim, AR Zhang arXiv preprint arXiv:2209.11215, 2022	241	2022
Gradient descent on neural networks typically occurs at the edge of stability JM Cohen, S Kaur, Y Li, JZ Kolter, A Talwalkar arXiv preprint arXiv:2103.00065, 2021	237	2021
What can resnet learn efficiently, going beyond kernels? Z Allen-Zhu, Y Li Advances in Neural Information Processing Systems 32, 2019	224	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by