Urmăriți
Ruiqi Zhang
Titlu
Citat de
Citat de
Anul
Trained transformers learn linear models in-context
R Zhang, S Frei, PL Bartlett
Journal of Machine Learning Research 25 (49), 1-55, 2023
1992023
Negative preference optimization: From catastrophic collapse to effective unlearning
R Zhang, L Lin, Y Bai, S Mei
The First Conference of Language Models (COLM) in 2024, 2024
682024
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Z Chen, Z Zhao, Z Zhu, R Zhang, X Li, B Raj, H Yao
NAACL 2024, 2024
202024
Off-policy fitted q-evaluation with differentiable function approximators: Z-estimation and inference theory
R Zhang, X Zhang, C Ni, M Wang
International Conference on Machine Learning, 26713-26749, 2022
202022
In-context learning of a linear Transformer block: benefits of the MLP component and one-step GD initialization
R Zhang, J Wu, PL Bartlett
The 38th Annual Conference on Neural Information Processing Systems …, 2023
142023
Fast Best-of-N Decoding via Speculative Rejection
H Sun, M Haider, R Zhang, H Yang, J Qiu, M Yin, M Wang, P Bartlett, ...
NIPS 2024, 2024
8*2024
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
R Zhang, A Zanette
Advances in Neural Information Processing Systems, 2024, 2023
82023
Optimal estimation of policy gradient via double fitted iteration
C Ni, R Zhang, X Ji, X Zhang, M Wang
International Conference on Machine Learning, 16724-16783, 2022
6*2022
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
C Fan, J Liu, L Lin, J Jia, R Zhang, S Mei, S Liu
arXiv preprint arXiv:2410.07163, 2024
2024
Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
R Zhang, Y Zhai, A Zanette
arXiv preprint arXiv:2402.15703, 2024
2024
Choose Your Anchor Wisely: Effective Unlearning Diffusion Models via Concept Reconditioning
J Zhu, R Zhang, L Lin, S Mei
Neurips Safe Generative AI Workshop 2024, 0
Sistemul nu poate realiza operația în acest moment. Încercați din nou mai târziu.
Articole 1–11