Xiang Ji
Citat de
Citat de
Bootstrapping fitted q-evaluation for off-policy inference
B Hao, X Ji, Y Duan, H Lu, C Szepesvari, M Wang
International Conference on Machine Learning, 4074-4084, 2021
Sample complexity of nonparametric off-policy evaluation on low-dimensional manifolds using deep networks
X Ji, M Chen, M Wang, T Zhao
The Eleventh International Conference on Learning Representations 2023, 2023
Bootstrapping Statistical Inference for Off-Policy Evaluation
B Hao, X Ji, Y Duan, H Lu, C Szepesvári, M Wang
arXiv preprint arXiv:2102.03607, 2021
Urban bike lane planning with bike trajectories: Models, algorithms, and a real-world case study
S Liu, ZJM Shen, X Ji
Manufacturing & Service Operations Management 24 (5), 2500-2515, 2022
Provable benefits of policy learning from human preferences in contextual bandit problems
X Ji, H Wang, M Chen, T Zhao, M Wang
arXiv preprint arXiv:2307.12975, 2023
Optimal estimation of policy gradient via double fitted iteration
C Ni, R Zhang, X Ji, X Zhang, M Wang
International Conference on Machine Learning, 16724-16783, 2022
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
X Ji, G Li
Conference on Neural Information Processing Systems (2023), 2023
Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis
Z Li, X Ji, M Chen, M Wang
International Conference on Artificial Intelligence and Statistics, 2737-2745, 2024
Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks
Z Li, X Ji, M Chen, M Wang
arXiv preprint arXiv:2310.10556, 2023
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds
Z Xu, X Ji, M Chen, M Wang, T Zhao
arXiv preprint arXiv:2309.13915, 2023
Sistemul nu poate realiza operația în acest moment. Încercați din nou mai târziu.
Articole 1–10