Stephen McAleer
Stephen McAleer
Adresă de e-mail confirmată pe openai.com - Pagina de pornire
Citat de
Citat de
Highly accurate machine fault diagnosis using deep transfer learning
S Shao, S McAleer, R Yan, P Baldi
IEEE Transactions on Industrial Informatics 15 (4), 2446-2455, 2018
Solving the Rubik’s cube with deep reinforcement learning and search
F Agostinelli*, S McAleer*, A Shmakov*, P Baldi
Nature Machine Intelligence 1 (8), 356-363, 2019
Language Models can Solve Computer Tasks
G Kim, P Baldi, S McAleer
Neural Information Processing Systems (NeurIPS), 2023
Mastering the game of Stratego with model-free multiagent reinforcement learning
J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub, V de Boer, ...
Science 378 (6623), 990-996, 2022
Llemma: An Open Language Model for Mathematics
Z Azerbayev, H Schoelkopf, K Paster, M Dos Santos, S McAleer, AQ Jiang, ...
International Conference on Learning Representations (ICLR), 2023
AI Alignment: A Comprehensive Survey
J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang, Y Duan, Z He, J Zhou, ...
arXiv preprint arXiv:2310.19852, 2023
Solving the Rubik's Cube with Approximate Policy Iteration
S McAleer*, F Agostinelli*, A Shmakov*, P Baldi
International Conference on Learning Representations (ICLR), 2018
Pipeline PSRO: A scalable approach for finding approximate nash equilibria in large games
S McAleer*, J Lanier*, R Fox, P Baldi
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Y Chen, Y Yang, T Wu, S Wang, X Feng, J Jiang, SM McAleer, H Dong, ...
36th Conference on Neural Information Processing Systems (NeurIPS 2022 …, 2022
Evolutionary reinforcement learning for sample-efficient multiagent coordination
S Majumdar, S Khadka, S Miret, S McAleer, K Tumer
International Conference on Machine Learning (ICML), 2020
XDO: A double oracle algorithm for extensive-form games
S McAleer, J Lanier, P Baldi, R Fox
Advances in Neural Information Processing Systems (NeurIPS), 2021
Independent Natural Policy Gradient Always Converges in Markov Potential Games
R Fox, S McAleer, W Overman, I Panageas
AISTATS 2022, 2021
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems (NeurIPS), 2021
Alphazero-like tree-search can guide large language model decoding and training
Z Wan, X Feng, M Wen, SM McAleer, Y Wen, W Zhang, J Wang
Forty-first International Conference on Machine Learning, 2024
Online Double Oracle
LC Dinh, Y Yang, S McAleer, NP Nieves, O Slumbers, Z Tian, DH Mguni, ...
Transactions on Machine Learning Research, 2021
White Paper: ARIANNA-200 high energy neutrino telescope
A Anker, P Baldi, SW Barwick, D Bergman, H Bernhoff, DZ Besson, ...
arXiv preprint arXiv:2004.09841, 2020
Deep-learning-based reconstruction of the neutrino direction and energy for in-ice radio detectors
C Glaser, S McAleer, S Stjärnholm, P Baldi, SW Barwick
Astroparticle Physics 145, 102781, 2023
Curiosity-Driven Multi-Criteria Hindsight Experience Replay
J Lanier, S McAleer, P Baldi
NeurIPS 2019 Deep RL Workshop, 2019
Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games
S McAleer, JB Lanier, K Wang, P Baldi, R Fox, T Sandholm
International Conference on Learning Representations (ICLR), 2022
Reducing variance in temporal-difference value estimation via ensemble of deep networks
L Liang, Y Xu, S McAleer, D Hu, A Ihler, P Abbeel, R Fox
International Conference on Machine Learning (ICML), 2022
Sistemul nu poate realiza operația în acest moment. Încercați din nou mai târziu.
Articole 1–20