Urmăriți
Marek Petrik
Marek Petrik
Adresă de e-mail confirmată pe cs.unh.edu - Pagina de pornire
Titlu
Citat de
Citat de
Anul
An approximate solution method for large risk-averse Markov decision processes
M Petrik, D Subramanian
arXiv preprint arXiv:1210.4901, 2012
1872012
Finite-sample analysis of proximal gradient td algorithms
B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik
arXiv preprint arXiv:2006.14364, 2020
1752020
Safe policy improvement by minimizing robust baseline regret
M Ghavamzadeh, M Petrik, Y Chow
Advances in Neural Information Processing Systems 29, 2016
1532016
Feature selection using regularization in approximate linear programs for Markov decision processes
M Petrik, G Taylor, R Parr, S Zilberstein
arXiv preprint arXiv:1005.1860, 2010
912010
An Analysis of Laplacian Methods for Value Function Approximation in MDPs.
M Petrik
IJCAI, 2574-2579, 2007
912007
Biasing approximate dynamic programming with a lower discount factor
M Petrik, B Scherrer
Advances in neural information processing systems 21, 2008
702008
Fast Bellman updates for robust MDPs
CP Ho, M Petrik, W Wiesemann
International Conference on Machine Learning, 1979-1988, 2018
682018
Beyond confidence regions: Tight bayesian ambiguity sets for robust mdps
M Petrik, RH Russel
Advances in neural information processing systems 32, 2019
672019
A practical method for solving contextual bandit problems using decision trees
AN Elmachtoub, R McNellis, S Oh, M Petrik
arXiv preprint arXiv:1706.04687, 2017
612017
Learning parallel portfolios of algorithms
M Petrik, S Zilberstein
Annals of Mathematics and Artificial Intelligence 48, 85-106, 2006
612006
Partial policy iteration for l1-robust markov decision processes
CP Ho, M Petrik, W Wiesemann
Journal of Machine Learning Research 22 (275), 1-46, 2021
552021
Constraint relaxation in approximate linear programs
M Petrik, S Zilberstein
Proceedings of the 26th Annual International Conference on Machine Learning …, 2009
462009
Tight approximations of dynamic risk measures
DA Iancu, M Petrik, D Subramanian
Mathematics of Operations Research 40 (3), 655-682, 2015
452015
A bilinear programming approach for multiagent planning
M Petrik, S Zilberstein
Journal of Artificial Intelligence Research 35, 235-274, 2009
452009
RAAM: The benefits of robustness in approximating aggregated MDPs in reinforcement learning
M Petrik, D Subramanian
Advances in Neural Information Processing Systems 27, 2014
442014
Average-Reward Decentralized Markov Decision Processes.
M Petrik, S Zilberstein
IJCAI, 1997-2002, 2007
382007
Bayesian robust optimization for imitation learning
D Brown, S Niekum, M Petrik
Advances in Neural Information Processing Systems 33, 2479-2491, 2020
372020
Social media and customer behavior analytics for personalized customer engagements
S Buckley, M Ettl, P Jain, R Luss, M Petrik, RK Ravi, C Venkatramani
IBM Journal of Research and Development 58 (5/6), 7: 1-7: 12, 2014
332014
Proximal Gradient Temporal Difference Learning Algorithms.
B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik
IJCAI, 4195-4199, 2016
322016
Anytime coordination using separable bilinear programs
M Petrik, S Zilberstein
PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE 22 (1), 750, 2007
322007
Sistemul nu poate realiza operația în acest moment. Încercați din nou mai târziu.
Articole 1–20