Odalric-Ambrym Maillard
(Former)
Research output
- 2019
- Published
Regret Bounds for Learning State Representations in Reinforcement Learning
Ortner, R., Pirotta, M., Lazaric, A., Fruit, R. & Maillard, O-A., Dec 2019.Research output: Contribution to conference › Poster › Research › peer-review
- E-pub ahead of print
Regret Bounds for Learning State Representations in Reinforcement Learning
Ortner, R., Pirotta, M., Lazaric, A., Fruit, R. & Maillard, O-A., 2019, (E-pub ahead of print) Advances in Neural Information Processing Systems. Vol. 32. p. 12717 12727 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- 2014
- Published
Selecting Near-Optimal Approximate State Representations in Reinforcement Learning
Ortner, R., Maillard, O-A. & Ryabko, D., 2014, Algorithmic Learning Theory - 25th International Conference, ALT 2014, Bled, October 8-10, 2014. p. 140-154Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- 2013
- Published
Competing with an Infinite Set of Models in Reinforcement Learning
Nguyen, P., Maillard, O-A., Ryabko, D. & Ortner, R., 2013, JMLR Workshop and Conference Proceedings Volume 31 : Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics. p. 463-471Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- Published
Linear regression with random projections.
Maillard, O-A., 2013, In: Journal of machine learning research (JMLR). 13, p. 1-1Research output: Contribution to journal › Article › Research › peer-review
- Published
Optimal regret bounds for selecting the state representation in reinforcement learning.
Maillard, O-A., Nguyen, P., Ortner, R. & Ryabko, D., 2013, JMLR Workshop and Conference Proceedings Volume 28 : Proceedings of The 30th International Conference on Machine Learning. p. 543-551Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- 2011
- Published
Adaptive bandits: Towards the best history-dependent strategy
Maillard, O-A., 2011, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. p. 570-578Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- Published
Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
Maillard, O-A., 2011, Proceedings of the 24th Annual Conference on Learning Theory. p. 497-514Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- Published
Selecting the State-Representation in Reinforcement Learning
Maillard, O-A., 2011, Advances in Neural Information Processing Systems 24. p. 2627-2635Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
- Published
Sparse recovery with Brownian sensing
Maillard, O-A., 2011, Advances in Neural Information Processing Systems 24. p. 1782-1790Research output: Chapter in Book/Report/Conference proceeding › Conference contribution