Fair model-based reinforcement learning comparisons with explicit and consistent update frequency

Joint work with Abdelhakim Benechehab, Giuseppe Paolo and Balázs Kégl.

Implicit update frequencies can introduce ambiguity in the interpretation of model-based reinforcement learning benchmarks, obscuring the real objective of the evaluation. While the update frequency can sometimes be optimized to improve performance, real-world applications often impose constraints, allowing updates only between deployments on the actual system. This blog post emphasizes the need for evaluations using consistent update frequencies across different algorithms to provide researchers and practitioners with clearer comparisons under realistic constraints.

Published at the Third Blogpost Track at ICLR 2024. Check the blog post here.

Share on

Twitter Facebook LinkedIn

Fair model-based reinforcement learning comparisons with explicit and consistent update frequency

Share on

You may also enjoy

Python development behind corporate proxies - Tips for the scientific Python community

Good practices with numpy random number generators