A partially-observable markov decision process for dealing with dynamically changing environments
Date Issued
2014
DOI
10.1007/978-3-662-44654-6_11
Abstract
This paper offers a solution to the non-stationary POMDP problem, by making use of methods and concepts from the field of Bayesian non-parametrics, specifically dynamic hierarchical Dirichlet process priors. We combine block Gibbs sampling and importance sampling to perform inference. We evaluate the method in several benchmark policy learning tasks

