Type:
Conference Paper›Invited and refereed articles in conference proceedings
Authored by:
Baras, John S., Borkar, V.S, V. S.
Conference date:
December 2000
Conference:
Proceedings of the IEEE Conference on Decision and Control, pp.3351-3356
Full Text Paper:
Abstract:
We propose a simulation-based algorithm for learning good policies for a Markov decision process with unknown transition law, with aggregated states. The state aggregation itself can be adapted on a slower time scale by an auxiliary learning algorithm. Rigorous justifications are provided for both algorithms.