Local-global interval MDPs for efficient motion planning with learnable uncertainty

J. Jiang, Y. Zhao, S. Coogan
American Control Conference, 2024

Abstract

We study the problem of computationally efficient control synthesis for Interval Markov Decision Processes (IMDPs), that is, MDPs with interval uncertainty on the transition probabilities, against tasks specified in linear temporal logic. To address the scalability challenge when synthesizing this control policy in a holistic way, we propose decomposing the monolithic global IMDP into a collection of interconnected local IMDPs. We focus on the problem of robotic motion planning. Specifically, we assume a setting in which the transition probabilities can be learned and their interval uncertainty reduced by observing the dynamics of the system at runtime. This creates an objective of exploration to ensure that the planning task can be completed with sufficient probability of success. We perform decoupled exploration and learning on the local IMDPs and then combine local control policies to guarantee global task satisfaction. In a simulation-based case study, we show that, compared to existing approaches, our proposed decomposition leads to faster learning and satisfaction of the planning task and provides a feasible controller when other methods are infeasible.