Multi-Objective Decision Making. Diederik M. Roijers

Чтение книги онлайн.

Читать онлайн книгу Multi-Objective Decision Making - Diederik M. Roijers страница 4

Multi-Objective Decision Making - Diederik M. Roijers Synthesis Lectures on Artificial Intelligence and Machine Learning

Скачать книгу

theory, at a graduate or undergraduate level. In order to remain accessible to a wide range of readers, we provide intuitive explanations and examples of key concepts before formalizing them. In some cases, we omit detailed proofs of theorems in order to better focus on the intuition behind and implications of these theorems. In such cases, we provide references to the detailed proofs.

      Outline This book is structured as follows. In Chapter 1, we motivate multi-objective decision making by providing examples of multi-objective decision problems and scenarios that require explicitly multi-objective solution methods. In Chapter 2, we introduce two popular classes of decision problems that we use throughout the book to illustrate specific algorithms and general theoretical results. In Chapter 3, we present a taxonomy of solution concepts for multi-objective decision problems. Using this taxonomy, we discuss different solution methods. First, we assume that the model of the environment is known to the agents, leading to a planning setting. In Chapters 4 and 5, we discuss two different approaches for finding a coverage set using planning algorithms. In Chapter 6, we remove the assumption that the agents are given a model of the environment, and consider cases where they must learn about the environment through interaction. Finally, we discuss several illustrating applications in Chapter 6, followed by conclusions and future work in Chapter 8.

      Diederik M. Roijers and Shimon Whiteson

      April 2017

       Acknowledgments

      This book is based on our research on multi-objective decision making over the years. During this research, we collaborated with people whose input has been essential to our understanding of the field. We would like to thank several of them explicitly.

      Together with Peter Vamplew and Richard Dazeley we wrote our 2013 survey article on multi-objective sequential decision making. The discussions we had about the nature of multi-objective decision problems were vital in shaping our ideas about this field, and lay the foundation for how we view multi-objective decision problems.

      In the past few years, one of our main collaborators (and Diederik’s other PhD supervisor), has been Frans A. Oliehoek. Together, we developed many algorithms for multi-objective decision making, including the CMOVE and OLS algorithms that we discuss in Chapters 4 and 5. Frans’s vast expertise on partially observable decision problems and limitless capacity for generating new ideas have been invaluable to our work in the field of multi-objective decision making.

      Together with Joris Scharpff, Matthijs Spaan, and Mathijs de Weerdt, we worked on the traffic network maintenance planning problem (which we discuss in Section 7.3), and in this context improved upon the original OLS algorithm (Chapter 5). We enjoyed this productive collaboration.

      We would also like to thank our other past and present co-authors and collaborators who we have worked with on multi-objective decision making problems: Alexander Ihler, João Messias, Maarten van Someren, Chiel Kooijman, Maarten Inja, Maarten de Waard, Luisa Zintgraf, Timon Kanters, Philipp Beau, Richard Pronk, Carla Groenland, Elise van der Pol, Joost van Doorn, Daan Odijk, Maarten de Rijke, Ayumi Igarashi, Hossam Mossalam, and Yannis Assael.

      Finally, we would like to thank several people with whom we had interesting discussions about multi-objective decision making over the years: Ann Nowé, Kristof van Moffaert, Tim Brys, Abdel-Illah Mouaddib, Paul Weng, Grégory Bonnet, Rina Dechter, Radu Marinescu, Shlomo Zilberstein, Kyle Wray, Patrice Perny, Paolo Viappiani, Pascal Poupart, Max Welling, Karl Tuyls, Francesco Delle Fave, Joris Mooij, Reyhan Aydoğan, and many others.

      Diederik M. Roijers and Shimon Whiteson

      April 2017

       Table of Abbreviations

Abbreviation Full Name Location
AOLS approximate optimistic linear support Algorithm 5.10, Section 5.5
CCS convex coverage set Definition 3.7, Section 3.2.2
CH convex hull Definition 3.6, Section 3.2.2
CHVI convex hull value iteration Section 4.3.2
CLS Cheng’s linear support Section 5.3
CMOVE multi-objective variable elimination Section 4.2.3
CoG coordination graph Definition 2.4, Section 2.2.1
CS coverage set Definition 3.5, Section 3.2
f scalarization function Definition 1.1, Section 1.1
MDP Markov decision process Definition 2.6, Section 2.3.1
MO-CoG multi-objective coordination graph Definition 2.5, Section 2.2.2
MODP multi-objective decision problem Definition 2.2, Section 2.1
MOMDP multi-objective Markov decision process Definition 2.8, Section 2.3.2
MORL multi-objective reinforcement learning

Скачать книгу