Publications
A complete list can also be found on my Google Scholar page.
2023
- LIC. THESISLearning in the Loop: On Neural Network-based Model Predictive Control and Cooperative System IdentificationRebecka WinqvistKTH Royal Institute of Technology , 2023
In the context of control systems, the integration of machine learning mechanisms has emerged as a key approach for improving performance and adaptability. Notable progress has been made across several aspects of the control loop, including learning-based techniques for system identification and estimation, filtering and denoising, and controller design. This thesis delves into the rapidly expanding domain of learning in control, with a particular focus placed on learning-based controllers and learning-based identification methods. The first part of this thesis is devoted to the investigation of Neural Network approximations of Model Predictive Control (MPC). Model-agnostic neural network structures are compared to networks employing MPC-specific information, and evaluated in terms of two performance metrics. The main novel aspect lies in the incorporation of gradient data in the training process, which is shown to enhance the accuracy of the network generated control inputs. Furthermore, experimental results reveal that MPC-informed networks outperform the agnostic counterparts in scenarios when training data is limited. In acknowledgement of the crucial role accurate system models play in in the control loop, the second part of this thesis lends its focus to learning-based identification methods. This line of work addresses the important task of characterizing and modeling dynamical systems, by introducing cooperative system identification techniques to enhance estimation performance. Specifically, it presents a novel and generalized formulation of the Correctional Learning framework, leveraging tools from Optimal Transport. The correctional learning framework centers around a teacher-student model, where an expert agent (teacher) modifies the sampled data used by the learner agent (student), to improve the student’s estimation process. By formulating correctional learning as an optimal transport problem, a more adaptable framework is achieved, better suited for estimating complex system characteristics and accommodating alternative intervention strategies.
- CDCOptimal Transport for Correctional LearningIn 2023 62nd IEEE Conference on Decision and Control (CDC) , 2023
The contribution of this paper is a generalized formulation of correctional learning using optimal transport, which is about how to optimally transport one mass distribution to another. Correctional learning is a framework developed to enhance the accuracy of parameter estimation processes by means of a teacher-student approach. In this framework, an expert agent, referred to as the teacher, modifies the data used by a learning agent, known as the student, to improve its estimation process. The objective of the teacher is to alter the data such that the student’s estimation error is minimized, subject to a fixed intervention budget. Compared to existing formulations of correctional learning, our novel optimal transport approach provides several benefits. It allows for the estimation of more complex characteristics as well as the consideration of multiple intervention policies for the teacher. We evaluate our approach on two theoretical examples, and on a human-robot interaction application in which the teacher’s role is to improve the robots performance in an inverse reinforcement learning setting.
2022
- CDCA Teacher-Student Markov Decision Process-based Framework for Online Correctional LearningIn 2022 IEEE 61st Conference on Decision and Control (CDC) , 2022
A classical learning setting typically concerns an agent/student who collects data, or observations, from a system in order to estimate a certain property of interest. Correctional learning is a type of cooperative teacher-student framework where a teacher, who has partial knowledge about the system, has the ability to observe and alter (correct) the observations received by the student in order to improve the accuracy of its estimate. In this paper, we show how the variance of the estimate of the student can be reduced with the help of the teacher. We formulate the corresponding online problem – where the teacher has to decide, at each time instant, whether or not to change the observations due to a limited budget – as a Markov decision process, from which the optimal policy is derived using dynamic programming. We validate the framework in numerical experiments, and compare the optimal online policy with the one from the batch setting.
2021
- IFAC SYSIDLearning Models of Model Predictive Controllers using Gradient DataRebecka Winqvist, Arun Venkitaraman, and Bo WahlbergIFAC-PapersOnLine, 202119th IFAC Symposium on System Identification SYSID 2021
This paper investigates the problem of controller identification given the data from a linear quadratic Model Predictive Controller (MPC) with constraints. We propose an approach for learning MPC that explicitly uses the gradient information in the training process. This is motivated by the observation that recent differentiable convex optimization MPC solvers can provide both the optimal feedback law from the state to control input as well as the corresponding gradient. As a proof of concept, we apply this approach to explicit MPC (eMPC), for which the feedback law is a piece-wise affine function of the state, but the number of pieces grows rapidly with the state dimension. Controller identification can here be used to find an approximate low complexity functional approximation of the controller. The eMPC is modelled using a Neural Network (NN) with Rectified Linear Units (ReLUs), since such NNs can represent any piece-wise affine function. A key motivation is to replace on-line solvers with neural networks to implement MPC and to simplify the evaluation of the function in larger input dimensions. We also study experimental design and model evaluation in this framework, and propose a hit-and-run sampling algorithm for input design. The proposed algorithms are illustrated and numerically evaluated on a second order MPC problem.
2020
- arXivOn Training and Evaluation of Neural Network Approaches for Model Predictive ControlRebecka Winqvist, Arun Venkitaraman, and Bo Wahlberg2020
The contribution of this paper is a framework for training and evaluation of Model Predictive Control (MPC) implemented using constrained neural networks. Recent studies have proposed to use neural networks with differentiable convex optimization layers to implement model predictive controllers. The motivation is to replace real-time optimization in safety critical feedback control systems with learnt mappings in the form of neural networks with optimization layers. Such mappings take as the input the state vector and predict the control law as the output. The learning takes place using training data generated from off-line MPC simulations. However, a general framework for characterization of learning approaches in terms of both model validation and efficient training data generation is lacking in literature. In this paper, we take the first steps towards developing such a coherent framework. We discuss how the learning problem has similarities with system identification, in particular input design, model structure selection and model validation. We consider the study of neural network architectures in PyTorch with the explicit MPC constraints implemented as a differentiable optimization layer using CVXPY. We propose an efficient approach of generating MPC input samples subject to the MPC model constraints using a hit-and-run sampler. The corresponding true outputs are generated by solving the MPC offline using OSOP. We propose different metrics to validate the resulting approaches. Our study further aims to explore the advantages of incorporating domain knowledge into the network structure from a training and evaluation perspective. Different model structures are numerically tested using the proposed framework in order to obtain more insights in the properties of constrained neural networks based MPC.
2019
- TAROSinstruMentor: An Interactive Robot for Musical Instrument TutoringRebecka Winqvist*, Benedikt Maurer*, Tom Miller* , and 5 more authorsIn Towards Autonomous Robotic Systems , 2019
Musical instrument education has typically faced challenges in providing students with a cost-efficient and long-term solution for personalised tutoring. To address these challenges, we propose a musical instrument tutor robot for students learning the recorder, called instruMentor. Equipped with robotic hands and a multimodal interface, the robot interacts with users by playing the recorder and demonstrating in real-time the proper handling of the instrument. A pilot study was conducted to investigate the effectiveness of a robot tutor for instrument learning. Experimental results suggest that instruMentor is successful at teaching the recorder and is positively appreciated by users, showing promise for the future coupling of music tutoring and social robots.