In the thesis we evaluate the PILCO algorithm, a control-law search method, with a third-order unstable nonlinear system with quantized amplitude of the output signal.
In the beginning the algorithm's theoretical foundation is presented, which consists of Gaussian processes and reinforcement learning. This is followed by a description of the algorithm and a short example of its operation. Next, the mathematical model of the system is described. The system is a hydraulic plant where a pump is used to influence the size of a bubble in a float, causing it to ascend or descend in the water. Instead of experiments with an actual plant, all experiments were run as computer simulations using the plant's model. The controller's objective was to stabilize the system and to reach and maintain a preselected position of the float. Then, the experimental runs with various controllers are described. Firstly, a PID controller was used for determining the appropriate sampling time of the system. This was followed by two experimental runs, where algorithm PILCO was used as a controller. First time, the amplitude of the plant's output signal was left continuous and the second time, its amplitude was quantized.
Lastly, we examine the results of the experiments, and conclude that algorithm PILCO performs better with continuous signals, where it successfully meets the objective, and somewhat worse with quantized signals, where it still meets the objective but needs many more iterations.