Trained a lunar lander in a simulated environment with Reinforcement Learning. The agent was implemented with the Expected-Sarsa algorithm and used a Neural networkfor for action-values approximation. The algorithm was capable to do planning steps with experience replay and learn a policy for the landing of the agent. Thorughout the specialization I implemented different projects, some of those are listed below. * Solved a Gridworld city with Dynamic programming to find an optimal policie. * Implemented a Dyna-Q and Dyna-Q+ algorithms in a changing maze environment to assess the performance of planning methods in RL. * Implemented an Average Reward Softmax Actor-Critic algorithm using Tile-coding to solve the Pendulum Swing-Up continuous problem. * Solved the Mountain car and Lunar Lander problems.