![Certificate Practical Reinforcement Learning](https://kzhu.ai/wp-content/uploads/2021/07/Coursera-BLWHHBG7RNVH-1024x791.jpg)
My #61 course certificate (with Honors) from Coursera
Practical Reinforcement LearningHigher School of Economics I am very proud that I survived and completed this thorny but...
![Regret of Policy, Boltzmann strategy, Hoeffding inequity](https://kzhu.ai/wp-content/uploads/2021/07/Practical-Reinforcement-Learning-26-724x1024.jpg)
Exploration and Planning in Reinforcement Learning
Exploration is needed to find unknown actions which lead to very large rewards. Most of the reinforcement learning...
![Policy gradient methods](https://kzhu.ai/wp-content/uploads/2021/07/Practical-Reinforcement-Learning-23-724x1024.jpg)
Reinforcement Learning: Policy Gradient Methods
The problems of value-based methods The idea behind value-based reinforcement learning (say, Q-learning) is to find an optimal...
![Deep Q-network](https://kzhu.ai/wp-content/uploads/2021/06/Practical-Reinforcement-Learning-19-724x1024.jpg)
Deep Q-Network in Reinforcement Learning
Deep Q-Network (DQN) is the first successful application of learning, both directly from raw visual inputs as humans...
![Function approximation in Reinforcement Learning](https://kzhu.ai/wp-content/uploads/2021/06/Practical-Reinforcement-Learning-14-724x1024.jpg)
Supervised Learning in Reinforcement Learning
Deduction to supervised learning problem In tabular method, each Q(s, a) could be seen as a parameter. There...
![Monte Carlo, Temporal difference.](https://kzhu.ai/wp-content/uploads/2021/05/Practical-Reinforcement-Learning-11-724x1024.jpg)
Model-free Reinforcement Learning
Value Iteration in real world n real world, we don’t have the state transition probability distribution or the...
![Dynamic programming, issues, discounting.](https://kzhu.ai/wp-content/uploads/2021/05/Dynamic-programming-in-RL-5-724x1024.jpg)
Dynamic Programming in RL
Reward That all of what we mean by goals and purposes can be well thought of as maximization...