
Reinforcement Learning: Policy Gradient Methods
The problems of value-based methods The idea behind value-based reinforcement learning (say, Q-learning) is to find an optimal...

My #59 course certificate from Coursera
Introduction to Google SEOUniversity of California Davis This is a wonderful course about marketing by means of search...

Google Search Engine Optimization
What is SEO? SEO stands for Search Engine Optimization, which is the practice of improving visibility of a...

Sales Contracts: UCC Article 2
UCC stands for Uniform Commercial Code, UCC Article 2 pertains sales of goods, which provides additional rules when...

Multi-Period Binomial Model
Multi-period binomial model Multi-period binomial model is really just a series of one-period model spliced together. When pricing...

Deep Q-Network in Reinforcement Learning
Deep Q-Network (DQN) is the first successful application of learning, both directly from raw visual inputs as humans...

Basics of Contracts
A contract is nothing more than an enforceable agreement. The person who makes an offer is offeror, whose...

Supervised Learning in Reinforcement Learning
Deduction to supervised learning problem In tabular method, each Q(s, a) could be seen as a parameter. There...

Computing the Tax
Income Tax Formula This is an important formula, you’ll use throughout the course and beyond. The US tax...

Model-free Reinforcement Learning
Value Iteration in real world n real world, we don’t have the state transition probability distribution or the...

Quadrotors: Energetic and System Design
Spinning all rotors of a quad-rotor in the same direction will cause the robot to constantly rotate. The...

My #56 course certificate from Coursera
What is ComplianceUniversity of Pennsylvania The world today operates in a highly regulated environment. Failure to comply usually...

Compliance and Risk Management
Basics The simple fact is that the world today operates in a highly regulated environment. So when it comes...

Derivative Securities: Swaps, Futures and Options
Swaps Why do companies or entities construct swaps? Because they want to change the nature of cash flows,...

Dynamic Programming in RL
Reward That all of what we mean by goals and purposes can be well thought of as maximization...