"The mountain car problem is commonly applied because it requires a reinforcement learning agent to learn on two continuous variables: position and velocity. For any given state (position and velocity) of the car, the agent is given the possibility of driving left, driving right, or not using the engine at all. In the standard version of the problem, the agent receives a negative reward at every time step when the goal is not reached; the agent has no information about the goal until an initial success."
"QLearning is a model free reinforcement learning technique that can be used to find the optimal action selection policy using Q function without requiring a model of the environment. Q-learning eventually finds an optimal policy"
Refs: QLearning: https://en.wikipedia.org/wiki/Q-learning
Mountain Car Problem: https://en.wikipedia.org/wiki/Mountain_car_problem
Mountain Car Open AI Gym: https://gym.openai.com/envs/MountainCar-v0/
Mountain Car Gym Git: https://github.com/openai/gym/wiki/MountainCar-v0
Open AI Gym: https://gym.openai.com/docs/
More Ref: https://github.com/llSourcell/Q_Learning_Explained