Really simple demo on Reinforcement learning

Ever since the AlphaZeroGo learnt the Chinese game GO on its own to beat the GO world champions, there has been kind of new awakening in Machine Learning space. Innovative companies are investing in ML initiatives hoping to master it for big gains in the future.

It takes about 60-70 minutes for the agent(9 eyed worm) to learn. Best way is to watch for few minutes and then go do something else for about an hour and come back. Make sure you keep this browser window live for that long.

A Demo can be a nice way to quickly understand the essence of the concept

This was originally developed by Andrej Karpathy former researcher and AI director at Tesla.
I have slightly modified it to make it less mathematical and changed settings for speedier learning.
You may also Use these controls for better understanding

Brief explanation

In this demo the Agent(worm) with 9 eyes has to find apples, red circles and eat. He needs to learn to avoid eating poison, green circles. Worm has ability to move in 5 angles and is constrained by the wall, grey line. It also has to learn to find the apples faster over time.

You can see in the demo that gradually, in about 60-70 minutes, the agent learns to find apples faster and eat them. It learns to carefully avoid the poisonous green circles and gets away from them.

If you are someone impatient like me and cannot wait that long, you may hit the pre-trained agent to see how it behaves once he has learnt and mastered the tricks over a period of about an hour.

This was originally developed by Andrej Karpathy former researcher and AI director at Tesla. I have slightly modified it to make it less mathematical and changed settings for speedier learning.

Shesh - Blogs

Our book with Wiley on AI

Friday, September 14, 2018

Really simple demo on Reinforcement learning

It takes about 60-70 minutes for the agent(9 eyed worm) to learn. Best way is to watch for few minutes and then go do something else for about an hour and come back. Make sure you keep this browser window live for that long.