Side Projects

Big 2 is a card game which is very popular in East Asia and South East Asia. There are many variations in the rules but my friends and I have developed a particular set that we play whenever we go away on holiday. This was a side project that I started on my own and then continued for a transferable skills module called "Practical Applications of Computational Techniques". The initial single player version was made using only Javascript but the full multiplayer app is being developed using the Node.js framework. Code can be found here (with some screenshots) although this is unfinished and there is no live demo currently available.


Inspired largely by the recent work of Deep Mind where they have used reinforcement learning to beat Atari games using only raw pixel input as well as the more famous Alpha Go which was able to beat the world champion Lee Sedol at the game of Go (and even more recently Alpha Go Zero which has surpassed this using only self-play) I have spent quite a lot of time in the past year learning about machine learning and reinforcement learning. Having made my way through Reinforcement Learning: An Introduction by Sutton and Barto and Deep Learning by Goodfellow et al. as well as many online tutorials the aim of this project was to get some practical experience implementing reinforcement learning algorithms and to try and create an AI which is able to play the card game "Big 2". This is an interesting challenge because unlike most of the games which deep reinforcement learning has had success in it is both a four-player game (instead of two) as well as a game of imperfect information (i.e. each player does not know the hands that each other player possesses). On top of this because you are initially dealt a hand of 13 cards and can in certain situations play 1 card hands, 2 card hands (pairs), 3 card hands (three of a kinds), 4 card hands (two pairs, four of a kinds) and 5 card hands (straights, flushes, full houses) it also has a fairly complicated action space (see this blog post). Surprisingly simply by implementing the "Proximal Policy Optimization" algorithm and purely through self-play (so starting from random neural networks and getting the agents to play against themselves) I have been able to train an AI that can now regularly beat me at the game. I've just finished working on a web app where you can try out playing against it for yourself - be prepared to lose! Rules for how to play can be found here.