Learning how to write a blog about learning: 2014

Friday, May 23, 2014

Making yet another Flappy bird clone with Pygame

There are literally thousands of Flappy bird clones everywhere in the appstores and on the web. That's why I thought it would be a great idea to make my own clone. I was learning how to use the python game library Pygame at the time, so I decided to make this game in Pygame to learn the library better.

The code is hosted at github, at https://github.com/liu916/Flappy

-------------------------------------------------------------

The first step to making the clone was getting the pictures and sprites for the bird, pipes, etc. Since the sprites aren't released in creative commons, I'll give you an accurate representation of the sprites drawn by Mr. Lamonae.

(Credit: Lemonae)

After loading the images into python, the next step was to implement the "bird" pygame.sprite.Sprite class. There really wasn't much to the class besides loading the bird image and implementing the movement of the bird.

In all the flappy bird clones, the bird you play as is always moves constantly rightward and accelerates downward if the jump button isn't pressed.

The bird may look like its traveling rightward, but the game only implements the pipes moving leftward to give the illusion of the bird moving right. So, the bird only needed to be changed in its vertical position.

Implementing its vertical position over time was easy enough:
y_velocity = y_velocity + y_acceleration
y_position = y_position + y_velocity
where the y_acceleration is some negative falling value.

When the user presses the jump button, the bird's y_velocity is to be changed to the bird's jump speed regardless of its value. That wouldn't work in real life, but it makes the game easier to play.

Next up was implementing the pipes. Making sure there was enough room between the pipes vertically and horizontally was tricky. The vertical height had to be just be just above the bird's total jump height so it wasn't impossible but wasn't too easy either.

The pipes also had to be generated at random heights, also adjusted so the bird is not flying too close to the ground.

Then, I was faced with an interesting problem: how should I make an infinite number of pipes coming at the bird (as long as the bird doesn't crash) ? To solve that problem, I made three pipes that acted as all pipes. Once one pipe went off the screen, I moved it back behind the other two pipes and adjusted its height.

Tying all the pieces together, the game needed to detect when the bird touched anything, which
For all sprite in sprites:
if bird.rect.coliderect(sprite):
game_over
worked pretty well.

Last of all, I made the score integer increment when the bird passed a pipe.

-------------------------------------------------------------------------------------

The Pygame Flappy bird clone is still vastly unfinished. The score prints to the terminal, not screen, the game_over() function does nothing but exit Pygame, and there is no background to the game.

Even though its only a working prototype, I learned a lot about the Pygame library. In programming, the only way to truly learn about something is to do it yourself.

Questions, comments? Post them below.

Wednesday, April 23, 2014

Test of the GTO brain teaser #1 solution

http://farm4.staticflickr.com/3010/2325449341_b05e02613e_b.jpg

(Source: Ben Pollard)

We all know the classic game of rock paper scissors. The game is perfectly even for both players and is pretty much the same as flipping a coin to see which side faces up.

To twist that a bit, consider facing an opponent who has a disadvantage. The opponent must flip a coin without showing it to you. If it is heads, he must play rock. (The opponent must play rock 50 % of the time). Otherwise, he is free to play whatever he wants.

If you win, you gain $100. If you lose, you give $100 to the opponent.

The question is, what strategy do you use to get the most money, given the opponent will adapt to his optimal strategy for each strategy you take?

The brain teaser is given in GTORangeBuilder's blog.

The solution is less straightforward you might think. The first strategy that might come to mind is using paper all the time. After all, the opponent has to use rock half the time, so you have to gain money that way, right?

Unfortunately, not quite. The opponent will start to start scissor all the time to counter your paper. So the end result is just 50/50.

The solution turns out to be very interesting. I learned quite a lot about the Nash equilibrium and indifference criteria, which are used to solve the problem.

It turns out the best strategy is to use paper 2/3 of the time, rock 1/3 of the time, and never play scissors. The opponent, countering this, will play paper 1/3 of the time when he has the choice, scissor 2/3 of the time when he has the choice, and never play rock when he has the choice.

So, how much money will you expect to make per game, or the EV (expected value) of the the game?

It turns out if you use this strategy, you will make $16.66 per game, pretty decent wage.

I decided to test this myself. I wrote a small javascript application to simulated the strategy. You can try it yourself on the right of my blog page. A 1000 trials of repeated trials yields very close to $16.66 dollars ($13-20 average).

You may find that repeatedly using paper will give you about a $30 average gain. That's because the the my written cpu cannot adopt to any strategies you use. You can toggle the cpu's strategy with the 3 fields I provided. You can also toggle your own strategy if you want to automate the playing.

You can see the source code here:
https://gist.github.com/liu916/11239336

Questions, comments? Post them below.

Sunday, March 30, 2014

Gradient Descent

Earlier, I had a post on predicting Iris Flower types from their flower sizes with the gradient descent algorithm.

The algorithm is actually quite beautiful in its simplicity. I'll try to explain the what the algorithm does and why it works in this post. Hopefully, this will give you a better picture of how machines can "learn" from data.

Note: I'll be splitting my explanation into two parts. This is part 1. Part 2 will be coming out soon.

--------------------------------------

In its most basic implementation, the gradient descent tries to fit a line to a set of points.

So, given a set of an m number points on the xy coordinate system below,

x | y
1 | -1
3 | 1
...
...
x[m] | y[m]


My excellent plotting skills

The algorithm would want to find a line like the green one I drew.

Because our data set is linear (follows a sort of straight line), we can represent the line as a linear function in the form f(x) = mx + b.

Following the naming convention, the line is called: h(x) = θ[0] + θ[1]x , but essentially their the same.

So, ready? The algorithm is as follows:

Repeat until convergence {
For θ[j] in every θ {

update θ[j] to be :

}
} where a is the learning rate
and m is the number of training points

For our example, the algorithm would essentially be this:

Repeat until the values of θ don't change anymore {
   for θ[j] in the θ we have (θ[0] and θ[1]) {
          update θ[j] to be :
                 itself + the learning rate * the summation with all points i of ( (the actual value - the predicted value) * the x of that point)
     }
}

------------------------------------

You may be wondering where these formulas come from and why they work. To understand that, you'll first have to know what it means to have a "better fitting" line.

First off the cost function is a function that measures a "bad" a line fits a data set. If the cost function is high, the line doesn't fit the data set very well. The goal of gradient descent is to minimize the cost function.

The cost function used here is:

where m is the number of training points.

If you look closely, you'll find that the cost function is actually a total of the distances between the predicted line and the actual points.

------------------------------

This is the part 1. Any questions or comments? Post them below.

Sunday, March 16, 2014

An Introduction to command line interfaces

If you've ever seen a hacker on TV or something, chances were that they were working on command line or shell of some sorts. Well, the huge black screen with white bitmap font isn't just for hackers, the interface is commonly used by developers and users just wanting to get a little more power over their computer.

zshell (Source: Wikimedia commons)

Despite its intimidating interface, using command line is actually quite intuitive. It was quite easy for me to learn. In essence, the user first inputs a command (hence the name "command line"), and then the computer executes the command and shows the output.

Not all command lines are the same, however. For my examples, I'll be using the classic "bash shell". So, on my bash shell, the command "echo hello world"

~/ $ echo hello world!

outputs

hello world!

on the computer. Although there are many command lines, their commands share pretty much the same properties. In their basic format, all commands in the command line are made of two parts:

1. The executable/program
2. The arguments/options/inputs

In my example above, I called the executable "echo" and used the string "hello world!" as the argument. The echo command just types back what you put in as the argument.

In addition to executing commands, the command line navigates around the filesystem of the computer. You can think of the command line as something like a person in a house, where the house is the computer itself. You can do stuff in your house to manipulate or interact with the house. Similarly, you can move to a different room in the house. Anytime you're on the command line, you're are somewhere in the filesystem of the computer.

The command "pwd" shows where I am in the command line.

~/ $ pwd

outputs for me

/home/Documents/

So, if I wanted to see whats in the current "room", or file "path", I would use the command "ls"

~/ $ ls

could show

foo.txt helloreader.cpp anyrandomfile anyrandomfolder

folderofhomework

You're now probably wondering "How do I change my location?". Well, dear reader, you use the command "cd". The argument for "cd" must be a folder of some sort.

~/ $ cd folderofhomework

The command does not output anything, but if you use "pwd" again, you can see that you've moved locations.

~/ $ pwd
/home/Documents/folderofhomework

Those are the basics of any command line interface. Depending on what shell or prompt you have available, the commands may be different.

One of the equivalents of the commands I used above for the "command prompt" on windows machines is "dir" to "ls". The command "cd" is the same. However, on the command prompt, there is no direct equivalent for "pwd". You have to use the command "echo %cd%"

Any questions, comments? Post them below.

Commands for unix derived command lines (bash, mac terminal, cshell, etc.)
Commands for the command prompt on windows
How to get to the Command prompt on Windows
How to get to the "terminal" app on the Mac

Friday, February 28, 2014

Vibrato on the violin

Vibrato is a musical technique which involves oscillating a pitch up and down quickly. When you hear the opera singers singing, they use lots and lots of vibrato. Vibrato enhances the tone of the music and can increase the musicality of some pieces by quite a lot. Vibrato on instruments is merely an imitation of vibrato in singing, which is produced by tremors in the larynx or diaphragm.

On classical string instruments, vibrato can be produced by rolling the ball of the finger on the left hand back and forth on the fingerboard toward the bridge and back toward the scroll while continuously playing the note with the bow. In another sense, the finger rolls back and forth on the string, varying the length of the string, varying the pitch of the sound.
Violin Vibrato

(Source: Wikimedia Commons)

Describing vibrato in words is quite a handful. The movement required to perform vibrato is just as unnatural as it sounds, which makes learning vibrato a difficult task. The repeated "vibrating" motion of the hand/arm is an exceedingly unnatural motion which requires constant practice and repetition.

(Source: Flickr, Protographer23)

Vibrato on the violin (or viola) presents an even more challenging task than on the cello or the bass. On the cello or bass, the the hand moves in an up and down motion, toward the ground and away from the ground. The task of vibrating the hand on the cello is a task of pushing the hand down and pulling the hand up by rotating the elbow of the left hand. However, on the violin, the motion of the hand is toward the player and away from the player made by rotating the elbow or the wrist. This motion can be even more unnatural than the motion on the cello.

There are generally two types of vibrato on the violin, "arm" vibrato and "wrist" vibrato. As their names indicate, "arm" vibrato is made by vibrating the arm, and "wrist" vibrato is made by vibrating the wrist. The choice between the two vibratos is much debated. In my opinion, wrist vibrato can produce a quicker vibrating sound and can sound more "virtuous" than the arm vibrato, but it is harder to learn. Arm vibrato can produce a "deeper", larger spanned vibrato, but it is more limited than wrist vibrato in the overall movement of the arm and speed of the vibrato.

Personally, I use wrist vibrato when I play the violin. Questions? Suggestions? Post your comment below.

Friday, February 14, 2014

Exploration and Exploitation: a direct tradeoff

You're playing Skyrim and you need to defeat all the computer enemies without dying. If you're playing "hard" mode, obviously, you can't just run at them and slash or shoot the enemies down like some superhero. You need to learn strategy that kills all the enemies in a calculated manner. Of course, when you first try the game, you might just die immediately, because your running path was terrible. You'll probably find some strategy to defeat the enemies easily each time eventually. Perhaps the mage charges right at you if you stand behind a pillar. You can then just cut him down when you're nearby. Using that strategy, you can kill the enemies each time you play, because the computer opponents never adapt.

What if you had a time limit or a death count on the game? You would need to learn the best strategy in the quickest time. However, the more you search and test that strategy the more you die. On the other hand, the more you search the better chance you find the best strategy which reaps the maximum reward with minimum risk. This is the basic premise of the Multi-armed bandit problem.

Slots (Source: Wikimedia Commons)

In the problem, a gambler is presented with an array of slot machines. Each slot machine yields a different reward at different chances. Not knowing which is which, the gambler much choose which slots to try and execute while maximizing profit.

Games today aren't so simple in presenting the exploration and exploitation tradeoff. Most games change scenarios often so you can't really exploit the same strategy over and over again, because the game changes levels. However, many games award the player for "exploring" more. A classic example is the super Mario brothers underground level. If the player explores the level and breaks the brick "ceiling", he can actually run on the ceiling through the level without meeting any hazards. To reward the player even more, the player is given the option to skip further levels if he runs across the ceiling.

The exploration and exploration tradeoffs can be tough decisions for game makers. If it is too easy to exploit the game, the player might get bored. There is also the question of how to reward the player for exploring and exploiting strategies. A big problem with many of the realistic games today like GTA, Call of duty, skyrim, Modern warfare, etc. is that their artificial intelligence systems always perform the same action and never adapt. It is too easy for the player to find a loophole and easily defeat the enemies.

How do you think a player should be awarded for exploring and exploiting strategies in games? How will the development of game artificial intelligence effect the tradeoff? Post your comments below.

I was inspired by this video to make this blog post. The maker discusses exploration and exploitation in the making of a skyrim ai mod.

Friday, January 31, 2014

Teaching a program to predict the Iris Flower's type

The Iris flower data set is a famous data set often used for teaching machine learning and data mining techniques. For practice, I'll try and write a program in Python that predicts the flower type from just its sepal and petal sizes.

The goal of the program is to be able to input the flower's sepal length, sepal width, petal length, and petal width, and have it output the correct flower type.

The iris data comes in the following format:
5.1,3.5,1.4,0.2,Iris-setosa
6.1,2.9,4.7,1.4,Iris-versicolor
4.7,3.2,1.3,0.2,Iris-setosa
5.8,2.7,5.1,1.9,Iris-virginica
5.0,3.6,1.4,0.2,Iris-setosa
...
where the commas are indicating
sepal length (cm), sepal width (cm), petal length (cm), petal width (cm), flower type

There are three different flowers in the data set, setosa, versicolor, and virginica. They look pretty similar, right?

Iris-setosa (Source: Wikimedia commons)

Iris-versolor (Source: Wikimedia commons)

Iris-virginica (Source: Wikimedia commons)

Even though they they look so similar, their sepal and petal sizes can differentiate between them. Here's a neat scatter plot on wikipedia of the data:

(Source: Wikimedia commons by user Indon)

----------------------------------
Algorithm Implementation

For the actual predicting, I'll be implementing logistic regression by batch gradient descent from scratch to predict Iris flower types from their sepal and petal sizes. It sounds like a handful, but the algorithm is actual quite elegant.

Given a column vector of targets y and a vector of training targets x the algorithm can be written as
1. Initialize random θ as the set of parameters
2. Until θ converges{
θ[j] := θ[j] + sum( ( y - h(x)) * x[j]) )
}
where h(x) is the sigmoid function; h(x) = 1 / ( 1 + e^(x * θ))

I don't have enough room to explain why the algorithm works, but that's a blog post for later.

Once vector θ is trained, h(x) = 1 / ( 1 + e ^ (x*θ)) will output the probability that y is the positive case.

The problem with the algorithm is that y can only be 1 or 0, positive or negative, not 3 flower types here. To solve this problem, three parameters θ are trained for each of the flowers using the one vs. all method.

In python, I used the popular numerical processing library numpy so I could do vector operations much easier.

To set up the batch gradient descent, I made some smaller functions (numpy imported as np):

For the batch gradient descent:

-------------------------------------------
Results

The original data set of the the iris flowers provides 150 examples, so I used 120 for training the classifier and the other 30 for testing.

Unfortunately, the trained classifier actually did very poorly. When I tested it, it would either get 60 - 80% correct, or 0-20%, which is peculiar. I think it had something to do with the convergence testing which I must have messed. Of course, I only used linear features, which is also a problem. After all the debugging I had to do in the program, I think I'll call it finished.

The code is hosted on github at https://github.com/liu916/IrisTrain

Learning how to write a blog about learning