I am a management consultant exploring the World of Artificial intelligence.

Action Space for the OpenAI Retro Gym game Airstriker-Genesis

Action Space for the OpenAI Retro Gym game Airstriker-Genesis

I’ve been recently playing around with the OpenAI Retro gym, a simulator for old Atari, NES, etc. games that lets artificial intelligence agents play them. One of the standard of-the-shelve games is the old game ‘Airstriker Genesis’.

 
Woohoo!

Woohoo!

 

It’s apparently a very easy game, with just the functions ‘left’, ‘right’ and ‘fire’ which makes it a good place to start. The problem is that the so called Action Space of this game is somewhat complicated and not well documented. I’ve searched the web quite a while to find out how this works. Only an obscure Japanese website had a text that finally helped me figure it out.

The Action Space of a retro gym game is basically the available functions you can use out of the overall available buttons from the original game console. Again, it’s just three available ones - that should be easy. But how to actively use an action in the game? All code you find online only uses the sample actions, which just provides random numbers in the action space:

import retro

def main():
    env = retro.make(game='Airstriker-Genesis')
    obs = env.reset()
    while True:
        obs, rew, done, info = env.step(env.action_space.sample())
        env.render()
        if done:
            obs = env.reset()
    env.close()


if __name__ == "__main__":
    main()

This is really all I could find. If you print out the sample actions like this

for step in range(num_steps):
   act = env.action_space.sample()
   print(act)

you get this:

[0 0 0 1 0 1 0 1 1 0 0 0]
[0 0 1 1 1 0 0 1 0 1 0 0]
[1 0 0 0 0 0 0 0 0 1 0 0]
[1 1 0 0 0 0 1 0 1 0 1 1]

If you look into the type of the Action Space, you get a clue what this might be:

env = retro.make(game='Airstriker-Genesis')
print (env.action_space)

This gives out the following:

MultiBinary(12)

… which tells you that each action is composed out of 12 bits that can be either flipped on or off (hence the array of bools you see as output of the sample action). But how does this match with the 3 possible actions (‘left’, ‘right’, ‘fire’)? This is where the Japanese website came finally in. If I understand it correctly, the old SEGA controller has 12 buttons ('B', 'A', 'MODE', 'START', 'UP', 'DOWN', 'LEFT', 'RIGHT', 'C', 'Y', 'X', 'Z'), hence the 12 bits. Only 3 out of those are used as actions in the simulator. The Japanese page tells you how to construct a discrete action space out of this. If you only want to know how to operate the space ship, here’s that code for you:

buttons = ['B', 'A', 'MODE', 'START', 'UP', 'DOWN', 'LEFT', 'RIGHT', 'C', 'Y', 'X', 'Z']
actions = [['LEFT'], ['RIGHT'], ['B']]
actions_ag = []
for action in actions:
    arr = np.array([False] * 12)
    for button in action:
        arr[buttons.index(button)] = True
    actions_ag.append(arr)

You can then access an action like this:

for step in range(num_steps):
    act = actions_ag[0]#or 1 or 2
    print(act)#this should give you e.g. [False False ... True ... False]
    obs, rew, done, info = env.step(act)
    env.render()  

That means the available action space for Airstriker-Genesis in the OpenAI Retro Gym version is this:

[False False False False False False  True False False False False False] Left
[False False False False False False False  True False False False False] Right
[ True False False False False False False False False False False False] Fire ('B')

And that’s it! Took me some time to figure it out - thank you anonymous Japanese website owner! :)

Manually play Airstriker-Genesis with your keyboard in the OpenAI Gym Retro

How to fix Apple Home App placing your home location wrong in automations