Bipedal Walker#

../../../_images/bipedal_walker.gif

This environment is part of the Box2D environments. Please read that page first for general information.

Action Space

Box(-1.0, 1.0, (4,), float32)

Observation Shape

(24,)

Observation High

[3.14 5. 5. 5. 3.14 5. 3.14 5. 5. 3.14 5. 3.14 5. 5. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. ]

Observation Low

[-3.14 -5. -5. -5. -3.14 -5. -3.14 -5. -0. -3.14 -5. -3.14 -5. -0. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. ]

Import

gym.make("BipedalWalker-v3")

Description#

This is a simple 4-joint walker robot environment. There are two versions:

  • Normal, with slightly uneven terrain.

  • Hardcore, with ladders, stumps, pitfalls.

To solve the normal version, you need to get 300 points in 1600 time steps. To solve the hardcore version, you need 300 points in 2000 time steps.

A heuristic is provided for testing. It’s also useful to get demonstrations to learn from. To run the heuristic:

python gym/envs/box2d/bipedal_walker.py

Action Space#

Actions are motor speed values in the [-1, 1] range for each of the 4 joints at both hips and knees.

Observation Space#

State consists of hull angle speed, angular velocity, horizontal speed, vertical speed, position of joints and joints angular speed, legs contact with ground, and 10 lidar rangefinder measurements. There are no coordinates in the state vector.

Rewards#

Reward is given for moving forward, totaling 300+ points up to the far end. If the robot falls, it gets -100. Applying motor torque costs a small amount of points. A more optimal agent will get a better score.

Starting State#

The walker starts standing at the left end of the terrain with the hull horizontal, and both legs in the same position with a slight knee angle.

Episode Termination#

The episode will terminate if the hull gets in contact with the ground or if the walker exceeds the right end of the terrain length.

Arguments#

To use to the hardcore environment, you need to specify the hardcore=True argument like below:

import gym
env = gym.make("BipedalWalker-v3", hardcore=True)

Version History#

  • v3: returns closest lidar trace instead of furthest; faster video recording

  • v2: Count energy spent

  • v1: Legs now report contact with ground; motors have higher torque and speed; ground has higher friction; lidar rendered less nervously.

  • v0: Initial version

Credits#

Created by Oleg Klimov