Vector#

gym.vector.make(id: str, num_envs: int = 1, asynchronous: bool = True, wrappers: callable | List[callable] | None = None, disable_env_checker: bool | None = None, **kwargs) VectorEnv#

Create a vectorized environment from multiple copies of an environment, from its id.

Example:

>>> import gym
>>> env = gym.vector.make('CartPole-v1', num_envs=3)
>>> env.reset()
array([[-0.04456399,  0.04653909,  0.01326909, -0.02099827],
       [ 0.03073904,  0.00145001, -0.03088818, -0.03131252],
       [ 0.03468829,  0.01500225,  0.01230312,  0.01825218]],
      dtype=float32)
Parameters:
  • id – The environment ID. This must be a valid ID from the registry.

  • num_envs – Number of copies of the environment.

  • asynchronous – If True, wraps the environments in an AsyncVectorEnv (which uses `multiprocessing`_ to run the environments in parallel). If False, wraps the environments in a SyncVectorEnv.

  • wrappers – If not None, then apply the wrappers to each internal environment during creation.

  • disable_env_checker – If to run the env checker for the first environment only. None will default to the environment spec disable_env_checker parameter (that is by default False), otherwise will run according to this argument (True = not run, False = run)

  • **kwargs – Keywords arguments applied during gym.make

Returns:

The vectorized environment.

VectorEnv#

gym.vector.VectorEnv.action_space#

The (batched) action space. The input actions of step must be valid elements of action_space.:

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.action_space
MultiDiscrete([2 2 2])
gym.vector.VectorEnv.observation_space#

The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.:

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.observation_space
Box([[-4.8 ...]], [[4.8 ...]], (3, 4), float32)
gym.vector.VectorEnv.single_action_space#

The action space of an environment copy.:

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.single_action_space
Discrete(2)
gym.vector.VectorEnv.single_observation_space#

The observation space of an environment copy.:

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.single_action_space
Box([-4.8 ...], [4.8 ...], (4,), float32)

Reset#

VectorEnv.reset(*, seed: int | List[int] | None = None, options: dict | None = None)#

Reset all parallel environments and return a batch of initial observations.

Parameters:
  • seed – The environment reset seeds

  • options – If to return the options

Returns:

A batch of observations from the vectorized environment.

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset()
(array([[-0.02240574, -0.03439831, -0.03904812,  0.02810693],
       [ 0.01586068,  0.01929009,  0.02394426,  0.04016077],
       [-0.01314174,  0.03893502, -0.02400815,  0.0038326 ]],
      dtype=float32), {})

Step#

VectorEnv.step(actions)#

Take an action for each parallel environment.

Parameters:

actions – element of action_space Batch of actions.

Returns:

Batch of (observations, rewards, terminated, truncated, infos) or (observations, rewards, dones, infos)

>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset()
>>> actions = np.array([1, 0, 1])
>>> observations, rewards, dones, infos = envs.step(actions)

>>> observations
array([[ 0.00122802,  0.16228443,  0.02521779, -0.23700266],
        [ 0.00788269, -0.17490888,  0.03393489,  0.31735462],
        [ 0.04918966,  0.19421194,  0.02938497, -0.29495203]],
        dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> dones
array([False, False, False])
>>> infos
{}