Spaces

Space

  • class rlcoach.spaces.Space(_shape: Union[int, tuple, list, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf)[source]
  • A space defines a set of valid values

    • Parameters
      • shape – the shape of the space

      • low – the lowest values possible in the space. can be an array defining the lowest values per point,or a single value defining the general lowest values

      • high – the highest values possible in the space. can be an array defining the highest values per point,or a single value defining the general highest values

    • contains(val: Union[int, float, numpy.ndarray]) → bool[source]

    • Checks if value is contained by this space. The shape must match andall of the values must be within the low and high bounds.

      • Parameters
      • val – a value to check

      • Returns

      • True / False depending on if the val matches the space definition
    • isvalid_index(_index: numpy.ndarray) → bool[source]

    • Checks if a given multidimensional index is within the bounds of the shape of the space

      • Parameters
      • index – a multidimensional index

      • Returns

      • True if the index is within the shape of the space. False otherwise
    • sample() → numpy.ndarray[source]

    • Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if nobounds are defined

      • Returns
      • A numpy array sampled from the space

Observation Spaces

  • class rlcoach.spaces.ObservationSpace(_shape: Union[int, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf)[source]
    • contains(val: Union[int, float, numpy.ndarray]) → bool
    • Checks if value is contained by this space. The shape must match andall of the values must be within the low and high bounds.

      • Parameters
      • val – a value to check

      • Returns

      • True / False depending on if the val matches the space definition
    • isvalid_index(_index: numpy.ndarray) → bool

    • Checks if a given multidimensional index is within the bounds of the shape of the space

      • Parameters
      • index – a multidimensional index

      • Returns

      • True if the index is within the shape of the space. False otherwise
    • sample() → numpy.ndarray

    • Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if nobounds are defined

      • Returns
      • A numpy array sampled from the space

VectorObservationSpace

  • class rlcoach.spaces.VectorObservationSpace(_shape: int, low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, measurements_names: List[str] = None)[source]
  • An observation space which is defined as a vector of elements. This can be particularly useful for environmentswhich return measurements, such as in robotic environments.

PlanarMapsObservationSpace

  • class rlcoach.spaces.PlanarMapsObservationSpace(_shape: numpy.ndarray, low: int, high: int, channels_axis: int = -1)[source]
  • An observation space which defines a stack of 2D observations. For example, an environment which returnsa stack of segmentation maps like in Starcraft.

ImageObservationSpace

  • class rlcoach.spaces.ImageObservationSpace(_shape: numpy.ndarray, high: int, channels_axis: int = -1)[source]
  • An observation space which is a private case of the PlanarMapsObservationSpace, where the stack of 2D observationsrepresent a RGB image, or a grayscale image.

Action Spaces

  • class rlcoach.spaces.ActionSpace(_shape: Union[int, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, descriptions: Union[None, List, Dict] = None, default_action: Union[int, float, numpy.ndarray, List] = None)[source]
    • clipaction_to_space(_action: Union[int, float, numpy.ndarray, List]) → Union[int, float, numpy.ndarray, List][source]
    • Given an action, clip its values to fit to the action space ranges

      • Parameters
      • action – a given action

      • Returns

      • the clipped action
    • contains(val: Union[int, float, numpy.ndarray]) → bool

    • Checks if value is contained by this space. The shape must match andall of the values must be within the low and high bounds.

      • Parameters
      • val – a value to check

      • Returns

      • True / False depending on if the val matches the space definition
    • isvalid_index(_index: numpy.ndarray) → bool

    • Checks if a given multidimensional index is within the bounds of the shape of the space

      • Parameters
      • index – a multidimensional index

      • Returns

      • True if the index is within the shape of the space. False otherwise
    • sample() → numpy.ndarray

    • Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if nobounds are defined

      • Returns
      • A numpy array sampled from the space
    • sample_with_info() → rl_coach.core_types.ActionInfo[source]

    • Get a random action with additional “fake” info

      • Returns
      • An action info instance

AttentionActionSpace

  • class rlcoach.spaces.AttentionActionSpace(_shape: int, low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None, forced_attention_size: Union[None, int, float, numpy.ndarray] = None)[source]
  • A box selection continuous action space, meaning that the actions are defined as selecting a multidimensional boxfrom a given range.The actions will be in the form:[[low_x, low_y, …], [high_x, high_y, …]]

BoxActionSpace

  • class rlcoach.spaces.BoxActionSpace(_shape: Union[int, numpy.ndarray], low: Union[None, int, float, numpy.ndarray] = -inf, high: Union[None, int, float, numpy.ndarray] = inf, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None)[source]
  • A multidimensional bounded or unbounded continuous action space

DiscreteActionSpace

  • class rlcoach.spaces.DiscreteActionSpace(_num_actions: int, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None)[source]
  • A discrete action space with action indices as actions

MultiSelectActionSpace

  • class rlcoach.spaces.MultiSelectActionSpace(_size: int, max_simultaneous_selected_actions: int = 1, descriptions: Union[None, List, Dict] = None, default_action: numpy.ndarray = None, allow_no_action_to_be_selected=True)[source]
  • A discrete action space where multiple actions can be selected at once. The actions are encoded as multi-hot vectors

CompoundActionSpace

  • class rlcoach.spaces.CompoundActionSpace(_sub_spaces: List[rl_coach.spaces.ActionSpace])[source]
  • An action space which consists of multiple sub-action spaces.For example, in Starcraft the agent should choose an action identifier from ~550 options (Discrete(550)),but it also needs to choose 13 different arguments for the selected action identifier, where each argument isby itself an action space. In Starcraft, the arguments are Discrete action spaces as well, but this is not mandatory.

Goal Spaces

  • class rlcoach.spaces.GoalsSpace(_goal_name: str, reward_type: rl_coach.spaces.GoalToRewardConversion, distance_metric: Union[rl_coach.spaces.GoalsSpace.DistanceMetric, Callable])[source]
  • A multidimensional space with a goal type definition. It also behaves as an action space, so that hierarchicalagents can use it as an output action space.The class acts as a wrapper to the target space. So after setting the target space, all the values of the classwill match the values of the target space (the shape, low, high, etc.)

    • Parameters
      • goal_name – the name of the observation space to use as the achieved goal.

      • reward_type – the reward type to use for converting distances from goal to rewards

      • distance_metric – the distance metric to use. could be either one of the distances in theDistanceMetric enum, or a custom function that gets two vectors as input andreturns the distance between them

    • class DistanceMetric[source]

    • An enumeration.

    • clipaction_to_space(_action: Union[int, float, numpy.ndarray, List]) → Union[int, float, numpy.ndarray, List]

    • Given an action, clip its values to fit to the action space ranges

      • Parameters
      • action – a given action

      • Returns

      • the clipped action
    • contains(val: Union[int, float, numpy.ndarray]) → bool

    • Checks if value is contained by this space. The shape must match andall of the values must be within the low and high bounds.

      • Parameters
      • val – a value to check

      • Returns

      • True / False depending on if the val matches the space definition
    • distancefrom_goal(_goal: numpy.ndarray, state: dict) → float[source]

    • Given a state, check its distance from the goal

      • Parameters
        • goal – a numpy array representing the goal

        • state – a dict representing the state

      • Returns

      • the distance from the goal
    • getreward_for_goal_and_state(_goal: numpy.ndarray, state: dict) → Tuple[float, bool][source]

    • Given a state, check if the goal was reached and return a reward accordingly

      • Parameters
        • goal – a numpy array representing the goal

        • state – a dict representing the state

      • Returns

      • the reward for the current goal and state pair and a boolean representing if the goal was reached
    • goalfrom_state(_state: Dict)[source]

    • Given a state, extract an observation according to the goal_name

      • Parameters
      • state – a dictionary of observations

      • Returns

      • the observation corresponding to the goal_name
    • isvalid_index(_index: numpy.ndarray) → bool

    • Checks if a given multidimensional index is within the bounds of the shape of the space

      • Parameters
      • index – a multidimensional index

      • Returns

      • True if the index is within the shape of the space. False otherwise
    • sample() → numpy.ndarray

    • Sample the defined space, either uniformly, if space bounds are defined, or Normal distributed if nobounds are defined

      • Returns
      • A numpy array sampled from the space
    • sample_with_info() → rl_coach.core_types.ActionInfo

    • Get a random action with additional “fake” info

      • Returns
      • An action info instance