motornet.environments¶
- class motornet.environment.Environment(effector, q_init=None, name: str = 'Env', differentiable: bool = True, max_ep_duration: float = 1.0, action_noise: float = 0.0, obs_noise: float | list = 0.0, action_frame_stacking: int = 0, proprioception_delay: float = None, vision_delay: float = None, proprioception_noise: float = 0.0, vision_noise: float = 0.0, **kwargs)¶
Bases:
Env,ModuleBase class for environments.
- Parameters:
effector –
motornet.effector.Effectorobject class or subclass. This is the effector that will evolve in the environment.q_init – Tensor or numpy.ndarray, the desired initial joint states for the environment, if a single set of pre-defined initial joint states is desired. If None, random initial joint states are drawn at each call of
reset(). This parameter will be ignored onreset()calls where a joint_state is provided as an option.name – String, the name of the environment object instance.
differentiable – Boolean, whether the environment will be differentiable or not. This will usually be useful for reinforcement learning, where the differentiability is not needed.
max_ep_duration – Float, the maximum duration of an episode, in seconds.
action_noise – Float, the standard deviation of the Gaussian noise added to the action input at each step of the simulation.
obs_noise – Float or list, the standard deviation of the Gaussian noise added to the observation vector at each step of the simulation. If this is a list, it should have as many elements as the observation vector and will indicate the standard deviation of each observation element independently.
action_frame_stacking – Integer, the number of past action steps to add to the observation vector.
proprioception_delay – Float, the delay in seconds for the proprioceptive feedback to be added to the observation vector. If None, no delay will occur.
vision_delay – Float, the delay in seconds for the visual feedback to be added to the observation vector. If None, no delay will occur.
proprioception_noise – Float, the standard deviation of the Gaussian noise added to the proprioceptive feedback at each step of the simulation.
vision_noise – Float, the standard deviation of the Gaussian noise added to the visual feedback at each step of the simulation.
**kwargs – This is passed as-is to the
torch.nn.Moduleparent class.
- apply_noise(loc: Tensor, noise: float | list) Tensor¶
Applies element-wise Gaussian noise to the input loc.
- Parameters:
loc – input on which the Gaussian noise is applied, which in probabilistic terms make it the mean of the Gaussian distribution.
noise – Float or list, the standard deviation (spread or “width”) of the distribution. Must be non-negative. If this is a list, it must contain as many elements as the second axis of `loc, and the Gaussian distribution for each column of loc will have a different standard deviation. Note that the elements within each column of loc will still be independent and identically distributed (i.i.d.).
- Returns:
A noisy version of loc as a tensor.
- detach(x: Any) Any¶
Converts a tensor to a numpy.ndarray on the CPU, or returns x unchanged if it is not a tensor.
- Parameters:
x – The value to detach.
- Returns:
A numpy.ndarray if x is a tensor, otherwise x unchanged.
- get_attributes() tuple[list[str], list]¶
Gets all non-callable attributes declared in the object instance, excluding gym.spaces.Space attributes, the effector, muscle, and skeleton attributes.
- Returns:
A list of attribute names as string elements.
A list of attribute values.
- get_obs(action: Tensor | None = None, deterministic: bool = False) Tensor | ndarray¶
Returns a (batch_size, n_features) tensor containing the (potentially time-delayed) observations. By default, this is the task goal, followed by the output of the
get_proprioception()method, the output of theget_vision()method, and finally the lastaction_frame_stackingaction sets, if a non-zero action_frame_stacking keyword argument was passed at initialization of this class instance. .i.i.d. Gaussian noise is added to each element in the tensor, using theobs_noiseattribute.- Parameters:
action – Tensor or None, the action taken at the current step. Used to update the action frame buffer when
action_frame_stackingis non-zero. Default: None.deterministic – Boolean, if True, observation noise is not applied. Default: False.
- Returns:
The observation vector as tensor or numpy.ndarray, depending on whether the
Environmentis set as differentiable or not.
- get_proprioception() Tensor¶
Returns a (batch_size, n_features) tensor containing the instantaneous (non-delayed) proprioceptive feedback. By default, this is the normalized muscle length for each muscle, followed by the normalized muscle velocity for each muscle as well. .i.i.d. Gaussian noise is added to each element in the tensor, using the
proprioception_noiseattribute.
- get_save_config() dict¶
Gets the environment object’s configuration as a dictionary.
- Returns:
A dictionary containing the parameters of the environment’s configuration. All parameters held as non-callable attributes by the object instance will be included in the dictionary, excluding gym.spaces.Space attributes, the effector, muscle, and skeleton attributes.
- get_vision() Tensor¶
Returns a (batch_size, n_features) tensor containing the instantaneous (non-delayed) visual feedback. By default, this is the cartesian position of the end-point effector, that is, the fingertip. .i.i.d. Gaussian noise is added to each element in the tensor, using the
vision_noiseattribute.
- joint2cartesian(joint_states: Tensor) Tensor¶
Shortcut to
motornet.effector.Effector.joint2cartesian()method.
- print_attributes() None¶
Prints all non-callable attributes declared in the object instance, excluding gym.spaces.Space attributes, the effector, muscle, and skeleton attributes.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[Tensor | ndarray, dict[str, Any]]¶
Initialize the task goal and
effectorstates for a (batch of) simulation episode(s). Theeffectorstates (joint, cartesian, muscle, geometry) are initialized to be biomechanically compatible with each other. This method is likely to be overwritten by any subclass to implement user-defined computations, such as defining a custom initial goal or initial states.- Parameters:
seed – Integer, the seed that is used to initialize the environment’s PRNG (np_random). If the environment does not already have a PRNG and
seed=None(the default option) is passed, a seed will be chosen from some source of entropy (e.g. timestamp or /dev/urandom). However, if the environment already has a PRNG andseed=Noneis passed, the PRNG will not be reset. If you pass an integer, the PRNG will be reset even if it already exists. Usually, you want to pass an integer right after the environment has been initialized and then never again.options – Dictionary, optional kwargs specific to motornet environments. This is mainly useful to pass batch_size, joint_state, and deterministic kwargs if desired, as described below.
- Options:
batch_size: Integer, the desired batch size. Default: 1.
joint_state: The joint state from which the other state values are inferred. If None, the q_init value declared during the class instantiation will be used. If q_init is also None, random initial joint states are drawn, from which the other state values are inferred. Default: None.
deterministic: Boolean, whether observation, proprioception, and vision noise are applied. Default: False.
- Returns:
The observation vector as tensor or numpy.ndarray, if the
Environmentis set as differentiable or not, respectively. It has dimensionality (batch_size, n_features).A dictionary containing the initial step’s information.
- step(action: Tensor | ndarray, deterministic: bool = False, **kwargs) tuple[Tensor | ndarray, ndarray | None, bool, bool, dict[str, Any]]¶
Perform one simulation step. This method is likely to be overwritten by any subclass to implement user-defined computations, such as reward value calculation for reinforcement learning, custom truncation or termination conditions, or time-varying goals.
- Parameters:
action – Tensor or numpy.ndarray, the input drive to the actuators.
deterministic – Boolean, whether observation, action, proprioception, and vision noise are applied.
**kwargs – Passed as-is to the
motornet.effector.Effector.step()call. Mainly useful to pass endpoint_load or joint_load kwargs.
- Returns:
The observation vector as tensor or numpy.ndarray, if the
Environmentis set as differentiable or not, respectively. It has dimensionality (batch_size, n_features).A numpy.ndarray with the reward information for the step, with dimensionality (batch_size, 1). This is None if the
Environmentis set as differentiable. By default this always returns 0. in theEnvironment.A boolean indicating if the simulation has been terminated or truncated. If the
Environmentis set as differentiable, this returns True when the simulation time reaches max_ep_duration provided at initialization.A boolean indicating if the simulation has been truncated early or not. This always returns False if the
Environmentis set as differentiable.A dictionary containing this step’s information.
- to(*args, **kwargs)¶
Move and/or cast the parameters and buffers.
This can be called as
- to(device=None, dtype=None, non_blocking=False)
- to(dtype, non_blocking=False)
- to(tensor, non_blocking=False)
- to(memory_format=torch.channels_last)
Its signature is similar to
torch.Tensor.to(), but only accepts floating point or complexdtypes. In addition, this method will only cast the floating point or complex parameters and buffers todtype(if given). The integral parameters and buffers will be moveddevice, if that is given, but with dtypes unchanged. Whennon_blockingis set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.See below for examples.
Note
This method modifies the module in-place.
- Parameters:
device (
torch.device) – the desired device of the parameters and buffers in this moduledtype (
torch.dtype) – the desired floating point or complex dtype of the parameters and buffers in this moduletensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module
memory_format (
torch.memory_format) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)
- Returns:
self
- Return type:
Module
Examples:
>>> # xdoctest: +IGNORE_WANT("non-deterministic") >>> linear = nn.Linear(2, 2) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]]) >>> linear.to(torch.double) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]], dtype=torch.float64) >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1) >>> gpu1 = torch.device("cuda:1") >>> linear.to(gpu1, dtype=torch.half, non_blocking=True) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1') >>> cpu = torch.device("cpu") >>> linear.to(cpu) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16) >>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble) >>> linear.weight Parameter containing: tensor([[ 0.3741+0.j, 0.2382+0.j], [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128) >>> linear(torch.ones(3, 2, dtype=torch.cdouble)) tensor([[0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
- update_obs_buffer(action: Tensor | None = None) None¶
Shifts each observation buffer by one step and appends the current observation.
- Parameters:
action – Tensor or None, the most recent action, used to update the action frame buffer when
action_frame_stackingis non-zero. Default: None.
- class motornet.environment.RandomTargetReach(*args, **kwargs)¶
Bases:
EnvironmentA reach-to-target environment in which both the starting position and the target are drawn uniformly at random from the reachable workspace at each episode reset.
- Parameters:
*args – Positional arguments passed as-is to the parent
Environmentclass.**kwargs – Keyword arguments passed as-is to the parent
Environmentclass.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[Any, dict[str, Any]]¶
Overrides
Environment.reset()to draw a random fingertip target position as the goal, sampled uniformly from the full reachable workspace.- Parameters:
seed – Integer, the seed that is used to initialize the environment’s PRNG. See
Environment.reset()for full details.options – Dictionary, optional kwargs. Accepts the same keys as
Environment.reset().
- Returns:
The observation vector as tensor or numpy.ndarray.
A dictionary containing the initial step’s information.