pytupli.benchmark.TupliEnvWrapper
- class TupliEnvWrapper(env: ~gymnasium.core.Env, storage: ~pytupli.storage.TupliStorage, benchmark_id: str | None = None, metadata_callback: ~pytupli.schema.EpisodeMetadataCallback | None = None, rl_tuple_cls: type[~pytupli.schema.RLTuple] = <class 'pytupli.schema.RLTuple'>)[source]
Bases:
WrapperA wrapper for Gymnasium environments that enables serialization and deserialization with the goal of creating reproducible benchmarks from environments. It handles the interface to the storage backend, including storing, loading, and publishing benchmarks. Enables users to record interactions with gymnasium environments to the storage such that they can be used as datasets for offline RL.
- Parameters:
env (Env) – The Gymnasium environment to wrap
storage (TupliStorage) – Storage backend for saving benchmark and episode data
benchmark_id (str | None) – Identifier for the benchmark. Defaults to None
metadata_callback (EpisodeMetadataCallback | None) – Callback for generating
None (episode metadata. Defaults to)
rl_tuple_cls (type[RLTuple]) – Class to use for creating RL tuples. Defaults to RLTuple
Wraps an environment to allow a modular transformation of the
step()andreset()methods.- Parameters:
env – The environment to wrap
Methods
Activates the recording of environment interactions.
class_nameReturns the class name of the wrapper.
closeCloses the wrapper and
env.Deactivates the recording of environment interactions.
Deletes the benchmark and optionally its related data from storage.
Deserializes a JSON string back into a Gymnasium environment.
get_wrapper_attrGets an attribute from the wrapper and lower environments if name doesn't exist in this object.
has_wrapper_attrChecks if the given attribute is within the wrapper or its environment.
Loads a benchmark from storage.
Publishes the benchmark, making it available for other users depending on their access rights.
renderUses the
render()of theenvthat can be overwritten to change the returned data.Resets the environment and returns the initial observation.
Serializes a Gymnasium environment to a JSON string.
set_wrapper_attrSets an attribute on this wrapper or lower environment if name is already defined.
Takes a step in the environment and optionally records the interaction.
Stores the benchmark in the storage backend.
wrapper_specGenerates a WrapperSpec for the wrappers.
Attributes
action_spaceReturn the
Envaction_spaceunless overwritten then the wrapperaction_spaceis used.metadataReturns the
Envmetadata.np_randomReturns the
Envnp_randomattribute.np_random_seedReturns the base environment's
np_random_seed.observation_spaceReturn the
Envobservation_spaceunless overwritten then the wrapperobservation_spaceis used.render_modeReturns the
Envrender_mode.specReturns the
Envspecattribute with the WrapperSpec if the wrapper inherits from EzPickle.unwrappedReturns the base environment of the wrapper.
- classmethod _deserialize(env: Env, storage: TupliStorage) Env[source]
Internal method for environment deserialization.
This method is meant to be overridden by subclasses to implement custom deserialization behavior, e.g., for artifacts such as csv files or trained models.
- Parameters:
env (Env) – The environment to deserialize
storage (TupliStorage) – Storage backend for loading artifacts
- Returns:
Env – The deserialized environment
- _get_hash(obj: Any) str[source]
Generates a hash for a given object using JSON serialization.
- Parameters:
obj (Any) – The object to hash
- Returns:
str – SHA-256 hash of the serialized object
- _prepare_storage(metadata: BenchmarkMetadata) tuple[str, str][source]
Prepares benchmark data for storage.
- Parameters:
metadata (BenchmarkMetadata) – Metadata for the benchmark
- Returns:
tuple[str, str] – Tuple containing the benchmark hash and serialized environment
- _serialize(env: Env) tuple[Env, list][source]
Internal method for environment serialization.
This method is meant to be overridden by subclasses to implement custom serialization behavior, e.g., for artifacts such as csv files or trained models.
- Parameters:
env (Env) – The environment to serialize
- Returns:
tuple[Env, list] – Tuple containing the processed environment and list of related artifacts
- activate_recording()[source]
Activates the recording of environment interactions.
When active, the wrapper will record tuples of (state, action, reward, etc.) and store them as episodes.
- deactivate_recording()[source]
Deactivates the recording of environment interactions.
When deactivated, the wrapper will not record or store any environment interactions.
- delete(delete_artifacts: bool = False, delete_episodes: bool = True)[source]
Deletes the benchmark and optionally its related data from storage.
- Parameters:
delete_artifacts (bool, optional) – Whether to delete related artifacts. Defaults to False
delete_episodes (bool, optional) – Whether to delete related episodes. Defaults to True
- Raises:
Exception – If deletion of benchmark, episodes, or artifacts fails
- classmethod deserialize_env(serialized_env: str, storage: TupliStorage) Env[source]
Deserializes a JSON string back into a Gymnasium environment.
- Parameters:
serialized_env (str) – The JSON string representation of the environment
storage (TupliStorage) – Storage backend for loading related artifacts
- Returns:
Env – The deserialized Gymnasium environment
- classmethod load(storage: ~pytupli.storage.TupliStorage, benchmark_id: str | None = None, metadata_callback: ~pytupli.schema.EpisodeMetadataCallback | None = None, rl_tuple_cls: type[~pytupli.schema.RLTuple] = <class 'pytupli.schema.RLTuple'>) TupliEnvWrapper[source]
Loads a benchmark from storage.
- Parameters:
storage (TupliStorage) – Storage backend to load from
benchmark_id (str | None) – ID of the benchmark to load. Defaults to None
metadata_callback (EpisodeMetadataCallback | None) – Callback for generating episode metadata. Defaults to None
- Returns:
TupliEnvWrapper – A new wrapper instance with the loaded benchmark
- publish() None[source]
Publishes the benchmark, making it available for other users depending on their access rights.
This method should be called after storing the benchmark when it’s ready to be used by others.
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[Any, dict[str, Any]][source]
Resets the environment and returns the initial observation.
- Parameters:
seed (int | None) – Random seed for environment reset. Defaults to None
options (dict[str, Any] | None) – Additional options for reset. Defaults to None
- Returns:
tuple[Any, dict[str, Any]] – Initial observation and info dictionary
- serialize_env(env: Env) str[source]
Serializes a Gymnasium environment to a JSON string.
This method handles the serialization of the environment and any related artifacts.
- Parameters:
env (Env) – The environment to serialize
- Returns:
str – JSON string representation of the environment
- step(action: Any) tuple[Any, SupportsFloat, bool, bool, dict[str, Any]][source]
Takes a step in the environment and optionally records the interaction.
If recording is active, stores the interaction in the tuple buffer and creates an episode when the episode terminates.
- Parameters:
action (Any) – The action to take in the environment
- Returns:
tuple[Any, SupportsFloat, bool, bool, dict[str, Any]] –
- Tuple containing:
observation: The environment observation
reward: The reward for the action
terminated: Whether the episode terminated naturally
truncated: Whether the episode was artificially terminated
info: Additional information from the environment
- store(name: str, description: str = '', difficulty: str | None = None, version: str | None = None, metadata: dict[str, Any] = {}) str[source]
Stores the benchmark in the storage backend.
- Parameters:
name (str) – Name of the benchmark
description (str, optional) – Description of the benchmark. Defaults to ‘’
difficulty (str | None, optional) – Difficulty level of the benchmark. Defaults to None
version (str | None, optional) – Version of the benchmark. Defaults to None
metadata (dict[str, Any], optional) – Additional metadata. Defaults to {}
- Returns:
str – The ID of the stored benchmark