pytupli.benchmark.TupliEnvWrapper

class TupliEnvWrapper(env: ~gymnasium.core.Env, storage: ~pytupli.storage.TupliStorage, benchmark_id: str | None = None, metadata_callback: ~pytupli.schema.EpisodeMetadataCallback | None = None, rl_tuple_cls: type[~pytupli.schema.RLTuple] = <class 'pytupli.schema.RLTuple'>)[source]

Bases: Wrapper

A wrapper for Gymnasium environments that enables serialization and deserialization with the goal of creating reproducible benchmarks from environments. It handles the interface to the storage backend, including storing, loading, and publishing benchmarks. Enables users to record interactions with gymnasium environments to the storage such that they can be used as datasets for offline RL.

Parameters:
  • env (Env) – The Gymnasium environment to wrap

  • storage (TupliStorage) – Storage backend for saving benchmark and episode data

  • benchmark_id (str | None) – Identifier for the benchmark. Defaults to None

  • metadata_callback (EpisodeMetadataCallback | None) – Callback for generating

  • None (episode metadata. Defaults to)

  • rl_tuple_cls (type[RLTuple]) – Class to use for creating RL tuples. Defaults to RLTuple

Wraps an environment to allow a modular transformation of the step() and reset() methods.

Parameters:

env – The environment to wrap

Methods

activate_recording

Activates the recording of environment interactions.

class_name

Returns the class name of the wrapper.

close

Closes the wrapper and env.

deactivate_recording

Deactivates the recording of environment interactions.

delete

Deletes the benchmark and optionally its related data from storage.

deserialize_env

Deserializes a JSON string back into a Gymnasium environment.

get_wrapper_attr

Gets an attribute from the wrapper and lower environments if name doesn't exist in this object.

has_wrapper_attr

Checks if the given attribute is within the wrapper or its environment.

load

Loads a benchmark from storage.

publish

Publishes the benchmark, making it available for other users depending on their access rights.

render

Uses the render() of the env that can be overwritten to change the returned data.

reset

Resets the environment and returns the initial observation.

serialize_env

Serializes a Gymnasium environment to a JSON string.

set_wrapper_attr

Sets an attribute on this wrapper or lower environment if name is already defined.

step

Takes a step in the environment and optionally records the interaction.

store

Stores the benchmark in the storage backend.

wrapper_spec

Generates a WrapperSpec for the wrappers.

Attributes

action_space

Return the Env action_space unless overwritten then the wrapper action_space is used.

metadata

Returns the Env metadata.

np_random

Returns the Env np_random attribute.

np_random_seed

Returns the base environment's np_random_seed.

observation_space

Return the Env observation_space unless overwritten then the wrapper observation_space is used.

render_mode

Returns the Env render_mode.

spec

Returns the Env spec attribute with the WrapperSpec if the wrapper inherits from EzPickle.

unwrapped

Returns the base environment of the wrapper.

classmethod _deserialize(env: Env, storage: TupliStorage) Env[source]

Internal method for environment deserialization.

This method is meant to be overridden by subclasses to implement custom deserialization behavior, e.g., for artifacts such as csv files or trained models.

Parameters:
  • env (Env) – The environment to deserialize

  • storage (TupliStorage) – Storage backend for loading artifacts

Returns:

Env – The deserialized environment

_get_hash(obj: Any) str[source]

Generates a hash for a given object using JSON serialization.

Parameters:

obj (Any) – The object to hash

Returns:

str – SHA-256 hash of the serialized object

_prepare_storage(metadata: BenchmarkMetadata) tuple[str, str][source]

Prepares benchmark data for storage.

Parameters:

metadata (BenchmarkMetadata) – Metadata for the benchmark

Returns:

tuple[str, str] – Tuple containing the benchmark hash and serialized environment

_serialize(env: Env) tuple[Env, list][source]

Internal method for environment serialization.

This method is meant to be overridden by subclasses to implement custom serialization behavior, e.g., for artifacts such as csv files or trained models.

Parameters:

env (Env) – The environment to serialize

Returns:

tuple[Env, list] – Tuple containing the processed environment and list of related artifacts

activate_recording()[source]

Activates the recording of environment interactions.

When active, the wrapper will record tuples of (state, action, reward, etc.) and store them as episodes.

deactivate_recording()[source]

Deactivates the recording of environment interactions.

When deactivated, the wrapper will not record or store any environment interactions.

delete(delete_artifacts: bool = False, delete_episodes: bool = True)[source]

Deletes the benchmark and optionally its related data from storage.

Parameters:
  • delete_artifacts (bool, optional) – Whether to delete related artifacts. Defaults to False

  • delete_episodes (bool, optional) – Whether to delete related episodes. Defaults to True

Raises:

Exception – If deletion of benchmark, episodes, or artifacts fails

classmethod deserialize_env(serialized_env: str, storage: TupliStorage) Env[source]

Deserializes a JSON string back into a Gymnasium environment.

Parameters:
  • serialized_env (str) – The JSON string representation of the environment

  • storage (TupliStorage) – Storage backend for loading related artifacts

Returns:

Env – The deserialized Gymnasium environment

classmethod load(storage: ~pytupli.storage.TupliStorage, benchmark_id: str | None = None, metadata_callback: ~pytupli.schema.EpisodeMetadataCallback | None = None, rl_tuple_cls: type[~pytupli.schema.RLTuple] = <class 'pytupli.schema.RLTuple'>) TupliEnvWrapper[source]

Loads a benchmark from storage.

Parameters:
  • storage (TupliStorage) – Storage backend to load from

  • benchmark_id (str | None) – ID of the benchmark to load. Defaults to None

  • metadata_callback (EpisodeMetadataCallback | None) – Callback for generating episode metadata. Defaults to None

Returns:

TupliEnvWrapper – A new wrapper instance with the loaded benchmark

publish() None[source]

Publishes the benchmark, making it available for other users depending on their access rights.

This method should be called after storing the benchmark when it’s ready to be used by others.

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[Any, dict[str, Any]][source]

Resets the environment and returns the initial observation.

Parameters:
  • seed (int | None) – Random seed for environment reset. Defaults to None

  • options (dict[str, Any] | None) – Additional options for reset. Defaults to None

Returns:

tuple[Any, dict[str, Any]] – Initial observation and info dictionary

serialize_env(env: Env) str[source]

Serializes a Gymnasium environment to a JSON string.

This method handles the serialization of the environment and any related artifacts.

Parameters:

env (Env) – The environment to serialize

Returns:

str – JSON string representation of the environment

step(action: Any) tuple[Any, SupportsFloat, bool, bool, dict[str, Any]][source]

Takes a step in the environment and optionally records the interaction.

If recording is active, stores the interaction in the tuple buffer and creates an episode when the episode terminates.

Parameters:

action (Any) – The action to take in the environment

Returns:

tuple[Any, SupportsFloat, bool, bool, dict[str, Any]]

Tuple containing:
  • observation: The environment observation

  • reward: The reward for the action

  • terminated: Whether the episode terminated naturally

  • truncated: Whether the episode was artificially terminated

  • info: Additional information from the environment

store(name: str, description: str = '', difficulty: str | None = None, version: str | None = None, metadata: dict[str, Any] = {}) str[source]

Stores the benchmark in the storage backend.

Parameters:
  • name (str) – Name of the benchmark

  • description (str, optional) – Description of the benchmark. Defaults to ‘’

  • difficulty (str | None, optional) – Difficulty level of the benchmark. Defaults to None

  • version (str | None, optional) – Version of the benchmark. Defaults to None

  • metadata (dict[str, Any], optional) – Additional metadata. Defaults to {}

Returns:

str – The ID of the stored benchmark