
Pufferlib
Spin up fast, vectorized single- or multi-agent RL environments with PufferLib’s PufferEnv API and the Ocean suite instead of slow Gym wrappers.
Overview
PufferLib is an agent skill for the Build phase that guides you through PufferEnv and Ocean to implement vectorized single- and multi-agent reinforcement-learning environments.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill pufferlibWhat is this skill?
- PufferEnv API with shared-buffer, in-place observations, actions, and rewards for high-throughput vectorization
- Ocean suite with 20+ pre-built environments for single-agent and multi-agent scenarios
- Flat observation and action spaces plus discrete/structured space helpers for efficient batch training
- Patterns for custom env state, reset/step loops, and integration with PyTorch PPO-style training workflows
- Performance-oriented guidance: native vectorization vs Gymnasium/Gym bridges and profiling-minded env design
- Ocean suite includes 20+ pre-built environments
- PufferEnv uses shared-buffer in-place updates for observations, actions, and rewards
Adoption & trust: 520 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need RL environments that keep up with vectorized training, but standard Gym wrappers copy data every step and choke your rollout throughput.
Who is it for?
Indie or solo builders running Python RL experiments who want custom envs or Ocean presets with native vectorization and multi-agent support.
Skip if: Teams building typical CRUD SaaS or web frontends with no reinforcement-learning training loop, or beginners who have not worked with Gym-style APIs and PyTorch yet.
When should I use this skill?
You are implementing custom RL environments, adopting Ocean presets, or optimizing vectorized single- and multi-agent training with PufferLib in Python.
What do I get? / Deliverables
You leave with a PufferEnv (or Ocean) setup, correct space contracts, and buffer-friendly reset/step code ready to plug into a PyTorch training loop.
- PufferEnv subclass or Ocean env configuration with defined observation and action spaces
- Reset/step implementation compatible with shared-buffer vectorization
- Training integration notes for PyTorch (or similar) rollout pipelines
Recommended Skills
Journey fit
Reinforcement-learning environment code is product engineering—you implement training infrastructure while building an agent-capable system, not during idea or launch work. PufferLib is agent-training infrastructure (custom PufferEnv subclasses, Ocean presets, vectorized rollouts), which maps directly to agent-tooling rather than generic frontend or docs work.
How it compares
Use instead of hand-rolled Gymnasium environments when rollout throughput and native vectorization matter more than a minimal tutorial env.
Common Questions / FAQ
Who is PufferLib for?
PufferLib is for solo builders and small teams implementing reinforcement-learning environments in Python—custom PufferEnv classes, Ocean suite envs, and vectorized training integrations.
When should I use PufferLib?
Use it during Build while you design agent-training infrastructure: defining observation and action spaces, implementing reset/step with shared buffers, choosing an Ocean preset, or tuning env performance before long training runs.
Is PufferLib safe to install?
Treat it like any third-party agent skill: review the Security Audits panel on this Prism page and inspect the skill package in your repo before granting shell or network access to your training environment.
SKILL.md
READMESKILL.md - Pufferlib
# PufferLib Environments Guide ## Overview PufferLib provides the PufferEnv API for creating high-performance custom environments, and the Ocean suite containing 20+ pre-built environments. Environments support both single-agent and multi-agent scenarios with native vectorization. ## PufferEnv API ### Core Characteristics PufferEnv is designed for performance through in-place operations: - Observations, actions, and rewards are initialized from a shared buffer object - All operations happen in-place to avoid creating and copying arrays - Native support for both single-agent and multi-agent environments - Flat observation/action spaces for efficient vectorization ### Creating a PufferEnv ```python import numpy as np import pufferlib from pufferlib import PufferEnv class MyEnvironment(PufferEnv): def __init__(self, buf=None): super().__init__(buf) # Define observation and action spaces self.observation_space = self.make_space({ 'image': (84, 84, 3), 'vector': (10,) }) self.action_space = self.make_discrete(4) # 4 discrete actions # Initialize state self.reset() def reset(self): """Reset environment to initial state.""" # Reset internal state self.agent_pos = np.array([0, 0]) self.step_count = 0 # Return initial observation obs = { 'image': np.zeros((84, 84, 3), dtype=np.uint8), 'vector': np.zeros(10, dtype=np.float32) } return obs def step(self, action): """Execute one environment step.""" # Update state based on action self.step_count += 1 # Calculate reward reward = self._compute_reward() # Check if episode is done done = self.step_count >= 1000 # Generate observation obs = self._get_observation() # Additional info info = {'episode': {'r': reward, 'l': self.step_count}} if done else {} return obs, reward, done, info def _compute_reward(self): """Compute reward for current state.""" return 1.0 def _get_observation(self): """Generate observation from current state.""" return { 'image': np.random.randint(0, 256, (84, 84, 3), dtype=np.uint8), 'vector': np.random.randn(10).astype(np.float32) } ``` ### Observation Spaces #### Discrete Spaces ```python # Single discrete value self.observation_space = self.make_discrete(10) # Values 0-9 # Dict with discrete values self.observation_space = self.make_space({ 'position': (1,), # Continuous 'type': self.make_discrete(5) # Discrete }) ``` #### Continuous Spaces ```python # Box space (continuous) self.observation_space = self.make_space({ 'image': (84, 84, 3), # Image 'vector': (10,), # Vector 'scalar': (1,) # Single value }) ``` #### Multi-Discrete Spaces ```python # Multiple discrete values self.observation_space = self.make_multi_discrete([3, 5, 2]) # 3 values, 5 values, 2 values ``` ### Action Spaces ```python # Discrete actions self.action_space = self.make_discrete(4) # 4 actions: 0, 1, 2, 3 # Continuous actions self.action_space = self.make_space((3,)) # 3D continuous action # Multi-discrete actions self.action_space = self.make_multi_discrete([3, 3]) # Two 3-way discrete choices ``` ## Multi-Agent Environments PufferLib has native multi-agent support, treating single-agent and multi-agent environments uniformly. ### Multi-Agent PufferEnv ```python class MultiAgentEnv(PufferEnv): def __init__(self, num_agents=4, buf=None): super().__init__(buf) self.num_agents = num_agents # Per-agent observation space self.single_observation_space = self.make_space({ 'position': (2,), 'velocity': (2,), 'global': (10,) }) # Per-agent action space self.single_action_spa