--- tags: - HalfCheetah-v5 - reinforcement-learning - ppo - halfcheetah - mujoco - gymnasium - pytorch model-index: - name: PPO-MuJoCo-HalfCheetah-v5 results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: HalfCheetah-v5 type: HalfCheetah-v5 metrics: - type: mean_reward value: 1244.04 +/- 36.99 name: mean_reward verified: false --- # Proximal Policy Optimization (PPO) Agent playing HalfCheetah-v5 This is a trained Proximal Policy Optimization (PPO) agent for the MuJoCo HalfCheetah-v5 environment. ## Model Details The model was trained using the code available [here](https://github.com/giansimone/ppo-mujoco-halfcheetah/). ## Usage To load and use this model for inference: ```python import torch import json import gymnasium as gym from agent import SimpleAgent from environment import make_env #Load the configuration with open("config.json", "r") as f: config = json.load(f) env_id = config["env_id"] hidden_dim = config["hidden_dim"] # Create environment. Get action and space dimensions env, state_size, action_size = make_env( env_id, render_mode="human", ) # Instantiate the agent and load the trained policy network agent = SimpleAgent(state_size, action_size, hidden_dim) agent.policy.load_state_dict(torch.load("model.pt")) # Enjoy the agent! state, _ = env.reset() done = False while not done: action = agent.select_action(state, deterministic=True) state, reward, terminated, truncated, _ = env.step(action) done = terminated or truncated env.render() env.close() ```