---
tags:
- HalfCheetah-v5
- reinforcement-learning
- ppo
- halfcheetah
- mujoco
- gymnasium
- pytorch
model-index:
- name: PPO-MuJoCo-HalfCheetah-v5
  results:
  - task:
      type: reinforcement-learning
      name: reinforcement-learning
    dataset:
      name: HalfCheetah-v5
      type: HalfCheetah-v5
    metrics:
    - type: mean_reward
      value: 1244.04 +/- 36.99
      name: mean_reward
      verified: false
---

# Proximal Policy Optimization (PPO) Agent playing HalfCheetah-v5

This is a trained Proximal Policy Optimization (PPO) agent for the MuJoCo HalfCheetah-v5 environment.

## Model Details

The model was trained using the code available [here](https://github.com/giansimone/ppo-mujoco-halfcheetah/).

## Usage
To load and use this model for inference:

```python
import torch
import json
import gymnasium as gym

from agent import SimpleAgent
from environment import make_env

#Load the configuration
with open("config.json", "r") as f:
    config = json.load(f)

env_id = config["env_id"]
hidden_dim = config["hidden_dim"]

# Create environment. Get action and space dimensions
env, state_size, action_size = make_env(
    env_id,
    render_mode="human",
)

# Instantiate the agent and load the trained policy network
agent = SimpleAgent(state_size, action_size, hidden_dim)
agent.policy.load_state_dict(torch.load("model.pt"))

# Enjoy the agent!
state, _ = env.reset()
done = False

while not done:
    action = agent.select_action(state, deterministic=True)
    state, reward, terminated, truncated, _ = env.step(action)

    done = terminated or truncated

    env.render()

env.close()
```