Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

.gitattributes +1 -0
README.md +75 -0
config.json +1 -0
model.pt +3 -0
replay.mp4 +3 -0
results.json +1 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+replay.mp4 filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+---
+tags:
+- HalfCheetah-v5
+- reinforcement-learning
+- ppo
+- halfcheetah
+- mujoco
+- gymnasium
+- pytorch
+model-index:
+- name: PPO-MuJoCo-HalfCheetah-v5
+  results:
+  - task:
+      type: reinforcement-learning
+      name: reinforcement-learning
+    dataset:
+      name: HalfCheetah-v5
+      type: HalfCheetah-v5
+    metrics:
+    - type: mean_reward
+      value: 1244.04 +/- 36.99
+      name: mean_reward
+      verified: false
+---
+# Proximal Policy Optimization (PPO) Agent playing HalfCheetah-v5
+This is a trained Proximal Policy Optimization (PPO) agent for the MuJoCo HalfCheetah-v5 environment.
+## Model Details
+The model was trained using the code available [here](https://github.com/giansimone/ppo-mujoco-halfcheetah/).
+## Usage
+To load and use this model for inference:
+```python
+import torch
+import json
+import gymnasium as gym
+from agent import SimpleAgent
+from environment import make_env
+#Load the configuration
+with open("config.json", "r") as f:
+    config = json.load(f)
+env_id = config["env_id"]
+hidden_dim = config["hidden_dim"]
+# Create environment. Get action and space dimensions
+env, state_size, action_size = make_env(
+    env_id,
+    render_mode="human",
+)
+# Instantiate the agent and load the trained policy network
+agent = SimpleAgent(state_size, action_size, hidden_dim)
+agent.policy.load_state_dict(torch.load("model.pt"))
+# Enjoy the agent!
+state, _ = env.reset()
+done = False
+while not done:
+    action = agent.select_action(state, deterministic=True)
+    state, reward, terminated, truncated, _ = env.step(action)
+    done = terminated or truncated
+    env.render()
+env.close()
+```

config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"env_id": "HalfCheetah-v5", "num_envs": 8, "hidden_dim": 256, "total_timesteps": 1000000, "n_steps": 1024, "batch_size": 64, "learning_rate": 0.0003, "gamma": 0.99, "gae_lambda": 0.95, "clip_epsilon": 0.2, "value_coef": 0.5, "entropy_coef": 0.01, "max_grad_norm": 0.5, "ppo_epochs": 10, "log_dir": "runs/", "seed": 42}

model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0a624ceabc3f753c2fa13d3fd139c10619c6e6dc9b5a39e07a85a112fb03e4d8
+size 298661

replay.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a50239636139bd0bece27cf5edc0a953532f46611fdca931443c501cf839a0a
+size 1784791

results.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"env_id": "HalfCheetah-v5", "mean_reward": 1244.0386862126802, "n_eval_episodes": 10, "eval_datetime": "2025-11-10T18:00:50.269012"}