🟩 Wordle AI Solver

Neural network models for solving Wordle puzzles. This repo contains two models — a supervised baseline and a reinforcement learning variant — both deployable via the live app.

Files

File	Description
`model_weights.pt`	Supervised model (WordleNet)
`config.json`	Supervised model config
`rl_model_weights.pt`	RL model (REINFORCE-filtered)
`rl_config.json`	RL model config
`answers.json`	2,315 valid Wordle answers
`allowed.json`	12,972 valid guess words

Model Comparison

	🧠 Supervised	🤖 Reinforcement
Training method	CrossEntropy on entropy-optimal games	REINFORCE with elite game filtering
Win rate	100%	98.2%
Avg guesses	3.46	3.75
Opener	CRANE	CRANE
Parameters	~13M	~13M

Architecture

Both models share the same encoder:

Input:  390-dim binary vector
        (26 letters × 5 positions × 3 states: grey/yellow/green)

Hidden: Linear(390 → 512) → BatchNorm1d → ReLU → Dropout(0.3)
        Linear(512 → 512) → BatchNorm1d → ReLU → Dropout(0.3)
        Linear(512 → 256) → BatchNorm1d → ReLU

Output: Linear(256 → 12972)
        logits over all 12,972 allowed guess words

Board encoding:

vec[letter_index * 15 + position * 3 + state] = 1.0
# letter_index: 0-25 (a-z)
# position:     0-4
# state:        0=grey, 1=yellow, 2=green

Training

Supervised Model

Trained on ~10,000 (board_state, best_guess) pairs generated by an entropy-optimal solver that plays all 2,315 Wordle games. The solver picks the guess maximising expected information gain at each step:

$E[\text{Info}] = \sum_{p} P(p) \cdot \log_2\left(\frac{1}{P(p)}\right)$

RL Model

Warm start from supervised weights
Elite game collection — greedy rollouts with constraint-filtered action masking, keeping only games solved in ≤3 guesses (~11% hit rate)
REINFORCE training — supervised loss on elite (state, action) pairs
Benchmark against all 2,315 answers using constraint-filtered suggestion logic

The RL model learns purely from reward signal (win/lose, guesses used) without access to the entropy oracle used to train the supervised model.

Inference

The models are not used as raw classifiers — the backend combines model logits with constraint filtering:

# 1. Get top-20 model words
logits = model(encode_board(history))
model_words = [ALLOWED[i] for i in logits.topk(20).indices]

# 2. Filter to words consistent with all previous guesses
possible = filter_words(ANSWERS, history)

# 3. Score by entropy against remaining possible set
candidates = model_words + possible
best = max(candidates, key=lambda w: entropy_score(w, possible))

This hybrid approach is why the supervised model achieves 100% — the neural net narrows the search, entropy scoring picks the optimal move.

Usage

import torch
import torch.nn as nn
from huggingface_hub import hf_hub_download
import json

REPO_ID = "sato2ru/wordle-solver"

config  = json.load(open(hf_hub_download(REPO_ID, "config.json")))
ALLOWED = json.load(open(hf_hub_download(REPO_ID, "allowed.json")))

class WordleNet(nn.Module):
    def __init__(self):
        super().__init__()
        h = config["hidden"]
        self.net = nn.Sequential(
            nn.Linear(390, h), nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(h, h),   nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(h, 256), nn.BatchNorm1d(256), nn.ReLU(),
            nn.Linear(256, 12972)
        )
    def forward(self, x): return self.net(x)

# Load supervised model
model = WordleNet()
model.load_state_dict(
    torch.load(hf_hub_download(REPO_ID, "model_weights.pt"), map_location="cpu")
)
model.eval()

Or use the live API directly:

curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=supervised" \
  -H "Content-Type: application/json" \
  -d '{"history": []}'

curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=rl" \
  -H "Content-Type: application/json" \
  -d '{"history": []}'

Results

Supervised — all 2,315 answers (greedy + entropy filter)

1 guess :    1
2 guesses:   59  ████████████
3 guesses: 1188  ██████████████████████████████████████████████
4 guesses: 1010  ████████████████████████████████████████
5 guesses:   56  ███████████
6 guesses:    1
FAILED   :    0  ✅ 100% win rate

RL — all 2,315 answers (greedy + entropy filter)

1 guess :    1
2 guesses:  141  ████████████
3 guesses:  810  ██████████████████████████████████████████████
4 guesses:  893  ████████████████████████████████████████
5 guesses:  343  ███████████
6 guesses:   86  ████
FAILED   :   41  ✅ 98.2% win rate

License

MIT

Downloads last month: 94

Video Preview

Reinforcement Learning

sato2ru
/

wordle-solver