🟩 Wordle AI Solver

Neural network models for solving Wordle puzzles. This repo contains two models β€” a supervised baseline and a reinforcement learning variant β€” both deployable via the live app.


Files

File Description
model_weights.pt Supervised model (WordleNet)
config.json Supervised model config
rl_model_weights.pt RL model (REINFORCE-filtered)
rl_config.json RL model config
answers.json 2,315 valid Wordle answers
allowed.json 12,972 valid guess words

Model Comparison

🧠 Supervised πŸ€– Reinforcement
Training method CrossEntropy on entropy-optimal games REINFORCE with elite game filtering
Win rate 100% 98.2%
Avg guesses 3.46 3.75
Opener CRANE CRANE
Parameters ~13M ~13M

Architecture

Both models share the same encoder:

Input:  390-dim binary vector
        (26 letters Γ— 5 positions Γ— 3 states: grey/yellow/green)

Hidden: Linear(390 β†’ 512) β†’ BatchNorm1d β†’ ReLU β†’ Dropout(0.3)
        Linear(512 β†’ 512) β†’ BatchNorm1d β†’ ReLU β†’ Dropout(0.3)
        Linear(512 β†’ 256) β†’ BatchNorm1d β†’ ReLU

Output: Linear(256 β†’ 12972)
        logits over all 12,972 allowed guess words

Board encoding:

vec[letter_index * 15 + position * 3 + state] = 1.0
# letter_index: 0-25 (a-z)
# position:     0-4
# state:        0=grey, 1=yellow, 2=green

Training

Supervised Model

Trained on ~10,000 (board_state, best_guess) pairs generated by an entropy-optimal solver that plays all 2,315 Wordle games. The solver picks the guess maximising expected information gain at each step:

E[Info]=βˆ‘pP(p)β‹…log⁑2(1P(p))E[\text{Info}] = \sum_{p} P(p) \cdot \log_2\left(\frac{1}{P(p)}\right)

RL Model

  1. Warm start from supervised weights
  2. Elite game collection β€” greedy rollouts with constraint-filtered action masking, keeping only games solved in ≀3 guesses (~11% hit rate)
  3. REINFORCE training β€” supervised loss on elite (state, action) pairs
  4. Benchmark against all 2,315 answers using constraint-filtered suggestion logic

The RL model learns purely from reward signal (win/lose, guesses used) without access to the entropy oracle used to train the supervised model.


Inference

The models are not used as raw classifiers β€” the backend combines model logits with constraint filtering:

# 1. Get top-20 model words
logits = model(encode_board(history))
model_words = [ALLOWED[i] for i in logits.topk(20).indices]

# 2. Filter to words consistent with all previous guesses
possible = filter_words(ANSWERS, history)

# 3. Score by entropy against remaining possible set
candidates = model_words + possible
best = max(candidates, key=lambda w: entropy_score(w, possible))

This hybrid approach is why the supervised model achieves 100% β€” the neural net narrows the search, entropy scoring picks the optimal move.


Usage

import torch
import torch.nn as nn
from huggingface_hub import hf_hub_download
import json

REPO_ID = "sato2ru/wordle-solver"

config  = json.load(open(hf_hub_download(REPO_ID, "config.json")))
ALLOWED = json.load(open(hf_hub_download(REPO_ID, "allowed.json")))

class WordleNet(nn.Module):
    def __init__(self):
        super().__init__()
        h = config["hidden"]
        self.net = nn.Sequential(
            nn.Linear(390, h), nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(h, h),   nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(h, 256), nn.BatchNorm1d(256), nn.ReLU(),
            nn.Linear(256, 12972)
        )
    def forward(self, x): return self.net(x)

# Load supervised model
model = WordleNet()
model.load_state_dict(
    torch.load(hf_hub_download(REPO_ID, "model_weights.pt"), map_location="cpu")
)
model.eval()

Or use the live API directly:

curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=supervised" \
  -H "Content-Type: application/json" \
  -d '{"history": []}'

curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=rl" \
  -H "Content-Type: application/json" \
  -d '{"history": []}'

Results

Supervised β€” all 2,315 answers (greedy + entropy filter)

1 guess :    1
2 guesses:   59  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
3 guesses: 1188  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
4 guesses: 1010  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
5 guesses:   56  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
6 guesses:    1
FAILED   :    0  βœ… 100% win rate

RL β€” all 2,315 answers (greedy + entropy filter)

1 guess :    1
2 guesses:  141  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
3 guesses:  810  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
4 guesses:  893  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
5 guesses:  343  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
6 guesses:   86  β–ˆβ–ˆβ–ˆβ–ˆ
FAILED   :   41  βœ… 98.2% win rate

Links


License

MIT

Downloads last month
94
Video Preview
loading

Space using sato2ru/wordle-solver 1