|
--- |
|
license: mit |
|
tags: |
|
- chess |
|
- game-ai |
|
- pytorch |
|
- safetensors |
|
library_name: transformers |
|
datasets: |
|
- Maxlegrec/ChessFENS |
|
--- |
|
|
|
# ChessBot Chess Model |
|
|
|
This is a ChessBot model for chess move prediction and position evaluation. This model is way worse than stockfish. It is better than most humans however. |
|
For stronger play, reducing temperature T (lower is stronger) is suggested. |
|
|
|
## Model Description |
|
|
|
The ChessBot model is a transformer-based architecture designed for chess gameplay. It can: |
|
- Predict the next best move given a chess position (FEN) |
|
- Evaluate chess positions |
|
- Generate move probabilities |
|
|
|
## Please Like if this model is useful to you :) |
|
|
|
A like goes a long way ! |
|
|
|
## Usage |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModel |
|
|
|
model = AutoModel.from_pretrained("Maxlegrec/ChessBot", trust_remote_code=True) |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
model = model.to(device) |
|
|
|
# Example usage |
|
fen = "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1" |
|
|
|
# Sample move from policy |
|
move = model.get_move_from_fen_no_thinking(fen, T=0.1, device=device) |
|
print(f"Policy-based move: {move}") |
|
#e2e4 |
|
|
|
# Get the best move using value analysis |
|
value_move = model.get_best_move_value(fen, T=0, device=device) |
|
print(f"Value-based move: {value_move}") |
|
#e2e4 |
|
|
|
# Get position evaluation |
|
position_value = model.get_position_value(fen, device=device) |
|
print(f"Position value [black_win, draw, white_win]: {position_value}") |
|
#[0.2318, 0.4618, 0.3064] |
|
|
|
# Get move probabilities |
|
probs = model.get_move_from_fen_no_thinking(fen, T=1, device=device, return_probs=True) |
|
top_moves = sorted(probs.items(), key=lambda x: x[1], reverse=True)[:5] |
|
print("Top 5 moves:") |
|
for move, prob in top_moves: |
|
print(f" {move}: {prob:.4f}") |
|
#Top 5 moves: |
|
# e2e4: 0.9285 |
|
# d2d4: 0.0712 |
|
# g1f3: 0.0001 |
|
# e2e3: 0.0000 |
|
# c2c3: 0.0000 |
|
``` |
|
|
|
## Requirements |
|
|
|
- torch>=2.0.0 |
|
- transformers>=4.48.1 |
|
- python-chess>=1.10.0 |
|
- numpy>=1.21.0 |
|
|
|
## Model Architecture |
|
|
|
The architecture is strongly inspired from the LCzero project. Although written in pytorch. |
|
|
|
- **Transformer layers**: 10 |
|
- **Hidden size**: 512 |
|
- **Feed-forward size**: 736 |
|
- **Attention heads**: 8 |
|
- **Vocabulary size**: 1929 (chess moves) |
|
|
|
## Training Data |
|
|
|
This model was trained on training data from the LCzero project. It consists of around 750M chess positions. I will publish the training dataset very soon. |
|
|
|
## Limitations |
|
|
|
- The model works best with standard chess positions |
|
- Performance may vary with unusual or rare positions |
|
- Requires GPU for optimal inference speed |