MuzzammilShah's picture
Update README.md
b2f466a verified
---
license: mit
datasets:
- MuzzammilShah/people-names
language:
- en
model_name: Bigram Character-Level Language Model
library_name: pytorch
tags:
- makemore
- bigram
- language-model
- andrej-karpathy
---
# Bigram Character-Level Language Model: Makemore (Part 1)
Introduced to the concept of a bigram character-level language model, this repository explores its **training**, **sampling**, and **evaluation** processes. The model evaluation was conducted using the **Negative Log Likelihood (NLL)** loss to assess its quality.
## Overview
The model was trained in two distinct ways, both yielding identical results:
1. **Frequency-Based Approach**: Directly counting and normalizing bigram frequencies.
2. **Gradient-Based Optimization**: Optimizing the counts matrix using a gradient-based framework guided by minimizing the NLL loss.
This demonstrated that **both methods converge to the same result**, showcasing their equivalence in achieving the desired outcome.
## Documentation
For a better reading experience and detailed notes, visit my **[Road to GPT Documentation Site](https://muzzammilshah.github.io/Road-to-GPT/Makemore-part1/)**.
## Acknowledgments
Notes and implementations inspired by the **Makemore - Part 1** video by [Andrej Karpathy](https://karpathy.ai/).
For more of my projects, visit my [Portfolio Site](https://muhammedshah.com).