MuzzammilShah
/

NeuralNetworks-LanguageModels-1

andrej-karpathy

Model card Files Files and versions

NeuralNetworks-LanguageModels-1 / README.md

MuzzammilShah's picture

Update README.md

b2f466a verified 2 months ago

|

history blame contribute delete

1.37 kB

	---
	license: mit
	datasets:
	- MuzzammilShah/people-names
	language:
	- en
	model_name: Bigram Character-Level Language Model
	library_name: pytorch
	tags:
	- makemore
	- bigram
	- language-model
	- andrej-karpathy
	---

	# Bigram Character-Level Language Model: Makemore (Part 1)

	Introduced to the concept of a bigram character-level language model, this repository explores its training, sampling, and evaluation processes. The model evaluation was conducted using the Negative Log Likelihood (NLL) loss to assess its quality.

	## Overview
	The model was trained in two distinct ways, both yielding identical results:
	1. Frequency-Based Approach: Directly counting and normalizing bigram frequencies.
	2. Gradient-Based Optimization: Optimizing the counts matrix using a gradient-based framework guided by minimizing the NLL loss.

	This demonstrated that both methods converge to the same result, showcasing their equivalence in achieving the desired outcome.

	## Documentation
	For a better reading experience and detailed notes, visit my [Road to GPT Documentation Site](https://muzzammilshah.github.io/Road-to-GPT/Makemore-part1/).

	## Acknowledgments
	Notes and implementations inspired by the Makemore - Part 1 video by [Andrej Karpathy](https://karpathy.ai/).

	For more of my projects, visit my [Portfolio Site](https://muhammedshah.com).