MuzzammilShah
/

NeuralNetworks-LanguageModels-3

batch-normalization

neural-networks

andrej-karpathy

Model card Files Files and versions Community

NeuralNetworks-LanguageModels-3 / README.md

MuzzammilShah's picture

Update README.md

d6312dd verified about 2 months ago

|

history blame contribute delete

1.25 kB

	---
	license: mit
	datasets:
	- MuzzammilShah/people-names
	language:
	- en
	model_name: Batch Normalization for Neural Networks
	library_name: pytorch
	tags:
	- makemore
	- batch-normalization
	- neural-networks
	- andrej-karpathy
	---

	# Batch Normalization for Neural Networks: Makemore (Part 3)

	In this repository, I implemented Batch Normalization within a neural network framework to enhance training stability and performance, following Andrej Karpathy's approach in the Makemore - Part 3 video.

	## Overview
	This implementation focuses on:
	- Normalizing activations and gradients.
	- Addressing initialization issues.
	- Utilizing Kaiming initialization to prevent saturation of activation functions.

	Additionally, visualization graphs were created at the end to analyze the effects of these techniques on the training process and model performance.

	## Documentation
	For a better reading experience and detailed notes, visit my [Road to GPT Documentation Site](https://muzzammilshah.github.io/Road-to-GPT/Makemore-part3/).

	## Acknowledgments
	Notes and implementations inspired by the Makemore - Part 3 video by [Andrej Karpathy](https://karpathy.ai/).

	For more of my projects, visit my [Portfolio Site](https://muhammedshah.com).