File size: 2,128 Bytes
1c43a3c 76ab87f f4b927e 1c43a3c 4f85a25 1c43a3c f8c68b3 1c43a3c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
license: llama3.2
pipeline_tag: text-generation
tags:
- executorch
---
# Introduction
This repository hosts the **LLaMa 3.2** models for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes both the **1B** and **3B** versions of the LLaMa model, as well as their **quantized** versions in `.pte` format, ready for use in the **ExecuTorch** runtime.
If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions.
## Compatibility
If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. For more details, see the compatibility note in the [ExecuTorch GitHub repository](https://github.com/pytorch/executorch/blob/11d1742fdeddcf05bc30a6cfac321d2a2e3b6768/runtime/COMPATIBILITY.md?plain=1#L4). If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with runtime used behind the scenes.
These models were exported using `v0.6.0` version and **no forward compatibility** is guaranteed. Older versions of the runtime may not work with these files.
### Repository Structure
The repository is organized into two main directories:
- `llama-3.2-1B`
- `llama-3.2-3B`
Each directory contains different versions of the model, including **QLoRa**, **SpinQuant**, and the **original** models.
- The `.pte` file should be passed to the `modelSource` parameter.
- The tokenizer for the models is available within the repo root, under `tokenizer.json` and `tokenizer_config.json`
If you wish to export the model yourself, you’ll need to obtain model weights and the `params.json` file from the official repositories, which can be found [here](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf).
For the **best performance-to-quality ratio**, we highly recommend using the **QLoRa** version, which is optimized for speed without sacrificing too much on model quality.
|