Update README.md
Browse files
README.md
CHANGED
@@ -26,7 +26,8 @@ tags:
|
|
26 |
- **Input:** Text
|
27 |
- **Output:** Text
|
28 |
- **Model Optimizations:**
|
29 |
-
- **
|
|
|
30 |
- **Intended Use Cases:** This model is designed to accelerate research on language models, for use as a building block for generative AI powered features. It provides uses for general purpose AI systems and applications (primarily in English) which require:
|
31 |
1. Memory/compute constrained environments.
|
32 |
2. Latency bound scenarios.
|
|
|
26 |
- **Input:** Text
|
27 |
- **Output:** Text
|
28 |
- **Model Optimizations:**
|
29 |
+
- **Activation quantization:** FP8
|
30 |
+
- **Weight quantization:** FP8
|
31 |
- **Intended Use Cases:** This model is designed to accelerate research on language models, for use as a building block for generative AI powered features. It provides uses for general purpose AI systems and applications (primarily in English) which require:
|
32 |
1. Memory/compute constrained environments.
|
33 |
2. Latency bound scenarios.
|