tperes commited on
Commit
c77808f
·
verified ·
1 Parent(s): a1f8a3e

Upload ollama-README-A.md

Browse files
Files changed (1) hide show
  1. ollama-README-A.md +216 -0
ollama-README-A.md ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # palmyra-mini-thinking-a GGUF Model Import Guide for Ollama
2
+
3
+ This guide provides step-by-step instructions for importing the palmyra-mini-thinking-a GGUF model files into Ollama for local inference.
4
+
5
+ ## 📁 Available Model Files
6
+
7
+ This directory contains two quantized versions of the palmyra-mini-thinking-a model:
8
+
9
+ - `palmyra-mini-thinking-a-thinking-a-BF16.gguf` - BFloat16 precision (highest quality, largest size)
10
+ - `palmyra-mini-thinking-a-thinking-a-Q8_0.gguf` - 8-bit quantization (high quality, medium size)
11
+
12
+ ## 🔧 Prerequisites
13
+
14
+ Before getting started, ensure you have:
15
+
16
+ - **Ollama installed** on your system ([Download from ollama.com](https://ollama.com/))
17
+ - **Sufficient RAM/VRAM** for your chosen model:
18
+ - BF16: ~16GB+ RAM recommended
19
+ - Q8_0: ~8GB+ RAM recommended
20
+ - **Terminal/Command Line access**
21
+
22
+ ## 🚀 Quick Start Guide
23
+
24
+ ### Method 1: Import Local GGUF File (Recommended)
25
+
26
+ #### Step 1: Navigate to Model Directory
27
+ ```bash
28
+ cd "/Users/[user]/Documents/Model Weights/SPW2 Mini Launch/palmyra-mini-thinking-a/GGUF/palmyra-mini-thinking-a FIXED GGUF-BF16"
29
+ ```
30
+
31
+ #### Step 2: Create a Modelfile
32
+ Create a new file named `Modelfile` (no extension) with the following content:
33
+
34
+ **For BF16 version (highest quality):**
35
+ ```
36
+ FROM ./palmyra-mini-thinking-a-BF16.gguf
37
+ PARAMETER temperature 0.3
38
+ PARAMETER num_ctx 4096
39
+ PARAMETER top_k 40
40
+ PARAMETER top_p 0.95
41
+ SYSTEM "You are Palmyra, an advanced AI assistant created by Writer. You are helpful and honest. You provide accurate and detailed responses while being concise and clear."
42
+ ```
43
+
44
+ **For Q8_0 version (balanced):**
45
+ ```
46
+ FROM ./palmyra-mini-thinking-a-Q8_0.gguf
47
+ PARAMETER temperature 0.3
48
+ PARAMETER num_ctx 4096
49
+ PARAMETER top_k 40
50
+ PARAMETER top_p 0.95
51
+ SYSTEM "You are Palmyra, an advanced AI assistant created by Writer. You are helpful and honest. You provide accurate and detailed responses while being concise and clear."
52
+ ```
53
+
54
+
55
+
56
+ #### Step 3: Import the Model
57
+ ```bash
58
+ ollama create palmyra-mini-thinking-a -f Modelfile
59
+ ```
60
+
61
+ #### Step 4: Run the Model
62
+ ```bash
63
+ ollama run palmyra-mini-thinking-a
64
+ ```
65
+
66
+ ### Method 2: Using Absolute Paths
67
+
68
+ If you prefer to create the Modelfile elsewhere, use absolute paths:
69
+
70
+ ```
71
+ FROM "/Users/thomas/Documents/Model Weights/SPW2 Mini Launch/palmyra-mini-thinking-a/GGUF/palmyra-mini-thinking-a FIXED GGUF-BF16/palmyra-mini-thinking-a-BF16.gguf"
72
+ PARAMETER temperature 0.3
73
+ PARAMETER num_ctx 4096
74
+ SYSTEM "You are Palmyra, an advanced AI assistant created by Writer."
75
+ ```
76
+
77
+ Then create and run:
78
+ ```bash
79
+ ollama create palmyra-mini-thinking-a -f /path/to/your/Modelfile
80
+ ollama run palmyra-mini-thinking-a
81
+ ```
82
+
83
+ ## ⚙️ Advanced Configuration
84
+
85
+ ### Custom Modelfile Parameters
86
+
87
+ You can customize the model behavior by modifying these parameters in your Modelfile:
88
+
89
+ ```
90
+ FROM ./palmyra-mini-thinking-a-BF16.gguf
91
+
92
+ # Sampling parameters
93
+ PARAMETER temperature 0.3 # Creativity (0.1-2.0)
94
+ PARAMETER top_k 40 # Top-k sampling (1-100)
95
+ PARAMETER top_p 0.95 # Top-p sampling (0.1-1.0)
96
+ PARAMETER repeat_penalty 1.1 # Repetition penalty (0.8-1.5)
97
+ PARAMETER num_ctx 4096 # Context window size
98
+ PARAMETER num_predict 512 # Max tokens to generate
99
+
100
+ # Stop sequences
101
+ PARAMETER stop "<|end|>"
102
+ PARAMETER stop "<|endoftext|>"
103
+
104
+ # System message
105
+ SYSTEM """You are Palmyra, an advanced AI assistant created by Writer.
106
+ You are helpful, harmless, and honest. You provide accurate and detailed
107
+ responses while being concise and clear. You can assist with a wide range
108
+ of tasks including writing, analysis, coding, and general questions."""
109
+ ```
110
+
111
+ ### Parameter Explanations
112
+
113
+ - **temperature**: Controls randomness (lower = more focused, higher = more creative)
114
+ - **top_k**: Limits vocabulary to top K tokens
115
+ - **top_p**: Nucleus sampling threshold
116
+ - **repeat_penalty**: Reduces repetitive text
117
+ - **num_ctx**: Context window size (how much text the model remembers)
118
+ - **num_predict**: Maximum tokens to generate per response
119
+
120
+ ## 🛠️ Useful Commands
121
+
122
+ ### List Available Models
123
+ ```bash
124
+ ollama list
125
+ ```
126
+
127
+ ### View Model Information
128
+ ```bash
129
+ ollama show palmyra-mini-thinking-a
130
+ ```
131
+
132
+ ### View Modelfile of Existing Model
133
+ ```bash
134
+ ollama show --modelfile palmyra-mini-thinking-a
135
+ ```
136
+
137
+ ### Remove Model
138
+ ```bash
139
+ ollama rm palmyra-mini-thinking-a
140
+ ```
141
+
142
+ ### Pull Model from Hugging Face (Alternative Method)
143
+ If the model were available on Hugging Face, you could also use:
144
+ ```bash
145
+ ollama run hf.co/username/repository-name
146
+ ```
147
+
148
+ ## 🔍 Choosing the Right Quantization
149
+
150
+ | Version | File Size | Quality | Speed | RAM Usage | Best For |
151
+ |---------|-----------|---------|-------|-----------|----------|
152
+ | BF16 | Largest | Highest | Slower | ~16GB+ | Production, highest accuracy |
153
+ | Q8_0 | Medium | High | Faster | ~8GB+ | Balanced performance |
154
+
155
+ ## 🐛 Troubleshooting
156
+
157
+ ### Common Issues
158
+
159
+ **1. "File not found" error:**
160
+ - Verify the file path in your Modelfile
161
+ - Use absolute paths if relative paths don't work
162
+ - Ensure the GGUF file exists in the specified location
163
+
164
+ **2. "Out of memory" error:**
165
+ - Try the Q8_0 quantization instead of BF16
166
+ - Reduce `num_ctx` parameter
167
+ - Close other applications to free up RAM
168
+
169
+ **3. Model runs but gives poor responses:**
170
+ - Adjust temperature and sampling parameters
171
+ - Modify the system message
172
+ - Try a higher quality quantization
173
+
174
+ **4. Slow performance:**
175
+ - Use Q8_0 quantization for faster inference
176
+ - Reduce `num_ctx` if you don't need long context
177
+ - Ensure you have sufficient RAM/VRAM
178
+
179
+ ### Getting Help
180
+
181
+ - Check Ollama documentation: [https://github.com/ollama/ollama](https://github.com/ollama/ollama)
182
+ - Ollama Discord community
183
+ - Hugging Face GGUF documentation: [https://huggingface.co/docs/hub/en/gguf](https://huggingface.co/docs/hub/en/gguf)
184
+
185
+ ## 📚 Additional Resources
186
+
187
+ - [Ollama Official Documentation](https://github.com/ollama/ollama/blob/main/docs/README.md)
188
+ - [Hugging Face Ollama Integration Guide](https://huggingface.co/docs/hub/en/ollama)
189
+ - [GGUF Format Documentation](https://huggingface.co/docs/hub/en/gguf)
190
+ - [Modelfile Syntax Reference](https://github.com/ollama/ollama/blob/main/docs/modelfile.md)
191
+
192
+ ## 🎯 Example Usage
193
+
194
+ Once your model is running, you can interact with it:
195
+
196
+ ```
197
+ >>> Hello! Can you tell me about yourself?
198
+
199
+ Hello! I'm Palmyra, an AI assistant created by Writer. I'm designed to be helpful,
200
+ harmless, and honest in my interactions. I can assist you with a wide variety of
201
+ tasks including writing, analysis, answering questions, coding help, and general
202
+ conversation. I aim to provide accurate and detailed responses while being concise
203
+ and clear. How can I help you today?
204
+
205
+ >>> What's the significance of rabbits to Fibonacci?
206
+
207
+ Rabbits played a significant role in the development of the Fibonacci sequence...
208
+ ```
209
+
210
+ ## 📄 License
211
+
212
+ Please refer to the original model license and terms of use from Writer/palmyra-mini-thinking-a.
213
+
214
+ ---
215
+
216
+ **Note**: This guide is based on Ollama's official documentation and community best practices. For the most up-to-date information, always refer to the [official Ollama documentation](https://github.com/ollama/ollama).