File size: 4,378 Bytes
2782cfc
 
 
 
 
 
 
 
43e3843
 
 
2782cfc
 
 
 
43e3843
2782cfc
43e3843
 
 
 
 
 
 
 
 
 
 
2782cfc
 
43e3843
 
 
 
 
 
 
2782cfc
43e3843
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2782cfc
 
43e3843
 
 
2782cfc
43e3843
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2782cfc
43e3843
2782cfc
43e3843
 
 
 
 
2782cfc
43e3843
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2782cfc
 
 
 
 
 
 
43e3843
2782cfc
43e3843
2782cfc
 
 
43e3843
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2782cfc
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
base_model: google/gemma-2-2b-it
library_name: transformers
model_name: gemma-2-2B-it-thinking-function_calling-V0
tags:
- generated_from_trainer
- trl
- sft
- function-calling
- thinking-layer
license: mit
---

# Model Card for gemma-2-2B-it-thinking-function_calling-V0

This model is a fine-tuned version of [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it), specifically trained for function calling with an added "Thinking Layer". The model was trained using [TRL](https://github.com/huggingface/trl) and incorporates an explicit thinking process before making function calls.

## 🎯 Key Features

- **Function Calling**: Generation of structured function calls
- **Thinking Layer**: Explicit reasoning process before execution
- **Supported Functions**:
  - `convert_currency`: Currency conversion
  - `calculate_distance`: Distance calculation between locations

## 🚀 Quick Start

### Function Calling Example

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "Sellid/gemma-2-2B-it-thinking-function_calling-V0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example for currency conversion
prompt = """<bos><start_of_turn>human
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags.
Here are the available tools:<tools>[{
    "type": "function",
    "function": {
        "name": "convert_currency",
        "description": "Convert from one currency to another",
        "parameters": {
            "type": "object",
            "properties": {
                "amount": {"type": "number", "description": "The amount to convert"},
                "from_currency": {"type": "string", "description": "The currency to convert from"},
                "to_currency": {"type": "string", "description": "The currency to convert to"}
            },
            "required": ["amount", "from_currency", "to_currency"]
        }
    }
}]</tools>

Hi, I need to convert 500 USD to Euros. Can you help me with that?<end_of_turn><eos>
<start_of_turn>model"""

# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0]))
```

## 🤖 Model Architecture

The model uses a special prompt structure with three main components:

1. **Tools Definition**:
```xml
<tools>
[Function signatures in JSON format]
</tools>
```

2. **Thinking Layer**:
```xml
<think>
[Explicit thinking process of the model]
</think>
```

3. **Function Call**:
```xml
<tool_call>
{
    "name": "function_name",
    "arguments": {
        "param1": "value1",
        ...
    }
}
</tool_call>
```

### Thinking Layer Process

The Thinking Layer executes the following steps:
1. **Analysis** of user request
2. **Selection** of appropriate function
3. **Validation** of parameters
4. **Generation** of function call

## 📊 Performance & Limitations

- **Memory Requirements**: ~4GB RAM
- **Inference Time**: ~1-2 seconds/request
- **Supported Platforms**:
  - CPU
  - NVIDIA GPUs (CUDA)
  - Apple Silicon (MPS)

### Limitations

- Limited to pre-trained functions
- No function call chaining
- No dynamic function extension

## 🔧 Training Details

The model was trained using SFT (Supervised Fine-Tuning):

### Framework Versions

- TRL: 0.15.1
- Transformers: 4.49.0
- Pytorch: 2.7.0.dev20250222
- Datasets: 3.3.2
- Tokenizers: 0.21.0

## 📚 Citations

If you use this model, please cite TRL:

```bibtex
@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/huggingface/trl}}
}
```

And this model:

```bibtex
@misc{gemma-function-calling-thinking,
    title        = {Gemma Function-Calling with Thinking Layer},
    author       = {Sellid},
    year         = 2024,
    publisher    = {Hugging Face Model Hub},
    howpublished = {\url{https://huggingface.co/Sellid/gemma-2-2B-it-thinking-function_calling-V0}}
}
```