Update README.md

43e3843 verified 6 months ago

4.38 kB

	---
	base_model: google/gemma-2-2b-it
	library_name: transformers
	model_name: gemma-2-2B-it-thinking-function_calling-V0
	tags:
	- generated_from_trainer
	- trl
	- sft
	- function-calling
	- thinking-layer
	license: mit
	---

	# Model Card for gemma-2-2B-it-thinking-function_calling-V0

	This model is a fine-tuned version of [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it), specifically trained for function calling with an added "Thinking Layer". The model was trained using [TRL](https://github.com/huggingface/trl) and incorporates an explicit thinking process before making function calls.

	## 🎯 Key Features

	- Function Calling: Generation of structured function calls
	- Thinking Layer: Explicit reasoning process before execution
	- Supported Functions:
	- `convert_currency`: Currency conversion
	- `calculate_distance`: Distance calculation between locations

	## 🚀 Quick Start

	### Function Calling Example

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_name = "Sellid/gemma-2-2B-it-thinking-function_calling-V0"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	# Example for currency conversion
	prompt = """<bos><start_of_turn>human
	You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags.
	Here are the available tools:<tools>[{
	"type": "function",
	"function": {
	"name": "convert_currency",
	"description": "Convert from one currency to another",
	"parameters": {
	"type": "object",
	"properties": {
	"amount": {"type": "number", "description": "The amount to convert"},
	"from_currency": {"type": "string", "description": "The currency to convert from"},
	"to_currency": {"type": "string", "description": "The currency to convert to"}
	},
	"required": ["amount", "from_currency", "to_currency"]
	}
	}
	}]</tools>

	Hi, I need to convert 500 USD to Euros. Can you help me with that?<end_of_turn><eos>
	<start_of_turn>model"""

	# Generate response
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=200)
	print(tokenizer.decode(outputs[0]))
	```

	## 🤖 Model Architecture

	The model uses a special prompt structure with three main components:

	1. Tools Definition:
	```xml
	<tools>
	[Function signatures in JSON format]
	</tools>
	```

	2. Thinking Layer:
	```xml
	<think>
	[Explicit thinking process of the model]
	</think>
	```

	3. Function Call:
	```xml
	<tool_call>
	{
	"name": "function_name",
	"arguments": {
	"param1": "value1",
	...
	}
	}
	</tool_call>
	```

	### Thinking Layer Process

	The Thinking Layer executes the following steps:
	1. Analysis of user request
	2. Selection of appropriate function
	3. Validation of parameters
	4. Generation of function call

	## 📊 Performance & Limitations

	- Memory Requirements: ~4GB RAM
	- Inference Time: ~1-2 seconds/request
	- Supported Platforms:
	- CPU
	- NVIDIA GPUs (CUDA)
	- Apple Silicon (MPS)

	### Limitations

	- Limited to pre-trained functions
	- No function call chaining
	- No dynamic function extension

	## 🔧 Training Details

	The model was trained using SFT (Supervised Fine-Tuning):

	### Framework Versions

	- TRL: 0.15.1
	- Transformers: 4.49.0
	- Pytorch: 2.7.0.dev20250222
	- Datasets: 3.3.2
	- Tokenizers: 0.21.0

	## 📚 Citations

	If you use this model, please cite TRL:

	```bibtex
	@misc{vonwerra2022trl,
	title = {{TRL: Transformer Reinforcement Learning}},
	author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
	year = 2020,
	journal = {GitHub repository},
	publisher = {GitHub},
	howpublished = {\url{https://github.com/huggingface/trl}}
	}
	```

	And this model:

	```bibtex
	@misc{gemma-function-calling-thinking,
	title = {Gemma Function-Calling with Thinking Layer},
	author = {Sellid},
	year = 2024,
	publisher = {Hugging Face Model Hub},
	howpublished = {\url{https://huggingface.co/Sellid/gemma-2-2B-it-thinking-function_calling-V0}}
	}
	```