File size: 3,440 Bytes
daf2c74
 
 
 
3b5c6fb
 
 
 
 
 
 
 
 
 
1208532
3b5c6fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8ed469a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: cc-by-nc-nd-4.0
base_model:
- Qwen/Qwen2.5-1.5B
language:
- en
- de
tags:
- Function_Call
- Automotive
- SLM
- GGUF
---

# Qwen2.5-1.5B-Auto-FunctionCaller

## Model Details

* **Model Name:** Qwen2.5-1.5B-Auto-FunctionCaller
* **Base Model:** [Qwen/Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B)
* **Model Type:** Language Model fine-tuned for Function Calling.
* **Recommended Quantization:** `Qwen2.5-1.5B-Auto-FunctionCaller.Q4_K_M_I.gguf`
    * This GGUF file using Q4\_K\_M quantization with Importance Matrix is recommended as offering the best balance between performance and computational efficiency (inference speed, memory usage) based on evaluation.

## Intended Use

* **Primary Use:** Function calling extraction from natural language queries within an automotive context. The model is designed to identify user intent and extract relevant parameters (arguments/slots) for triggering vehicle functions or infotainment actions.
* **Research Context:** This model was specifically developed and fine-tuned as part of a research publication investigating the feasibility and performance of Small Language Models (SLMs) for function-calling tasks in resource-constrained automotive environments.
* **Target Environment:** Embedded systems or edge devices within vehicles where computational resources may be limited.
* **Out-of-Scope Uses:** General conversational AI, creative writing, tasks outside automotive function calling, safety-critical vehicle control.

## Performance Metrics

The following metrics were evaluated on the `Qwen2.5-1.5B-Auto-FunctionCaller.Q4_K_M_I.gguf` model:

* **Evaluation Setup:**
    * Total Evaluation Samples: 2074
* **Performance:**
    * **Exact Match Accuracy:** 0.8414
    * **Average Component Accuracy:** 0.9352
* **Efficiency & Confidence:**
    * **Throughput:** 10.31 tokens/second
    * **Latency (Per Token):** 0.097 seconds
    * **Latency (Per Instruction):** 0.427 seconds
    * **Average Model Confidence:** 0.9005
    * **Calibration Error:** 0.0854

*Note: Latency and throughput figures are hardware-dependent and should be benchmarked on the target deployment environment.*

## Limitations

* **Domain Specificity:** Performance is optimized for automotive function calling. Generalization to other domains or complex, non-structured conversations may be limited.
* **Quantization Impact:** The `Q4_K_M_I` quantization significantly improves efficiency but may result in a slight reduction in accuracy compared to higher-precision versions (e.g., FP16).
* **Complex Queries:** May struggle with highly nested, ambiguous, or unusually phrased requests not well-represented in the fine-tuning data.
* **Safety Criticality:** This model is **not** intended or validated for safety-critical vehicle operations (e.g., braking, steering). Use should be restricted to non-critical systems like infotainment and comfort controls.
* **Bias:** Like any model, performance and fairness depend on the underlying data. Biases present in the fine-tuning or evaluation datasets may be reflected in the model's behavior.

## Training Data (Summary)

The model was fine-tuned on a synthetic dataset specifically curated for automotive function calling tasks. Details will be referenced in the associated publication.

## Citation

- Systematic Deployment of Small Language Models to Edge Devices - FEV.io
- 2025 JSAE Annual Congress (Spring) / Publication code : 20255372