FinChat-XS

FinChat-XS is a lightweight financial domain language model designed to answer questions about finance, markets, investments, and economics in a conversational style.

Model Overview

FinChat-XS is a fine-tuned version of HuggingFaceTB/SmolLM2-360M-Instruct, optimized for financial domain conversations using LoRA (Low-Rank Adaptation). With only 360M parameters, it offers a balance between performance and efficiency, making it accessible for deployment on consumer hardware.

The model combines professional financial knowledge with a conversational communication style, making it suitable for applications where users need expert financial information delivered in an approachable manner.

Repository & Resources

For full code, training process, and additional details, visit the GitHub repository:

๐Ÿ”— FinLLMOpt Repository

How the Model was Created

FinChat-XS was developed through a focused fine-tuning process designed to enhance financial domain expertise while maintaining conversational abilities:

  1. Base model selection: Started with SmolLM2-360M-Instruct, a lightweight instruction-tuned language model

  2. Dataset preparation:

    • Filtered the sujet-ai/Sujet-Finance-Instruct-177k dataset to focus on QA and conversational QA examples
    • Applied length filtering to keep responses below 500 characters
    • Augmented short conversational QA examples to improve conciseness
  3. Fine-tuning approach:

    • Applied LoRA (Low-Rank Adaptation) to efficiently fine-tune the model
    • Targeted key attention modules (q_proj, v_proj)
    • Used rank r=4 and alpha=16
    • Training configuration:
      • Batch size: 2 (effective batch size 16 with gradient accumulation)
      • Learning rate: 1.5e-4
      • BF16 precision

Challenges

The primary challenge encountered during the development of FinChat-XS was the lack of high-quality conversational datasets specifically focused on personal finance. While the Sujet-Finance-Instruct-177k dataset provided valuable financial QA examples, there remains a notable gap in naturalistic, multi-turn conversations about personal financial scenarios.

Why Use This Model?

FinChat-XS offers several advantages for specific use cases:

  • Efficient deployment: At only 362MB, it can run on devices with limited resources.
  • Financial domain knowledge: Fine-tuned specifically on financial QA data
  • Balanced communication style: Combines professional financial knowledge with conversational delivery
  • Low deployment cost: Requires significantly less computational resources than larger models
  • Customizable: The LoRA adapter can be mixed with other adapters or further fine-tuned

Ideal for:

  • Embedded financial assistants in mobile apps
  • Personal financial planning tools
  • Educational applications about finance and investing
  • Customer service automation for financial institutions
  • Quick deployment scenarios where larger models aren't practical

How to Use the Model

Basic Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "oopere/FinChat-XS"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)

# Create a conversation
messages = [
    {"role": "user", "content": "What's the difference between stocks and bonds?"}
]

# Format the prompt using the chat template
prompt = tokenizer.apply_chat_template(messages, tokenize=False)

# Tokenize the prompt
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate a response
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.2
)

# Decode and print the response
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Optimized Inference with 8-bit Quantization

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

# Configure 8-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_8bit=True,
    bnb_4bit_compute_dtype=torch.float16
)

# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
    "oopere/FinChat-XS", 
    quantization_config=bnb_config,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("oopere/FinChat-XS")

# Continue with the same usage pattern as above

Using with LoRA Adapter Only

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")

# Load LoRA adapter
peft_model = PeftModel.from_pretrained(base_model, "oopere/qa-adapterFinChat-XS")

# Continue with the same usage pattern as above

Limitations & Considerations

While FinChat-XS performs well in many financial conversation scenarios, users should be aware of these limitations:

  1. Knowledge limitations: The model's knowledge is limited to its training data and has a knowledge cutoff date from the base model (SmolLM2).

  2. Size trade-offs: As a 360M parameter model, it has less capacity than larger models (7B+) and may provide less nuanced or detailed responses on complex topics.

  3. Financial advice disclaimer: The model is not a certified financial advisor and should not be used for making investment decisions. Its responses should be considered educational, not professional financial advice.

  4. Domain boundaries: While focused on finance, the model may struggle with highly specialized financial topics or recent developments not covered in its training data.

  5. Hallucination potential: Like all language models, FinChat-XS may occasionally generate plausible-sounding but incorrect information, especially when asked about specific numerical data or complex financial details.

  6. Style variations: The model balances formal financial knowledge with a conversational style, which may not be appropriate for all professional contexts.

  7. Regulatory compliance: This model has not been specifically audited for compliance with financial regulations in various jurisdictions.

Citation

If you use FinChat-XS in your research or applications, please consider citing it as:

@misc{oopere2025finchatxs,
  author = {Martra, P.},
  title = {FinChat-XS: A Lightweight Financial Domain Chat Language Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/oopere/FinChat-XS}}
}

Acknowledgements

  • HuggingFaceTB for creating the SmolLM2 model series
  • Sujet AI for their financial instruction dataset
  • Hugging Face for providing the infrastructure and tools for model development
Downloads last month
50
Safetensors
Model size
0.4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for oopere/FinChat-XS

Finetuned
(108)
this model
Quantizations
2 models

Dataset used to train oopere/FinChat-XS

Space using oopere/FinChat-XS 1

Collection including oopere/FinChat-XS