TinyBERT Financial News Sentiment Analysis
A lightweight TinyBERT model fine-tuned for financial news sentiment analysis, achieving 89% accuracy with < 60MB model size and <50ms CPU inference latency.
Model Details
- Model Type: Text Classification (Sentiment Analysis)
- Architecture: TinyBERT (4-layer, 312-hidden)
- Pretrained Base:
huawei-noah/TinyBERT_General_4L_312D
- Fine-tuned Dataset: Financial news headlines with sentiment labels
- Input: Financial news text (max 128 tokens)
- Output: Sentiment classification (Negative/Neutral/Positive)
Performance
Metric | Value |
---|---|
Accuracy | 89.2% |
F1-Score | 0.87 |
Model Size | 54.84MB |
CPU Latency | 28ms |
Quantized Size | 5.3MB |
Usage
Direct Inference with Pipeline
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="mikeysharma/finance-sentiment-analysis"
)
result = classifier("$TSLA - Morgan Stanley upgrades Tesla to Overweight")
print(result)
Using Model & Tokenizer Directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("mikeysharma/finance-sentiment-analysis)
model = AutoModelForSequenceClassification.from_pretrained("mikeysharma/finance-sentiment-analysis")
inputs = tokenizer(
"$BYND - JPMorgan cuts Beyond Meat price target",
return_tensors="pt",
truncation=True,
max_length=128
)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions)
ONNX Runtime (Optimal for Production)
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("mikeysharma/finance-sentiment-analysis")
model = ORTModelForSequenceClassification.from_pretrained("mikeysharma/finance-sentiment-analysis")
inputs = tokenizer(
"Cemex shares fall after Credit Suisse downgrade",
return_tensors="pt",
truncation=True,
max_length=128
)
outputs = model(**inputs)
Training Data
The model was fine-tuned on a dataset of financial news headlines with three sentiment classes:
- Negative: Bearish sentiment, downgrades, losses
- Neutral: Factual reporting, no strong sentiment
- Positive: Bullish sentiment, upgrades, gains
Example samples:
$AAPL - Apple hits record high after earnings beat (Positive)
$TSLA - Tesla misses Q2 delivery estimates (Negative)
$MSFT - Microsoft announces new Azure features (Neutral)
Preprocessing
Text is preprocessed with:
- Lowercasing
- Ticker symbol normalization ($AAPL โ AAPL)
- URL removal
- Special character cleaning
- Truncation to 128 tokens
Deployment
For production deployment, we recommend:
- ONNX Runtime for CPU-optimized inference
- FastAPI for REST API serving
- Docker containerization
Example Dockerfile:
FROM python:3.8-slim
WORKDIR /app
COPY . .
RUN pip install transformers optimum[onnxruntime] fastapi uvicorn
CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8000"]
Limitations
- Primarily trained on English financial news
- Performance may degrade on non-financial text
- Short-form text (headlines) works best
- May not capture nuanced sarcasm/irony
Ethical Considerations
While useful for market analysis, this model should not be used as sole input for investment decisions. Always combine with human judgment and other data sources.
Citation
If you use this model in your research, please cite:
@misc{tinybert-fin-sentiment,
author = {Mikey Sharma},
title = {Lightweight Financial News Sentiment Analysis with TinyBERT},
year = {2023},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/mikeysharma/finance-sentiment-analysis}}
}
license: mit
- Downloads last month
- 22
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support