Spaces:
Runtime error
Runtime error
pip install -r requirements.txt | |
python run_api.py | |
curl -X POST http://127.0.0.1:5000/classify -H "Content-Type: application/json" -d '{"purpose_text": "paid rent"}' | |
## LLM/Transformer Conceptual Plan | |
To adapt a transformer-based model like BERT to this classification task, I would: | |
- Use a pre-trained model like `bert-base-uncased` from Hugging Face Transformers. | |
- Tokenize the `purpose_text` field using the BERT tokenizer. | |
- Add a classification head (dense layer) on top of the [CLS] token representation. | |
- Fine-tune the model on the labeled dataset using cross-entropy loss. | |
Due to hardware limitations, I am not implementing this, but a minimal prototype could be done with the `Trainer` API in Hugging Face. | |
how the data was trained | |
Raw: "Monthly apartment payment - paid" | |
Cleaned: "monthly apartment payment" | |
# Transformer-Based Classification Notes | |
Instead of traditional models, we could use a transformer like BERT for this task. | |
## Approach | |
1. Load a pre-trained model like `bert-base-uncased` | |
2. Tokenize `purpose_text` using HuggingFace's tokenizer | |
3. Add a classification head to the model | |
4. Fine-tune the model on your labeled dataset | |
## Benefits | |
- Better semantic understanding of context | |
- No need for manual preprocessing or TF-IDF | |
## Tools | |
- `transformers` from HuggingFace | |
- `datasets` for handling input | |
- `torch` for training | |
## Reason for Not Using It | |
Due to hardware limitations and time constraints, traditional models were preferred. | |