Spaces:
Runtime error
Runtime error
File size: 1,521 Bytes
ff52cdd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
pip install -r requirements.txt
python run_api.py
curl -X POST http://127.0.0.1:5000/classify -H "Content-Type: application/json" -d '{"purpose_text": "paid rent"}'
## LLM/Transformer Conceptual Plan
To adapt a transformer-based model like BERT to this classification task, I would:
- Use a pre-trained model like `bert-base-uncased` from Hugging Face Transformers.
- Tokenize the `purpose_text` field using the BERT tokenizer.
- Add a classification head (dense layer) on top of the [CLS] token representation.
- Fine-tune the model on the labeled dataset using cross-entropy loss.
Due to hardware limitations, I am not implementing this, but a minimal prototype could be done with the `Trainer` API in Hugging Face.
how the data was trained
Raw: "Monthly apartment payment - paid"
Cleaned: "monthly apartment payment"
# Transformer-Based Classification Notes
Instead of traditional models, we could use a transformer like BERT for this task.
## Approach
1. Load a pre-trained model like `bert-base-uncased`
2. Tokenize `purpose_text` using HuggingFace's tokenizer
3. Add a classification head to the model
4. Fine-tune the model on your labeled dataset
## Benefits
- Better semantic understanding of context
- No need for manual preprocessing or TF-IDF
## Tools
- `transformers` from HuggingFace
- `datasets` for handling input
- `torch` for training
## Reason for Not Using It
Due to hardware limitations and time constraints, traditional models were preferred.
|