bank_transaction / ReadMe.md
leynessa's picture
Upload 8 files
ff52cdd verified

pip install -r requirements.txt python run_api.py curl -X POST http://127.0.0.1:5000/classify -H "Content-Type: application/json" -d '{"purpose_text": "paid rent"}'

LLM/Transformer Conceptual Plan

To adapt a transformer-based model like BERT to this classification task, I would:

  • Use a pre-trained model like bert-base-uncased from Hugging Face Transformers.
  • Tokenize the purpose_text field using the BERT tokenizer.
  • Add a classification head (dense layer) on top of the [CLS] token representation.
  • Fine-tune the model on the labeled dataset using cross-entropy loss.

Due to hardware limitations, I am not implementing this, but a minimal prototype could be done with the Trainer API in Hugging Face.

how the data was trained Raw: "Monthly apartment payment - paid" Cleaned: "monthly apartment payment"

Transformer-Based Classification Notes

Instead of traditional models, we could use a transformer like BERT for this task.

Approach

  1. Load a pre-trained model like bert-base-uncased
  2. Tokenize purpose_text using HuggingFace's tokenizer
  3. Add a classification head to the model
  4. Fine-tune the model on your labeled dataset

Benefits

  • Better semantic understanding of context
  • No need for manual preprocessing or TF-IDF

Tools

  • transformers from HuggingFace
  • datasets for handling input
  • torch for training

Reason for Not Using It

Due to hardware limitations and time constraints, traditional models were preferred.