Model Card: BLOOM-560m for Personal Sharing Classification
This model is a fine-tuned version of BLOOM-560m designed to classify personal experience sharing in social media text. It was developed to explore how different generations (Baby Boomers and Gen X) express themselves on pseudonymous platforms like Reddit.
Model Details
- Model Type: Large Language Model (Decoder-only) fine-tuned for sequence classification.
- Language: English.
- Finetuned from model:
bigscience/bloom-560m. - Application: Sociotechnical research on digital aging and online self-disclosure.
Intended Use
Primary Task
The model classifies individual sentences into one of four categories to analyze domains of self-disclosure in online forums.
Categories
- Health and Wellness (Label 0): Personal experiences regarding physical/mental health, treatments, or aging-related bodily changes.
- Personal Relationships and Identity (Label 1): Sentences describing social ties, family, friendships, or social identities.
- Professional and Financial (Label 2): Reflections on work, career history, retirement planning, and financial management.
- Not Related to Personal Sharing (Label 3): Non-reflective content, general information, or social pleasantries.
Training Data
- Source: Publicly available posts and comments from the Reddit subreddit
r/AskOldPeople. - Size: 2,000 manually labeled sentences (stratified sampling: 500 per category).
- Data Split: 80% Training, 10% Validation, 10% Test.
- Preprocessing: Sentences were tokenized using the Punkt sentence tokenizer.
Performance
The model achieved high accuracy on a held-out test set:
| Metric | Value |
|---|---|
| F1 Score | 0.9599 |
Usage
You can use this model directly with the Hugging Face transformers library:
from transformers import pipeline
classifier = pipeline("text-classification", model="ernchern/personal_info_classification")
text = "I am 67, retired in August, and most basic expenses are covered by Social Security."
result = classifier(text)
print(result)
- Downloads last month
- 26
Model tree for ernchern/personal_info_classification
Base model
bigscience/bloom-560mEvaluation results
- f1self-reported0.960