Model Card: BLOOM-560m for Personal Sharing Classification

This model is a fine-tuned version of BLOOM-560m designed to classify personal experience sharing in social media text. It was developed to explore how different generations (Baby Boomers and Gen X) express themselves on pseudonymous platforms like Reddit.

Model Details

  • Model Type: Large Language Model (Decoder-only) fine-tuned for sequence classification.
  • Language: English.
  • Finetuned from model: bigscience/bloom-560m.
  • Application: Sociotechnical research on digital aging and online self-disclosure.

Intended Use

Primary Task

The model classifies individual sentences into one of four categories to analyze domains of self-disclosure in online forums.

Categories

  • Health and Wellness (Label 0): Personal experiences regarding physical/mental health, treatments, or aging-related bodily changes.
  • Personal Relationships and Identity (Label 1): Sentences describing social ties, family, friendships, or social identities.
  • Professional and Financial (Label 2): Reflections on work, career history, retirement planning, and financial management.
  • Not Related to Personal Sharing (Label 3): Non-reflective content, general information, or social pleasantries.

Training Data

  • Source: Publicly available posts and comments from the Reddit subreddit r/AskOldPeople.
  • Size: 2,000 manually labeled sentences (stratified sampling: 500 per category).
  • Data Split: 80% Training, 10% Validation, 10% Test.
  • Preprocessing: Sentences were tokenized using the Punkt sentence tokenizer.

Performance

The model achieved high accuracy on a held-out test set:

Metric Value
F1 Score 0.9599

Usage

You can use this model directly with the Hugging Face transformers library:

from transformers import pipeline

classifier = pipeline("text-classification", model="ernchern/personal_info_classification")

text = "I am 67, retired in August, and most basic expenses are covered by Social Security."
result = classifier(text)
print(result)
Downloads last month
26
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ernchern/personal_info_classification

Finetuned
(38)
this model

Evaluation results