๐ง Model Overview
-This model is fine_tuned for sentiment classification
- using the Sentiment140 dataset
- It predicts one of the following sentiment labels:
0
: Negative1
: Positive
-Model Type: DistilBERT
-Language: English
(with future support for Arabic tweets planned)
-Framework: Transformers (๐ค Hugging Face)
๐ง Model Details
- Base model:
distilbert-base-uncased
- Task: Sentiment Analysis (Binary: Positive / Negative)
- Trained on: Sentiment140 dataset (1.6M tweets)
- Creator: Hatem Moushir
- Language: English
- Use case: General sentiment classification for social media or informal texts.
๐ Training Data
Dataset: Sentiment140
Samples Used: 1,600,000 tweets
Format: CSV with columns:
target
,id
,date
,query
,user
,text
labels:
0
: Negative2
: Neutral4
: PositivePreprocessing:
- Lowercasing
- Removing URLs, mentions, hashtags
- Optional: Stopwords filtering
๐ง Training Configuration
Setting | Value |
---|---|
Model Base | distilbert-base-uncased |
Epochs | 2 |
Batch Size | 16 |
๐งช Evaluation Results
๐งช Evaluation Results
Metric | Value |
---|---|
Accuracy (Val) โ Epoch 1 | 83.43% |
Accuracy (Val) โ Epoch 2 | (ูู ูุชุญุณู) โ 83.43% |
Training Loss (Epoch 1) | 0.4022 |
Validation Loss (Epoch 1) | 0.3784 |
Training Loss (Epoch 2) | 0.2536 |
Validation Loss (Epoch 2) | 0.4179 |
๐ How to Use
- example 1
from transformers import pipeline
model_name = "HatemMoushir/sentiment140-distilbert-hatem"
pipe = pipeline("sentiment-analysis", model=model_name)
pipe([
"I love this place!",
"This is a terrible experience."
])
- example 2
import gradio as gr
from transformers import pipeline
# Label mapping from model output to readable format
label_map = {"LABEL_0": "Negative", "LABEL_1": "Positive"}
# Load the sentiment analysis pipeline with your custom model
pipe = pipeline("sentiment-analysis", model="HatemMoushir/sentiment140-distilbert-hatem")
def analyze_sentiment(texts):
# Split input by newline or punctuation (you can customize this)
sentences = texts.split('\n')
results = pipe(sentences)
output = []
for text, res in zip(sentences, results):
label = label_map[res["label"]]
score = round(res["score"], 2)
output.append(f"{text} โ {label} ({score})")
return "\n".join(output)
# Multi-line input using Gradio Textbox
gr.Interface(
fn=analyze_sentiment,
inputs=gr.Textbox(lines=5, placeholder="Enter one or more sentences..."),
outputs="text"
).launch()
Test the model
from transformers import pipeline
# ุชุญู
ูู ุงููู
ูุฐุฌ ุงูุฅูุฌููุฒู ุงูู
ุฏุฑุจ ุนูู Sentiment140
classifier = pipeline("sentiment-analysis", model="HatemMoushir/sentiment140-distilbert-hatem")
# 100 ุฌู
ูุฉ ุฅูุฌููุฒูุฉ ู
ุน ุงูุชุตููู ุงูุญูููู: 1 = Positive, 0 = Negative
samples = [
("I love this place!", 1),
("I hate waiting in traffic.", 0),
("Today is a beautiful day", 1),
("I am really disappointed", 0),
("Feeling great about this opportunity", 1),
("This movie was terrible", 0),
("Absolutely loved the dinner", 1),
("Iโm sad and frustrated", 0),
("My friends make me happy", 1),
("Everything went wrong today", 0),
("What a fantastic game!", 1),
("Worst experience ever", 0),
("The weather is amazing", 1),
("I canโt stand this anymore", 0),
("So proud of my achievements", 1),
("Feeling down", 0),
("Just got a promotion!", 1),
("Why does everything suck?", 0),
("Best vacation ever", 1),
("Iโm tired of this nonsense", 0),
("Such a lovely gesture", 1),
("That was rude and uncalled for", 0),
("Finally some good news!", 1),
("I'm so lonely", 0),
("My cat is the cutest", 1),
("This food tastes awful", 0),
("Celebrating small wins today", 1),
("Not in the mood", 0),
("Grateful for everything", 1),
("I feel useless", 0),
("Such a peaceful morning", 1),
("Another failure, just great", 0),
("Got accepted into college!", 1),
("I hate being ignored", 0),
("The sunset was breathtaking", 1),
("You ruined my day", 0),
("He makes me feel special", 1),
("Everything is falling apart", 0),
("Can't wait for the weekend", 1),
("So much stress right now", 0),
("Iโm in love", 1),
("I donโt care anymore", 0),
("Won first place!", 1),
("This is so frustrating", 0),
("He always cheers me up", 1),
("Feeling stuck", 0),
("Had a wonderful time", 1),
("Nothing matters", 0),
("Looking forward to tomorrow", 1),
("Just leave me alone", 0),
("We made it!", 1),
("Horrible customer service", 0),
("The music lifts my spirits", 1),
("I'm drowning in problems", 0),
("My team won the match", 1),
("I wish I never came", 0),
("Sunshine and good vibes", 1),
("Everything is a mess", 0),
("Love the energy here", 1),
("Feeling hopeless", 0),
("She always makes me smile", 1),
("So many regrets", 0),
("Today was a success", 1),
("Bad day again", 0),
("Iโm truly blessed", 1),
("This is depressing", 0),
("Can't stop smiling", 1),
("Everything hurts", 0),
("So excited for this!", 1),
("I hate myself", 0),
("Best concert ever", 1),
("Life is unfair", 0),
("Happy and content", 1),
("Crying inside", 0),
("Feeling inspired", 1),
("The service was awful", 0),
("Joy all around", 1),
("I feel dead inside", 0),
("Itโs a dream come true", 1),
("Nothing good ever happens", 0),
("Feeling positive", 1),
("That hurt my feelings", 0),
("Success tastes sweet", 1),
("I can't handle this", 0),
("We had a blast", 1),
("Itโs not worth it", 0),
("Heโs such a kind soul", 1),
("I'm broken", 0),
("Everything is perfect", 1),
("So tired of pretending", 0),
("What a nice surprise!", 1),
("I feel empty", 0),
("Canโt wait to start!", 1),
("It's always my fault", 0),
("A new beginning", 1),
("So much pain", 0),
("My heart is full", 1),
("This sucks", 0),
("I feel accomplished", 1),
("Why bother", 0),
("Living my best life", 1),
("I just want to disappear", 0)
]
# ุชุฌุฑุจุฉ ุงููู
ูุฐุฌ ูู
ูุงุฑูุฉ ุงููุชูุฌุฉ
correct = 0
for i, (text, true_label) in enumerate(samples):
result = classifier(text)[0]
predicted_label = 1 if result["label"] == "LABEL_1" else 0
is_correct = predicted_label == true_label
correct += is_correct
print(f"{i+1}. \"{text}\"")
print(f" ๐ Model โ {predicted_label} | ๐ฏ True โ {true_label} | {'โ๏ธ Correct' if is_correct else 'โ Wrong'}\n")
# ุฏูุฉ ุงููู
ูุฐุฌ
accuracy = correct / len(samples)
print(f"โ
Accuracy: {accuracy * 100:.2f}%")
Development and Assistance
This model was developed and trained using Google Colab, with guidance and technical assistance from ChatGPT, which was used for idea generation, code authoring, and troubleshooting throughout the development process.
Source Code
The full code used to prepare and train the model is available on GitHub:
๐ GitHub file source.
๐ Model Versions
Version Notes
- v1.0 Trained on 10000 samples
- v1.1 Trained on 50000 samples
๐ Limitations
Bias may exist due to dataset source (Twitter, English only)
Slang and sarcasm not always well understood
Model not yet optimized for Arabic tweets
๐ License
This model is released under the MIT License.
๐ Acknowledgments
Special thanks to the Stanford NLP group for publishing Sentiment140, and to the Hugging Face team for making model sharing easy.
๐ฌ Contact
Created by Hatem Moushir For questions or collaboration: h_moushir@hotmail.com
- Downloads last month
- 43