|
--- |
|
license: mit |
|
language: |
|
- en |
|
pipeline_tag: tabular-classification |
|
tags: |
|
- sklearn |
|
- classification |
|
- iris |
|
- tabular |
|
datasets: |
|
- brjapon/iris |
|
metrics: |
|
- accuracy |
|
library_name: scikit-learn |
|
new_version: "v1.0" |
|
model-index: |
|
- name: Iris Decision Tree |
|
results: |
|
- task: |
|
type: tabular-classification |
|
name: Classification |
|
metrics: |
|
- type: accuracy |
|
value: 0.97 |
|
name: Test Accuracy |
|
--- |
|
|
|
# Iris Classification Models |
|
|
|
This repository starts with a **Decision Tree** model trained on the classic **Iris dataset**. The model classifies iris flowers into three species—*setosa*, *versicolor*, or *virginica*—based on four numeric features (sepal length, sepal width, petal length, and petal width). |
|
|
|
Because of its small size and simplicity, this model is intended primarily for **demonstration and educational** purposes. |
|
|
|
## Model Description |
|
- **Framework**: [Scikit-Learn](https://scikit-learn.org/stable/) |
|
- **Algorithm**: Decision Tree (`DecisionTreeClassifier` class) |
|
- **Hyperparameters**: |
|
- Defaults for Decision Tree in Scikit-Learn |
|
|
|
### Intended Uses |
|
- **Education/Proof-of-Concept**: Demonstrates loading a scikit-learn model from the Hugging Face Hub. |
|
- **Beginner ML Tutorials**: Introduction to classification tasks, usage of Hugging Face model hosting, and deploying simple demos in Spaces. |
|
|
|
### Limitations |
|
- **Dataset Size**: The Iris dataset is small (150 samples). Performance metrics may not extrapolate to real-world scenarios. |
|
- **Domain Constraints**: The dataset only covers three iris species and may not generalize to other types of flowers. |
|
- **Not Production-Ready**: This model is not suited for critical applications (e.g., healthcare, autonomous vehicles). |
|
|
|
## How to Use |
|
To use this model, you can load the `.joblib` file from the Hub in Python code: |
|
|
|
```python |
|
import joblib |
|
from huggingface_hub import hf_hub_download |
|
|
|
# Accompanying dataset is hosted in Hugging Face under 'Jesus02/iris-clase' |
|
model_path = hf_hub_download(repo_id="brjapon/iris", |
|
filename="iris_dt.joblib", |
|
repo_type="model") |
|
|
|
model = joblib.load(model_path) |
|
|
|
# Example prediction (random values below) |
|
sample_input = [[5.1, 3.5, 1.4, 0.2]] |
|
prediction = model.predict(sample_input) |
|
print(prediction) # e.g., [0] which might correspond to 'setosa' |
|
``` |
|
|
|
## Training Procedure |
|
- **Training Data**: 80% of the 150-sample Iris dataset (120 samples). |
|
- **Validation Data**: 20% (30 samples). |
|
- **Steps**: |
|
1. Loaded dataset (obtained from HF repository `brjapon/iris`) |
|
2. Split into training and test sets with `train_test_split` |
|
3. Trained Decision Tree model with default settings |
|
4. Evaluated accuracy on the test set |
|
|
|
## Performance |
|
Using a random 80/20 split, the model typically achieves **~97%** accuracy on the test subset. Actual results may vary depending on your specific train/test split random state. |
|
|
|
## Limitations & Bias |
|
- The Iris dataset is not representative of modern, large-scale classification tasks. |
|
- Results should not be generalized beyond the included species and scenario. |