File size: 3,120 Bytes
d119ec8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: mit
language:
- en
pipeline_tag: tabular-classification
tags:
- sklearn
- classification
- iris
- tabular
datasets:
- brjapon/iris
metrics:
- accuracy
library_name: scikit-learn
new_version: "v1.0"
model-index:
- name: Iris Decision Tree
  results:
  - task:
      type: tabular-classification
      name: Classification
    metrics:
    - type: accuracy
      value: 0.97
      name: Test Accuracy
---

# Iris Classification Models

This repository starts with a **Decision Tree** model trained on the classic **Iris dataset**. The model classifies iris flowers into three species—*setosa*, *versicolor*, or *virginica*—based on four numeric features (sepal length, sepal width, petal length, and petal width).  

Because of its small size and simplicity, this model is intended primarily for **demonstration and educational** purposes.

## Model Description
- **Framework**: [Scikit-Learn](https://scikit-learn.org/stable/)  
- **Algorithm**: Decision Tree (`DecisionTreeClassifier` class)  
- **Hyperparameters**:  
  - Defaults for Decision Tree in Scikit-Learn

### Intended Uses
- **Education/Proof-of-Concept**: Demonstrates loading a scikit-learn model from the Hugging Face Hub.  
- **Beginner ML Tutorials**: Introduction to classification tasks, usage of Hugging Face model hosting, and deploying simple demos in Spaces.

### Limitations
- **Dataset Size**: The Iris dataset is small (150 samples). Performance metrics may not extrapolate to real-world scenarios.  
- **Domain Constraints**: The dataset only covers three iris species and may not generalize to other types of flowers.  
- **Not Production-Ready**: This model is not suited for critical applications (e.g., healthcare, autonomous vehicles).  

## How to Use
To use this model, you can load the `.joblib` file from the Hub in Python code:

```python
import joblib
from huggingface_hub import hf_hub_download

# Accompanying dataset is hosted in Hugging Face under 'Jesus02/iris-clase'
model_path = hf_hub_download(repo_id="brjapon/iris",
                             filename="iris_dt.joblib",
                             repo_type="model")

model = joblib.load(model_path)

# Example prediction (random values below)
sample_input = [[5.1, 3.5, 1.4, 0.2]]
prediction = model.predict(sample_input)
print(prediction)  # e.g., [0] which might correspond to 'setosa'
```

## Training Procedure
- **Training Data**: 80% of the 150-sample Iris dataset (120 samples).  
- **Validation Data**: 20% (30 samples).  
- **Steps**:  
  1. Loaded dataset (obtained from HF repository `brjapon/iris`) 
  2. Split into training and test sets with `train_test_split`
  3. Trained Decision Tree model with default settings
  4. Evaluated accuracy on the test set

## Performance
Using a random 80/20 split, the model typically achieves **~97%** accuracy on the test subset. Actual results may vary depending on your specific train/test split random state.

## Limitations & Bias
- The Iris dataset is not representative of modern, large-scale classification tasks.  
- Results should not be generalized beyond the included species and scenario.