|
--- |
|
library_name: transformers |
|
datasets: |
|
- majeedkazemi/students-coding-questions-from-ai-assistant |
|
language: |
|
- en |
|
base_model: |
|
- Salesforce/codet5-base |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
Vilnius University Deep Neural Networks course project. |
|
|
|
|
|
## Model Details |
|
A transformer-based query classification model. |
|
|
|
|
|
### Model Description |
|
This model was developed as part of a Deep Neural Networks (DNN) course project at Vilnius University. |
|
It fine-tunes the `Salesforce/codet5-base` model for classifying student queries related to C programming into five categories: **General Question**, **Question from Code**, **Help Fix Code**, **Help Write Code**, and **Explain Code**. |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
- **Developed by:** Brigita Bruškytė, Artiom Hovhannisyan, Eglė Orinaitė |
|
Faculty of Mathematics and Informatics, Vilnius University |
|
|
|
## Dataset |
|
- **Size**: 6,776 student queries from a real C programming course. |
|
- **Structure**: JSON entries with `user_id`, `time`, `feature type`, `feature version`, `input question`, `input code`, `input intention`, `input task description`. |
|
- **Note**: Dataset does not include AI responses — only the student queries. |
|
|
|
## Challenges |
|
- **Class imbalance**: e.g., “General Question” is much more frequent. |
|
- **Field-based hints**: Some classes have unique fields (like `input task description`), inadvertently helping classification. |
|
- **Token length**: Some queries, especially with code snippets, can be very long, hitting transformer limits. |
|
- **Structural inconsistency**: Dataset descriptions sometimes did not match actual data. |
|
|
|
|
|
### Per-Category F1 Scores |
|
|
|
| Category | Codet-classy | |
|
|----------------------|------------| |
|
| Explain Code | 0.90 | |
|
| General Question | 0.97 | |
|
| Help Fix Code | 0.85 | |
|
| Help Write Code | 0.63 | |
|
| Question from Code | 0.89 | |
|
|