--- library_name: transformers tags: - code - bug-fix - code-generation - code-repair - codet5p - ai - machine-learning - deep-learning - huggingface - finetuned-model license: apache-2.0 datasets: - Girinath11/aiml_code_debug_dataset metrics: - bleu base_model: - Salesforce/codet5p-220m --- # Model Card for Model ID This is a fine-tuned version of the [Salesforce/codet5p-220m](https://huggingface.co/Salesforce/codet5p-220m) model, specialized for real-world AI, ML, and Deep Learning code bug-fix tasks. The model was trained on 150,000 code pairs (buggy → fixed) extracted from GitHub projects relevant to the AI/ML/GenAI ecosystem. It is optimized for suggesting correct code fixes from faulty code snippets and is highly effective for debugging and auto-correction in AI coding environments. ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** [Girinath V] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [Text-to-text Transformer (Encoder-Decoder)] - **Language(s) (NLP):** [Programming (Python, some support for other AI/ML languages] - **License:** [Apache 2.0] - **Finetuned from model:** [[Salesforce/codet5p-220m](https://huggingface.co/Salesforce/codet5p-220m)] ### Model Sources: - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use -Fix real-world AI/ML/GenAI Python code bugs. - Debug model training scripts, data pipelines, and inference code. - Educational use for learning from code correction. ### Downstream Use [optional] - Integrated into code review pipelines. - LLM-enhanced IDE plugins for auto-fixing AI-related bugs. - Assistant agents in AI-powered coding copilots. ### Out-of-Scope Use - General-purpose natural language tasks. - Code generation unrelated to AI/ML domains. - Use on production code without human review. ## Bias, Risks, and Limitations ## Biases - Model favors AI/ML/GenAI-related Python patterns. - Not trained for full-stack or UI/frontend code debugging. ### Limitations - May not generalize well outside its fine-tuned domain. - Struggles with ambiguous or undocumented buggy code. ### Recommendations - Use alongside human review. - Combine with static analysis for best results. ## How to Get Started with the Model from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Girinath11/aiml_code_debug_model") model = AutoModelForSeq2SeqLM.from_pretrained("Girinath11/aiml_code_debug_model") inputs = tokenizer("buggy: def add(a,b) return a+b", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0])) ## Training Details ### Training Data -150,000 real-world buggy–fixed Python code pairs. -Data collected from GitHub AI/ML repositories. -Includes data cleaning, formatting, deduplication. ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]