Text Classification
Transformers
PyTorch
JAX
roberta
code_x_glue_cc_defect_detection
code
security
vulnerability-detection
codebert
apache-2.0
Instructions to use mangsense/codebert_java with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mangsense/codebert_java with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="mangsense/codebert_java")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("mangsense/codebert_java") model = AutoModelForSequenceClassification.from_pretrained("mangsense/codebert_java") - Notebooks
- Google Colab
- Kaggle
File size: 1,741 Bytes
411d58a ba7d849 3b0aa07 ba7d849 3b0aa07 ba7d849 3b0aa07 ba7d849 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 411d58a 3b0aa07 dd0fb08 3b0aa07 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | ---
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- pytorch
- jax
- code_x_glue_cc_defect_detection
- code
- roberta
- security
- vulnerability-detection
- codebert
- apache-2.0
license: apache-2.0
---
# CodeBERT fine-tuned for Java Vulnerability Detection
CodeBERT model fine-tuned for detecting security vulnerabilities in Java code.
## Model Description
This model is fine-tuned from [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) for binary classification of secure/insecure Java code.
## Intended Uses
- Detect security vulnerabilities in Java source code
- Binary classification: Safe (LABEL_0) vs Vulnerable (LABEL_1)
## How to Use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("mangsense/codebert_java")
model = AutoModelForSequenceClassification.from_pretrained("mangsense/codebert_java")
# run code
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np
tokenizer = AutoTokenizer.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code')
model = AutoModelForSequenceClassification.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code')
inputs = tokenizer("your code here", return_tensors="pt", truncation=True, padding='max_length')
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1
outputs = model(**inputs, labels=labels)
loss = outputs.loss
logits = outputs.logits
print(np.argmax(logits.detach().numpy()))
```
## Training Data
Trained on CodeXGLUE Defect Detection dataset.
## Limitations
- Focused on Java code only
- May not detect all types of vulnerabilities |