File size: 6,975 Bytes

---
library_name: transformers
license: apache-2.0
datasets:
- TokenBender/code_instructions_122k_alpaca_style
metrics:
- accuracy
language:
- en
widget:
  - text: |
      ### Instruction: write a neural network complete code in Keras 
      ### Input: Use cifar dataset 
      ### Response:
  
    output:
      text: >-
        from keras.datasets import cifar10
        import numpy as np
        from keras.models import Sequential
        from keras.layers import Dense, Dropout, Activation, Flatten
        from keras.layers import Conv2D, MaxPooling2D
        from keras.utils import to_categorical
        from keras.optimizers import Adam
        
        # Load the CIFAR-10 dataset
        (X_train, y_train), (X_test, y_test) = cifar10.load_data()
        
        # Normalize pixel values to be between 0 and 1
        X_train = X_train.astype('float32') / 255
        X_test = X_test.astype('float32') / 255
        
        # Convert class vectors to binary class matrices
        y_train = to_categorical(y_train, num_classes=10)
        y_test = to_categorical(y_test, num_classes=10)
        
        # Create the model
        model = Sequential()
        model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3)))
        model.add(Activation('relu'))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        
        model.add(Conv2D(64, (3, 3)))
        model.add(Activation('relu'))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        
        model.add(Flatten())
        model.add(Dense(64))
        model.add(Activation('relu'))
        model.add(Dropout(0.5))
        model.add(Dense(10))
        model.add(Activation('softmax'))
        
        # Compile the model
        model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
        
        # Train the model
        model.fit(X_train, y_train, batch_size=32, epochs=10, validation_split=0.2)
  
pipeline_tag: text-generation
base_model: codellama/CodeLlama-13b-Instruct-hf
---

<p align="center" style="font-size:34px;"><b>Panda-Coder 🐼</b></p>

# Panda Coder-13B vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yP-11PWqLrDn5ymKDWMfz9r6jLpTcTAH?usp=sharing)

![1*E4V6iZPaeE6iTZjAleCt3Q.webp](https://cdn-uploads.huggingface.co/production/uploads/630f3058236215d0b7078806/q4k9YbKDW3eypKmOJio5j.webp)

 Panda Coder is a state-of-the-art LLM capable of generating code on the NLP based Instructions

 ## Model description

 🤖 Model Description: Panda-Coder is a state-of-the-art LLM, a fine-tuned model, specifically designed to generate code based on natural language instructions. It's the result of relentless innovation and meticulous fine-tuning, all to make coding easier and more accessible for everyone.

## Inference 

> Hardware requirements:
>
> 30GB VRAM - A100 Preferred

### vLLM - For Faster Inference

#### Installation

```
!pip install vllm
```

**Implementation**:

```python
from vllm import LLM, SamplingParams

llm = LLM(model='aiplanet/panda-coder-13B',gpu_memory_utilization=0.95,max_model_len=4096)

prompts = [""" ### Instruction: Write a Java code to add 15 numbers randomly generated.
### Input: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
### Response:
""",
"### Instruction: write a neural network complete code in Keras ### Input: Use cifar dataset ### Response:"
]

sampling_params = SamplingParams(temperature=0.1, top_p=0.95,repetition_penalty = 1.1,max_tokens=1000)

outputs = llm.generate(prompts, sampling_params)

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(generated_text)
    print("\n\n")
```


### Transformers - Basic Implementation

```python
import torch
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments,BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = "aiplanet/panda-coder-13B"

base_model = AutoModelForCausalLM.from_pretrained(model, quantization_config=bnb_config, device_map="cuda")

tokenizer = AutoTokenizer.from_pretrained(model, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

prompt = f"""### Instruction:
Below is an instruction that describes a task. Write a response that appropriately completes the request.

Write a Python quickstart script to get started with TensorFlow

### Input:

### Response:
"""

input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
outputs = base_model.generate(input_ids=input_ids, max_new_tokens=512, do_sample=True, top_p=0.9,temperature=0.1,repetition_penalty=1.1)

print(f"Output:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}")
```

Output

```bash
Output:
import tensorflow as tf

# Create a constant tensor
hello_constant = tf.constant('Hello, World!')

# Print the value of the constant
print(hello_constant)
```

## Prompt Template for Panda Coder 13B

```
### Instruction:
{<add your instruction here>}

### Input:
{<can be empty>}

### Response:
```

 ## 🔗 Key Features:

 🌟 NLP-Based Coding: With Panda-Coder, you can transform your plain text instructions into functional code effortlessly. No need to grapple with syntax and semantics - it understands your language.

 🎯 Precision and Efficiency: The model is tailored for accuracy, ensuring your code is not just functional but also efficient.

 ✨ Unleash Creativity: Whether you're a novice or an expert coder, Panda-Coder is here to support your coding journey, offering creative solutions to your programming challenges.

 📚 Evol Instruct Code: It's built on the robust Evol Instruct Code 80k-v1 dataset, guaranteeing top-notch code generation.

 📢 What's Next?: We believe in continuous improvement and are excited to announce that in our next release, Panda-Coder will be enhanced with a custom dataset. This dataset will not only expand the language support but also include hardware programming languages like MATLAB, Embedded C, and Verilog. 🧰💡

 ## Get in Touch


 You can schedule 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)

 Stay tuned for more updates and be a part of the coding evolution. Join us on this exciting journey as we make AI accessible to all at AI Planet!



 ### Framework versions

- Transformers 4.33.3
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3

 ### Citation

 ```
 @misc {lucifertrj,
	author       = { {Tarun Jain} },
	title        = { Panda Coder-13B by AI Planet},
	year         = 2023,
	url          = { https://huggingface.co/aiplanet/panda-coder-13B },
	publisher    = { Hugging Face }
}
 ```