|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text2text-generation |
|
library_name: transformers |
|
tags: |
|
- text-generation-inference |
|
widget: |
|
- text: > |
|
Given a SQL table named 'price_data' with the following columns: |
|
|
|
Transaction_ID, Platform, Product_ID, User_ID, Transaction_Amount |
|
|
|
Construct a SQL query to answer the following question: |
|
|
|
Q: How many rows are there |
|
|
|
example_title: "How many rows are there?" |
|
--- |
|
|
|
A text2sql T5 model, finetuned from Flan-t5-base. Code: [Link](https://github.com/kevinng77/chat-table-t5/blob/master/prompt.py) |
|
A further finetuning will significantly increase the performance of Flan-t5 model on Text-to-SQL tasks. |
|
|
|
|
|
## Inference Example: |
|
|
|
|
|
```python |
|
from transformers import T5Tokenizer, T5ForConditionalGeneration, pipeline |
|
|
|
table_columns = "Transaction_ID, Platform, Product_ID, User_ID, Transaction_Amount, Region, Transaction_Time, Transaction_Unit, User_Comments" |
|
|
|
table_name = "my_data" |
|
|
|
PROMPT_INPUT = f""" |
|
Given a SQL table named '{table_name}' with the following columns: |
|
{table_columns} |
|
|
|
Construct a SQL query to answer the following question: |
|
Q: {{question}}. |
|
""" |
|
|
|
model_id = "kevinng77/chat-table-flan-t5" |
|
tokenizer = T5Tokenizer.from_pretrained(model_id) |
|
model = T5ForConditionalGeneration.from_pretrained(model_id) |
|
|
|
input_text = PROMPT_INPUT.format_map({"question": "How many rows are there in the table?"}) |
|
|
|
pipe = pipeline( |
|
"text2text-generation", |
|
model=model, tokenizer=tokenizer, max_length=512 |
|
) |
|
``` |