|
--- |
|
title: ViVQA-X Vietnamese Visual Question Answering with Explanations |
|
emoji: πΌοΈ |
|
colorFrom: blue |
|
colorTo: green |
|
sdk: gradio |
|
app_file: app.py |
|
pinned: false |
|
short_description: Interactive Vietnamese VQA demo with explanations. |
|
--- |
|
|
|
|
|
# π€ ViVQA-X: Vietnamese VQA with Explanations Demo |
|
|
|
This Hugging Face Space provides an demo for the **ViVQA-X** project. It showcases the `LSTM-Generative` baseline model, which can answer questions about an image in Vietnamese and generate a natural language explanation for its prediction. |
|
|
|
This demo is based on the research paper **"An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset"**. |
|
|
|
|
|
## π Core Project, Dataset, and Paper |
|
|
|
This demo is one part of the larger ViVQA-X project. For full access to the code, dataset, and additional models, please visit the official resources: |
|
|
|
* **π» Main GitHub Repository:** [https://github.com/duongtruongbinh/ViVQA-X](https://github.com/duongtruongbinh/ViVQA-X) |
|
* **π Hugging Face Dataset:** [https://huggingface.co/datasets/duongtruongbinh/ViVQA-X](https://huggingface.co/datasets/duongtruongbinh/ViVQA-X) |
|
* **π Research Paper:** For complete details on the pipeline and benchmark results, please refer to our paper cited below. |
|
|
|
|
|
|
|
## π Features & Model |
|
|
|
This demo provides: |
|
* **Vietnamese Visual Question Answering**: Ask a question in Vietnamese about the uploaded image. |
|
* **Explanation Generation**: The model provides a justification for its answer. |
|
* **Confidence Score**: Each answer comes with a confidence score. |
|
|
|
The model architecture consists of a ResNet-based visual encoder, an LSTM-based question encoder, and an LSTM-based decoder for generating explanations. |
|
|
|
|
|
|
|
## π§ Running Locally |
|
|
|
To run this demo on your own machine: |
|
|
|
1. Clone the repository for this Space: |
|
```bash |
|
# Replace {your-username} and {your-space-name} with your HF details |
|
git clone [https://huggingface.co/spaces/](https://huggingface.co/spaces/){your-username}/{your-space-name} |
|
cd {your-space-name} |
|
``` |
|
|
|
2. Install the required dependencies: |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
3. Run the Gradio application: |
|
```bash |
|
python app.py |
|
``` |
|
The interface will be available at `http://localhost:7860`. |
|
|
|
--- |
|
|
|
## π Citation |
|
|
|
If you use our dataset, code, or models in your research, please cite our paper: |
|
|
|
```bibtex |
|
@misc{vivqax2025, |
|
author = {Duong, Truong-Binh and Tran, Hoang-Minh and Le-Nguyen, Binh-Nam and Duong, Dinh-Thang}, |
|
title = {An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset}, |
|
howpublished = {Accepted for publication in the Proceedings of The International Conference on Intelligent Systems & Networks (ICISN 2025), Springer Lecture Notes in Networks and Systems (LNNS)}, |
|
year = {2025} |
|
} |
|
|