duongtruongbinh's picture
Fix front-matter emoji and shorten description
8af3d31
---
title: ViVQA-X Vietnamese Visual Question Answering with Explanations
emoji: πŸ–ΌοΈ
colorFrom: blue
colorTo: green
sdk: gradio
app_file: app.py
pinned: false
short_description: Interactive Vietnamese VQA demo with explanations.
---
# πŸ€– ViVQA-X: Vietnamese VQA with Explanations Demo
This Hugging Face Space provides an demo for the **ViVQA-X** project. It showcases the `LSTM-Generative` baseline model, which can answer questions about an image in Vietnamese and generate a natural language explanation for its prediction.
This demo is based on the research paper **"An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset"**.
## πŸ”— Core Project, Dataset, and Paper
This demo is one part of the larger ViVQA-X project. For full access to the code, dataset, and additional models, please visit the official resources:
* **πŸ’» Main GitHub Repository:** [https://github.com/duongtruongbinh/ViVQA-X](https://github.com/duongtruongbinh/ViVQA-X)
* **πŸ“š Hugging Face Dataset:** [https://huggingface.co/datasets/duongtruongbinh/ViVQA-X](https://huggingface.co/datasets/duongtruongbinh/ViVQA-X)
* **πŸ“œ Research Paper:** For complete details on the pipeline and benchmark results, please refer to our paper cited below.
## πŸš€ Features & Model
This demo provides:
* **Vietnamese Visual Question Answering**: Ask a question in Vietnamese about the uploaded image.
* **Explanation Generation**: The model provides a justification for its answer.
* **Confidence Score**: Each answer comes with a confidence score.
The model architecture consists of a ResNet-based visual encoder, an LSTM-based question encoder, and an LSTM-based decoder for generating explanations.
## πŸ”§ Running Locally
To run this demo on your own machine:
1. Clone the repository for this Space:
```bash
# Replace {your-username} and {your-space-name} with your HF details
git clone [https://huggingface.co/spaces/](https://huggingface.co/spaces/){your-username}/{your-space-name}
cd {your-space-name}
```
2. Install the required dependencies:
```bash
pip install -r requirements.txt
```
3. Run the Gradio application:
```bash
python app.py
```
The interface will be available at `http://localhost:7860`.
---
## πŸ“œ Citation
If you use our dataset, code, or models in your research, please cite our paper:
```bibtex
@misc{vivqax2025,
author = {Duong, Truong-Binh and Tran, Hoang-Minh and Le-Nguyen, Binh-Nam and Duong, Dinh-Thang},
title = {An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset},
howpublished = {Accepted for publication in the Proceedings of The International Conference on Intelligent Systems & Networks (ICISN 2025), Springer Lecture Notes in Networks and Systems (LNNS)},
year = {2025}
}