diabetes_assistant

Sleeping

App Files Files Community

diabetes_assistant / README.md

mmccanse

Update README.md

3b6c6f7 verified 3 months ago

preview code

raw

history blame contribute delete

5.29 kB

	---
	title: Diabetes Assistant - Multilingual!
	emoji: 🔥
	colorFrom: indigo
	colorTo: green
	sdk: gradio
	sdk_version: 5.33.0
	app_file: app.py
	pinned: false
	license: cc-by-4.0
	short_description: Multi-lingual Diabetes chatbot. Responses in text and audio.
	---
	# Project Title: Diabetes Assistant

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/660ec05035d092e3fc20c415/65DL3GyTksWQ6OX23PbMS.png)



	## Objective
	The objective of this project was to showcase our individual learnings about large language models, translation application, chatbot, gradio and hugging face.

	## Sources
	- ChatGPT
	- Copilot
	- Hugging Face
	- Gradio
	- OpenAI Whisper (https://openai.com/research/whisper)
	- Langchain (https://www.langchain.com/)
	- Amazon Polly (https://docs.aws.amazon.com/polly/latest/dg/what-is.html)
	- Helsinki-NLP/opus-mt models (https://huggingface.co/Helsinki-NLP)

	## Citations
	This project utilizes models from the OPUS-MT project. We thank Jörg Tiedemann and Santhosh Thottingal for their work:

	- Tiedemann, J., & Thottingal, S. (2020). OPUS-MT – Building open translation services for the World. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (pp. 479–480). European Association for Machine Translation. [https://aclanthology.org/2020.eamt-1.61](https://aclanthology.org/2020.eamt-1.61)

	- Tiedemann, J. (2020). The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT. In Proceedings of the Fifth Conference on Machine Translation (pp. 1174–1182). Association for Computational Linguistics. [https://aclanthology.org/2020.wmt-1.139](https://aclanthology.org/2020.wmt-1.139)



	## Method

	L3-AI Created an assistant to ask your diabetes questions and when needed translate responses to an alternate language.

	1. <b>Transcription:</b> Individuals could either voice their questions by hitting the microphone, upload an mp3 of their question, or write their diabetes
	related questions within the Hugging Face Application. For questions that were either voice activated or mp3 uploaded we used <i>openai/whisper-large</i> to
	transcribe the audio into written format.

	2. <b>LLM Model:</b> Using <i>WikipediaLoader</i>, we created a large language model that tapped into Wikipedia specifically grabbing information related to the diabetes
	question.

	3. <b>Chatbot Response and Voice Over:</b> L3-AI added a feature that allowed our Hugging Face Application to verbalize the response from the LLM as well as provide responses in
	written format. We used <i>Amazon Polly</i>, to provide written text to speech.

	4. <b>Translation:</b> <i>Helsinki-NLP</i> was used to translate the information provided from the LLM.

	5. <b>Gradio:</b> L3-AI used the gradio application to organize and produce each level and response of the four different models utilized.

	6. <b>Hugging Face:</b> Finally, L3-AI pushed all information to Hugging Face Application for speed as well as production.


	## Interface

	https://huggingface.co/spaces/L3-AI/diabetes_assistant

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6604cd9fda664781b225e0b6/jP1tHn0iF6NVWChSTxcr9.png)

	## Learnings
	Natural Language Processing (NLP):
	- Gained insights into NLP techniques and methodologies used for building our conversational agent.
	- Learned about tokenization, language modeling, and how to improve speed within our chatbot development.

	Model Selection and Evaluation:
	- Evaluate different language models such as LLM and Polly for their performance in generating human-like responses.
	- Compare model capabilities, including coherence, fluency, and ability to stay on topic.
	- Understand the strengths and limitations of each model in different conversational contexts.

	Fine-tuning:
	- Address issues such as speed and translation accuracy by fine-tuning model parameters and configurations.
	- Implement strategies to mitigate challenges such as text truncation and limited language support to enhance overall user experience.
	- Iterate on model architecture, hyperparameters, and data preprocessing techniques to achieve desired outcomes and user satisfaction.

	Hugging Face:
	- Emphasize the necessity of creating a comprehensive requirements document outlining dependencies, libraries, and configurations required for Hugging Face model integration.
	- Avoid reliance on Jupyter notebooks for production-level deployment due to limitations in scalability, version control, and reproducibility.

	Streamlit VS Gradio:
	- Recognized Streamlit's appeal for deployment purposes, particularly for its visually appealing characteristics and user interface elements.
	- However, prioritized Gradio for deployment due to its compatibility with the core functionality and focus of our model, prioritizing model performance and functionality over visualization aesthetics.


	## Opportunities and Next Steps
	For L3-AI concept design we centered on diabetes however, we thought in future endeavors expanding to other disease states would enhance the work that was started.
	The source material limited which sources we could pull from due to API restrictions.


	## Credits
	We would like to thank our pets who kept us company as we worked on coding and this application.