PIPELINE:
1. Pick language
2. Input image/text/audio
3. give input to chatbot which explains the words and structures
4. also parse words and create flashcards for all of the words
5. show flashcards and give option to add to anki

TASKS NEEDED:
1. image - text (OCR)
  - GOT-OCR (716M parameters)
2. text - text (chatbot)
  - chatgpt 4o
3. audio - text
  - whisper
4. chatbot to explain
  - chatgpt 4o
5. text to speech