Generate captions from images
Generate text responses using images and text input
Generate captions for images