File type, then pass it alongside text in a list:
Use vision-capable models for image processing. Check Model Capabilities to see which models support image input.
Key Features
- Automatic Processing: Images are automatically converted to the correct format for vision models
- Multi-modal: Combine text and images in the same conversation
- File Support: Works with local files, URLs, and base64 data