Task names follow the Hugging Face pipeline naming convention.
Tasks
| Task | Supports Streaming | Supported Endpoints |
|---|---|---|
| audio-classification | No | /v1/classifications/audio |
| automatic-speech-recognition | Yes | /v1/audio/transcriptions |
| conversational | Yes | /v1/chat/completions, /v1/completions |
| depth-estimation | No | /v1/images/depth-estimation |
| document-question-answering | Yes | /v1/question-answering/document |
| feature-extraction | No | /v1/feature-extraction |
| fill-mask | No | /v1/fill-mask |
| image-classification | No | /v1/classifications/image |
| image-feature-extraction | No | /v1/images/feature-extraction |
| image-segmentation | No | /v1/images/segmentation |
| text-generation (decoder-only models) | Yes | /v1/chat/completions, /v1/completions |
| image-text-to-text | Yes | /v1/chat/completions, /v1/completions |
| audio-text-to-text | Yes | /v1/chat/completions, /v1/completions |
| video-text-to-text | Yes | /v1/chat/completions, /v1/completions |
| image-to-image | No | /v1/images/edits, /v1/images/generations |
| image-text-to-image | No | /v1/images/edits, /v1/images/generations |
| image-text-to-video | No | /v1/videos |
| unconditional-image-generation | No | /v1/images/generations |
| image-to-text | No | /v1/images/to-text |
| mask-generation | No | /v1/images/mask-generation |
| object-detection | No | /v1/images/object-detection |
| question-answering | No | /v1/question-answering |
| summarization | Yes | /v1/completions |
| sentence-similarity | No | /v1/embeddings |
| sentence-embeddings | No | /v1/embeddings |
| text-ranking | No | /v1/rerank |
| table-question-answering | No | /v1/question-answering/table |
| text2text-generation (encoder-decoder models) | Yes | /v1/completions |
| text-classification | No | /v1/classifications/text, /v1/classifications/zero-shot |
| text-to-audio | Yes | /v1/audio/speech |
| text-to-image | No | /v1/images/generations |
| text-to-speech | Yes | /v1/audio/speech |
| token-classification | No | /v1/classifications/token, /v1/classifications/zero-shot/token |
| translation | Yes | /v1/completions |
| video-classification | No | /v1/classifications/video |
| image-to-video | No | /v1/videos |
| text-to-video | No | /v1/videos |
| visual-question-answering | No | /v1/question-answering/visual |
| zero-shot-classification | No | /v1/classifications/zero-shot |
| zero-shot-image-classification | No | /v1/classifications/zero-shot/image |
| zero-shot-audio-classification | No | /v1/classifications/zero-shot/audio |

