Model catalog

Deploy curated AI models in minutes

Launch world-class LLM, vision, audio, and generative models on dedicated GPUs. Every preset includes sensible defaults, autoscaling, and real-time analytics.

Latency (median)

220 ms

Measured across production workloads

Model refreshes

Weekly

Automatic updates with rollback support

Dedicated support

Specialist team

Fine-tuning, distillation, evals, & more

LLM models

Optimized presets with opinionated defaults so you can deploy without guesswork.

1 model

Llama3-8B

LLM

Fast and efficient large language model for chat and text generation

Available GPU Tiers:

EFG-SEFG-M

Deploy Model

Image models

Optimized presets with opinionated defaults so you can deploy without guesswork.

3 models

Stable Diffusion 3.5 Flash

Image

Ultra-fast image generation with high quality output

Available GPU Tiers:

EFG-SEFG-M

Deploy Model

FLUX.1-dev

Image

State-of-the-art image generation model with exceptional detail

Available GPU Tiers:

EFG-MEFG-L

Deploy Model

Stable Video Diffusion

Image

Generate smooth video sequences from text or images

Available GPU Tiers:

EFG-L

Deploy Model

Audio models

Optimized presets with opinionated defaults so you can deploy without guesswork.

1 model

Whisper-v3

Audio

Advanced speech recognition and transcription model

Available GPU Tiers:

EFG-S

Deploy Model

Vision models

Optimized presets with opinionated defaults so you can deploy without guesswork.

1 model

Llama 3.2 Vision

Vision

Multimodal model for image understanding and analysis

Available GPU Tiers:

EFG-MEFG-L

Deploy Model

Need a proprietary or fine-tuned model?

We help teams productionize custom weights, manage fine-tuning pipelines, and deliver thousands of inferences per second without infrastructure overhead.

Book a consult Pricing overview