
Ein Sprache-zu-Text-Modell, das GPT-4o zur Transkription von Audio verwendet
Quick readCRAISEE · Jul 3, 2026reference
Try it on CRAISEE nowGo create

Ein Sprache-zu-Text-Modell, das GPT-4o zur Transkription von Audio verwendet

OpenAI's state-of-the-art image generation model, excelling at prompt adherence, crisp text rendering, and precise editing capabilities.

A complete guide to Seedance 2.0: ByteDance's multimodal AI video model — covering architecture, core features, and practical prompts all in one place.

Alibaba's Happy Horse 1.1 generates videos from text, animates a single image, or builds a video from multiple reference images. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.

Kling v3 (Video 3.0) by Kuaishou: native 4K, 60fps, multi-shot cuts, multilingual audio. Full wiki covering specs, pricing, prompts & comparisons with Runway Gen-4 and Veo 3.1.