Easily build production-ready
Viral Clip Finder
Viral Clip Finder
in days not months
Create powerful media processing pipelines with just one API call. Our open-source platform makes it simple to build and scale your workflows.
What is Mediatoad?
Mediatoad is a open-source, well-structured, enterprise-grade media processing platform that emphasizes developer experience, scalability, and maintainability while providing powerful media processing capabilities through a unified API.
npm install @mediatoad/sdk
.transcription()
Transcribe your audio and video files with high accuracy and speed.
const mt = new MediaToad("API_KEY");const result = await mt.createJob({jobId: 'transcription-job-mov_bbb', // Unique identifier for the jobassets: [{name: 'mov_bbb.mp4', // Name of the media fileurl: 'https://www.w3schools.com/html/mov_bbb.mp4', // URL to fetch the media file},],tasks: [{id: 'transcription-task-xzYGa6N1F9nyGEubqvXH9', // Unique identifier for the transcription taskoperation: 'transcription', // Task type: "transcription"asset: 'mov_bbb.mp4', // asset picked from "assets" to transcribeprovider: 'assemblyai', // Third-party transcription providerlanguage: 'hi', // Language of the transcriptionnotify: {url: 'https://webhook.site/<task-notification-url>', // Webhook url to receive task status updates},apiKey: process.env.ASSEMBLYAI_API_KEY, // API key for the transcription provider (should be securely stored)},],storage: {bucket: 'my-bucket', // Name of the storage bucket where results will be savedbase: 'transcripts/', // Folder path in the bucket for storing transcriptsstorageType: 's3', // Storage type (could be s3 or az for s3 or azure blob storage)},notify: {url: 'https://webhook.site/<job-notification-url>', // Webhook to receive job-level status updates},});
Key Features
Multi-Provider Support
AWS, Azure, AssemblyAI, Deepgram, Google and more
Fallback System
Automatic provider fallback for high reliability
Parallel Processing
Run multiple tasks simultaneously
Webhook Notifications
Real-time updates on job progress
.segmentation()
Automatically detect and segment video content into meaningful scenes.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
Smart Scene Detection
AI-powered scene boundary detection
Custom Duration
Configurable minimum segment length
Keyframe Extraction
Automatic thumbnail generation
Metadata Export
Detailed segment information
.findViralClips()
Automatically identify potentially viral moments in your video content.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
AI Analysis
ML-powered moment detection
Virality Score
Engagement potential prediction
Auto-Clip Generation
Ready-to-share viral moments
Platform Optimization
Format for different social platforms
.translation()
Translate content across multiple languages with high accuracy.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
100+ Languages
Support for major global languages
Context-Aware
Maintains context and nuance in translations
Batch Translation
Process multiple files simultaneously
Quality Metrics
Confidence scores for translations
.clip()
Extract and create clips from your video content with frame-perfect precision.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
Frame-Perfect Cutting
Precise timecode-based extraction
Batch Clipping
Create multiple clips in one job
Format Conversion
Export to various formats and resolutions
Preview Generation
Automated thumbnail creation
.dub()
Automatically generate voiceovers for your videos in multiple languages.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
Multi-Language Support
Support for 40+ languages and dialects
Multiple Voice Options
Various voice profiles per language
Audio Synchronization
Automatic lip-sync adjustment
Background Preservation
Maintains original music and effects
.textToSpeech()
Convert text to natural-sounding speech with customizable voices.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
Natural-Sounding Voices
High-quality, lifelike speech synthesis
Multi-Language Support
Generate speech in 30+ languages
Voice Customization
Adjust pitch, speed, and emphasis
Multiple Audio Formats
Export as MP3, WAV, or OGG
.generateVideo()
Create professional videos programmatically from templates and content.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
Template Library
Pre-designed professional templates
Dynamic Content
Automatically populate with your data
Custom Branding
Add logos, fonts, and color schemes
Batch Generation
Create multiple videos at scale
.faceDetection()
Identify and track objects, faces, and people in your video content with high accuracy.
const mt = new Mediatoad('API_KEY')const data = await mt.transcribe('url', { provider: 'deepgram' })const { speakers, transcription } = data
Key Features
Multi-Object Detection
Identify faces, people, objects and custom classes
Temporal Tracking
Track objects across video frames
Custom Models
Support for specialized detection models
Detailed Metadata
Bounding boxes, timestamps, and confidence scores
Powerful Media Processing Features
MediaToad provides a comprehensive suite of tools for handling all your media processing needs.
Workflow Orchestration
Powered by Temporal for fault-tolerant job execution with automatic retries
- Complex multi-step operations
- State tracking and resumption
- Scalable processing architecture
Media Processing Engine
Built on FFmpeg with an intuitive API layer
- Video/audio encoding and transformations
- Scene detection and segmentation
- Thumbnail generation and metadata extraction
Cloud-Agnostic Storage
Works with AWS S3, Azure Blob Storage, MinIO, and more
- No vendor lock-in
- Optimized for large-file handling
- Seamless media uploads/downloads
AI & Automation
Integrated AI-driven video analysis and processing
- Speech-to-text transcription
- Object/person detection
- Content moderation
Job Tracking & Notifications
Real-time monitoring and integration capabilities
- Webhook notifications
- Server-Sent Events (SSE)
- Automatic retries and failure handling
Developer-Friendly API
Simple JSON-based job definitions
- Intuitive job configuration
- Comprehensive documentation
- Flexible integration options
Roadmap & Changelog
Upcoming Features
Text-to-Speech (TTS)
in-progressAI-driven voice synthesis from transcripts
Programmatic Video Generation
plannedAutomated creation of media content from templates
Advanced Face/Object Detection
plannedEnhanced video analytics powered by AI
Real-time Processing
plannedSupport for live streaming and real-time analysis
Recent Updates
v1.2.0
- Added support for WebM format
- Improved transcoding speed by 40%
- New REST API endpoints
v1.1.0
- Introduced batch processing
- Fixed memory leaks in long operations
- Updated documentation
How MediaToad Works
A simple, powerful workflow for all your media processing needs.
Submit via API
Define your media operations with a simple JSON request
Temporal Orchestration
Workflows are managed with retries, state tracking, and scalable processing
Media Processing
FFmpeg and AI services handle encoding, transcription, and analysis
Storage & Delivery
Processed files are stored in your specified cloud backend
Notifications
Receive webhook/SSE notifications when jobs complete
Analytics & Insights
Track performance metrics and monitor workflow efficiency
Who Benefits from MediaToad?
MediaToad serves a wide range of users with diverse media processing needs.
For LLM Providers
Streamline your media processing pipeline for production
LLM providers can leverage MediaToad to efficiently process and analyze large volumes of media content, enabling better training data preparation and content moderation.
- Scale media processing for training data
- Automate content moderation workflows
- Integrate with existing AI systems
For Developers
Enhance developer experience with simplified media handling
Developers can focus on building great applications without worrying about the complexities of media processing, storage, and workflow management.
- Intuitive API for complex media operations
- Reliable, fault-tolerant processing
- Flexible integration with existing systems
For Content Platforms
Scale your media infrastructure reliably
Content platforms can handle growing media libraries with a scalable, reliable solution that adapts to changing requirements and traffic patterns.
- Process user-generated content at scale
- Automate thumbnail and preview generation
- Implement content analysis and moderation
For Enterprises
Avoid vendor lock-in with a flexible solution
Enterprises can maintain control over their media processing infrastructure while avoiding the limitations and costs of proprietary solutions.
- Cloud-agnostic architecture
- Customizable to specific business needs
- Integrate with existing enterprise systems
Ready to Transform Your Media Processing?
Get started with MediaToad today and experience the power of open-source media processing at scale.