AI & Machine Learningassemblyai.com &nearr;

AssemblyAI API for AI Agents

AI-powered audio transcription and understanding

AssemblyAI provides APIs for audio transcription, speaker detection, content moderation, and summarization. AI agents can use AssemblyAI to process audio content, extract insights, and automate audio analysis workflows.

What AI agents can do with AssemblyAI

Structured actions an AI agent can execute through the AssemblyAI API

Action

Description

Inputs

Outputs

transcribe

Transcribe audio with AI features

audio_url, speaker_labels, auto_chapters, sentiment_analysis

transcript_id, text, words[], chapters[]

getTranscript

Retrieve a completed transcript

transcript_id

status, text, utterances[], summary

lemur

Ask questions about a transcript using LeMUR

transcript_ids, prompt, model

response, usage

Use cases for AssemblyAI + AI agents

Meeting notes and action item extraction
Content moderation for audio
Podcast summarization
Sentiment analysis on customer calls
Audio-based Q&A with LeMUR

How to connect AssemblyAI to an AI agent

1Get your AssemblyAI API key
2Generate an AgentSpec for transcription actions
3Define transcription and analysis actions
4Publish for discovery
5Test with sample audio

Best practices

✓Enable speaker labels for multi-speaker content

✓Use auto_chapters for long-form audio segmentation

✓Leverage LeMUR for asking questions about transcripts

✓Handle async transcription with polling

✓Choose the appropriate model tier for accuracy vs speed

Frequently asked questions

What makes AssemblyAI different from Deepgram?+

AssemblyAI includes built-in AI features like auto-chapters, sentiment analysis, and LeMUR (LLM-based Q&A over transcripts). It is more of an audio understanding platform than just transcription.

What is LeMUR?+

LeMUR lets you ask questions about transcripts using AI. Submit transcript IDs and a prompt, and LeMUR returns answers grounded in the audio content. Great for agents that need to extract specific information from recordings.

How do agents handle long transcription jobs?+

Submit the audio URL, receive a transcript ID, then poll getTranscript until status is "completed". Use webhooks for production workflows to avoid polling overhead.