Skip to main content

ElevenLabs Integration

Convert speech to text and generate natural-sounding speech with ElevenLabs' advanced AI voice models.


Timbal provides seamless integration with ElevenLabs' powerful AI voice and speech services.

This integration allows you to:

  • Convert speech to text using ElevenLabs' advanced transcription models
  • Generate natural-sounding speech from text using various voices and models

Prerequisites

Before using the ElevenLabs integration, you'll need:

  1. An ElevenLabs account - Sign up here
  2. An API key - Get your API key here
  3. Set up your environment variable:
    export ELEVENLABS_API_KEY='your-api-key-here

Installation

No additional installation is required.

Import the specific functions you need:

from timbal.steps.elevenlabs import stt, tts
  1. Sign up for a ElevenLabs account here, and obtain an API key.
  2. Store your obtained API key in an environment variable named ELEVENLABS_API_KEY to facilitate its use by the tools.

Speech to Text (STT)

Description

The Speech to Text (STT) service converts audio files into text using ElevenLabs' advanced transcription models.

Example

from timbal.steps.elevenlabs import stt
from timbal.types import File
# Validate and process an audio file
audio_file = File.validate("path/to/audio.wav")
transcription = await stt(audio_file=audio_file, model_id="scribe_v1")

Parameters

ParameterTypeDescriptionRequired
audio_fileFileAudio file to transcribe. Must be a valid audio file (content type starting with "audio/")Yes
model_idstrTranscription model to use. Available models:
- scribe_v1: Standard transcription model
- scribe_v1_experimental: Experimental model with enhanced features
No
from timbal.steps.elevenlabs import stt
from timbal.types import File
from timbal import Agent
# Create an agent with STT capability
agent = Agent(
tools=[stt]
)
# Process an audio file
audio_file = File.validate("path/to/audio.wav")
response = await agent.complete(prompt=audio_file, "What does it say?")

Text to Speech (TTS)

The Text to Speech (TTS) service converts text into natural-sounding speech using ElevenLabs' voice models.

Example

from timbal.steps.elevenlabs import tts
# Generate speech from text
audio_file = await tts(
text="Hello, how are you?",
voice_id="your-voice-id",
model_id="eleven_flash_v2_5"
)

Parameters

ParameterTypeDescriptionRequired
textstrText to convert to speechYes
voice_idstrID of the voice to useNo
model_idstrTTS model to use. Available models:
- eleven_flash_v2_5: Fast and efficient model
- eleven_multilingual_v2: Model with multilingual support
No

Integration with Agent

from timbal.steps.elevenlabs import tts
from timbal import Agent
# Create an agent that responds with audio
agent = Agent(
system_prompt="Answer always with an audio."
tools=[tts]
)
response = await agent.complete(prompt="What is 2+2?")