AI Audio Lab: Create Voices, Sounds, and More
Before each live online session, Tech Training will provide a Zoom link for live online classes, along with any required class materials.
Unlock AI tools like Eleven Labs and OpenAI to create high-quality audio -- from sound effects to voice options. This fast-paced, hands-on session shows you how to craft pro-grade audio for projects, presentations, and creative storytelling.
- Program Description
We'll explore the Eleven Labs interface -- including the document editor, sound effects generator, and voice tools (voice options and voice changer) -- as well as both speech-to-text and text-to-speech capabilities. You'll leave with the skills to convert your documents into compelling audio experiences and experiment with voice styling using Eleven Labs and OpenAI's Fine-Tuned Models (FM).
Note: This course does not cover Conversational AI or voice cloning features.
- Topic Outline
Topics Covered
- Getting Started with Eleven Labs
Get familiar with the core toolkit you'll use to create AI-generated audio. This segment covers foundational skills and feature navigation.- Document Editor
Learn to input and structure content for voice output. Discover best practices for formatting text to optimize clarity, tone, and pacing when read aloud by AI voices.- Sound Effect Generator
Dive into Eleven Labs' SFX tool to enhance your audio projects and create custom sound effects from text prompts.- Voice Options
Explore the pre-set voice library. We'll cover when to use different voices depending on audience, context, and content type.- Voice Changer
Transform recorded or uploaded voice clips using AI-driven modulation.- Speech-to-Text (STT)
Use Eleven Labs' transcription features to convert spoken audio into editable text. You'll practice uploading audio, refining transcripts, and using the output to repurpose content into articles, notes, or captions.- Text-to-Speech (TTS)
Turn written content into spoken audio. Learn how to customize delivery settings (e.g., speed, tone, emphasis), and export final audio for use in podcasts, videos, or learning modules.- Voice Styling & Design
Take your voiceovers from "robotic" to "resonant." This section shows how to fine-tune AI voices for personality, emotion, and branding.- OpenAI Fine-Tuned Model (FM)
Explore how OpenAI's FM capabilities can add nuance and depth to your voice outputs. We'll walk through simple API-based workflows to connect your content to advanced voice stylization.- Prerequisites
Pre-Requisites
To make the most of this session, you should be comfortable with the following:Computer Proficiency
- Opening and switching between programs (e.g., browsers, Word, etc.)
- Navigating multiple tabs
- Understanding browser differences (Chrome, Firefox, etc.)Zoom Proficiency
- Accessing chat and participant panel
- Muting/unmuting, enabling/disabling video
- Sharing your screen
- Navigating the Zoom overlay controlsGenerative AI Familiarity
- Regular use of tools like ChatGPT, Claude, Gemini, Mistral LeChat, DeepSeek, etc.
- Basic understanding of prompting techniquesIncluded with the Course: 1-month subscription to Eleven Labs for continued exploration
Custom training workshops are available for this program
Technology training sessions structured around individual or group learning objectives. Learn more about custom training.
Special Group Rates
For groups of 5 or more, special rates are available. Please contact techtraining@stanford.edu for more details.
University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.
Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.
