If you've ever tried to edit audio or video using traditional software (Audacity, GarageBand, Adobe Premiere, or anything similar), you'll know it involves staring at a waveform. That wobbly line represents your audio, and to cut a section, you have to find the right spot by eye, play it back, scrub through the timeline, and hope you don't accidentally chop the beginning off a word. It works, but it's fiddly, time-consuming, and feels completely disconnected from what was actually said.
AI audio editing flips this. The tool transcribes your audio, shows you the text, and lets you edit the audio by editing the text. Want to remove that "umm" in the middle of your sentence? Delete it from the transcript. Want to cut an entire paragraph you rambled through? Select the text and press delete. The audio follows the text. It sounds too simple to be real until you try it, at which point you wonder why audio editing ever worked any other way.
Text-based editing is only part of the picture, though. AI can also clean up your audio quality after the fact: removing background noise, reducing echo, and making a recording from your kitchen sound like it was made in a proper studio. These enhancement tools have changed what counts as "good enough" recording conditions. You no longer need a quiet room and an expensive microphone to produce audio that sounds professional.
This Thing gives you hands-on experience with both: editing audio through text, and enhancing audio quality with AI. By the end, you'll have a good sense of how capable these tools are, where they fall short, and whether they're useful in your own life.
How AI audio editing works
As with image generation and voice synthesis, you don't need a deep technical understanding to use these tools well. But knowing the basics helps you understand what you're hearing and why results vary.
The current tool landscape
You've got two main categories of tool here: one for text-based editing, one for audio enhancement. Here's what's available and worth trying.
Resources to explore
Your primary text-based editing tool for the activity. Free tier with 60 media minutes per month. Requires desktop app (Windows/macOS).
One-click audio enhancement. Free tier processes files up to 30 minutes with one hour of daily processing. No software to install.
Automated audio post-production with levelling, noise reduction, and loudness normalisation. Free tier of two hours per month.
Activity: edit and enhance your audio
This activity has four parts. You'll record a short audio clip, then use two different AI tools to process it: one for text-based editing and one for audio enhancement. The comparison will show you two distinct ways AI is changing audio editing.
What you'll need: a computer with a microphone (built-in is fine), a free Descript account, and a free Adobe account.
Part 1: Record your source material
Record yourself talking for two to three minutes. Use your computer's built-in microphone or your phone. Don't worry about finding a quiet room or using good equipment. A slightly imperfect recording is actually better for this exercise because it gives the AI something to work with.
Don't script it; just talk naturally. The goal is a recording that sounds like a real person speaking off the cuff, complete with the "ums," pauses, restarts, and background noise that come with that.
Part 2: Text-based editing in Descript
- Install and sign up. Download and install Descript from descript.com if you haven't already, and create a free account.
- Upload your recording. Create a new project and upload your audio file.
- Wait for transcription. Descript will transcribe your audio. This usually takes a minute or two.
- Explore the transcript. Read through it and notice how it maps to your recording. Click on any word and the audio will play from that point.
- Remove filler words. Descript highlights "ums" and "uhs" automatically. Try removing them individually or use the filler word removal feature to clear them in bulk.
- Try a bigger edit. Delete a sentence or two from the middle of your recording. Play back the result and listen for how smooth (or not) the edit sounds.
- Try Studio Sound (optional). If you have AI credits remaining, apply Studio Sound to hear the audio enhancement.
- Export your edit. Export the edited audio.
Part 3: Audio enhancement with Adobe Podcast
- Open the tool. Go to podcast.adobe.com/enhance and sign in with a free Adobe account.
- Upload your original. Upload your original recording (the unedited version, not the Descript export).
- Wait for processing. The AI typically takes under a minute for a short recording.
- Compare the versions. Listen to the enhanced version and toggle between the original and enhanced audio to hear the difference.
- Download the result. Download the enhanced version.
Part 4: Compare and reflect
You should now have three versions of your recording: the original, the Descript-edited version, and the Adobe-enhanced version. Listen to all three and write a short reflection (200–300 words) covering these questions:
- What did the text-based editing in Descript change about the content of your recording? Did removing filler words and pauses make it sound more polished, or did it lose some natural character?
- What did the Adobe Podcast enhancement change about the sound quality? Could you hear a difference in background noise, clarity, or overall polish?
- Which type of improvement (content editing or quality enhancement) made the bigger difference to your recording?
- Can you think of situations in your own life where either of these tools would be useful? Think about any audio or video content you create, even informally: voice messages, presentation recordings, training materials, or social media content.
Your output
A document or blog post containing:
- Your original recording
- The Descript-edited version (with filler words removed and at least one section cut)
- The Adobe Podcast-enhanced version
- A written reflection (200–300 words) comparing the three versions
Things to notice
As you work through the activity, pay attention to a few things that reveal how these tools work and where they have limitations.
Why this matters
Audio and video content is increasingly part of professional life. Organisations use it for training, internal communications, social media, and knowledge sharing. But the traditional barrier to creating polished audio has always been the editing. It was slow, required specialist software knowledge, and felt like a completely separate skill from the actual content creation.
AI audio editing removes that barrier in two ways. Text-based editing means you can edit a recording as naturally as you'd edit a document, a skill everyone already has. And AI enhancement means you don't need professional equipment or a treated recording space to produce audio that sounds clean and clear.
You don't need to be a "content creator" for this to be relevant. If you've ever recorded a voice note and wished you could tidy it up before sending it, or given a presentation that was recorded and wished the audio quality was better, or thought about starting a podcast but felt put off by the editing, these tools are directly useful.
You'll see this same pattern throughout the programme: AI is steadily removing the technical barriers between having an idea and executing it. You don't need to learn audio engineering to produce clean audio, just as you don't need to learn graphic design to create images (Thing 9) or voice acting to generate narration (Thing 10). The skills that matter more and more are the creative and editorial ones: knowing what you want to say, recognising what sounds good, and making thoughtful decisions about the tools you use.
Claim your Open Badge
Submit your original recording, the Descript-edited version, the Adobe Podcast-enhanced version, and your written reflection as evidence for your Thing 11 badge via cred.scot.
Submit your three audio versions and written reflection as evidence to claim this badge via cred.scot.
Claim now