[Shorts Studio] #11 How To Clone Your Voice With ElevenLabs
[Shorts Studio] #11 如何使用 ElevenLabs 克隆您的聲音
If you’re creating faceless Shorts, or just want to save time recording your own voice, then voiceover becomes the backbone of your content.
如果您正在製作無臉 Shorts,或只是想節省錄製自己聲音的時間,那麼配音就會成為您內容的支柱。
And ElevenLabs is the most natural-sounding AI voice tool available right now.
而 ElevenLabs 是目前最自然的 AI 語音工具。
In this lesson, you’ll learn:
在本課中,您將學習到
How to generate voiceovers using 3 different methods
如何使用 3 種不同的方法產生配音What makes an AI voiceover actually sound good
如何讓 AI 配音真正聽起來好聽And how to use it in your Shorts workflow
以及如何在您的 Shorts 工作流程中使用它
Let’s start with the basics.
讓我們從最基本的開始。
What Is ElevenLabs? ElevenLabs 是什麼?
ElevenLabs is an AI tool that turns written text into human-like voiceovers.
ElevenLabs 是一款人工智能工具,可將書面文字轉換成類似人類的配音。
You give it a script. It reads it aloud.
你給它一個腳本。它會大聲朗讀。
But unlike most text-to-speech tools, it doesn’t sound robotic. It can sound relaxed, excited, thoughtful—even sarcastic. The difference is in the tone, pacing, and emotion.
但與大多數文字轉語音工具不同的是,它聽起來不會像機械人一樣。它可以聽起來輕鬆、興奮、深思熟慮,甚至是諷刺。不同之處在於語氣、步調和情感。
And when used well, the output feels like a real human.
如果使用得宜,輸出的感覺就像真人一樣。
Why Use A Voiceover? 為什麼要使用配音?
A good voiceover can: 好的配音可以
Make your short feel more premium
讓您的短版感覺更優質Deliver tone (which text on screen alone can’t do)
傳達語氣 (單靠螢幕上的文字無法做到)Free you from needing to be on mic or camera
讓您無需在麥克風或攝影機前拍攝
And you don’t have to hire a voice actor, rent a mic, or spend hours editing audio.
而且您不需要僱用配音演員、租用麥克風或花費數小時編輯音訊。
But ultimately a good “voice” keeps your viewer engaged.
但最終,一個好的「聲音」能讓您的觀眾投入其中。

如果您在說話時失去了對象,那是因為您不夠吸引人。要做到引人入勝,您需要考慮以下 3 個方面:1.聽覺投入 - 您的聲音如何 2.視覺投入 - 您如何使用身體語言和手勢 3.內容 - 您說的話對對方是否有趣?如果您想讓人們在您發言時投入,請練習改善這三個方面。就是這麼簡單!您有沒有為了讓人們投入而掙扎過?
Enable 3rd party cookies or use another browser
啟用第三方 cookie 或使用其他瀏覽器
Who Is ElevenLabs Best For?
ElevenLabs 最適合什麼人?
You’ll get the most value out of ElevenLabs if:
如果您符合以下條件,您將從 ElevenLabs 獲得最大的價值:
You’re making faceless content
您正在製作不露臉的內容You want to sound more polished without recording yourself
您想在沒有錄音的情況下,讓自己的聲音聽起來更有質感You want to create a “voice” for your channel that’s consistent across videos
您要為您的頻道創造一種 「聲音」,在不同的影片中保持一致You want to create videos in different languages
您想要製作不同語言的影片You want to move faster without sacrificing quality
您希望在不犧牲品質的前提下加快進度
If you're already writing scripts, ElevenLabs makes it easy to scale.
如果您已經在撰寫腳本,ElevenLabs 可以讓您輕鬆擴展。
3 Ways To Create Voiceovers With ElevenLabs
使用 ElevenLabs 創建配音的 3 種方法
There are three ways to generate a voiceover using ElevenLabs.
使用 ElevenLabs 生成配音有三种方法。
Way #1: Use A Voice From The Library
方法 #1:使用圖書館的聲音
This is the fastest path.
這是最快的路徑。
ElevenLabs has a Voice Library with thousands of community-made voices. Many are clones or custom creations, and you can preview before using.
ElevenLabs 的語音庫擁有數以千計的社群語音。許多都是複製或自訂創作,您可以在使用前預覽。
Some of the most popular, default voices include:
一些最受歡迎的預設語音包括
Rachelle—neutral American female, often used for explainer content
Rachelle-中性的美國女性,常用於說明內容Mark—male voice, good for conversational writing
Mark-男聲,適合會話式寫作Dakota H—female, clean, natural
Dakota H-女性、乾淨、自然
There are hundreds of voices to choose from.
有上百種語音可供選擇。
I recommend spending 15 minutes browsing the voice library. Filter on your preferences. Pick one that fits the tone of your video or just a voice that sounds good to you.
我建議您花 15 分鐘瀏覽語音資料庫。根據您的喜好進行篩選。挑選一個符合您視訊基調的聲音,或是聽起來很舒服的聲音。
You can always switch voices later.
您可以稍後再轉換聲音。
Pro tip: search using a sample of your voice to match your style and tone.
專業建議:使用您的語音樣本進行搜尋,以符合您的風格和語調。
Way #2: Design Your Voice With An AI Prompt
方法 #2:使用 AI 提示設計您的語音
If you want something specific, you can use the Voice Design tool.
如果您想要特定的東西,可以使用語音設計工具。
You describe the voice in plain language, and ElevenLabs generates a few options.
您用簡單的語言描述聲音,ElevenLabs 就會產生一些選項。
Example prompt: 範例提示:
“Young American female. Casual tone. Slightly upbeat and friendly, but not over-the-top.”
"美國年輕女性。隨和的語氣。略帶樂觀和友善,但不過分"。
You can also specify accent, age, emotion, speed, pitch, and more.
您還可以指定口音、年齡、情感、速度、音調等。
Preview a sample, tweak if needed, and save the one you like.
預覽樣本,必要時調整,然後儲存您喜歡的樣本。
Way #3: Clone Your Own Voice
方法 #3:克隆自己的聲音
This is useful if you want your Shorts to sound like you—but you don’t want to record every time.
如果您想讓您的 Shorts 聽起來像您自己,但又不想每次都錄音,這就很有用。
Here’s how it works: 工作原理如下:
Upload a 1–5 minute sample of your voice (must be clean and clear)
上傳 1-5 分鐘的聲音樣本 (必須乾淨清晰)ElevenLabs trains a model that mimics your tone, cadence, and speech patterns
ElevenLabs 會訓練一個模型,模仿您的語氣、語調和說話模式You can now generate scripts in your own voice
現在您可以用自己的聲音產生腳本
Cloning works best when your sample includes natural speech. Try to record yourself speaking normally.
當您的樣本包含自然語音時,複製效果最佳。嘗試錄製自己正常說話的聲音。
How To Use Your Voice With ElevenLabs (Text to Speech Tool)
如何使用 ElevenLabs 語音(文字轉語音工具)
Once you’ve written your script and picked your voice, generating the audio is simple.
一旦您寫好劇本並挑選好您的聲音,產生音訊就很簡單了。
Here’s how to do it step by step:
以下是逐步操作的方法:
Go to the “Text to Speech” tab This is where you generate your audio from a written script.
移至「文字轉語音」標籤 這是您從書面腳本產生音訊的地方。Paste your script into the editor Make sure your text includes punctuation, line breaks, or any emphasis markers you want the voice to pick up on. (See the next section for a few tips.)
將您的劇本貼到編輯器中 確保您的文字包含標點符號、換行符號,或任何您希望語音能讀出的重點標記。(請參閱下一節的一些提示)。Choose your voice Select from your custom voice, cloned voice, or one of the voices from the ElevenLabs library. You can preview each voice before generating.
選擇您的語音 從自訂語音、克隆語音或 ElevenLabs 語音庫中的語音中選擇。您可以在製作前預覽每種聲音。Click “Generate” ElevenLabs will read your script and produce a voiceover. This usually takes just a few seconds.
點擊 「生成」,ElevenLabs 將讀取您的腳本並生成配音。這通常只需要幾秒鐘。Listen—and tweak If something sounds off, edit the script and try again.
聆聽-調整 如果聽起來不對勁,請編輯腳本再試一次。
That’s it. 就是這樣。
You’ll probably do a few rounds of tweaking before it sounds just right.
在聽起來恰到好處之前,您可能會進行幾輪調整。
But once you're happy with it, download the audio file from ElevenLabs and drop it into your video editor (CapCut, Descript, Hedra, Final Cut, etc.)
但當您滿意後,從 ElevenLabs 下載音訊檔案,並將其放入您的視訊編輯器 (CapCut、Descript、Hedra、Final Cut 等)。
Done. That’s your voiceover.
完成。這就是您的配音。
What Makes An AI Voiceover Sound Good?
是什麼讓 AI 配音聽起來很好聽?
Here’s where most creators mess up.
這就是大多數創作者搞砸的地方。
They think the tool does all the work. So, they write a clean, polished script and expect the AI to deliver it with the timing, tone, and personality of a real person. But that’s not how it works.
他們認為所有的工作都是由工具來完成。因此,他們寫了一份簡潔、精緻的腳本,並期望 AI 能以真人的時機、語氣和個性來傳達。但事情並非如此。
A great voiceover is 50% the voice…and 50% the script.
一個好的配音,50% 是聲音......50% 是腳本。
If you want your video to sound like a polished audiobook, fine—ElevenLabs can do that straight out of the box. But if you want it to sound like a real person, you need to intentionally add a little friction..Real people pause. Emphasize the wrong word. Trail off mid-thought. Say the same thing twice, use short sentences, interrupt themselves, and speak with emotion.
如果您希望您的視訊聽起來像一本精心製作的有聲書,沒問題,ElevenLabs 可以直接做到這一點。但如果您想讓它聽起來像真人一樣,您就需要刻意增加一點摩擦力......真人會停頓。強調錯誤的單字。中途停頓。同一件事說兩次、使用簡短的句子、中斷自己的說話、說話時帶有情感。
There are a few techniques that help:
有一些技巧可以幫到您:
Writing how you speak 寫下您說話的方式
Using punctuation to create rhythm and breaks
使用標點符號創造節奏和間歇Guiding tone through context or dialogue-style cues
透過上下文或對話式提示來引導語氣Playing with ALL CAPS to emphasize key moments
使用 ALL CAPS 來強調關鍵時刻Using AI to “unpolish” your script
使用 AI 來「解除打磨」您的腳本
We’ll cover each one in a second.
我們稍後會逐一介紹。
Or, if you want to skip ahead for the prompts:
或者,如果您想跳過前面的提示:
Click here to rewrite your script and add “human flaws” using AI.
點選此處重寫您的腳本,並使用 AI 加入「人性的缺點」。
This will automatically rewrite your text with natural pauses, filler phrases, and softer transitions—so it reads more like speech than writing.
這會自動重寫您的文字,並加入自然的停頓、填充詞組和柔和的轉場,因此讀起來更像說話而不是寫作。
Let’s take a look at a few of the most important rules that make your AI narration feel more natural:
讓我們來看看幾個最重要的規則,讓您的 AI 敘述感覺更自然:
1. Write How You Speak
1.寫出您的說話方式
When you use overly formal, academic, or “marketing” language, the voice follows your lead—and the output will sound stiff and over-rehearsed.
當您使用過於正式、學術性或「行銷」的語言時,語音會跟隨您的趨勢,而且輸出的內容會聽起來很僵硬、過於排練。
Instead, you want to write like you’re explaining something to a friend.
相反地,您要寫得像是在向朋友解釋什麼。
Let’s look at a quick comparison.
讓我們來快速比較一下。
Bad: 壞的:
“The following tutorial will explain three key components of optimizing your workflow.”
「以下教程將解釋優化工作流程的三個關鍵組成部分」。
It’s abstract, uses filler phrases like “the following tutorial,” and “it has no real personality or rhythm. It sounds like a white paper.
它很抽象,使用「以下教學」這樣的填充詞句,而且「它沒有真正的個性或節奏。聽起來像白皮書。
Better: 更好:
“If you want to get more done in 1 hour than most people do in 7 days, you absolutely must avoid these 3 time-wasting tasks. ”
"如果您想在 1 小時內完成的工作比大多數人在 7 天內完成的還要多,就絕對要避免這三種浪費時間的工作。"
Notice how this line gets straight to the point and uses plain everyday language. It feels like someone is about to help you, not lecture you.
請注意這句話是如何直接切入主題,並使用平實的日常用語。感覺像是有人要幫您,而不是對您說教。
To avoid this problem, say your script out loud before generating the voiceover.
為了避免這個問題,請在產生配音之前大聲說出您的劇本。
If it sounds awkward when you say it, it’ll sound even worse coming from an AI.
如果你說的時候聽起來很彆扭,那麼從 AI 口中說出來就更彆扭了。
2. Use Punctuation To Create Rhythm
2.使用標點符號創造節奏
Commas, ellipses, and line breaks signal natural pauses.
逗號、省略號和換行符號表示自然的停頓。
They help the AI know when to pause, when to emphasize, and when to slow down. So when you want a pause, a shift in tone, or emphasis—you have to build it into the writing.
它們幫助 AI 知道何時要停頓、何時要強調、何時要放慢。因此,當您想要停頓、轉換語調或強調時,您必須在寫作中加入這些元素。
Use commas to create short, natural pauses.
使用逗號創造簡短自然的停頓。
When you speak out loud, you naturally pause between ideas or clauses. Commas tell the AI to do the same.
當您大聲說話時,您會在意念或分句之間自然地停頓一下。逗號告訴 AI 也要這樣做。
Without the comma: 沒有逗號:
“Let’s talk about how you fix this mistake before it slows you down.”
「我們來談談如何在它拖累你之前 修正這個錯誤」
Reads fast, all in one breath.
讀得很快,一氣呵成。
With the comma: 用逗號:
“Let’s talk about how you fix this mistake, before it slows you down.”
「我們來談談如何彌補這個錯誤吧」 「免得它拖累你」
This adds a beat and gives the sentence room to breathe, which sounds more like a person.
這樣可以增加節奏,讓句子有呼吸的空間,聽起來更像一個人。
Use ellipses to add hesitation.
使用省略號增加遲疑。
Ellipses are great when you want to signal a moment of thought, or lead-in to a reveal.
當您想要提示思考的片刻,或引出揭示的內容時,省略號是非常好的選擇。
Example: 範例:
“You think the problem is your niche…but it’s not.”
「你以為問題出在你的利基上......但事實並非如此」。
That pause is doing a lot of work.
這個停頓做了很多工作。
Without it, the same line doesn’t carry as much weight:
沒有它,同一條線就沒有那麼大的份量:
“You think the problem is your niche but it’s not.”
「你以為問題出在你的利基市場 但其實不是」
Use ellipses when you want the voice to linger—to make the listener lean in.
當您想要讓聲音縈繞不絕時,請使用省略詞--讓聽者傾身傾聽。
Use line breaks to control pacing.
使用換行符控制步調。
Short-form video scripts are more punchy when you write one line per beat.
短片劇本每節只寫一行,會更有衝勁。
Example: 範例:
Most people don’t realize this:
Your first draft is never the problem.
This gives you: 這讓您
A clean pacing cue for editing
簡潔的剪輯步調提示Built-in emphasis without needing caps or exclamation marks
內建強調,不需要大寫或感嘆號A way to match text-on-screen with voiceover moments
搭配螢幕文字與配音時刻的方法
If your script looks like a paragraph, it’ll sound like one breathless, monotone paragraph.
如果您的劇本看起來像一個段落,聽起來就會像一個喘不過氣來的單調段落。
Your editorial rhythm improves when your writing has structure.
當您的寫作有結構時,您的編輯節奏就會改善。
Break it into beats. 將它分成幾個節拍。
<break time="1.0s"/>
ElevenLabs also supports SSML tags like this one.
ElevenLabs 也支援像這樣的 SSML 標籤。
If you want to insert an exact pause, you can write:
如果要插入精確的停頓,可以寫:
Here’s the biggest mistake I see…
<break time="1.2s"/>
Trying to say too much in 60 seconds.
This tells the AI: stop here for X seconds, then continue.
這會告訴 AI:停在這裡 X 秒,然後繼續。
The more intentional your punctuation and formatting, the more human the voice sounds.
您的標點符號和格式越用心,聲音聽起來就越有人情味。
3. Insert Emotional Emphasis
3.插入情感強調
If your script sounds flat, your voiceover will too.
如果您的劇本聽起來平淡無奇,您的配音也會如此。
To fix this, you need to build emotion and emphasis into the script itself.
要解決這個問題,您需要在劇本本身建立情感和強調。
Here’s how: 方法如下:
Use narrative cues to guide tone
使用敘事提示來引導語氣
AI responds best when emotion is made explicit.
當情感被明確表達時,AI 的反應最好。
You can do this in two ways:
您可以用兩種方式來做到這一點:
Convey emotion through the writing itself
透過文字本身傳達情感Use dialogue-style tags to instruct the model
使用對話式標籤指示模型
Let’s look at both. 讓我們來看看這兩個問題。
Example 1 – Emotion baked into the line:
範例 1 - 將情感融入產品線中:
“You spent all that time working… and this is what you walked away with?”
"你花了那麼多時間工作......就得到了這個?
The rising tension is built into the sentence. The pacing, punctuation, and progression give the voice enough to read it with disbelief—even without being told explicitly.
不斷上升的緊張感是建立在句子中的。步調、標點符號和漸進的方式,讓讀者有足夠的語感,即使沒有被明確告知,也能帶著不信的心情去閱讀。
Example 2 – Emotion delivered through a tag:
範例 2 - 透過標籤傳遞情感:
“You’re leaving?” she asked, her voice trembling with sadness.
「你要走了?」她問,聲音因傷感而發抖。
“That’s it!” he exclaimed triumphantly.
「就是這樣!」他勝利地喊道。
Tags make the emotional delivery more predictable.
標籤讓情感傳遞更容易預測。
But the AI will speak the tags unless you remove them from the final audio.
但是 AI 會說出這些標籤,除非您從最終音訊中移除這些標籤。
Here’s what you can do:
您可以這樣做
Include the emotional tag in the script
在腳本中包含情感標籤Generate the audio with the tag
以標籤產生音訊Trim that part in post-production (using CapCut, Descript, etc.)
在後期製作中修剪該部分 (使用 CapCut、Descript 等)
This gives you expressive delivery without having the AI read the tag out loud.
這讓您不用讓 AI 大聲讀出標籤就能表達出來。
Fine-tune tone with ElevenLabs settings
使用 ElevenLabs 設定微調音色
In addition to writing for emphasis, you can also adjust how much range the voice is allowed to have.
除了撰寫強調之外,您也可以調整允許語音的音域範圍。
Here’s how the two main sliders work:
以下是兩個主要滑桿的運作方式:
Stability 穩定性
Higher = flatter, more controlled tone
較高 = 較平坦、較受控制的音色Lower = more expressive variation (pitch changes, natural pauses)
較低 = 更富表現力的變化 (音高變化、自然停頓)
Similarity 相似性
Higher = stays truer to the original voice profile
較高 = 更貼近原始語音設定檔Lower = allows more deviation and personality
較低 = 允許更多偏差和個性
If your voiceover feels too robotic:
如果您的配音感覺太機械化:
Lower the stability to around 30–50%. You’ll hear more variation in tone and pacing.
將穩定度降低至 30-50% 左右。您會聽到更多音色和節奏的變化。
If the voice starts to feel inconsistent or loses clarity:
如果聲音開始不連貫或失去清晰度:
Raise the similarity slider. It’ll tighten up the tone and pronunciation.
提高相似度滑塊。這樣可以收緊語調和發音。
There’s no universal best setting. You’ll need to test a few combinations.
沒有通用的最佳設定。您需要測試幾種組合。
Use ALL CAPS 使用大寫字母
Finally, one of the most reliable ways to signal emphasis in ElevenLabs is to use ALL CAPS on individual words or short phrases.
最後,在 ElevenLabs 中表示強調的最可靠方法之一是在單字或短語上使用 ALL CAPS。
This tells the model to “say this part sharp with force.”
這會告訴模型「用力說這部分尖銳」。
Use ALL CAPS when you want to:
想用大寫就用大寫:
Hit a key word in the sentence
點擊句子中的關鍵字Add weight to a punchline or reversal
增加俏皮話或反話的份量Make the voice land a moment the way a human would
讓聲音像人類一樣降落片刻
Examples 實例
Without emphasis: 沒有強調:
“The biggest mistake I see people make is trying to do too much.”
「我看到人們犯的最大錯誤就是嘗試做得太多」。
With emphasis: 有強調:
“The BIGGEST mistake I see people make… is trying to do TOO MUCH.”
「我看到人們犯的最大錯誤......就是嘗試做得太多」。
Another one: 另一個
“You think the problem is your niche… but it’s NOT.”
「你以為問題出在你的利基......但其實不是」。
Or: 或者
“I told you once. I won’t say it AGAIN.”
"我告訴過你一次我不會再說第二次了"
ALL CAPS helps the AI voice punch the line, without needing any extra direction.
ALL CAPS 可協助 AI 語音在不需要任何額外指示的情況下打出一行字。
Be careful you don’t overdo it or the model starts to sound overly intense—like shouting in your email.
注意不要過度,否則模型會開始聽起來過度強烈,就像在您的電子郵件中大喊大叫一樣。
Absolutely. Here’s how that section could flow as #5 under “What Makes an AI Voiceover Sound Good?”—building naturally on the previous techniques, and reinforcing your core point about realism over perfection:
沒問題。以下是「如何讓 AI 配音聽起來好聽?」下第 5 節的流程 - 自然地建立在之前的技巧上,並強化您的核心觀點:真實而非完美:
5. Use AI to “Unpolish” Your Script
5.使用 AI 來「打磨」您的腳本
Sometimes, the best move is to take a polished script…and deliberately make it messier.
有時候,最好的做法就是將已打磨好的劇本......故意弄得更亂。
You can do this manually by:
您可以透過下列方式手動完成:
Adding filler phrases (“you know,” “actually,” “look…”)
添加填充短語(「你知道」、「實際上」、「看......」)。Breaking up longer sentences
分割較長的句子Repeating key phrases 重複關鍵詞組
Starting thoughts mid-sentence or interrupting them
中途開始思考或中斷思考
But you can also use AI to help.
但您也可以使用 AI 來幫忙。
Drop in your clean script into one of these prompts, and AI will rewrite it to sound more human.
將您簡潔的腳本放入這些提示中,AI 就會將腳本改寫得更人性化。
It’ll add: 它會加上
Conversational phrasing 會話用語
Natural stumbles 自然障礙
Human pacing 人體步調
Think of it like handing your script to a voice coach who makes it sound more like something a person would actually say on camera.
就像是將您的劇本交給配音教練,他會讓您的劇本聽起來更像是一個人在鏡頭前說的話。
Click here for the prompts.
點選此處取得提示。
Here’s your assignment: 這是你的任務:
By the end of this lesson, you should:
本課結束時,您應該:
Write a 60-second script (150–170 words)
撰寫 60 秒的劇本 (150-170 字)Pick one of the three voice generation methods
從三種語音產生方法中選取一種Create and download your voiceover
建立並下載您的配音Add it into your library of video assets for your Short.
將它加入您的 Short 視訊資產庫。
If you’ve never used ElevenLabs before, try at least two different voice styles. It will help you hear the difference between tones and understand which voice fits which type of video.
如果您從未使用過 ElevenLabs,請至少嘗試兩種不同的語音風格。它可以幫助您聽出不同聲調之間的差異,了解哪種語音適合哪種類型的影片。
See you tomorrow, 明天見
—Dickie & Cole -Dickie與Cole
Co-Founders of Ship 30 For 30
Ship 30 For 30 的共同創始人
Co-Founders of Premium Ghostwriting Academy
Premium Ghostwriting Academy 的共同創辦人
Co-Founders of Typeshare Typeshare 的共同創辦人
Co-Founders of Write With AI
Write With AI 的共同創辦人

Recommend Write With AI to your readers
向您的讀者推薦 Write With AI
ChatGPT & Claude 提示的領先付費電子報,以及如何將 AI 平台變成您個人的數位寫作助理。