Tự tạo mp3 cho Anki để khỏi tải hoặc dùng Addon

Cách TỰ TẠO ÂM THANH câu Tiếng Anh, Tiếng Việt để dùng cho Anki không cần tải từ mp3 có sẵn, làm từ giòng lệnh trong Powershell Laptop!
Hiện tại có thể dùng 3 phương án là:

Anki TTS tích hợp (chỉ chạy trên Desktop, không sync audio).
edge-tts (Python script) → xuất file mp3, sync được cả Mobile.
Addon (ví dụ: AwesomeTTS, PiperTTS) Ở đây chỉ bàn đến các phương án tự thực hiện để tạo ra âm thanh tích hợp cho Anki, không dùng Addon.
1)Dùng TTS tích hợp của Anki:
Giải pháp này chỉ dùng được trên Anki Desktop, vì không xuất ra file mp3 nên không dùng được trên Anki Android, hoặc chia sẻ bằng Anki sử dụng nguồn media online từ repository Github. Nó sẽ tự chuyển text từ các Fields vd

{{English}} {{EnglishAns}} {{EnglishAns2}} {{EnglishAns3}}

... thành âm thanh rồi phát ra, nên hoàn toàn không có Fields âm thanh trong Deck! Muốn được vậy ta cần dùng giòng code sau trong "Front" hoặc "Back" tùy vị trí muốn âm thanh xuất hiện. Cụ thể:

<script>
function speakText(text, lang="en-US") {
    const utterance = new SpeechSynthesisUtterance(text);
    utterance.lang = lang;
    speechSynthesis.speak(utterance);
}
speakText(`{{English}}`, "en-US");
</script>

Kết hợp với giòng code mặc định trong template:

<script>
  var elem = document.querySelector("#played_audio .soundLink, #played_audio .replaybutton");
  if (elem) { elem.click(); }
</script

Muốn tạo thêm "button" để có thể click thủ công khi muốn phát âm thanh giòng nào, ta thêm đoạn code:

<div style="font-size:32px; text-align:center; font-weight:bold; color:red; margin-top:20px;">
  {{English}}
  <button class="audio-btn" onclick="speakText(`{{English}}`, 'en-US')" title="Phát âm câu hỏi">
    <svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="white" viewBox="0 0 24 24">
      <path d="M3 10v4h4l5 5V5l-5 5H3zm13.5 2a4.5 4.5 0 0 0-4.5-4.5v9a4.5 4.5 0 0 0 4.5-4.5zm2.5 0a7 7 0 0 0-7-7v2a5 5 0 0 1 0 10v2a7 7 0 0 0 7-7z"/>
    </svg>
  </button>
</div

<div style="font-size:30px; margin-top:10px;">
<span style="color:blue; font-weight:bold;">{{EnglishAns}}</span>
  <button class="audio-btn" onclick="speakText(`{{EnglishAns}}`, 'en-US')" title="Phát âm">
    <svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="white" viewBox="0 0 24 24">
      <path d="M3 10v4h4l5 5V5l-5 5H3zm13.5 2a4.5 4.5 0 0 0-4.5-4.5v9a4.5 4.5 0 0 0 4.5-4.5zm2.5 0a7 7 0 0 0-7-7v2a5 5 0 0 1 0 10v2a7 7 0 0 0 7-7z"/>
    </svg>
  </button><br>
  
</div>

2)Dùng Edge-TTS thủ công:
Với Edge-TTS (Microsoft Azure Neural Voices) → giọng rất giống người thật: File .mp3 giọng tự nhiên như người thật (Aria = nữ Anh-Mỹ, HoaiMy = nữ Việt).
👉 Hoạt động tốt trên Anki Desktop, Web, Android, iOS. Ta thực hiện các bước sau: a)Kiểm tra môi trường (Để xem Laptop đã cài Python chưa, nếu chưa phải cài Python!)Kiểm tra version Python:

python --version

(hoặc py --version) Kiểm tra pip:

pip --version

Cài edge-tts (nếu lỗi quyền thì thêm --user):

pip install --user edge-tts

b)Tạo File mẫu CSV (tên: sentences.csv)

Bạn tạo file này bằng Notepad, Excel hoặc Google Sheets rồi lưu dạng .csv (UTF-8).

English,Vietnamese
What is the reason you learn English?,Lý do bạn học tiếng Anh là gì?
I learn English to get a better job.,Tôi học tiếng Anh để có một công việc tốt hơn.
I learn English to enjoy English movies and music.,Tôi học tiếng Anh để thưởng thức phim và nhạc tiếng Anh.
I learn English to communicate with people from different countries.,Tôi học tiếng Anh để giao tiếp với người từ nhiều quốc gia.

c)Tạo Script Python dùng Edge-TTS

Lưu file này thành generate_audio.py.

import edge_tts
import asyncio
import csv
import os

# Thư mục chứa file mp3 xuất ra
OUTPUT_FOLDER = "output_audio"
os.makedirs(OUTPUT_FOLDER, exist_ok=True)

# Hàm tạo TTS
async def save_tts(text, voice, filename):
    communicate = edge_tts.Communicate(text, voice)
    await communicate.save(filename)

async def main():
    with open("sentences.csv", "r", encoding="utf-8") as f:
        reader = csv.DictReader(f)
        for i, row in enumerate(reader, start=1):
            en_text = row["English"].strip()
            vi_text = row["Vietnamese"].strip()

            # Giọng Anh-Mỹ nữ (AriaNeural), giọng Việt nữ (HoaiMyNeural)
            en_file = os.path.join(OUTPUT_FOLDER, f"en_{i}.mp3")
            vi_file = os.path.join(OUTPUT_FOLDER, f"vi_{i}.mp3")

            await save_tts(en_text, "en-US-AriaNeural", en_file)
            await save_tts(vi_text, "vi-VN-HoaiMyNeural", vi_file)

            print(f"✔ Saved: {en_file}, {vi_file}")

asyncio.run(main())

d)Cách chạy: Đặt sentences.csv và generate_audio.py chung 1 thư mục. Chạy:

python generate_audio.py

Kết quả: trong thư mục output_audio sẽ có en_1.mp3, vi_1.mp3, en_2.mp3, vi_2.mp3… Sau đó, ta có thể import vào Anki bằng cách thêm cột [sound:en_1.mp3] và [sound:vi_1.mp3] trong CSV để đảm bảo chạy trên Desktop, Web, Android, iOS.
3)Mở rộng ta có thể tạo file CSV: import trực tiếp vào Anki
Làm thêm bước này ta có được text + [sound:...]) để ta khỏi chỉnh tay sau khi đã tạo ra audio:
a)Cấu trúc CSV để import vào Anki: Ví dụ: mỗi thẻ có 2 mặt (English → Vietnamese). CSV sẽ có 3 cột:

English,Vietnamese,Audio
What is the reason you learn English?,Lý do bạn học tiếng Anh là gì?,[sound:en_1.mp3][sound:vi_1.mp3]
I learn English to get a better job.,Tôi học tiếng Anh để có một công việc tốt hơn.,[sound:en_2.mp3][sound:vi_2.mp3]
I learn English to enjoy English movies and music.,Tôi học tiếng Anh để thưởng thức phim và nhạc tiếng Anh.,[sound:en_3.mp3][sound:vi_3.mp3]
I learn English to communicate with people from different countries.,Tôi học tiếng Anh để giao tiếp với người từ nhiều quốc gia.,[sound:en_4.mp3][sound:vi_4.mp3]

Cột English: hiển thị mặt trước.
Cột Vietnamese: hiển thị mặt sau.
Cột Audio: chứa các file âm thanh đã sinh ra.
- [sound:en_1.mp3] → phát câu tiếng Anh.
- [sound:vi_1.mp3] → phát câu tiếng Việt.

b)Điều chỉnh script để xuất CSV này
Bổ sung phần cuối script để tự động viết file anki_import.csv:

import edge_tts
import asyncio
import csv
import os

OUTPUT_FOLDER = "output_audio"
os.makedirs(OUTPUT_FOLDER, exist_ok=True)

async def save_tts(text, voice, filename):
    communicate = edge_tts.Communicate(text, voice)
    await communicate.save(filename)

async def main():
    rows = []
    with open("sentences.csv", "r", encoding="utf-8") as f:
        reader = csv.DictReader(f)
        for i, row in enumerate(reader, start=1):
            en_text = row["English"].strip()
            vi_text = row["Vietnamese"].strip()

            en_file = f"en_{i}.mp3"
            vi_file = f"vi_{i}.mp3"

            await save_tts(en_text, "en-US-AriaNeural", os.path.join(OUTPUT_FOLDER, en_file))
            await save_tts(vi_text, "vi-VN-HoaiMyNeural", os.path.join(OUTPUT_FOLDER, vi_file))

            rows.append([en_text, vi_text, f"[sound:{en_file}][sound:{vi_file}]"])
            print(f"✔ Saved: {en_file}, {vi_file}")

    # Xuất file CSV import trực tiếp vào Anki
    with open("anki_import.csv", "w", encoding="utf-8", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["English", "Vietnamese", "Audio"])
        writer.writerows(rows)
    print("👉 Done! File 'anki_import.csv' ready to import into Anki.")

asyncio.run(main())

c)Cách dùng:

Chạy script như trước → sẽ có output_audio (chứa mp3) và file anki_import.csv.
Trong Anki → File → Import → chọn anki_import.csv.
Đảm bảo copy cả thư mục output_audio vào thư mục media của Anki (Anki sẽ tự đồng bộ khi import).

👉 Kết quả: Ta có deck đầy đủ câu tiếng Anh + nghĩa tiếng Việt + audio rõ ràng tự nhiên chạy được trên Desktop, Web, iOS, Android.
4)Ngoài cách trên ta còn có thể dùng gTTS (Google Text-to-Speech)

a)Cài thư viện gTTS (Google Text-to-Speech) Mở Command Prompt (Windows) hoặc Terminal (Mac/Linux), gõ:

pip install gTTS

b)Tạo script Python (vd: anki_tts.py) Copy nội dung này vào file .py:

from gtts import gTTS

# Danh sách câu (Anh, Việt)
sentences = [
    ("What is the reason you learn English?", "Lý do bạn học tiếng Anh là gì?"),
    ("I learn English to get a better job.", "Tôi học tiếng Anh để có một công việc tốt hơn."),
    ("I learn English to enjoy English movies and music.", "Tôi học tiếng Anh để thưởng thức phim và nhạc tiếng Anh."),
    ("I learn English to communicate with people from different countries.", "Tôi học tiếng Anh để giao tiếp với người từ nhiều quốc gia.")
]

for i, (en_text, vi_text) in enumerate(sentences, start=1):
    # Tiếng Anh
    tts_en = gTTS(en_text, lang="en", tld="com")
    tts_en.save(f"en_{i}.mp3")

    # Tiếng Việt
    tts_vi = gTTS(vi_text, lang="vi")
    tts_vi.save(f"vi_{i}.mp3")

print("✅ Xuất file mp3 thành công! Copy vào thư mục collection.media của Anki.")

c)Chạy lệnh:

python anki_tts.py

Sẽ có các file:

en_1.mp3, vi_1.mp3
en_2.mp3, vi_2.mp3
en_3.mp3, vi_3.mp3
en_4.mp3, vi_4.mp3

d)Thêm vào Anki: Copy các file này vào thư mục:

C:\Users\<TênUser>\AppData\Roaming\Anki2\<TênProfile>\collection.media

Trong Template, gọi bằng:

[sound:en_1.mp3]
[sound:vi_1.mp3]

Với cách này: Âm thanh rõ, chuẩn, giống người thật (Google TTS). File mp3 được lưu cục bộ, sync lên AnkiWeb → dùng tốt trên Anki Android/iOS.
5)Mẫu mở rộng: Từ các file âm thanh có được do chính mình tự tạo ra, ta có thể dùng nó để làm mẫu mở rộng gồm 1 câu hỏi và nhiều câu trả lời tương ứng với các Fields ví dụ

{{English Audio}} {{English Audio Ans}} {{English Audio Ans2}} {{English Audio Ans3}}

và mỗi câu đều có icon riêng biệt để click nghe. Dòng code của "Back" template này có thể viết cụ thể như sau:

<script>
  var elem = document.querySelector("#played_audio .soundLink, #played_audio .replaybutton");
  if (elem) { elem.click(); }
</script>
<div style='font-family: Arial; font-size: 20px;'></div>

<div style="font-size:32px; text-align:center; font-weight:bold; color:red; margin-top:20px;">
  {{English}}
 {{English Audio}} 
</div

<div style="font-size:30px; margin-top:10px;">
<span style="color:blue; font-weight:bold;">{{EnglishAns}}</span><br>
 {{English Audio Ans}} 
<br>
  
</div>

<div style="font-size:30px; margin-top:10px;">
<span style="color:blue; font-weight:bold;">{{EnglishAns2}}</span><br>
 {{English Audio Ans2}} 
<br>
  
</div>
<div style="font-size:30px; margin-top:10px;">
<span style="color:blue; font-weight:bold;">{{EnglishAns3}}</span><br>
 {{English Audio Ans3}} 
<br>
  
</div>