Local STT (Speech-to-Text) SOP

Prerequisites

pip install openai-whisper
Ensure ffmpeg is installed (brew install ffmpeg).

Usage

import whisper
model = whisper.load_model("base") # Use 'base' or 'tiny' for speed on Apple Silicon
result = model.transcribe("audio.wav")
print(result["text"])

Audio Capture (R14 Verified)

Microphone: Verified with sounddevice. Use sd.query_devices() to identify input index.
System Loopback (Internal Audio):
- macOS does not support system audio capture natively. Standard APIs (avfoundation, coreaudio) only see physical hardware.
- Requirement: Virtual driver BlackHole (2ch or 16ch) is mandatory for loopback.
- Workflow: Install BlackHole -> Set macOS Audio Output to "BlackHole" -> Select "BlackHole" as Input Device in sounddevice.

Notes

Running large models (medium, large) on Apple Silicon CPU is slow. Use faster-whisper for production.

local stt sop

Local STT (Speech-to-Text) SOP

Prerequisites

Usage

Audio Capture (R14 Verified)

Notes

评论（0）