Audio Transcriber API
Transcribe audio from URL or upload from local disk.
- Python
- FastAPI
- OpenAI Whisper
- Docker

About the project
With this API, users can simply post a URL or upload an audio file, and the API will transcribe the audio and returns the transcription. This project was created so I can transcribe limitless audio files locally without calling external APIs.
I built the API endpoints using FastAPI. FastAPI is a modern and high-performance web framework for building APIs based on standard Python type hints. It provides automatic interactive documentation and it is a lot fun working with the tool.
How it works
When the client upload an audio to the API, it will assign a Celery task for transcribing the audio and returns the task ID.
@app.post('/transcribe')
async def transcribe(audio: AudioFile = None, url: str = Form(None)):
...
if audio:
# Save the uploaded file into a temporary file
ext = pathlib.Path(audio.filename).suffix
_, filepath = tempfile.mkstemp(dir='/tmp', suffix=ext)
with open(filepath, 'wb') as f:
f.write(audio.file.read())
# Transcribe asynchronously
try:
task = transcribe_from_file.delay(filepath)
except TaskException as e:
raise HTTPException(status_code=500, detail=str(e))
return {'taskId': task.id}
The client then need to check the status of the given task on a separate endpoint. When the transcribing is done, the client will also receive the result.
@app.get('/transcribe/{task_id}')
async def transcribe_status(task_id: str):
task = celery.AsyncResult(task_id)
if task.ready():
return {'status': 'DONE', 'result': task.get()}
else:
return {'status': 'IN_PROGRESS'}
Under the hood, the API is using OpenAI Whisper model for transcribing the audio. The model is relatively small but it gives very good results.
import whisper
...
@celery.task
def transcribe_from_file(filepath: str):
model = whisper.load_model('base')
result = model.transcribe(filepath)
...
return result
Summary
By using this API, users can transcribe their audio files locally without having to spend additional costs for using third-party APIs. You can download the source code on my Github.