Transcription Service App

Transcription Service App is a web app for universities to make simple transcriptions from video or audio files in multiple languages, currently tailored towards Open AI's Whisper models.

Some of its features are:

Supports transcriptions with or without simultaneous translations to multiple languages.
Simple interface.
Access to two of Open AI's Whisper models (base and large-v3).
Supports upload from videos and audio files (up to 1gb) as well as YouTube links.
Users can edit and download transcription results in 4 different formats (txt, vtt, srt and json).
Diarization support to detect multiple speakers (up to 20).
Srt, vtt and json formats provide timestamp and speaker information (when available).
Transcribed subtitles can be activated in uploaded videos.

Usage & Configuration

You first need to set up a whisperx API server to work with this app.

Some environment variables should be set. Here is an example of a .env file:

# PATH to the ffmpeg library in your system
FFMPEG_PATH=/usr/bin/ffmpeg
# Path where temporal files will be generated 
TEMP_PATH=transcription-whisper-temp
# Uncomment this up if you're using an authentication process to allow users to log out
#LOGOUT_URL=/oauth2/sign_out
# Url and port to the API server
API_URL=http://111.111.111.11:11300

Development

The app is developed in the streamlit framework.

You can install the requirements needed to run and develop the app using pip install -r requirements.txt. Then simply run a development server like this:

streamlit run app.py

Authors

virtUOS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Transcription Service App

Usage & Configuration

Development

Authors

Files

README.md

Latest commit

History

README.md

File metadata and controls

Transcription Service App

Usage & Configuration

Development

Authors