This project utilizes OpenAI's Whisper model to transcribe audio and video files. It provides an automated workflow for processing multiple files, organizing transcriptions, and managing the transcription history.
- Transcribe audio and video files using OpenAI's Whisper model
- Automatic file organization for input and output
- Maintains a history of transcriptions
- Supports multiple audio and video formats
Before you begin, ensure you have the following installed:
- Python 3.7 or higher
- pip (Python package manager)
- Git (for version control)
You'll also need an OpenAI API key to use the Whisper model.
-
Clone the repository:
git clone https://github.com/Yan-Yu-Lin/whisper-transcription.git cd whisper-transcription
-
Create a virtual environment:
python -m venv myenv
-
Activate the virtual environment:
- On macOS and Linux:
source myenv/bin/activate
- On Windows:
myenv\Scripts\activate
- On macOS and Linux:
-
Install the required packages:
pip install -r requirements.txt
-
Create a
.env
file in the project root and add your OpenAI API key:OPENAI_API_KEY=your_api_key_here
The project creates and uses the following folder structure:
Processing_Video
: Place your audio/video files here for transcriptionProcessed_Video
: Processed files are moved here after transcriptionResult_Text
: Contains the latest transcription resultsResult_Archive
: Stores previous transcription results
-
Place the audio or video files you want to transcribe in the
Processing_Video
folder. -
Run the transcription script:
python transcribe.py
-
The script will process all files in the
Processing_Video
folder:- Transcribe each file using the Whisper model
- Move processed files to the
Processed_Video
folder - Save transcriptions in the
Result_Text
folder - Move any existing transcriptions to the
Result_Archive
folder
-
Check the
Result_Text
folder for your transcriptions.
This project supports the following input file types:
- Audio: mp3, wav, m4a, flac, aac, ogg, wma
- Video: mp4, avi, mov, wmv, flv, mkv
If you encounter any issues:
- Ensure your OpenAI API key is correctly set in the
.env
file - Check that you have sufficient API credits with OpenAI
- Verify that your input files are in a supported format
- Make sure you're running the script from the project root directory
Contributions to improve the project are welcome. Please follow these steps:
- Fork the repository
- Create a new branch (
git checkout -b feature/AmazingFeature
) - Make your changes
- Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the Whisper model
- All contributors who have helped to improve this project