Realtime GPT-4o photo, video & voice chat

A lightweight demo application for real-time GPT-4o communication including photo/video support, GPT-4o vision integration and voice chat.

https://youtu.be/Bh5tORytR90

Introduction

While OpenAI has demonstrated GPT-4's vision capabilities, these features are not yet accessible to the public. This demo application shows how you could utilize their APIs to implement similar real-time voice/video communication with integrated AI vision functionalities.

Quick Start

Configuration, installation & running

mv config.js.example config.js  // change key + language
npm install
npm run start

Open: http://localhost:3000

How it works:

Turn on your webcam and microphone
Click "Connect" and hold the "Push to Talk" while speaking
Ask something about an object/situation on the video stream
Release the button, and the question/screenshot will be analyzed by GPT-4o vision
From there you can ask follow-up questions or ask other questions
Have fun!

Dev notes:

Code is unsafe for production, key is exposed in the client-side code.
Requirements: NodeJS, modern web browser and an OpenAI API key
By default, Chrome camera access via HTTP is disabled. Could be changed: chrome://flags/#unsafely-treat-insecure-origin-as-secure

Disclaimer

Most of the code is based on OpenAI demo repo, their code is slightly modified/minified
The model(gpt-4o-realtime-preview) has currently strict usage limitations. Expect only 5 minutes of usage in Tier 1 (tbh; the realtime model is quite expensive to run)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
client		client
server		server
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
config.example.js		config.example.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Realtime GPT-4o photo, video & voice chat

Introduction

Quick Start

Configuration, installation & running

How it works:

Dev notes:

Disclaimer

About

Releases

Packages

Languages

License

basvandorst/realtime-gpt4o-videochat

Folders and files

Latest commit

History

Repository files navigation

Realtime GPT-4o photo, video & voice chat

Introduction

Quick Start

Configuration, installation & running

How it works:

Dev notes:

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages