π Follow This Series Live: Jarvis 2.0 Series
JARVIS-AGI is an advanced AI project designed to integrate multiple AI capabilities, including speech recognition, text processing, and image analysis, into a cohesive system. Named after the iconic AI assistant from popular culture, Jarvis is built with cutting-edge natural language processing capabilities, allowing users to interact with it through voice commands. Whether it's checking the weather, setting reminders, managing calendars, or searching the web, Jarvis is equipped to handle a wide range of tasks efficiently and effectively. With its intuitive interface and robust functionality, Jarvis aims to revolutionize the way users engage with technology, making everyday tasks simpler and more convenient.
- Project Overview
- Features
- Directory Structure
- Installation
- Usage
- Contributing
- License
- Connect with Us
- Speech Recognition: Convert spoken language into text using various models.
- Text Processing: Analyze and generate text with multiple AI tools.
- Image Analysis: Perform image recognition and processing tasks.
- Audio Tools: Detect hotwords and manage audio playback interruptions.
- Interactive Prompts: Predefined prompts to guide AI interactions.
The project is organized into several key directories:
JARVIS-AGI/
βββ .env
βββ .env.example
βββ .gitattributes
βββ .gitignore
βββ ASSETS/
β βββ CLAP_DETECTS/
β β βββ MODELS/
β β βββ Model.txt
β βββ SOUNDS/
β β βββ activation_sound.wav
β β βββ audio_file.mp3
β β βββ deactivation_sound.wav
β βββ STREAM_AUDIOS/
β β βββ output_audio_6.mp3
β β βββ output_audio_7.mp3
β β βββ output_audio_8.mp3
β β βββ output_audio_9.mp3
β β βββ output_audio_10.mp3
β β βββ output_audio_11.mp3
β β βββ output_audio_12.mp3
β β βββ output_audio_13.mp3
β β βββ output_audio_14.mp3
β β βββ output_audio_15.mp3
β β βββ output_audio_16.mp3
β β βββ output_audio_17.mp3
β β βββ output_audio_18.mp3
β β βββ output_audio_19.mp3
β β βββ output_audio_20.mp3
β βββ USERDATA/
β β βββ LE CHAT/
β β βββ How_To_Store_UserData.txt
β βββ Vosk/
β βββ available_working_proxies.txt
β βββ conversation_history.json
β βββ openGPT_IDs.txt
βββ BRAIN/
β βββ AI/
β β βββ IMAGE/
β β β βββ decohere_ai.py
β β β βββ deepInfra_IMG.py
β β βββ TEXT/
β β β βββ API/
β β β β βββ Blackbox_ai.py
β β β β βββ Bnn_GPT.py
β β β β βββ FarFalle.py
β β β β βββ Hugging_Face_TEXT.py
β β β β βββ Le_Chat.py
β β β β βββ Phind.py
β β β β βββ Pi_Ai.py
β β β β βββ Uncensored.py
β β β β βββ basedGPT.py
β β β β βββ deepInfra_TEXT.py
β β β β βββ deepseek_ai.py
β β β β βββ hugging_chat.py
β β β β βββ liaobots.py
β β β β βββ openGPT.py
β β β β βββ openrouter.py
β β β βββ LOCAL/
β β β β βββ llama_CPP.py
β β β βββ STREAM/
β β β βββ basedGPT.py
β β β βββ deepInfra_TEXT.py
β β βββ VISION/
β β βββ deepInfra_VISION.py
β βββ TOOLS/
β βββ groq_web_access.py
βββ ENGINE/
β βββ STT/
β β βββ DevsDoCode.py
β β βββ NetHyTech.py
β β βββ src/
β β β βββ index.html
β β βββ vosk_recog.py
β βββ TTS/
β βββ STREAMING/
β β βββ DeepGram.py
β β βββ speechify.py
β βββ DeepGram.py
β βββ ElevenLabs.py
β βββ ai_voice.py
β βββ deepAI.py
β βββ edge_tts.py
β βββ hearling.py
β βββ speechify.py
β βββ stream_elements_api.py
βββ PLAYGROUND/
β βββ ADB_CALL/
β β βββ ADB COMMANDS.txt
β β βββ Details.txt
β β βββ IMP Commands.txt
β β βββ Information.txt
β β βββ android_device_connection_setup.py
β β βββ make_call.py
β βββ CAMERA/
β β βββ camera_vision.py
β βββ CLAP_NN/
β β βββ DATASETS/
β β β βββ Informtation.txt
β β βββ ClapDetector.py
β β βββ Model_Trainer.py
β β βββ audio_inference.py
β β βββ cnn_sound_model.py
β β βββ load_dataset.py
β βββ WEBSITE_ASSISTANT/
β βββ chrome_latest_url.py
β βββ jenna_reader.py
βββ PROMPTS/
β βββ BISECTORS.py
β βββ INSTRUCTIONS.py
β βββ PROMPTS.py
β βββ SYSTEM.py
βββ TOOLS/
β βββ AUDIO/
β β βββ Hotword_Detection.py
β β βββ Interrupted_Playsound.py
β βββ LE_CHAT_COOKIES/
β β βββ Cookie_Extractor.py
β βββ SYSTEM_SETTINGS/
β β βββ SETTING.py
β β βββ system_theme.py
β β βββ taskbar.py
β βββ Alpaca_DS_Converser.py
β βββ ProxyAPI.py
β βββ RawDog.py
β βββ TXT_DS_Converser.py
β βββ Web_Results.py
β βββ stream_audio_cleanup.py
βββ CODE_OF_CONDUCT.md
βββ IMPORTS.py
βββ LICENCE
βββ Le_Chat_Tester.py
βββ Memory ConvoTxt.py
βββ SpeedTester.py
βββ StreamSpeak.py
βββ WebTester.py
βββ main.py
βββ readme.md
βββ requirements.txt
-
Clone the repository:
git clone https://github.com/SreejanPersonal/JARVIS-AGI.git cd JARVIS-AGI
-
Install the required packages:
pip install -r requirements.txt
-
(Optional) Install Vosk Speech Recognition Models:
Vosk provides pre-trained models for various languages. To install the models for your desired language, follow these steps:
- Go to the Vosk GitHub repository releases page: Vosk GitHub Releases
- Download the model folder for your language. For example, if you want English models, download the folder named
vosk-model-en-us-aspire-0.2
. - Extract the contents of the folder into a directory named
ASSETS
in your project directory. - Ensure that the extracted model folder is directly under the
ASSETS
directory, without any additional nesting. - Now, you should have a structure like this:
<your-main-project-directory>/ASSETS/vosk-model-en-us-aspire-0.2
. - Modify the
main.py
or any relevant script to point to the model directory. For example:from ENGINE.STT.vosk_recog import speech_to_text for speech in speech_to_text(model_path="ASSETS/Vosk/vosk-model-small-en-us-0.15"): if speech != "": print("Human >>", speech)
-
Run the main script:
python main.py
-
Configuration: Modify API configuration
in the .env
directory to suit your needs.
We welcome contributions to improve JARVIS-AGI. To contribute, follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Commit your changes (
git commit -m 'Add new feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
Made With π By - Sree (Devs Do Code)
For any questions or concerns, reach out to us via our social media handles. Our top choice for contact is Telegram: Devs Do Code Telegram
- YouTube Channel: Devs Do Code
- Telegram Group: Devs Do Code Telegram
- Discord Server: Devs Do Code Discord
- Instagram:
- Personal: Sree
- Channel: Devs Do Code
Dive into the world of coding with Devs Do Code - where passion meets programming! Make sure to hit that Subscribe button to stay tuned for exciting content!
Pro Tip: For optimal performance and a seamless experience, we recommend using the default library versions demonstrated in this demo. Your coding journey just got even better! Happy coding!
Now you're all set to explore the Devs Do Code's project! Enjoy coding!