Overview

An image-to-text agent using NLP and Llama 3.2 11B Vision Model.

The agent as an expert English teacher will analyze the image file, extract keywords, group them semantically, and craft concise sentences demonstrating correct usage.

Feed scanned textbook to the model:

Extract keywords
List up a few sample words on UI

=> Website

Make sure to run Flask app on your terminal and turn off the ad blocker on your browser.

Key Features

Automate the contract review process through the following steps:

Image Upload:
- Uploads an image (jpg/png) to the system.
- Encode the image by paragraph.
Retrieve Key Information:
- Employs Llama 3.2 11B Vision Model (running on Together AI)
  - Llama 3.2 Vision Capabilities
- Use NLTK to process the prompt
- Extract paragraph, questions, and correct answers in string format from the encoded image in string.
AI-Powered Vocabulary List:
- Send the retrieved string data to the agent
- Genearate vocabulary list with meaning and a sample sentence.
User Interaction:
- Present the vocabulary list for the user (This repository contains a simple React app as an user interface.)

Technologies Used

[data-doc-management]

Chroma DB: Vector database for storing and querying standard contract clauses
SQLite: Database for storing application data

[ai-model-curation]

Together AI: Hosting Llama 3.1 for text processing, clause segmentation, and response generation
AIML API: Curation platform to access AI models and other tasks

[task-handling]

NLDK: Natural language toolkit for building Python programs to work with human language data Doc

[deployment-framework]

Python: Primary programming language. We use ver 3.12
Flask: Web framework for the backend API
Flask Cors: A Flask extension for handling Cross Origin Resource Sharing (CORS), making cross-origin AJAX possible
pipenv: Python package manager
pre-commit: Managing and maintaining pre-commit hooks
React: Frontend framework
Vercel: User endpoint

Project Structure

.
├── __init__.py
├── app.py                  # Flask application
├── agents.py               # Define the  AI agents
├── Prompts/                # Store prompt and system context templates
│   ├── System.py
│   └── User.py
│   └── ...
├── db/                     # Database files
│   ├── chroma.sqlite3
│   └── ...
└── sample_textbook_images/ # Sample textbook images for the test 
└── uploads/                # Uploaded image files

Setup

Install the pipenv package manager:
```
pip install pipenv
```

Install dependencies:

pipenv shell
pip install -r requirements.txt -v

Set up environment variables: Create a .env file in the project root and add the following:
```
TOGETHER_API_KEY=your_together_api_key
```

Usage

Test the AI assistant:
```
pipenv shell
python main.py
```
In the terminal, you can trace the process analyzing the sample textbook data.
Start the Flask backend:
```
python -m flask run --debug
```
The backend will be available at http://localhost:5000.
In a separate terminal, run the React frontend app:
```
cd frontend
npm start
```
The frontend will be available at http://localhost:3000.

Call the Flask API from the frontend app to see the result on user interface.

Development

Package Management with pipenv

Add a package: pipenv install <package>
Remove a package: pipenv uninstall <package>
Run a command in the virtual environment: pipenv run <command>

After adding/removing the package, update requirements.txt accordingly or run pip freeze > requirements.txt to reflect the changes in dependencies.
To reinstall all the dependencies, delete Pipfile and Pipfile.lock, then run:
```
pipenv shell
pipenv install -r requirements.txt -v
```

Pre-commit Hooks

Install pre-commit hooks:
```
pipenv run pre-commit install
```
Run pre-commit checks manually:
```
pipenv run pre-commit run --all-files
```

Pre-commit hooks help maintain code quality by running checks for formatting, linting, and other issues before each commit.

*To skip pre-commit hooks

git commit --no-verify -m "your-commit-message"

Customizing AI Agents

To modify or add new AI agents, edit the agents.py file. Each agent is defined with a specific role, goal, and set of tools.

To modify or add templated prompts, edit/add files to the Prompts folder.

This project employs Chain of thought technique as well as Role based prompting.

-> Prompting Guide (Meta)

Modifying RAG Functionality

The system uses Chroma DB to store and query the images uploaded. To update the knowledge base:

Add new contract documents to the uploads/ directory.
Modify the agents.py file to update the ingestion process if necessary.
Run the ingestion process to update the Chroma DB.

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/your-amazing-feature)
Commit your changes (git commit -m 'Add your-amazing-feature')
Push to the branch (git push origin feature/your-amazing-feature)
Open a pull request

Troubleshooting

Common issues and solutions:

API key errors: Ensure all API keys in the .env file are correct and up to date.
Database connection issues: Check if the Chroma DB is properly initialized and accessible.
Memory errors: If processing large contracts, you may need to increase the available memory for the Python process.
Issues related to the AI agents or RAG system: Check the output.log file for detailed error messages and stack traces.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Overview

Table of Contents

Key Features

Technologies Used

Project Structure

Setup

Usage

Development

Package Management with pipenv

Pre-commit Hooks

Customizing AI Agents

Modifying RAG Functionality

Contributing

Troubleshooting

Files

README.md

Latest commit

History

README.md

File metadata and controls

Overview

Table of Contents

Key Features

Technologies Used

Project Structure

Setup

Usage

Development

Package Management with pipenv

Pre-commit Hooks

Customizing AI Agents

Modifying RAG Functionality

Contributing

Troubleshooting