From 9f42e255a292f68782b8f4715c24f21bf463437c Mon Sep 17 00:00:00 2001 From: Shreya Shankar Date: Tue, 17 Sep 2024 10:35:52 -0700 Subject: [PATCH] Update docs --- README.md | 8 +++++--- docs/community/roadmap.md | 1 + docs/installation.md | 22 ++++++++++++++++------ 3 files changed, 22 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 514f1260..e090781c 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ DocETL is a powerful tool for creating and executing data processing pipelines, especially suited for complex document processing tasks. It offers a low-code, declarative YAML interface to define complex data operations on complex data. -[Documentation](https://shreyashankar.github.io/docetl) | [Website](https://docetl.com) +[Website](https://docetl.com) | [Documentation](https://shreyashankar.github.io/docetl) | [Discord](https://discord.gg/fHp7B2X3xx) ## When to Use DocETL @@ -23,13 +23,15 @@ DocETL is the ideal choice when you're looking to maximize correctness and outpu ## Installation +See the documentation for installing from PyPI. + ### Prerequisites Before installing DocETL, ensure you have Python 3.10 or later installed on your system. You can check your Python version by running: python --version -### Installation Steps +### Installation Steps (from Source) 1. Clone the DocETL repository: @@ -60,7 +62,7 @@ OPENAI_API_KEY=your_api_key_here Alternatively, you can set the OPENAI_API_KEY environment variable in your shell. -5. Run the basic test suite to ensure everything is working: +5. Run the basic test suite to ensure everything is working (this costs less than $0.01 with OpenAI): ```bash make tests-basic diff --git a/docs/community/roadmap.md b/docs/community/roadmap.md index f217deda..db8d8646 100644 --- a/docs/community/roadmap.md +++ b/docs/community/roadmap.md @@ -35,6 +35,7 @@ mindmap - **Model Diversity**: Extending support beyond OpenAI to include a wider range of models, with a focus on local models. - **OCR and PDF Extraction**: Improving integration with OCR technologies and PDF extraction tools for more robust document processing. +- **Multimodal Data Processing**: Enhancing DocETL to handle multimodal data, including text, images, audio, and video (as many of the LLMs support multimodal inputs, anyways). ## Agents and Planning diff --git a/docs/installation.md b/docs/installation.md index dbc06dc1..cd4fc9f1 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -18,7 +18,11 @@ python --version pip install docetl ``` -This command will install DocETL along with its dependencies as specified in the pyproject.toml file. +This command will install DocETL along with its dependencies as specified in the pyproject.toml file. To verify that DocETL has been installed correctly, you can run the following command in your terminal: + +```bash +docetl version +``` ## Installation from Source @@ -45,15 +49,21 @@ poetry install This will create a virtual environment and install all the required dependencies. -## Verifying the Installation +4. Set up your OpenAI API key: -To verify that DocETL has been installed correctly, you can run the following command in your terminal: +Create a .env file in the project root and add your OpenAI API key: ```bash -docetl version +OPENAI_API_KEY=your_api_key_here ``` -If the installation was successful, this command will display the version of DocETL installed on your system. +Alternatively, you can set the OPENAI_API_KEY environment variable in your shell. + +5. Run the basic test suite to ensure everything is working (this costs less than $0.01 with OpenAI): + +```bash +make tests-basic +``` ## Troubleshooting @@ -63,4 +73,4 @@ If you encounter any issues during installation, please ensure that: - You have the latest version of pip installed - Your system meets all the requirements specified in the pyproject.toml file -For further assistance, please refer to the project's GitHub repository or reach out to the community for support. +For further assistance, please refer to the project's GitHub repository or reach out on the [Discord server](https://discord.gg/fHp7B2X3xx).