Skip to content

Paper Translator with Chroma-db / Create Knowledge Database with Chromadb, Pinecone

Notifications You must be signed in to change notification settings

seohyunjun/paper-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hits

paper-translator

  1. This is a paper translator(korean) using Langchain.
  2. It automatically translates addresses or files in the form of PDF files.

version history

v0.1.9 2024/03/10

  • select embedding model
    • text-embedding-3-small
    • text-embedding-3-large

v0.1.8 2024/02/13

  • Add ChromaDB

v0.1.7 2023/12/16

  • Vectorstore using pinecone

v0.1.6 2023/11/27

  • add GPT4-Vision API

v0.1.5 2023/7/25

  • add Youtube Script translator(using youtube-dl)
version history

v0.1.4 2023/7/9

  • use langchain schema

v0.1.3 2023/6/23

  • URL -> markdown
    • require brew install libmagic

v0.1.2 2023/6/15

  • ChatGPT API Update : gpt-3.5-turbo-16k
    • token 4k -> 16k (about 3 pages cover per 1 request)

v0.1.1 2023/6/6

  • ConstitutionalChain(test) : if output format is wrong, fix it.

v0.1.0 2023/6/4

  • paper translator using Langchain
  • preprocessing for paper (ex, split Reference)

Usage guide

Since Langchain's llm model uses OpenAI, an OpenAI API Key is required.

# OPENAI API key
OPENAI_API_KEY="..."

# Pinecone API key
PINECONE_API_KEY="..."
PINECONE_ENVIRONEMENT="..."

Install guide

git clone https://github.com/seohyunjun/paper-translator
cd paper-tanslator
python -m pip install -r ./requirements.txt

Example

python main.py --pdf https://arxiv.org/pdf/2304.06035v1.pdf --verbose 1 --outputfile ChooseYourWeapon.md


About

Paper Translator with Chroma-db / Create Knowledge Database with Chromadb, Pinecone

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages