Tool Collection

A collection of Python scripts and shell scripts that utilize the GPT-4 language model to process and analyze text files, particularly scientific papers. These tools are designed to automate various tasks such as summarization, reformatting, and extraction of specific information from the text.

Scripts

1. `read_paper.py`

This Python script performs agglomerative information extraction and summarization on a directory of text files. It processes the files, splits them into paragraphs or sentences, classifies the sentences into predefined topics using a Language Model (LLM), and generates summaries for each topic and the overall document.

2. `embed_dist.py`

This Python script calculates the Euclidean distance and cosine similarity between two string embeddings. It utilizes an API to generate the embeddings for the provided strings.

3. `summarize_paper.py`

This Python script summarizes the content of a file or a directory of files using an API. It breaks down the content into chunks, sends each chunk to the API for summarization, and then concatenates the summaries to generate a final summary.

4. `rewrite.py`

This Python script processes a file or a directory of files, applying a specific instruction to the content using an API. It breaks down the content into chunks, sends each chunk to the API for processing, and concatenates the processed chunks to generate the final output.

5. `go_linux.sh`

This Bash script takes a natural language prompt as input and sends it to an API endpoint using curl. The API response, which is expected to be a precise Linux command based on the input prompt, is then displayed in the terminal.

6. `rewrite-line.py`

This Python script processes a file line by line, applying a specific instruction to each line using an API. It sends each line to the API for processing and stores the original line and the corresponding output in a JSON format.

Usage

Detailed usage instructions for each script can be found in their respective sections in this README.

Dependencies

Python 3.x
argparse
os
json
requests
numpy
scikit-learn
curl
jq

Example Workflow

Convert PDF files to text format using pdf_to_text.
Summarize the extracted text files using summarize_paper.py with specific instructions.
Rewrite the generated summaries using rewrite.py to group relevant applications by type and cancer type.
Extract verbatim prompt examples from the summaries using rewrite.py and create a table.
Extract high-level results worth highlighting using rewrite.py.

Notes

These scripts assume the existence of an API endpoint that accepts JSON payloads and returns the processed data.
Make sure to have the necessary dependencies installed before running the scripts.
Adjust the API endpoint URLs and request payloads based on your specific API requirements.
Feel free to customize the scripts to suit your specific needs and use cases.

For more detailed information on each script, please refer to their respective sections in this README.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
__pycache__		__pycache__
agents		agents
argo		argo
embeddings		embeddings
pull_papers		pull_papers
reformat_txt		reformat_txt
.coverage		.coverage
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tool Collection

Scripts

1. `read_paper.py`

2. `embed_dist.py`

3. `summarize_paper.py`

4. `rewrite.py`

5. `go_linux.sh`

6. `rewrite-line.py`

Usage

Dependencies

Example Workflow

Notes

About

Releases

Packages

Languages

chian/tool_collection

Folders and files

Latest commit

History

Repository files navigation

Tool Collection

Scripts

1. read_paper.py

2. embed_dist.py

3. summarize_paper.py

4. rewrite.py

5. go_linux.sh

6. rewrite-line.py

Usage

Dependencies

Example Workflow

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `read_paper.py`

2. `embed_dist.py`

3. `summarize_paper.py`

4. `rewrite.py`

5. `go_linux.sh`

6. `rewrite-line.py`

Packages