Skip to content

Commit

Permalink
describegpt: revise documentation for clarity
Browse files Browse the repository at this point in the history
[skip ci]
  • Loading branch information
rzmk committed Apr 20, 2024
1 parent f3b217b commit 182156f
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
| [count](/src/cmd/count.rs#L2)<br>📇🏎️🐻‍❄️ | Count the rows in a CSV file. (11.87 seconds for a 15gb, 27m row NYC 311 dataset without an index. Instantaneous with an index.) If the `polars` feature is enabled, uses Polars' multithreaded, mem-mapped CSV reader for fast counts even without an index |
| [datefmt](/src/cmd/datefmt.rs#L2)<br>🚀 | Formats recognized date fields ([19 formats recognized](https://docs.rs/qsv-dateparser/latest/qsv_dateparser/#accepted-date-formats)) to a specified date format using [strftime date format specifiers](https://docs.rs/chrono/latest/chrono/format/strftime/). |
| [dedup](/src/cmd/dedup.rs#L2)<br>🤯🚀 | Remove duplicate rows (See also `extdedup`, `extsort`, `sort` & `sortcheck` commands). |
| [describegpt](/src/cmd/describegpt.rs#L2)<br>🌐🤖 | Infer extended metadata about a CSV using a GPT model from [OpenAI's API](https://platform.openai.com/docs/introduction), [Ollama](https://ollama.com), or another OpenAI API compatible server such as [Jan](https://jan.ai). |
| [describegpt](/src/cmd/describegpt.rs#L2)<br>🌐🤖 | Infer extended metadata about a CSV using a GPT model from [OpenAI's API](https://platform.openai.com/docs/introduction), [Ollama](https://ollama.com), or another API compatible with the OpenAI API specification such as [Jan](https://jan.ai). |
| [diff](/src/cmd/diff.rs#L2)<br>🚀 | Find the difference between two CSVs with ludicrous speed!<br/>e.g. _compare two CSVs with 1M rows x 9 columns in under 600ms!_ |
| [enum](/src/cmd/enumerate.rs#L2) | Add a new column enumerating rows by adding a column of incremental or uuid identifiers. Can also be used to copy a column or fill a new column with a constant value. |
| [excel](/src/cmd/excel.rs#L2)<br>🚀 | Exports a specified Excel/ODS sheet to a CSV file. |
Expand Down
2 changes: 1 addition & 1 deletion docs/Describegpt.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# `describegpt` command

`describegpt` allows users to infer extended metadata about a CSV dataset using large language models, in particular GPT chat completion models from OpenAI's API, Ollama, or an API compatible with OpenAI's API such as Jan. `describegpt` uses `qsv stats` and `qsv frequency` in the background to provide context to the model.
`describegpt` allows users to infer extended metadata about a CSV dataset using large language models, in particular GPT chat completion models from OpenAI's API, Ollama, or an API compatible with the OpenAI API specification such as Jan. `describegpt` uses `qsv stats` and `qsv frequency` in the background to provide context to the model.

Note that this command uses LLMs for inferencing and is therefore prone to inaccurate information being produced. Verify output results before using them.

Expand Down
4 changes: 2 additions & 2 deletions src/cmd/describegpt.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ describegpt options:
--jsonl Return results in JSON Lines format.
--prompt-file <file> The JSON file containing the prompts to use for inferencing.
If not specified, default prompts will be used.
--base-url <url> The endpoint for interacting with LLMs. Can be used to run local
local LLMs with Ollama, Jan, etc.
--base-url <url> The URL of the API for interacting with LLMs. Supports APIs
compatible with the OpenAI API specification (Ollama, Jan, etc.).
[default: https://api.openai.com/v1]
--ollama Required flag when using Ollama.
--model <model> The model to use for inferencing.
Expand Down

0 comments on commit 182156f

Please sign in to comment.