Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can we make the gptscript prompt type configurable #913

Open
milosgajdos opened this issue Dec 2, 2024 · 0 comments
Open

can we make the gptscript prompt type configurable #913

milosgajdos opened this issue Dec 2, 2024 · 0 comments

Comments

@milosgajdos
Copy link

milosgajdos commented Dec 2, 2024

We are building a GPTScript integration in one of our products which uses OSS LLMs.

We access these models via an API compatible with OpenAI API. We primarily use ollama for inference.

We've noticed that GPTscript sets the (first) chat completion message role to system instead of user, thus making it a system prompt.

This has a bad effect on some of the OSS models which are fine-tuned for tool calling, apparently. Specifically, we noticed that with llama3.1:8b-instruct-q8_0 model.

It's worth mentioning that GPTScript works as expected without any modifications with OpenAI; we've noticed the problems described in this issue only when interacting with the OSS LLMs.

For some reason setting the completion message to system instead of user prevents the following gptscript from running correctly (i.e. returning expected results).

Here's the script we noticed the strange behaviour on:

Click me!
model: llama3.1:8b-instruct-q8_0 from http://localhost:8080/v1
tools: sys.exec, sys.read, sys.write, sys.ls
description: Find out the current time

Find out the current time and return it. You must only use the defined tools. Do not guess answers.

When we run the above script with gptscript version v0.9.5 like so (NOTE: we disable the streaming and cache)

GPTSCRIPT_PROVIDER_LOCALHOST_API_KEY=XXXX GPTSCRIPT_INTERNAL_OPENAI_STREAMING="false" gptscript --debug-messages --disable-cache hlx.gpt

We get the following output:

14:32:26 started  [main]
14:32:27 sent     [main]
         content  [1] content | Waiting for model response...
14:32:27 ended    [main]

OUTPUT:

Here is the payload we observed being sent by GPTscript to ollama

Click me!
{
  "messages": [
    {
      "content": "Find out current time and return it. Use only available tools, Do not make up answers.",
      "role": "system"
    }
  ],
  "model": "llama3.1:8b-instruct-q8_0",
  "stream": true,
  "tools": [
    {
      "function": {
        "description": "Execute a command and get the output of the command",
        "name": "exec",
        "parameters": {
          "properties": {
            "command": {
              "description": "The command to run including all applicable arguments",
              "type": "string"
            },
            "directory": {
              "description": "The directory to use as the current working directory of the command. The current directory \".\" will be used if no argument is passed",
              "type": "string"
            }
          },
          "type": "object"
        }
      },
      "type": "function"
    },
    {
      "function": {
        "description": "Reads the contents of a file. Can only read plain text files, not binary files",
        "name": "read",
        "parameters": {
          "properties": {
            "filename": {
              "description": "The name of the file to read",
              "type": "string"
            }
          },
          "type": "object"
        }
      },
      "type": "function"
    },
    {
      "function": {
        "description": "Write the contents to a file",
        "name": "write",
        "parameters": {
          "properties": {
            "content": {
              "description": "The content to write",
              "type": "string"
            },
            "filename": {
              "description": "The name of the file to write to",
              "type": "string"
            }
          },
          "type": "object"
        }
      },
      "type": "function"
    },
    {
      "function": {
        "description": "Lists the contents of a directory",
        "name": "ls",
        "parameters": {
          "properties": {
            "dir": {
              "description": "The directory to list",
              "type": "string"
            }
          },
          "type": "object"
        }
      },
      "type": "function"
    }
  ]
}

And here's the response back

Click me!
{
  "id": "chatcmpl-156",
  "object": "chat.completion",
  "created": 1733157640,
  "model": "llama3.1:8b-instruct-q8_0",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": ""
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 69,
    "completion_tokens": 1,
    "total_tokens": 70,
    "completion_tokens_details": null
  },
  "system_fingerprint": "fp_ollama"
}

You can see we get back an empty response i.e. no tool is being called. This happens regardless of the number of times gptscript has been run.

After a long debugging session we noticed that if we changed the role from system (thus no longer sending a system prompt) to user llama suddenly starts responding with the response that leads to GPTScript working as expected. Here's the response we get back - the only change in the request payload is the type of the prompt i.e. message role.

Click me!
{
  "id": "chatcmpl-168",
  "object": "chat.completion",
  "created": 1733157657,
  "model": "llama3.1:8b-instruct-q8_0",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "index": 0,
            "id": "call_6zerrtfx",
            "type": "function",
            "function": {
              "name": "exec",
              "arguments": "{\"command\":\"date\",\"directory\":\".\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 395,
    "completion_tokens": 21,
    "total_tokens": 416,
    "completion_tokens_details": null
  },
  "system_fingerprint": "fp_ollama"
}

Which brings us to the question asked in this issue. Is there any particular reason gptscript sets the prompt to system prompt? Changing it to user prompt seems to make things work - at least with the llama family of models, apparently.

Would you be open to making the type of prompt configurable/overridable, say, via env?

milosgajdos added a commit to helixml/gptscript that referenced this issue Dec 4, 2024
milosgajdos added a commit to milosgajdos/helix that referenced this issue Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant