Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use any Ollama models, although Ollama itself is working fine. #259

Closed
marhensa opened this issue Nov 12, 2024 · 47 comments · May be fixed by #344
Closed

Unable to use any Ollama models, although Ollama itself is working fine. #259

marhensa opened this issue Nov 12, 2024 · 47 comments · May be fixed by #344
Assignees
Labels
bug Something isn't working stale The pull / issue is stale and will be closed soon

Comments

@marhensa
Copy link

Describe the bug

I have Ollama installed in Windows 11 24H2, default port 11434.

 ollama list
NAME                        ID              SIZE      MODIFIED
opencoder-extra:8b          a8a4a23defc6    4.7 GB    5 seconds ago
opencoder:8b                c320df6c224d    4.7 GB    10 minutes ago
granite3-dense-extra:8b     2763b99345c8    4.9 GB    8 hours ago
granite3-dense:8b           b5e91128f3ef    4.9 GB    21 hours ago
llama3.2:latest             a80c4f17acd5    2.0 GB    2 weeks ago
llava-phi3:latest           c7edd7b87593    2.9 GB    2 weeks ago
mxbai-embed-large:latest    468836162de7    669 MB    2 weeks ago
mistral-nemo:latest         994f3b8b7801    7.1 GB    2 weeks ago
qwen2.5:7b                  845dbda0ea48    4.7 GB    2 weeks ago
 ollama serve
2024/11/12 18:30:19 routes.go:1189: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\Marhensa\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-11-12T18:30:19.019+07:00 level=INFO source=images.go:755 msg="total blobs: 38"
time=2024-11-12T18:30:19.025+07:00 level=INFO source=images.go:762 msg="total unused blobs removed: 0"
time=2024-11-12T18:30:19.031+07:00 level=INFO source=routes.go:1240 msg="Listening on 127.0.0.1:11434 (version 0.4.1)"
time=2024-11-12T18:30:19.033+07:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm cpu]"
time=2024-11-12T18:30:19.033+07:00 level=INFO source=gpu.go:221 msg="looking for compatible GPUs"
time=2024-11-12T18:30:19.033+07:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2024-11-12T18:30:19.033+07:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2024-11-12T18:30:19.301+07:00 level=INFO source=types.go:123 msg="inference compute" id=GPU-025d3e54-b313-bf37-470b-5f87afd418ee library=cuda variant=v12 compute=8.6 driver=12.7 name="NVIDIA GeForce RTX 3060"

I install bolt.new-any-llm on WSL2 Debian 12 (mirrored network mode)

I can access and use Ollama just fine from WSL2 Debian bash terminal

wsluser@DESKTOP:~/labs/bolt.new-any-llm$ curl http://127.0.0.1:11434
Ollama is running

I can even try to use it to ask "why is the sky blue?" question from WSL2 Debian bash using curl, so the network / connection is not a problem.

wsluser@DESKTOP:~/labs/bolt.new-any-llm$ curl -D - http://127.0.0.1:11434/api/chat -d '{"model":"opencoder-extra:8b","messages":[{"role":"user","content":"why is the sky blue?"}],"stream":false}'
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Tue, 12 Nov 2024 11:47:08 GMT
Content-Length: 740

{"model":"opencoder-extra:8b","created_at":"2024-11-12T11:47:08.0489403Z","message":{"role":"assistant","content":"This isn't actually a problem to solve through programming per se. The color of the sky can vary depending on the time of year and location. In general terms, it's due to scattering from millions of tiny particles in the atmosphere, referred to as Rayleigh scattering. This effect makes the sky appear blue when viewed from above. But this could also be influenced by factors like atmospheric conditions or even human eye adaptation.\n"},"done_reason":"stop","done":true,"total_duration":7165754700,"load_duration":2206961600,"prompt_eval_count":31,"prompt_eval_duration":570000000,"eval_count":86,"eval_duration":4295000000}

I set the .env.local for Ollama base URL with http://127.0.0.1:11434

I run the container with that .env.local file
the command is docker compose --profile development --env-file .env.local up

wsluser@DESKTOP:~/labs/bolt.new-any-llm$ docker compose --profile development --env-file .env.local up
[+] Running 1/0
 ✔ Container boltnew-any-llm-bolt-ai-dev-1  Created                                                                0.0s
Attaching to bolt-ai-dev-1
bolt-ai-dev-1  |
bolt-ai-dev-1  | > bolt@ dev /app
bolt-ai-dev-1  | > remix vite:dev "--host" "0.0.0.0"
bolt-ai-dev-1  |
bolt-ai-dev-1  |   ➜  Local:   http://localhost:5173/
bolt-ai-dev-1  |   ➜  Network: http://172.18.0.2:5173/

the container runs fine, I opened up browser the Bolt can see the Ollama LLM lists:
image

but when the conversation starts, it gives error RIGHT AWAY.
image

in bash console there error accessing Ollama:

bolt-ai-dev-1  | > bolt@ dev /app
bolt-ai-dev-1  | > remix vite:dev "--host" "0.0.0.0"
bolt-ai-dev-1  |
bolt-ai-dev-1  |   ➜  Local:   http://localhost:5173/
bolt-ai-dev-1  |   ➜  Network: http://172.18.0.2:5173/
bolt-ai-dev-1  | RetryError [AI_RetryError]: Failed after 3 attempts. Last error: Cannot connect to API: connect ECONNREFUSED 127.0.0.1:11434
bolt-ai-dev-1  |     at _retryWithExponentialBackoff (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:98:13)
bolt-ai-dev-1  |     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
bolt-ai-dev-1  |     at async startStep (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:3903:13)
bolt-ai-dev-1  |     at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:3977:11)
bolt-ai-dev-1  |     at async file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:256:22
bolt-ai-dev-1  |     at async chatAction (/app/app/routes/api.chat.ts:49:20)
bolt-ai-dev-1  |     at async Object.callRouteAction (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/data.js:37:16)
bolt-ai-dev-1  |     at async /app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4612:21
bolt-ai-dev-1  |     at async callLoaderOrAction (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4677:16)
bolt-ai-dev-1  |     at async Promise.all (index 1)
bolt-ai-dev-1  |     at async callDataStrategyImpl (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4552:17)
bolt-ai-dev-1  |     at async callDataStrategy (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4041:19)
bolt-ai-dev-1  |     at async submit (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:3900:21)
bolt-ai-dev-1  |     at async queryImpl (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:3858:22)
bolt-ai-dev-1  |     at async Object.queryRoute (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:3827:18)
bolt-ai-dev-1  |     at async handleResourceRequest (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/server.js:413:20)
bolt-ai-dev-1  |     at async requestHandler (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/server.js:156:18)
bolt-ai-dev-1  |     at async /app/node_modules/.pnpm/@[email protected]_@[email protected][email protected][email protected][email protected]_typ_qwyxqdhnwp3srgtibfrlais3ge/node_modules/@remix-run/dev/dist/vite/cloudflare-proxy-plugin.js:70:25 {
bolt-ai-dev-1  |   cause: undefined,
bolt-ai-dev-1  |   reason: 'maxRetriesExceeded',
bolt-ai-dev-1  |   errors: [
bolt-ai-dev-1  |     APICallError [AI_APICallError]: Cannot connect to API: connect ECONNREFUSED 127.0.0.1:11434
bolt-ai-dev-1  |         at postToApi (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@ai-sdk/provider-utils/dist/index.js:446:15)
bolt-ai-dev-1  |         at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
bolt-ai-dev-1  |         at async OllamaChatLanguageModel.doStream (/app/node_modules/.pnpm/[email protected][email protected]/node_modules/ollama-ai-provider/dist/index.js:485:50)
bolt-ai-dev-1  |         at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:3938:23)
bolt-ai-dev-1  |         at async file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:256:22
bolt-ai-dev-1  |         at async _retryWithExponentialBackoff (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:86:12)
bolt-ai-dev-1  |         at async startStep (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:3903:13)
bolt-ai-dev-1  |         at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:3977:11)
bolt-ai-dev-1  |         at async file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/dist/index.mjs:256:22
bolt-ai-dev-1  |         at async chatAction (/app/app/routes/api.chat.ts:49:20)
bolt-ai-dev-1  |         at async Object.callRouteAction (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/data.js:37:16)
bolt-ai-dev-1  |         at async /app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4612:21
bolt-ai-dev-1  |         at async callLoaderOrAction (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4677:16)
bolt-ai-dev-1  |         at async Promise.all (index 1)
bolt-ai-dev-1  |         at async callDataStrategyImpl (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4552:17)
bolt-ai-dev-1  |         at async callDataStrategy (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:4041:19)
bolt-ai-dev-1  |         at async submit (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:3900:21)
bolt-ai-dev-1  |         at async queryImpl (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:3858:22)
bolt-ai-dev-1  |         at async Object.queryRoute (/app/node_modules/.pnpm/@[email protected]/node_modules/@remix-run/router/dist/router.cjs.js:3827:18)
bolt-ai-dev-1  |         at async handleResourceRequest (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/server.js:413:20)
bolt-ai-dev-1  |         at async requestHandler (/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/server.js:156:18)
bolt-ai-dev-1  |         at async /app/node_modules/.pnpm/@[email protected]_@[email protected][email protected][email protected][email protected]_typ_qwyxqdhnwp3srgtibfrlais3ge/node_modules/@remix-run/dev/dist/vite/cloudflare-proxy-plugin.js:70:25 {
bolt-ai-dev-1  |       cause: [Error],
bolt-ai-dev-1  |       url: 'http://127.0.0.1:11434/api/chat',
bolt-ai-dev-1  |       requestBodyValues: [Object],
bolt-ai-dev-1  |       statusCode: undefined,
bolt-ai-dev-1  |       responseHeaders: undefined,
bolt-ai-dev-1  |       responseBody: undefined,
bolt-ai-dev-1  |       isRetryable: true,
bolt-ai-dev-1  |       data: undefined,
bolt-ai-dev-1  |       [Symbol(vercel.ai.error)]: true,
bolt-ai-dev-1  |       [Symbol(vercel.ai.error.AI_APICallError)]: true
bolt-ai-dev-1  |     },

it cannot access http://127.0.0.1:11434/api/chat ? or what?

is it POST vs GET thingy that Ollama have problem with?

Link to the Bolt URL that caused the error

http://localhost:5173

Steps to reproduce

  1. Install Ollama on Windows, get some models, serve it to default port
  2. Set WSL2 Debian 12 in mirrored networking mode, so Debian could access resources from Windows (like 127.0.0.1:11434) inside Linux WSL2.
  3. Set the Ollama in .env.local
  4. Set Docker container, using development profile with docker compose --profile development --env-file .env.local up
  5. Open the browser, choose Ollama, choose the model
  6. Ask something
  7. ERROR

Expected behavior

Error.

image

Screen Recording / Screenshot

Rekaman.2024.11.12.192136.mp4

Platform

  • OS: Windows (for Ollama), WSL2 Debian 12 Mirrored Network Mode (for this to run from Docker)
  • Browser: Edge
  • Version:

Additional context

No response

@chrismahoney
Copy link
Collaborator

In your .env.local file, are you adding http://localhost:11434 as your Ollama base URL? If not, please duplicate .env.example to .env.local, add that entry, and try again.

@chrismahoney
Copy link
Collaborator

I’m not sure at the moment (any may depend on your execution context) if you need to use 127.0.0.1 instead of localhost. Localhost has been working for me more or less from the start.

@marhensa
Copy link
Author

I’m not sure at the moment (any may depend on your execution context) if you need to use 127.0.0.1 instead of localhost. Localhost has been working for me more or less from the start.

the problem is not localhost vs 127.0.0.1, I tried both and still got error.

even with 127.0.0.1 the Ollama model is listed in browser of this forked Bolt.new, as you can see in my screenshot. Model installed in Ollama is recognized by Bolt.

the problem is when conversation start, it immediately gives error.

@chrismahoney
Copy link
Collaborator

Oh, you're running from docker. You should have an .env.local file with the following specified for the Ollama base URL:

# You only need this environment variable set if you want to use oLLAMA models
# EXAMPLE http://localhost:11434
OLLAMA_API_BASE_URL=http://host.docker.internal:11434

Your oTToDev instance is trying to locate Ollama at http://127.0.0.1:11434, which is not valid while you're running within a Docker container based on this output in the video you posted (thanks for the video!)

image

Please update .env.local, rebuild based on docker instructions and then re-run the container. Hopefully that takes care of it.

@customize9292
Copy link

I had the same problem. I just reinstalled it and everything worked.

@chrismahoney
Copy link
Collaborator

If the above host.docker.internal change fixes this for you, please respond so we can close this issue.

@mmundy3832
Copy link

I am also using docker, and I have the same issue as OP. Unfortunately, I was not able to get it running using the change to OLLAMA_API_BASE_URL. In fact, that doesn't make sense to me because, as OP mentioned, this command works
curl -D - http://127.0.0.1:11434/api/chat -d '{"model":"qwen2.5-coder-extra-ctx:7b","messages":[{"role":"user","content":"why is the sky blue?"}],"stream":false}

@marhensa
Copy link
Author

marhensa commented Nov 12, 2024

Oh, you're running from docker. You should have an .env.local file with the following specified for the Ollama base URL:

Your oTToDev instance is trying to locate Ollama at http://127.0.0.1:11434, which is not valid while you're running within a Docker container based on this output in the video you posted (thanks for the video!)

Hi.. thank you to help me resolve this, but sadly it's still not working.

image

as I said in the first post, the oTToDev already see the model (so no matter it's 127.0.0.1, localhost, or host.docker.internal)

here the console of Ollama, it succeed to serve the /api/tags which basically listing the models (it's called from oTToDev when I open the browser).

it shows 200 which is sucessful, but still, whenever the conversation starts, it gives error.

so /api/tags is reachable but /api/chat somehow failed.

Rekaman.2024.11.13.054919.mp4

@schilling3003
Copy link

I am having a similar issue where the app sees the Ollama models, but the chat fails. What model is it trying to send the request to? For me, no matter what Ollama model I chose I always tries to send the request to Claude 3.5-Sonnet which makes it fail. Same problem when I try to use LMStudio. All local LLMs appear to be broken.

@marhensa
Copy link
Author

I am having a similar issue where the app sees the Ollama models, but the chat fails. What model is it trying to send the request to? For me, no matter what Ollama model I chose I always tries to send the request to Claude 3.5-Sonnet which makes it fail. Same problem when I try to use LMStudio. All local LLMs appear to be broken.

All Ollama models can be listed in oTToDev, but they fail whenever a conversation starts, all of them.

@radumalica
Copy link

radumalica commented Nov 13, 2024

same here, pulled the latest version and i have my OLLAMA_BASE_URL set properly, problem is when choosing Ollama, i can see my downloaded models, but for some reason when you hit enter to code something, the app si trying to get "claude-3-5-sonnet-latest" instead of the selected Ollama model:

    at null.<anonymous> (file:///app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@ai-sdk/provider-utils/src/response-handler.ts:72:16)
    at async postToApi (file:///app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@ai-sdk/provider-utils/src/post-to-api.ts:81:28)
    at async OllamaChatLanguageModel.doStream (file:///app/node_modules/.pnpm/[email protected][email protected]/node_modules/ollama-ai-provider/src/ollama-chat-language-model.ts:230:50)
    at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/core/generate-text/stream-text.ts:345:23)
    at null.<anonymous> (async file:///app/.wrangler/tmp/dev-Ih0o3c/functionsWorker-0.48321999920050174.js:30634:22)
    at async _retryWithExponentialBackoff (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/util/retry-with-exponential-backoff.ts:37:12)
    at async startStep (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/core/generate-text/stream-text.ts:310:13)
    at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/core/generate-text/stream-text.ts:387:11)
    at null.<anonymous> (async file:///app/.wrangler/tmp/dev-Ih0o3c/functionsWorker-0.48321999920050174.js:30634:22)
    at async chatAction (file:///app/build/server/index.js:1038:20) {
  cause: undefined,
  url: 'http://ollama_url:11434/api/chat',
  requestBodyValues: {
    format: undefined,
    model: 'claude-3-5-sonnet-latest',
    options: { num_ctx: 32768, num_predict: 8000, temperature: 0 },
    messages: [ [Object], [Object] ],
    tools: undefined
  },
  statusCode: 404,
  responseHeaders: {
    'content-length': '78',
    'content-type': 'application/json; charset=utf-8',
    date: 'Wed, 13 Nov 2024 08:32:12 GMT'
  },
  responseBody: '{"error":"model \\"claude-3-5-sonnet-latest\\" not found, try pulling it first"}',
  isRetryable: false,
  data: undefined,
  [Symbol(vercel.ai.error)]: true,
  [Symbol(vercel.ai.error.AI_APICallError)]: true
}
[wrangler:inf] POST /api/chat 500 Internal Server Error (39ms)

later edit with additional information:

In dev tools in browser (latest Chrome) the payload sent to the /api/chat:

{messages: [,…], apiKeys: {}}
apiKeys
: 
{}
messages
: 
[,…]
0
: 
{role: "user", content: "[Model: qwen2.5-coder:7b]↵↵[Provider: Ollama]↵↵build a simple express API"}
content
: 
"[Model: qwen2.5-coder:7b]\n\n[Provider: Ollama]\n\nbuild a simple express API"
role
: 
"user"

So the model and provider are properly sent to the app. This is picked up by Chat.client.tsx line 206 i think.

Also, from the error it seems that the provider is properly selected but not the model, and it defaults to DEFAULT_MODEL which is claude-3-5-sonnet-latest.

LATER EDIT 2:

Problem is in PR #188 which keeps the user choice and introduces regex match for provider and model. I added some console.log entries in stream-text.ts file and here is the output while choosing Ollama and code-qwen2.5 in the frontend:

more debug:

Model match result: null
Provider match result: null
Extracted Model in function: claude-3-5-sonnet-latest
Extracted Provider in function: Anthropic
Message content: [Model: qwen2.5-coder:7b]

[Provider: Ollama]

test
Model Match: null
Provider Match: null
Setting currentModel to: claude-3-5-sonnet-latest
[
  'gpt-4o',
  'anthropic/claude-3.5-sonnet',
  'anthropic/claude-3-haiku',
  'deepseek/deepseek-coder',
  'google/gemini-flash-1.5',
  'google/gemini-pro-1.5',
  'x-ai/grok-beta',
  'mistralai/mistral-nemo',
  'qwen/qwen-110b-chat',
  'cohere/command',
  'gemini-1.5-flash-latest',
  'gemini-1.5-pro-latest',
  'llama-3.1-70b-versatile',
  'llama-3.1-8b-instant',
  'llama-3.2-11b-vision-preview',
  'llama-3.2-3b-preview',
  'llama-3.2-1b-preview',
  'claude-3-5-sonnet-latest',
  'claude-3-5-sonnet-20240620',
  'claude-3-5-haiku-latest',
  'claude-3-opus-latest',
  'claude-3-sonnet-20240229',
  'claude-3-haiku-20240307',
  'gpt-4o-mini',
  'gpt-4-turbo',
  'gpt-4',
  'gpt-3.5-turbo',
  'grok-beta',
  'deepseek-coder',
  'deepseek-chat',
  'open-mistral-7b',
  'open-mixtral-8x7b',
  'open-mixtral-8x22b',
  'open-codestral-mamba',
  'open-mistral-nemo',
  'ministral-8b-latest',
  'mistral-small-latest',
  'codestral-latest',
  'mistral-large-latest'
]

REGEX is failing because the MODEL_LIST doesn't contain the OLLAMA models (i have 2 of them: qwen2.5-coder:7b and llama-3.1:latest) which are not in the list, hence the REGEX match failing and app defaulting to DEFAULT_MODEL and DEFAULT_PROVIDER.

@radumalica
Copy link

radumalica commented Nov 13, 2024

workaround: if you are running in docker and you use an external URL for OLLAMA, just set in docker-compose.yaml 'RUNNING_IN_DOCKER=false' otherwise the app will silently use host.docker.internal URL in this function here:

const getOllamaBaseUrl = () => {
  const defaultBaseUrl = import.meta.env.OLLAMA_API_BASE_URL || 'http://default_ollama_url:11434;
  // Check if we're in the browser
  if (typeof window !== 'undefined') {
    // Frontend always uses localhost
    return defaultBaseUrl;
  }

  // Backend: Check if we're running in Docker
  const isDocker = process.env.RUNNING_IN_DOCKER === 'true';

  return isDocker
    ? defaultBaseUrl.replace("localhost", "host.docker.internal")
    : defaultBaseUrl;
};

So if RUNNING_IN_DOCKER is true, the Base URL for Ollama in the BACKEND will be set to host.docker.internal and the model list pull will silently fail with no warning or errors.
In the FRONTEND however, it will be set to whatever you added in the .env.local file as OLLAMA_BASE_API_URL.

I don't know or found out where is that happening

This is a major inconsistency of the app. The BASE URL should be retrieved from .env.local to keep the application consistent, and only add instructions to the README file for users that need host.docker.internal as an additional dns entry in docker-compose file.

@new4u
Copy link

new4u commented Nov 13, 2024

RUNNING_IN_DOCKER=false

this is ture! i solved problem (running in docker and using openailike

@mmundy3832
Copy link

RUNNING_IN_DOCKER=false

this is ture! i solved problem (running in docker and using openailike

Unfortunately, it did not solve my problem. I still have an exception, but mine appears just a little different:

Error: Network connection lost.
bolt-ai-1  |     at async postToApi (file:///app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@ai-sdk/provider-utils/src/post-to-api.ts:65:22)
bolt-ai-1  |     at async OllamaChatLanguageModel.doStream (file:///app/node_modules/.pnpm/[email protected][email protected]/node_modules/ollama-ai-provider/src/ollama-chat-language-model.ts:230:50)
bolt-ai-1  |     at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/core/generate-text/stream-text.ts:345:23)
bolt-ai-1  |     at null.<anonymous> (async file:///app/.wrangler/tmp/dev-bpgYTk/functionsWorker-0.16294152281104846.js:30634:22)                                                                                                  
bolt-ai-1  |     at async _retryWithExponentialBackoff (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/util/retry-with-exponential-backoff.ts:37:12)                                                                                                                                                                                                            
bolt-ai-1  |     at async startStep (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/core/generate-text/stream-text.ts:310:13)
bolt-ai-1  |     at async fn (file:///app/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected][email protected][email protected][email protected]/node_modules/ai/core/generate-text/stream-text.ts:387:11)       
bolt-ai-1  |     at null.<anonymous> (async file:///app/.wrangler/tmp/dev-bpgYTk/functionsWorker-0.16294152281104846.js:30634:22)                                                                                                  
bolt-ai-1  |     at async chatAction (file:///app/build/server/index.js:998:20)
bolt-ai-1  |     at async Object.callRouteAction (file:///app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@remix-run/server-runtime/dist/data.js:37:16) {                                    
bolt-ai-1  |   retryable: true                                                                                                                                                                                                     
bolt-ai-1  | }                                                                                                                                                                                                                     
[wrangler:inf] POST /api/chat 500 Internal Server Error (6ms) 

This error occurs with RUNNING_IN_DOCKER set to either true or false, and all combinations of OLLAMA_API_BASE_URL mentioned above

@chrismahoney
Copy link
Collaborator

LATER EDIT 2:

Problem is in PR #188 which keeps the user choice and introduces regex match for provider and model. I added some console.log entries in stream-text.ts file and here is the output while choosing Ollama and code-qwen2.5 in the frontend

Thanks for the context, this sounds like an edge case based on remembering the last selected provider/model and that update of the fields registering as a model change. There were a couple of PRs in flight recently around this, making a note as I'm currently looking at env vars in general.

@chrismahoney chrismahoney added the bug Something isn't working label Nov 13, 2024
@chrismahoney
Copy link
Collaborator

Note: To review this behavior with updates to providers in #251.

@hardj300
Copy link

hardj300 commented Nov 13, 2024

I am not sure if my issue is the same or slightly different. I have Ollama and OpenWebUI deployed on a single container and known to be working just fine. When I deploy your Bolt fork into the same docker instance I am unable to see any models in the second drop down. I am able to curl the ollama api url from the bolt container just fine and am pretty certain the ENV variable was passed succesfully. Feel free to help me check.
Screenshot_20241113_150426_Chrome

here is what is looks like when I run the compose up command.
SmartSelect_20241113_153011_Termux

@radumalica
Copy link

I am not sure if my issue is the same or slightly different. I have Ollama and OpenWebUI deployed on a single container and known to be working just fine. When I deploy your Bolt fork into the same docker instance I am unable to see any models in the second drop down. I am able to curl the ollama api url from the bolt container just fine and am pretty certain the ENV variable was passed succesfully.

As i see in your screenshot, the .env.local is not passed to the container since it is added it in .dockerignore file by default. You need to remove it from there so the file will be copied to the actual docker container, and bindings.sh needs that file.

Also, make sure you are running this in the same docker network as ollama and webui containers. If it is so, you need to get the container IP address for Ollama container , docker inspect ollama_container_name it will be something like 172.19.x.x and you actually need to pass that URL as OLLAMA_BASE_URL in your bolt setup in .env.local .

@hardj300
Copy link

I have updated my .dockerignore and commented out the lines to pull in the .env file. Thank you. This clears the error on startup regarding the .env file missing. However, this did not solve the issue for me. When I rebuild and compose I am still unable to see any models in the dropdown for Ollama.

When I console into the running bolt container I can run "curl http://172.16.1.19:11434/api/tags" and I get a response with a list of the models running in Ollama container. 172.16.1.19 is the host IP of my docker instance running on Ubuntu VM. Both the Ollama container and the Bolt container are running on the same host.

How can I continue to look into why the models are not populating into the dropdown?

@hardj300
Copy link

I sould add that this issue happens with the docker container as well as running direct with pnmp.

@marhensa
Copy link
Author

marhensa commented Nov 14, 2024

As i see in your screenshot, the .env.local is not passed to the container since it is added it in .dockerignore file by default. You need to remove it from there so the file will be copied to the actual docker container, and bindings.sh needs that file.

this issue is not about .env.local it's about how oTToDev can't do conversation with Ollama, while can list the Ollama models.

if the env.local is the problem, oTToDev cannot list the models in the first place.

check the first post, it's run by docker compose --profile development --env-file .env.local up

meaning it's directly using --env-file .env.local command, so no problem with .env file used here.

the problem is entirely different.

@PrepperShepherd
Copy link

Found a solution that worked in my case. The issue was two-fold:

  1. In model.ts, the default case in the getModel function was incorrectly passing baseURL as the model parameter:

// Before (incorrect)
default:
return getOllamaModel(baseURL, model);

// After (fixed)
default:
return getOllamaModel(model);

In docker-compose.yaml, I needed to set up the correct networking and Ollama URL:

services:
bolt-ai-dev:
environment:
- OLLAMA_API_BASE_URL=http://ollama:11434
networks:
- demo

networks:
demo:
name: self-hosted-ai-starter-kit_demo
external: true

These changes allowed the application to:

Properly pass the model name to Ollama instead of trying to use the URL as the model name
Connect to Ollama through the Docker network using the container name

After making these changes and rebuilding with:

docker compose --profile development down
docker compose --profile development up --build

The Ollama integration started working correctly. This seems to be related to but separate from the model selection issue others are experiencing with Claude defaults.

Only issue now is that though even though it's writing code, the Preview window in bolt.new UI isn't working at all.

After fixing the connection issue, I noticed that while Ollama is now able to generate code responses, the Preview window in the UI isn't working at all. This appears to be a separate issue from the initial connection problem.

Steps to reproduce the Preview issue:

  1. Select an Ollama model
  2. Ask it to generate code (eg: "create a simple React component")
  3. Code is generated successfully, but the Preview window remains blank/non-functional

Can anyone else confirm if they're seeing the same behavior with the Preview functionality after getting Ollama working?

Environment details:

  • Running in Docker using the setup described in my previous comment
  • Ollama is functioning and generating code responses
  • Preview window specifically is non-functional

@schilling3003
Copy link

I am having a similar issue where the app sees the Ollama models, but the chat fails. What model is it trying to send the request to? For me, no matter what Ollama model I chose I always tries to send the request to Claude 3.5-Sonnet which makes it fail. Same problem when I try to use LMStudio. All local LLMs appear to be broken.

It's working for me now. When I initially tried to install oTToDev the docker build failed, and so did the regular Windows install method. I finally got it to install inside of WSL, but then Ollama did not work. It showed the models but all requests went to Claude Sonnet 3.5 no matter what I had selected, and I obviously am not running Clause Sonnet.

I was able to get the normal Windows install method and everything works now. I would ideally like to get this running in Docker, but continue to have issues building the container.

@MightymonkNL
Copy link

I had the same problem (not docker version, just the Windows install). I found out that on my setup i had to just fill in something in the .env.local in the Anthropic key (ANTHROPIC_API_KEY=dw23423NOAI) . And after that it worked fine..

@dzbeebo
Copy link

dzbeebo commented Nov 14, 2024

I got the same issue here. I'm on a MacOS and deployed it using Docker and I tried the "fix" of changing the RUNNING_IN_DOCKER to false, but that didn't resolve the issue. As you can see below, the model that's being referenced is still calude 3.5

2024-11-14 14:03:32     cause: Error: connect ECONNREFUSED 127.0.0.1:11434
2024-11-14 14:03:32         at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1607:16)
2024-11-14 14:03:32         at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
2024-11-14 14:03:32       errno: -111,
2024-11-14 14:03:32       code: 'ECONNREFUSED',
2024-11-14 14:03:32       syscall: 'connect',
2024-11-14 14:03:32       address: '127.0.0.1',
2024-11-14 14:03:32       port: 11434
2024-11-14 14:03:32     },
2024-11-14 14:03:32     url: 'http://127.0.0.1:11434/api/chat',
2024-11-14 14:03:32     requestBodyValues: {
2024-11-14 14:03:32       format: undefined,
2024-11-14 14:03:32       model: 'claude-3-5-sonnet-latest',
2024-11-14 14:03:32       options: [Object],
2024-11-14 14:03:32       messages: [Array],
2024-11-14 14:03:32       tools: undefined
2024-11-14 14:03:32     },
2024-11-14 14:03:32     statusCode: undefined,
2024-11-14 14:03:32     responseHeaders: undefined,
2024-11-14 14:03:32     responseBody: undefined,
2024-11-14 14:03:32     isRetryable: true,
2024-11-14 14:03:32     data: undefined,
2024-11-14 14:03:32     [Symbol(vercel.ai.error)]: true,
2024-11-14 14:03:32     [Symbol(vercel.ai.error.AI_APICallError)]: true
2024-11-14 14:03:32   },
2024-11-14 14:03:32   [Symbol(vercel.ai.error)]: true,
2024-11-14 14:03:32   [Symbol(vercel.ai.error.AI_RetryError)]: true
2024-11-14 14:03:32 }

@unmotivatedgene
Copy link

Same issue here, installed bolt first time today. Bolt can see ollama and load models, but it cannot chat with it. Ollama doesn't even see the request. Errors are the same as OP.

@hardj300
Copy link

How are people getting their models for ollama to show up in the drop down? Hahaha. I can't even get that far. I was able to deploy OpenHands and use my Ollama instance with no issues at all.

@marhensa
Copy link
Author

How are people getting their models for ollama to show up in the drop down? Hahaha. I can't even get that far. I was able to deploy OpenHands and use my Ollama instance with no issues at all.

refresh browser, wait, change to other source, back to Ollama, the model list from Ollama will show.

@chrismahoney
Copy link
Collaborator

Thanks for the info, Ollama is up top for me in terms of provider issue resolution; going to spend some time with these issues starting tomorrow

@chrismahoney chrismahoney self-assigned this Nov 15, 2024
@radumalica
Copy link

Since you will focus on it, here is another small issue. Whenever you choose Ollama and you get the models in dropdown list, if you refresh the page, the Ollama provider remains selected (because of PR 188) but the model list is empty. You have to choose another provider, and choose Ollama back to see the models again. Might happen with other providers such as OpenAILike and others that pull dynamically the model list.

Once the page is refreshed, the Ollama provider is chosen, no models appear in the list, and if you give it a prompt, it will throw an error and going again back to Claude-3.5-Sonnet which is the default one, even though the local storage of the browser has the Ollama provider and the chosen model you had before in the cookies. This happens until you select another provider with a static model and then select Ollama again .

@mmundy3832
Copy link

How are people getting their models for ollama to show up in the drop down? Hahaha. I can't even get that far. I was able to deploy OpenHands and use my Ollama instance with no issues at all.

refresh browser, wait, change to other source, back to Ollama, the model list from Ollama will show.

I'm one of the people who has seen it work, but most of the time my list is empty for Ollama

@xorica27
Copy link

xorica27 commented Nov 16, 2024

There's an easy fix without any hardcoding:

Open Terminal:
export OLLAMA_HOST=localhost:8888

Then,
ollama serve

Start Bolt application, refresh the browser, choose ollama models and it should be working.

@TheFoxStudio
Copy link

So just for my understanding: Running Bolt with local Ollama is currently broken?

Would it maybe be possible/smart to add a big, highlighted banner to the README so that others do not try to get this running for hours like I did and know they will just have to wait?

@dctfor
Copy link

dctfor commented Nov 16, 2024

Well... me I have this issue but I manage to hard fix the issue of running local Ollama at localhost/127.0.0.1:11434 ... there are some bugs in current project preventing local use

Not sure why I'm getting always the claude-sonnet model instead of the model selected in the dropdown, for me a custom llama3.1:8b with the ctx windows of 32k by now

First set your model at app/lib/.server/llm/stream-text.ts in the method streamText, my case I set :
currentModel = 'llama3.1:8b';
just before the return

image

Second, not sure why but it's attempting to connect to IPV6 localhost instead of IPV4, so I'm hardcoding in the getOllamaModel method @ app/lib/.server/llm/model.ts :
Ollama.config.baseURL = 'http://127.0.0.1:11434/api';

image

And that's it, that's my hardfix which allow me to make it work by now, created the fork and PR just in case you might want to pull from my fork

@TheFoxStudio
Copy link

@dctfor
That hardcoded fix works for me as well. I appreciate you figuring this out and sharing it.

At least I can get something done until this is being addressed. Thanks a lot!

@dctfor
Copy link

dctfor commented Nov 16, 2024

@TheFoxStudio
Will try to track down the issues related as is mainly because the IPV6 url format that fails to reach the api, not sure why when fetching the tags it works with ollama

Then will review why the dropdown is not working as expected and tries to send claude-sonnet

@dctfor
Copy link

dctfor commented Nov 16, 2024

So far I know that maybe is something related in the way it gets the model, because the default model is being taken ('claude-3-5-sonnet-latest') so is more likely to be a Frontend issue (not 100% sure) when sending to the back the chatAction

image

And the URL not sure as it should be taken from the ENV
image

So there might be something around the setup process I'll debug, since at least it should take ´localhost´ instead of ´::1´

@dctfor
Copy link

dctfor commented Nov 17, 2024

The issue with the DEFAUL_MODEL ´claude-3-5-sonnet-latest´ being taken instead of the selected one, is because by default is not set the static models, so when it asks for the available models in the online Ollama that can ask with /api/tags, is not updating the staticModels for the streamText method, tried to filter by any ollama provider in the model list but is empty

image

image

@dctfor
Copy link

dctfor commented Nov 17, 2024

Originally it used to send the message as it is because it used only one provider

image

Then it was adjusted for allowing virtually any provider, but the issue is with the model_list, so far I'm unable to fix that bug keeping the filtering in the streamText method, but if the model list is updated and set in the UI, from the UI the selected model should be ok and then we can proceed to use the actual model that comes from the message, if there's an error, somethings odd when fetching the available models for the provider and should be updated, otherwise I think is the best solution to the problem.

image

@dctfor
Copy link

dctfor commented Nov 17, 2024

And now is fixed on my PR, with that change virtually will allow any model from the source you set in the env vars, no issue with this simple change on my local.

This was referenced Nov 17, 2024
@unmotivatedgene
Copy link

And now is fixed on my PR, with that change virtually will allow any model from the source you set in the env vars, no issue with this simple change on my local.

I tried your branch and I see no change other than in the error the correct model is listed.
Ollama still never sees the chat request hit it.

@unmotivatedgene
Copy link

unmotivatedgene commented Nov 18, 2024

Ok so using @dctfor 's fixed branch and adding to the docker compose:

network_mode: "host"

I am finally able to get responses from chatting. I did not try network mode host before trying his branch.

@marhensa
Copy link
Author

marhensa commented Nov 18, 2024

Ok so using @dctfor 's fixed branch and adding to the docker compose:

network_mode: "host"

I am finally able to get responses from chatting. I did not try network mode host before trying his branch.

finally 🥹

Desktop.2024.11.18.20.29.40.04.mp4

image

image

@mroxso
Copy link

mroxso commented Nov 18, 2024

I am not sure if my issue is the same or slightly different. I have Ollama and OpenWebUI deployed on a single container and known to be working just fine. When I deploy your Bolt fork into the same docker instance I am unable to see any models in the second drop down. I am able to curl the ollama api url from the bolt container just fine and am pretty certain the ENV variable was passed succesfully.

As i see in your screenshot, the .env.local is not passed to the container since it is added it in .dockerignore file by default. You need to remove it from there so the file will be copied to the actual docker container, and bindings.sh needs that file.

Also, make sure you are running this in the same docker network as ollama and webui containers. If it is so, you need to get the container IP address for Ollama container , docker inspect ollama_container_name it will be something like 172.19.x.x and you actually need to pass that URL as OLLAMA_BASE_URL in your bolt setup in .env.local .

removing the .env.local defintion from the .dockerignore file did the trick for me! thanks!

@ssmirr
Copy link

ssmirr commented Nov 20, 2024

@chrismahoney @coleam00 I proposed a fix for this issue in #344

@itaylorweb
Copy link

I had the same issue. Im running locally (not in Docker). All I needed to do was set the base url for Ollama in my env.local to be http://127.0.0.1:11434 (not localhost) and it now works fine.

Copy link

github-actions bot commented Dec 2, 2024

This issue has been marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.

@github-actions github-actions bot added the stale The pull / issue is stale and will be closed soon label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale The pull / issue is stale and will be closed soon
Projects
None yet
Development

Successfully merging a pull request may close this issue.