Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Milvus exited after about 10 minutes #38453

Open
1 task done
HWZhang1234 opened this issue Dec 13, 2024 · 2 comments
Open
1 task done

[Bug]: Milvus exited after about 10 minutes #38453

HWZhang1234 opened this issue Dec 13, 2024 · 2 comments
Assignees
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@HWZhang1234
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:milvus:v2.4.13-gpu-hotfix
- Deployment mode(standalone or cluster):standalone 
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Ubuntu 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

When I use docker compose start milvus.It will exit after few minutes.

Expected Behavior

It always healthy

Steps To Reproduce

1.docker compose up -d  to start milvus
2.It exit for few minuts

Milvus Log

milvus.log

Anything else?

No response

@HWZhang1234 HWZhang1234 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 13, 2024
@yanliang567
Copy link
Contributor

@HWZhang1234 I didnot see the error about exit, but find 2 potential issues below. I suggest you add more resource and retry:

  1. this is not a clean deployment, which means there were collections existing in the cluster when you start milvus. And the memory is not enough to load the collection
    [2024/12/13 08:54:24.538 +00:00] [WARN] [meta/failed_load_cache.go:97] ["FailedLoadCache put failed record"] [collectionID=453258454674913397] [error="load segment failed, OOM if load, maxSegmentSize = 2269.283290863037 MB, memUsage = 55400.912271499634 MB, predictMemUsage = 57670.19556236267 MB, totalMem = 63887.4609375 MB thresholdFactor = 0.900000"]
  2. the mq is too slow
    3[2024/12/13 08:54:22.968 +00:00] [WARN] [server/rocksmq_impl.go:690] ["rocksmq produce too slowly"] [topic=by-dev-rootcoord-dml_2] ["get lock elapse"=0] ["alloc elapse"=0] ["write elapse"=0] ["updatePage elapse"=201] ["produce total elapse"=201]

/assign @HWZhang1234
/unassign

@HWZhang1234
Copy link
Author

I re-capture one milvus log.This time milvus only start less than 2 minutes.Could you please help check this log.
milvus.log

And I also upload my docker-compose.yml.Could you please help me add memory in this file.
docker-compose.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

2 participants