You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
because we have clusters that can have more than 100 nodes time to time cluster cost count for 7 days in UI return
Failed to load report data
Request failed with status code 404
and in logs we see
opencost-bc6d766b9-fw2sg opencost-ui 2024/10/11 12:38:53 [error] 28#28: *296999 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.34.146.18, server: _, request: "GET /model/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false HTTP/1.1", upstream: "http://0.0.0.0:9003/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false", host: "prod-spok-opencost.domain.net", referrer: "https://prod-spok-opencost.domain.net/allocation?window=7d"
opencost-bc6d766b9-fw2sg opencost-ui 2024/10/11 12:38:53 [error] 28#28: *296999 open() "/var/www/custom_504.html" failed (2: No such file or directory), client: 10.34.146.18, server: _, request: "GET /model/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false HTTP/1.1", upstream: "http://0.0.0.0:9003/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false", host: "prod-spok-opencost.domain.net", referrer: "https://prod-spok-opencost.domain.net/allocation?window=7d"
opencost-bc6d766b9-fw2sg opencost-ui 10.34.146.18 - - [11/Oct/2024:12:38:53 +0000] "GET /model/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false HTTP/1.1" 404 153 "https://prod-spok-opencost.domain.net/allocation?window=7d" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:131.0) Gecko/20100101 Firefox/131.0" "10.32.104.179"
so in this case we hit 180s timeouts for opencost UI containers
opencost and prometheus pods do not get OOM and have CPU request settings (CPU limit = null) higher than metrics showing for them during 7-day calculation requests
So it seems like we need the ability to tune these parameters on the docker level and then on the chart level. In another case, we need to build our own docker container for opencost UI with updated timeout variables that is a good solution as I see.
Could be that I missed something, so suggestions are appreciated.
The text was updated successfully, but these errors were encountered:
Describe the bug
Opencost does not allow to tune values below
https://github.com/opencost/opencost-ui/blob/f243545ed8b04113d088f0e62e064acf5d950714/default.nginx.conf.template#L69-L71
because we have clusters that can have more than 100 nodes time to time cluster cost count for 7 days in UI return
and in logs we see
so in this case we hit 180s timeouts for opencost UI containers
opencost and prometheus pods do not get OOM and have CPU request settings (CPU limit = null) higher than metrics showing for them during 7-day calculation requests
So it seems like we need the ability to tune these parameters on the docker level and then on the chart level. In another case, we need to build our own docker container for opencost UI with updated timeout variables that is a good solution as I see.
Could be that I missed something, so suggestions are appreciated.
The text was updated successfully, but these errors were encountered: