OpenCost UI nginx timeouts tuning #40

vova3379 · 2024-10-11T16:11:45Z

Describe the bug

Opencost does not allow to tune values below
https://github.com/opencost/opencost-ui/blob/f243545ed8b04113d088f0e62e064acf5d950714/default.nginx.conf.template#L69-L71

        proxy_connect_timeout       180;
        proxy_send_timeout          180;
        proxy_read_timeout          180;

because we have clusters that can have more than 100 nodes time to time cluster cost count for 7 days in UI return

Failed to load report data
Request failed with status code 404

and in logs we see

opencost-bc6d766b9-fw2sg opencost-ui 2024/10/11 12:38:53 [error] 28#28: *296999 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.34.146.18, server: _, request: "GET /model/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false HTTP/1.1", upstream: "http://0.0.0.0:9003/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false", host: "prod-spok-opencost.domain.net", referrer: "https://prod-spok-opencost.domain.net/allocation?window=7d"
opencost-bc6d766b9-fw2sg opencost-ui 2024/10/11 12:38:53 [error] 28#28: *296999 open() "/var/www/custom_504.html" failed (2: No such file or directory), client: 10.34.146.18, server: _, request: "GET /model/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false HTTP/1.1", upstream: "http://0.0.0.0:9003/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false", host: "prod-spok-opencost.domain.net", referrer: "https://prod-spok-opencost.domain.net/allocation?window=7d"
opencost-bc6d766b9-fw2sg opencost-ui 10.34.146.18 - - [11/Oct/2024:12:38:53 +0000] "GET /model/allocation/compute?window=7d&aggregate=namespace&includeIdle=true&step=1d&accumulate=false HTTP/1.1" 404 153 "https://prod-spok-opencost.domain.net/allocation?window=7d" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:131.0) Gecko/20100101 Firefox/131.0" "10.32.104.179"

so in this case we hit 180s timeouts for opencost UI containers

opencost and prometheus pods do not get OOM and have CPU request settings (CPU limit = null) higher than metrics showing for them during 7-day calculation requests

So it seems like we need the ability to tune these parameters on the docker level and then on the chart level. In another case, we need to build our own docker container for opencost UI with updated timeout variables that is a good solution as I see.

Could be that I missed something, so suggestions are appreciated.

The text was updated successfully, but these errors were encountered:

AjayTripathy · 2024-10-15T17:17:52Z

Would support a PR here! Should be a quick handful of helm changes!

vova3379 · 2024-10-16T14:40:09Z

Can provide pr for this and after approval also PR for the opencost chart.

Is such productivity expected for opencost on such cluster sizes because kubecost doesn't have such productivity issues?

AjayTripathy · 2024-10-17T16:34:36Z

Kubecost handles caching and data layers on the user's behalf between Prometheus and the UI; that is beyond the scope of opencost today.

github-actions bot added needs-follow-up needs-triage labels Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenCost UI nginx timeouts tuning #40

OpenCost UI nginx timeouts tuning #40

vova3379 commented Oct 11, 2024 •

edited

Loading

AjayTripathy commented Oct 15, 2024

vova3379 commented Oct 16, 2024

AjayTripathy commented Oct 17, 2024

OpenCost UI nginx timeouts tuning #40

OpenCost UI nginx timeouts tuning #40

Comments

vova3379 commented Oct 11, 2024 • edited Loading

AjayTripathy commented Oct 15, 2024

vova3379 commented Oct 16, 2024

AjayTripathy commented Oct 17, 2024

vova3379 commented Oct 11, 2024 •

edited

Loading