Overview

Version 2.4 is a ‘edge’ release focused on GPU efficiency, many bug fixes, as well as several quality of life improvements.

Important Notices

Upgrading to 2.4 will add new fields to the Kubecost ETL that support the GPU monitoring features. The new ETL files are not backward compatible with previous versions. Multi-cluster users MUST upgrade the primary before upgrading secondary (agents).

The current 2.3.x release is considered stable and will continue to be maintained. Kubecost will release a new 2.3.x version that is compatible with the ETL changes in 2.4.x that will allow downgrading to that version from 2.4.x. All this said, the 2.4.0 release has been extensively tested and we recommend upgrading to take advantage of the new features and significant number of bug/CVE fixes.

An agent upgrade to version 2.4+ is required to gather the additional metrics for NVIDIA GPU workloads. If NVIDIA GPUs are not used, the agent upgrade is not required.

Major Features

[Feature] Incorporate GPU Efficiency into efficiency metrics displayed around the application.
[Feature] Ability to rightsize node groups in cluster-sizing. Note that this requires that the agents(secondaries) must be at or above 1.100, which added support for node labels
[Feature] Add options to the Allocations page to see Idle costs broken down per-node and per-cluster.
[Feature] Add support for Collections Budgets.
[Feature] Add support for Idle Costs to Collections.

Minor Features

[Feature] Add support for new setting in helm to enable standard discount to be applied in kubecost primary cluster installation that applies to data coming from secondary clusters.
[Feature] Add support for certificates when using a custom SMTP server with Kubecost.
[Feature] Add new FOCUS spec fields to Cloud Cost to support Account Name, Invoice Entity Name, Region ID, and Availability Zone.
[Feature] Add the ability to support BYO certificates for SMTP integration.
[Feature] Add a check in the Settings page which alerts users when their Helm Chart, UI image, and API image versions are not in sync.
[Feature] Add four new fields from the FOCUS spec to Cloud Costs.
[Feature] Add four new Fields from the FOCUS spec to Cloud Budgets.
[Feature] Add limited support for feature-flagging via the Helm chart.
[Feature] Agent diagnostics is now enabled by default
[Enhancement] Substantial application-wide improvements to WCAG 2.1 AA accessibility.
[Enhancement] Add a loading indicator when downloading request sizing CSVs to show that the download has, in fact, been initiated.
[Enhancement] Add a loading indicator when request sizing data is refreshing.
[Enhancement] Remove the “New” badges from pages that were introduced in 2.0.
[Enhancement] Default Request Sizing window to 3d instead of 48h. Using 48 data points was causing the page to hang or crash for some larger data sets.
[Enhancement] When an array of empty data is returned from the custom costs API, show an informative message rather than and empty graph/table.
[Enhancement] Show an informative message when the Request Sizing API returns a response with an empty set of Recommendations.
[Enhancement] Show an informative message when attempting to create a Budget fails.
[Enhancement] More information in bug reports.
[Enhancement] Show a more informative error response when cluster sizing recommendations cannot be generated due to not finding cloud provider information for a cluster.
[Enhancement] Show friendlier Cloud Account Names in Overview / Cloud Cost tables instead of Cloud Account IDs, when names are available.
[Enhancement] Add the ability to see aggregator PV usage in /diagnostics page.

Fixes

[Fix] Add a new script for copying alerts to the aggregator pod from cost-model as we moved this endpoint over. If you have alerts configured prior to 2.4, you’ll need to run this script upon upgrading.
[Fix] Fix an issue where overview cluster efficiency shows usage as 0.
[Fix] Fix an issue where resource hourly cost is incorrectly calculated on drill down.
[Fix] Fix an issue when changing from separate idle by node to another idle configuration.
[Fix] Fix an issue with GPU idle calculations in allocation.
[Fix] Fix assets that appear to be missing account ID.
[Fix] Fix an issue causing discrepancies in collections cost in the k8s domain for query windows that yield relative date boundaries.
[Fix] Fix csv pricing for gpus not correctly reflecting in kubecost.
[Fix] Fix an issue with the Allocation API not matching Allocation Summary API on costs.
[Fix] Fix an http 500 error in cluster right-sizing.
[Fix] Fix an issue with Allocation API calculation on PV costs.
[Fix] Fix an issue with Allocation API and Allocation Summary API cost accuracy when cost metric is not set to cumulative cost.
[Fix] Fix an issue with AKS reconciliation of BRL currency costs.
[Fix] Fix an issue with Asset budgets using the ‘Project” workload type.
[Fix] Fix several issue with /clusters page, issues causing inaccurate provider selection, as well as costs.
[Fix] Fix aws:eks:cluster-name tag not being picked up.
[Fix] Fix an issue causing inflated network costs for Azure clusters.
[Fix] Fix an issue where HA and DR icons are not working properly on /settings page.
[Fix] Fix an issue with Carbon Costs and Trends getting HTTP 500 in allocations.
[Fix] Fix issue in orphaned resources API causing a 500 error on a single resource lookup failure from provider.
[Fix] Fix issue in allocations presenting non zero shared costs when sharing is disabled.
[Fix] Fix the scalability of the clusters API for accuracy and speed.
[Fix] Better error handling in some cases where the app fails to start. Allow users to enter a license key or start/extend an Enterprise trial when blocked on license violations.
[Fix] Update math in the Overview’s efficiency graph card so as not to show negative allocation, which is impossible.
[Fix] Remove the Category filter from Asset Budget filter options, as it is unsupported.
[Fix] Prevent drilling into Pod items in the Efficiency page. Previously, this would set the aggregation to Namespace and remove all filters.
[Fix] Request Sizing had two separate UI elements for setting Filters. The one in the Customize menu has been removed.
[Fix] Remove an unnecessary check for the presence of the Network Cost daemonset on the primary cluster before rendering the Network Costs page. Secondary clusters may be reporting network costs that can be viewed from this page, regardless of the state of the daemonset on the primary.
[Fix] Prevent querying for data older than the 15 day retention period for Free tier in the Collections and Efficiency pages.
[Fix] Correctly generate links from the Allocations page to the Request Right Sizing page when filtering and/or aggregating by custom label.
[Fix] Correct an error that resulted from savings Cloud Cost reports with custom labels.
[Fix] Correct a broken link to the Efficiency Report documentation.
[Fix] Fix a bug in Assets where updating the Cost Metric field would remove any applied filters.
[Fix] Fix an issue where step size was not honored in Efficiency Reports.
[Fix] Fix a variety of issues in the Allocation Detail Modal (shown when clicking on a Pod row). This modal would issue an incorrect and expensive Assets query to try to derive the Pod’s Node. When it failed, it would show a cryptic message about credentials.
[Fix] Fix a bug that caused the Clusters list to filter incorrectly.
[Fix] Remove the unallocated item from the Overview’s Namespace Breakdown table.
[Fix] Fix an issue where sometimes applying a license would hide the current active Free Enterprise Trial status and vice-versa. The settings page now always shows both the active license and the state of an installations free trial.
[Fix] Fix an issue where custom SMTP tests/updates from the UI could fail.
[Fix] Fix Alerts only alerting on data from the Primary cluster. All alerts except Cluster/Application Health alerts will leverage data from secondary clusters.
[Fix] Don’t try to show all per-day cluster costs in the Overview page. Show top 10 like we do in other graphs.
[Fix] Fix an issue where UI-created Budgets that reset on Sunday did not create correctly.
[Fix] Fix an issue where the UI could send an incorrect parameter to the Cluster Sizing API.
[Fix] Fix an issue with Assets monthly totals not appropriately lining up.
[Fix] Fix category options in asset autocomplete.
[Fix] Fix an issue where namespace turndown always shows the next run as ‘coming soon’.
[Fix] Fix alerts to be multi-cluster aware.
[Fix] Fix missing claim names in persistent volume sizing.
[Fix] Fix the default experience for cluster right sizing when current daily data isn’t yet available.
[Fix] Fix an inaccuracy in pod costs on abandoned workloads savings page.
[Fix] Fix an issue where the cluster provider name could be incorrect on the clusters page.
[Fix] Fix an issue where total and page count on container right-sizing page had values when no recommendations were available.
[Fix] Fix an issue where database timestamps weren’t being correctly set for some data, defaulting to Jan 1st 1970.
[Fix] Fix an issue with PV discrepancy between allocation and allocation summary API.
[Fix] Fix an issue with saving SMTP configuration after edits.
[Fix] Fix an issue where aggregator can run out of pv space and no warnings to the frontend are available.
[Fix] Fix an issue where shared costs do not show correctly in the top level allocations view.
[Fix] Fix an issue where node counts don’t match across allocation, assets, and cluster inspect.
[Fix] Fix an issue where allocation API does not matc...

Fixes

Reduced memory footprint.
Fix an issue blocking settings page because of core count limit.
Fix an issue with Cloud costs page not leveraging cloud account mappings.
Fix an issue with higher than normal numbers in gcp cloud costs.
Fix an issue with aggregating by label on right-sizing.
Fix an issue with request-sizing sort by field causing an api error.
Fix an issue with scheduled reports not sending.
Fix an issue with collections adding cloud cost with custom labels.
Fix an issue with external costs not getting ingested properly.
Fix an issue with asset budgets getting an api error using service workload type.
Fix an issue with budgets page resetting weekly send 1-7 from 0-6.
Fix an issue in budgets collection selector not showing selected value.
Fix an issue causing prometheus query error in local storage queries.
Fix an issue creating cloud cost reports via helm chart.
Fix an issue with trial status disappearing on upgrade of eks optimized.
Fix an issue with slow queries on cluster status api.
Fix an issue with invalid provider name in the cluster list and cluster detail.
Fix error visible in aggregator logs “append row Failure: acquiring max concurrency semaphore: context canceled” resulting in hung api responses.
Fix an issue with allocation top line matching allocation summary api.
Fix an issue with assets not matching allocation summary api.
Fix an issue with adjustments when no cloud integration or custom pricing is enabled.
Fix an issue with auth loops when OIDC values are set.
Fix an issue with azure costs being lower in kubecost than azure.

Helm Fixes

#3595 Fix okta redirect loop.
#3600 Reduce Memory usage.

Security Updates

#3618 Bump kubecost-modeling to v0.1.15 to fix CVE-2024-7592
#3609 Bump kubecost-modeling to v0.1.14 to fix CVE-2024-4603, CVE-2024-5535, CVE-2024-4603, CVE-2024-5535, CVE-2024-6923, CVE-2024-37891
#3594 Bump cluster-controller to 0.16.8 to fix CVE-2024-41110
#3603 Bump Grafana for CVE-2024-21490 CVE-2024-24557

Known issues:

prom/prometheus v2.53.1 in our helm chart has a known critical CVE-2024-41110 that has not been resolved upstream. Once this is resolved we will patch again.
redhat/ubi9 has a known high CVE-2024-6345 that has not been resolved upstream. We are working for alternate resolutions here as ubi9 has left this open for some time. Will patch a resolution for this when we have a verified and tested solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Fixes

Fixes

Helm Fixes

Overview

Important Notices

Major Features

Minor Features

Fixes

Fixes

Helm Fixes

Security Updates

Known issues:

Releases: kubecost/cost-analyzer-helm-chart

v2.4.3

v2.4.3-rc.0

v2.5.0-rc.1

V2.5.0-rc.0

v2.4.2

What's Changed

v2.4.2-rc.0

Fixes

v2.4.1

v2.4.1-rc.1

Fixes

Helm Fixes

v2.4.0

Overview

Important Notices

Major Features

Minor Features

Fixes

v2.3.5

Fixes

Helm Fixes

Security Updates

Known issues: