A ReportDataSource
is a custom resource that represents how to store data, such as where it should be stored, and in some cases, how the data is to be collected.
There are currently four types of ReportDataSource's: prometheusMetricsImporter
, awsBilling
, reportQueryView
and prestoTable
.
Each has a corresponding configuration section within the spec
of a ReportDataSource
.
The main effect that creating a ReportDataSource has is that it causes the metering operator to create a table in Presto or Hive.
Depending on the type of ReportDataSource it then may do other additional tasks.
For prometheusMetricsImporter
datasources the operator periodically collects metrics and stores them in the table.
For awsBilling
, the operator configures the table to point at an S3 bucket containing AWS Cost and Usage reports, making these reports exposed as a database table.
To read more details on how the different ReportDataSources work, read the metering architecture document.
prometheusMetricsImporter
: If this section is present, then theReportDataSource
will be configured to periodically poll Prometheus for metrics using the specified Prometheus query.query
: The PromQL query to use.storage
: This section controls theStorageLocation
options, allowing you to control on a per ReportDataSource level, where data is stored.storageLocationName
: The name of theStorageLocation
resource to use.
prometheusConfig
:url
: If present, the URL of the Prometheus instance to scrape for this ReportDataSource.
awsBilling
: If specified, theReportDataSource
will be configured to use an S3 bucket containing AWS billing reports as its source of data.source
:bucket
: Bucket name to store data into.prefix
: Path within the bucket where to store data.region
: The region where bucket is located.
reportQueryView
: If this section is present, then theReportDataSource
will be configured to create a View in Presto using the renderedspec.query
as the query for the view.queryName
: The name of a ReportQuery to create a view from.inputs
: Used to override or set values defined in a ReportQuery's spec.input field. For details on how inputs can be specified read the Specifying Inputs section of the ReportQueries documentation.storage
: This section controls theStorageLocation
options, allowing you to control on a per ReportDataSource level, where data is stored.storageLocationName
: The name of theStorageLocation
resource to use.
prestoTable
: If present, then theReportDataSource
will simply make it possible to reference a database table within Presto as a ReportDataSource.tableRef
: The name of the PrestoTable that this ReportDataSource should refer to.
For ReportDataSources with a spec.prometheusMetricsImporter
present, their tables have the following database table schema:
timestamp
: The type of this column istimestamp
. This is the time which the metric was collected.- Note:
timestamp
is also a reserved keyword (for the column type) in Presto, meaning any queries using it must use quotes to refer to the column, like so:SELECT "timestamp" FROM datasource_unready_deployment_replicas LIMIT 1;
- Note:
timeprecision
: The type of this column is adouble
. This is "query resolution step width" used to query this metric from Prometheus. This defines how accurate the data is. The bigger the value, the less accurate. This value is controlled globally by the operator, and has a default value of 60.labels
: The type of this column is amap(varchar, varchar)
. This is the set of Prometheus labels and their values for the metric.amount
: The type of this column is adouble
. Amount is the value of the metric at thattimestamp
Below is an example of one of the built-in ReportDataSource
resources that is installed with Metering Operator by default.
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
name: "pod-request-memory-bytes"
labels:
operator-metering: "true"
spec:
prometheusMetricsImporter:
query: |
sum(kube_pod_container_resource_requests_memory_bytes) by (pod, namespace, node)
If the data to be scraped is on a non-default Prometheus instance:
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
name: "pod-request-memory-bytes"
labels:
operator-metering: "true"
spec:
prometheusMetricsImporter:
query: |
sum(kube_pod_container_resource_requests_memory_bytes) by (pod, namespace, node)
prometheusConfig:
url: http://custom-prometheus-instance:9090
For ReportDataSources with a spec.reportQueryView
present, a Presto view will be created using the rendered output of a specified ReportQuery's spec.query
field.
This enables abstracting away the details of more complex queries by exposing them as a database table whose content is based on the result of the query the view is based on.
It also enables re-use by allowing you to create a view containing the complexities of a query allowing other queries to simply query it as a regular table.
This example exposes the pod-memory-request-raw
ReportQuery as a view.
The schema is based on the spec.columns
of the ReportQuery.
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
name: "pod-memory-request-raw"
labels:
operator-metering: "true"
spec:
reportQueryView:
queryName: pod-memory-request-raw
If you wanted to specify some inputs to a ReportQuery that accepts inputs, you can set them in the spec.reportQueryView.inputs
:
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
name: "cluster-cpu-capacity-2019"
spec:
reportQueryView:
queryName: cluster-cpu-capacity
inputs:
- name: ReportingStart
value: "2019-01-01T00:00:00Z"
- name: ReportingEnd
value: "2020-01-01T00:00:00Z"
For more details on how inputs can be specified read the Specifying Inputs section of the ReportQueries documentation.
For ReportDataSources with a spec.awsBilling
present, see here for an example of what the table schema looks like.
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
name: "aws-billing"
labels:
operator-metering: "true"
spec:
awsBilling:
source:
bucket: "your-aws-cost-report-bucket"
prefix: "path/to/report"
region: "your-buckets-region"
For ReportDataSources with a spec.prestoTable
present, the reporting-operator will simply verify that a PrestoTable resource exists and it's status.tableName
is set.
If it does, then the ReportDataSource will simply point at the existing PrestoTable.
A PrestoTable ReportDataSource is merely a way to expose an arbitrary table to the rest of the metering resources which expect to interact with a ReportDataSource.
apiVersion: metering.openshift.io/v1
kind: ReportDataSource
metadata:
name: example-baremetal-cost
spec:
prestoTable:
tableRef:
name: example-baremetal-cost