Skip to content

Commit

Permalink
[nexus] Project-scoped OxQL endpoint (#6873)
Browse files Browse the repository at this point in the history
#7047 moves the existing `/v1/timeseries/query` endpoint (which requires
the fleet viewer role) to `/v1/system/timeseries/query`. This PR is on
top of #7047 and adds `/v1/timeseries/query?project=my-project`, using a
sneaky trick to let us ensure that the user has read permissions on a
project before we let them see metrics for that project. See
#5298 (comment)
for a discussion of the OxQL authz problem more broadly.

- [x] Add `/v1/timeseries/query` with required project query param
- [x] Integration tests showing the authz works in an expected scenarios
- ~~Add a list schemas endpoint that only lists schemas with a
`project_id` field~~
- ~~Move schema filtering logic inside oximeter client~~
- ~~Fully test schema list endpoint~~

## The trick

1. Require the user to say (in a query param) what project they're
interested in
2. Look up that project and make sure they can read it
3. Jam `| filter silo_id == "<silo_id>" && project_id == "<project_id>"`
on the end of whatever query they passed in

It sounds silly, but I talked it over with @bnaecker and we couldn't
find any holes. If the user tries to fetch metrics from another project
inside their query, the query will end up with the filter `project_id ==
"def" && project_id == "xyz"` and the result set will always be empty.
If they try to query a metric that doesn't have a `project_id` on it, it
will error out. It works!

## API design

Initially I had the endpoint as
`/v1/timeseries/query/project/{project}`. This is really horrible. It
should be `/projects/`, but that doesn't feel any better. I also
considered `/v1/projects/{project}/timeseries/query`, which has some
precedent:


https://github.com/oxidecomputer/omicron/blob/45f5f1cc2c7eec2a1de0a5143e85e0794134f175/nexus/external-api/output/nexus_tags.txt#L59-L64

But it also feels awful. 

I like the query param approach. Right now, there are only fleet-scoped
metrics and project-scoped metrics, so requiring the project ID makes
sense, but the query params are nicely flexible in that we could make
project optional or add other optional fields and just do different auth
checks based on what's passed in. Neither path nor operation ID mention
projects.

<details>
<summary>Examples of fleet- and project-scoped metrics</summary


https://github.com/oxidecomputer/omicron/blob/45f5f1cc2c7eec2a1de0a5143e85e0794134f175/oximeter/oximeter/schema/virtual-machine.toml#L3-L18


https://github.com/oxidecomputer/omicron/blob/45f5f1cc2c7eec2a1de0a5143e85e0794134f175/oximeter/oximeter/schema/switch-data-link.toml#L3-L19
</details>


## No list endpoint yet

I backed out the schema list endpoint. We can't list project-scoped
schemas because `authz_scope` is not in the database! See
#5942. Currently the
schema list endpoint hard-codes `authz_scope: Fleet`.


https://github.com/oxidecomputer/omicron/blob/69de8b6288fde36fbcd6cabb4d632d62851230ad/oximeter/db/src/model/from_block.rs#L142

I am using the tag `hidden` on the query endpoint so that it goes in the
OpenAPI definition (so I can use it in the console) but it will not show
up on the docs site.
  • Loading branch information
david-crespo authored Dec 3, 2024
1 parent 9285a7c commit 289146b
Show file tree
Hide file tree
Showing 7 changed files with 337 additions and 40 deletions.
1 change: 1 addition & 0 deletions nexus/external-api/output/nexus_tags.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ probe_create POST /experimental/v1/probes
probe_delete DELETE /experimental/v1/probes/{probe}
probe_list GET /experimental/v1/probes
probe_view GET /experimental/v1/probes/{probe}
timeseries_query POST /v1/timeseries/query

API operations found with tag "images"
OPERATION ID METHOD URL PATH
Expand Down
20 changes: 20 additions & 0 deletions nexus/external-api/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2567,6 +2567,26 @@ pub trait NexusExternalApi {
body: TypedBody<params::TimeseriesQuery>,
) -> Result<HttpResponseOk<views::OxqlQueryResult>, HttpError>;

// TODO: list endpoint for project-scoped schemas is blocked on
// https://github.com/oxidecomputer/omicron/issues/5942: the authz scope for
// each schema is not stored in Clickhouse yet.

/// Run project-scoped timeseries query
///
/// Queries are written in OxQL. Project must be specified by name or ID in
/// URL query parameter. The OxQL query will only return timeseries data
/// from the specified project.
#[endpoint {
method = POST,
path = "/v1/timeseries/query",
tags = ["hidden"],
}]
async fn timeseries_query(
rqctx: RequestContext<Self::Context>,
query_params: Query<params::ProjectSelector>,
body: TypedBody<params::TimeseriesQuery>,
) -> Result<HttpResponseOk<views::OxqlQueryResult>, HttpError>;

// Updates

/// Upload TUF repository
Expand Down
70 changes: 47 additions & 23 deletions nexus/src/app/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -140,28 +140,52 @@ impl super::Nexus {
self.timeseries_client
.oxql_query(query)
.await
.map(|result| {
// TODO-observability: The query method returns information
// about the duration of the OxQL query and the database
// resource usage for each contained SQL query. We should
// publish this as a timeseries itself, so that we can track
// improvements to query processing.
//
// For now, simply return the tables alone.
result.tables
})
.map_err(|e| match e {
oximeter_db::Error::DatabaseUnavailable(_)
| oximeter_db::Error::Connection(_) => {
Error::ServiceUnavailable {
internal_message: e.to_string(),
}
}
oximeter_db::Error::Oxql(_)
| oximeter_db::Error::TimeseriesNotFound(_) => {
Error::invalid_request(e.to_string())
}
_ => Error::InternalError { internal_message: e.to_string() },
})
// TODO-observability: The query method returns information
// about the duration of the OxQL query and the database
// resource usage for each contained SQL query. We should
// publish this as a timeseries itself, so that we can track
// improvements to query processing.
//
// For now, simply return the tables alone.
.map(|result| result.tables)
.map_err(map_timeseries_err)
}

/// Run an OxQL query against the timeseries database, scoped to a specific project.
pub(crate) async fn timeseries_query_project(
&self,
_opctx: &OpContext,
project_lookup: &lookup::Project<'_>,
query: impl AsRef<str>,
) -> Result<Vec<oxql_types::Table>, Error> {
// Ensure the user has read access to the project
let (authz_silo, authz_project) =
project_lookup.lookup_for(authz::Action::Read).await?;

// Ensure the query only refers to the project
let filtered_query = format!(
"{} | filter silo_id == \"{}\" && project_id == \"{}\"",
query.as_ref(),
authz_silo.id(),
authz_project.id()
);

self.timeseries_client
.oxql_query(filtered_query)
.await
.map(|result| result.tables)
.map_err(map_timeseries_err)
}
}

fn map_timeseries_err(e: oximeter_db::Error) -> Error {
match e {
oximeter_db::Error::DatabaseUnavailable(_)
| oximeter_db::Error::Connection(_) => Error::unavail(&e.to_string()),
oximeter_db::Error::Oxql(_)
| oximeter_db::Error::TimeseriesNotFound(_) => {
Error::invalid_request(e.to_string())
}
_ => Error::internal_error(&e.to_string()),
}
}
27 changes: 27 additions & 0 deletions nexus/src/external_api/http_entrypoints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5544,6 +5544,33 @@ impl NexusExternalApi for NexusExternalApiImpl {
.await
}

async fn timeseries_query(
rqctx: RequestContext<ApiContext>,
query_params: Query<params::ProjectSelector>,
body: TypedBody<params::TimeseriesQuery>,
) -> Result<HttpResponseOk<views::OxqlQueryResult>, HttpError> {
let apictx = rqctx.context();
let handler = async {
let nexus = &apictx.context.nexus;
let opctx =
crate::context::op_context_for_external_api(&rqctx).await?;
let project_selector = query_params.into_inner();
let query = body.into_inner().query;
let project_lookup =
nexus.project_lookup(&opctx, project_selector)?;
nexus
.timeseries_query_project(&opctx, &project_lookup, &query)
.await
.map(|tables| HttpResponseOk(views::OxqlQueryResult { tables }))
.map_err(HttpError::from)
};
apictx
.context
.external_latencies
.instrument_dropshot_handler(&rqctx, handler)
.await
}

// Updates

async fn system_update_put_repository(
Expand Down
23 changes: 19 additions & 4 deletions nexus/tests/integration_tests/endpoints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -948,10 +948,14 @@ pub static DEMO_SILO_METRICS_URL: Lazy<String> = Lazy::new(|| {
)
});

pub static TIMESERIES_LIST_URL: Lazy<String> =
pub static TIMESERIES_QUERY_URL: Lazy<String> = Lazy::new(|| {
format!("/v1/timeseries/query?project={}", *DEMO_PROJECT_NAME)
});

pub static SYSTEM_TIMESERIES_LIST_URL: Lazy<String> =
Lazy::new(|| String::from("/v1/system/timeseries/schemas"));

pub static TIMESERIES_QUERY_URL: Lazy<String> =
pub static SYSTEM_TIMESERIES_QUERY_URL: Lazy<String> =
Lazy::new(|| String::from("/v1/system/timeseries/query"));

pub static DEMO_TIMESERIES_QUERY: Lazy<params::TimeseriesQuery> =
Expand Down Expand Up @@ -2208,7 +2212,18 @@ pub static VERIFY_ENDPOINTS: Lazy<Vec<VerifyEndpoint>> = Lazy::new(|| {
},

VerifyEndpoint {
url: &TIMESERIES_LIST_URL,
url: &TIMESERIES_QUERY_URL,
visibility: Visibility::Protected,
unprivileged_access: UnprivilegedAccess::None,
allowed_methods: vec![
AllowedMethod::Post(
serde_json::to_value(&*DEMO_TIMESERIES_QUERY).unwrap()
),
],
},

VerifyEndpoint {
url: &SYSTEM_TIMESERIES_LIST_URL,
visibility: Visibility::Public,
unprivileged_access: UnprivilegedAccess::None,
allowed_methods: vec![
Expand All @@ -2217,7 +2232,7 @@ pub static VERIFY_ENDPOINTS: Lazy<Vec<VerifyEndpoint>> = Lazy::new(|| {
},

VerifyEndpoint {
url: &TIMESERIES_QUERY_URL,
url: &SYSTEM_TIMESERIES_QUERY_URL,
visibility: Visibility::Public,
unprivileged_access: UnprivilegedAccess::None,
allowed_methods: vec![
Expand Down
Loading

0 comments on commit 289146b

Please sign in to comment.