Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the OpenTelemetry protobuf library #5658

Open
GFriedrich opened this issue Nov 12, 2024 · 12 comments
Open

Replace the OpenTelemetry protobuf library #5658

GFriedrich opened this issue Nov 12, 2024 · 12 comments
Labels
waiting for team An issue we need members of the team to review

Comments

@GFriedrich
Copy link

Please describe the feature request.
Get rid of the OpenTelemetry protobuf library as it is not intended for public use.

Rationale
Recently #5493 was opened to express that some vulnerable protobuf version is used. The ticket was closed though, because the protobuf version is coming from the OpenTelemetry protobuf library. Now though, the OpenTelemetry maintainers decided that this library is not intended for public use (see open-telemetry/opentelemetry-proto-java#20). Therefore I guess the only option here is to get rid of this library and generating the files for protobuf by yourselves.

@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Nov 13, 2024

Thank you for the issue!

It seems CVE-2024-7254 is about parsing (untrusted) data. I don't think Micrometer is affected since it does not parse any protobuf payload (deserializing), it does the opposite, it produces protobuf payload (serializing). From security scanners perspective, this seems like a false-positive to me. It also worth to mention that other than being affected or not, from Micrometer's perspective, this should not be a CVE since users should be able to upgrade the protobuf dependency (protobuf-java:3.25.5 should be fixed).

We need to discuss internally how to handle the new opentelemetry-proto guidance. According to it:

[...] please generate your own bindings, consulting grpc codegen and possibly build.gradle.kts

and as you mentioned, we should be able to simply generate the code from the proto specs using grpc codegen. Unfortunately, this is not as simple as it sounds. I asked around the OTel Slack, it seems that:

  • The types that otel-java itself uses are not generated using codegen either but manually created and maintained.
  • otel-java cannot rely on the default grpc codegen tooling because upgrading codegen can introduce binary incompatible changes in the generated code (this happened in OTel multiple occasions in the past).
  • The spec has a few customizations, in some cases (traceId/spanID), it defines a different serialized representation than the default behavior of codegen.
  • Generating also means that instead of depending on binary distributions (jars) we need to depend on source distributions (zipped proto files) where the available Java tooling is not as good as it is for binaries.
  • otel-java follows proto updates manually and the code review process tries to prevent manually maintained code being incompatible with the proto definition.

Please feel free to share if you have any idea/comment/question/etc.

@patpatpat123
Copy link

Hello team,

Just wanted to highlight something the developers of the library are saying:
From the developers of opentelemetry-proto-java, it seems that (I am just quoting):

unfortunately it looks like this artifact io.micrometer:micrometer-registry-otlp didn't follow [our guidance](https://github.com/open-telemetry/opentelemetry-proto-java#support):

We have no intention of eventually publishing stable artifacts. If you need guarantees, please generate your own bindings, consulting [grpc codegen](https://grpc.io/docs/languages/java/generated-code/#codegen) and possibly [build.gradle.kts](https://github.com/open-telemetry/opentelemetry-proto-java/blob/main/build.gradle.kts)

I am afraid there is a gap between two teams here.

On one hand (proto java team) say the library is not ready for production use

On the other hand (micrometer team) might not know about this (recent) information.

If this is the case, is there a way to replace this library with something more "production ready"?

Please let us know, as OTLP is a widely used library.

Thank you and good day!

@GFriedrich
Copy link
Author

Hey @jonatan-ivanov,

thanks for getting back!
First of all it is good to know that the vulnerability can be excluded here. Thank you for the info.
Nevertheless though I want to say it is a rather bad developer experience that the first thing you see when using the library is a warning about a vulnerability. That simply gives a bad belly feeling when using it.
I do understand that a solution comes with some complexities and I'm not asking for getting rid of it from one day to the other. But I'm asking for that some planning is started on how to solve this issue, because keeping it that way is probably the worst solution.

From an external point of view I see two options:

In any case thanks so much for looking into this 🙏

@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Nov 13, 2024

@patpatpat123
As I mentioned in #5659 (comment), I think it worths to note that it seems the readme was clarified yesterday(7f3d19c) and a note about this was added earlier this year (e2efe5f). Before that, the readme only contained how to use the published artifact publicly available in Maven Central. We added OTLP support back in 2022 where there was no sign that this artifact should not be used. As of today, Maven Central does not have any note that users should not use these artifacts, neither the javadoc of the published sources. The artifact name does not indicate that it is not intended for eventual production use either and/or it should be only used by OTel test suites.

@GFriedrich
My comment above about the CVE is only about the context (Micrometer should not be affected) and how to fix it if you need it to be fixed right now. I think we should fix this even if Micrometer is seemingly not affected by the CVE, the discussion has already started, we are looking into this. To improve the dev experience, could you please report this to your security scanner (it should not report this, it is false-positive)?

Either you decide together with OTel that it probably makes sense to have some Java artifact with the protobuf Java classes

This is mostly an OTel decision. I think a binary distribution (jar in case of Java) of the OTLP spec should be available for everyone. I already started a discussion with otel-java folks, let's see where this will go.

@trask
Copy link
Contributor

trask commented Nov 14, 2024

@jonatan-ivanov I don't think this is a fair statement. The artifact version is (and has always been) tagged with -alpha. That is very standard way to indicate that an artifact is not indented for production use yet, including in semantic versioning "A pre-release version indicates that the version is unstable".

@jonatan-ivanov
Copy link
Member

Hey @trask, there are a lot of things going on in the comments, could you please let me know which statement is not fair so that I can correct it?

If I need to guess you mean the one about production use ("not intended for production use")? I think I should have written "eventual production use"(?). I edited it and added a note about the OTel test usage to make it clearer, but that's what I meant (since that was changed in the commits I linked) not the -alpha suffix. That's what I was talking about in the previous sentence too ("this artifact should not be used").

Right now, the repo says there is no intention of eventually publishing stable artifacts. To me this means that it is not just not indented for production use yet but ever. I think this is a completely different topic than the -alpha suffix in the version number especially if I consider OTel artifacts that have ~"stable parts" and also have the -alpha suffix.

@jsuereth
Copy link

We #3129 back in 2022 where there was no sign that this artifact should not be used.

I believe this is where the -alpha version comment comes from.

Wanted to say something else though:

A big motivator in opentelemetry for its proto solution was actually increased performance directly serializing the intermediate storage classes to proto vs. having to memory-map/convert to the generated classes and then serialize. That is:

  • OpenTelemetry does not use the generated protos directly as it exposes pure interfaces that can be evolved in a compatible way.
  • OpenTelemetry may optimise the internal storage of fields for the running of a program.
  • Protobuf generated classes aren't necessarily configured or optimised for these use cases.
  • Serializing directly from our interfaces to proto format was more efficient than transforming into generated classes, then serializing.

If micrometer is able to optimise using the same technique, it might be beneficial to everyone. I'm a bit overloaded, but could explain the basics on how this works if Micrometer wanted to create serialize-only templates from proto descriptions, a simplification of otel's code here: https://github.com/open-telemetry/opentelemetry-java/blob/main/buildSrc/src/main/kotlin/io/opentelemetry/gradle/ProtoFieldsWireHandler.kt

@patpatpat123
Copy link

Hey @jonatan-ivanov ,

Thank you for following up on this.
Would it make more sense to devise a strategy to decouple from its usage instead of trying to justify past, current, and future use of the library?

I think the micrometer OTLP community would benefit from that.

@jack-berg
Copy link

Any reason why micrometer can't just use io.opentelemetry:opentelemetry-exporter-otlp?

Micrometer could provide a mapping to the MetricData interface, and call OtlpHttpMetricExporter#export(Collection<MetricData>).

This would allow micrometer to leverage the high performance, hand-rolled serialization we use internally in opentelemetry-exporter-otlp, without having to worry about depending on io.opentelemetry.proto:opentelemetry-proto.

The only argument I can think of to use io.opentelemetry.proto:opentelemetry-proto instead of io.opentelemetry:opentelemetry-exporter-otlp, is that io.opentelemetry:opentelemetry-exporter-otlp has opinions about what HTTP client libraries can be used as the transport (OkHttp and JDK 11+ HttpClient). If micrometer really needs to support some other HTTP client library (I would challenge this but ultimately not my decision), then micrometer can generate their own proto bindings fairly simply as is done in opentelemetry-proto-java.

@jonatan-ivanov
Copy link
Member

@jsuereth Ah, I see, by saying "there was no sign that this artifact should not be used", what I meant here was no sign that it should not be used at all (other than OTel test suits). To me this is a different topic than the -alpha suffix since right now, the repo says there is no intention of eventually publishing stable artifacts.

Thank you for the details about proto and the offer to help. As a short term solution, it seems there will be another release of proto-java that contains the CVE fix since it seems there will be some changes in the proto specs.
As a long term solution, I'm planning to attend on the Spec SIG next week to see what can be done across the portfolio and if we can have a stable, easily consumable "binary" distribution of the OTLP types (does not need to be proto-generated but could be tested by proto-generated code, like what the SDK does with those types right now). Let's see where that discussion goes and if the optimizations you mentioned can be reused there, I'm also interested discussing them with you from Micrometer perspective.

@jonatan-ivanov
Copy link
Member

@jack-berg
I think that could be something we should look into. In Micrometer registries, users can inject their own http client (there is an interface for it) so the client aspect you mentioned is a valid concern. Also ThreadFactory but that might not be a concern. Dependencies could be also an interesting question, bringing in the whole OTel SDK in order to serialize OTLP is something I like to avoid.

@jack-berg
Copy link

Yup good points @jonatan-ivanov. There seems to be a tradeoff:

  • If micrometer uses io.opentelemetry:opentelemetry-exporter-otlp, we have:
    • Opinionated configuration, HTTP client library implementations
    • Extra io.opentelemetry:opentelemetry-exporter-sdk dependency, but get to avoid com.google.protobuf:protobuf-java
    • Higher performance, since we optimized serialization by hand-rolling it
    • Slightly less work since micrometer only needs to convert to MetricData
  • If micrometer generates its own proto bindings akin to io.opentelemetry.proto:opentelemetry-proto, we have:
    • More flexibility around configuration, HTTP client library implementations
    • Extra com.google.protobuf:protobuf-java dependency, but get to avoid io.opentelemetry:opentelemetry-exporter-sdk
    • Lower performance
    • Slightly more work, since micrometer needs to generate the bindings and convert

From a dependency size standpoint, we have:

  • io.opentelemetry:opentelemetry-exporter-sdk is 6.7k (opentelemetry-sdk) + 54.5k (opentelemetry-sdk-common) + 313k (opentelemetry-sdk-metrics) + 157k (opentelemetry-api) = ~531k.
  • io.opentelemetry.proto:opentelemetry-proto is 926k (opentelemetry-proto) + 1,742k (protobuf-java) = ~2,668k.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for team An issue we need members of the team to review
Projects
None yet
Development

No branches or pull requests

6 participants