-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telemetry work #22
base: main
Are you sure you want to change the base?
Telemetry work #22
Conversation
examples/AWSDriverExample/src/main/java/software/amazon/MetricsExample.java
Show resolved
Hide resolved
examples/AWSDriverExample/src/main/java/software/amazon/MetricsExample.java
Show resolved
Hide resolved
|
||
AWS Aurora are databases provided by AWS that are hosted in the AWS cloud. Currently available in Aurora are instances of MySQL and PostgreSQL databases. In order to enable the access from an application to an Aurora database, users need to set up a driver that has the ability to connect and interact with an Aurora database instance set up remotely. Existing drivers for MySQL and PostgreSQL work fine with Aurora instances, but do not take advantage of any of the additional features provided by the Aurora databases. | ||
|
||
Recently, AWS has created the AWS JDBC Wrapper driver. The AWS JDBC Wrapper is a driver application that is not a driver in itself, but more of a driver enhancer - an application that set up alongside a database driver, enables some Aurora-specific features to the database driver. Examples of those features are the ability of driver failover (link to doc) and integration with IAM and AWS Secrets Manager (link). The structure of the AWS JDBC Wrapper is organized in plugins, where each new wrapper feature is isolated and independent from other features. That allows users to select which wrapper features/plugins they require in their application workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave the responsibility of explaining what the wrapper is to the repo home page or do you think its helpful to have here as well?
|
||
Through the driver, applications interact with a remote database by sending requests. Those requests trigger a chain of execution where every one of the plugins enabled by the user is activated and executed, until the request reaches the database and returns. | ||
|
||
In its current form, the AWS JDBC Wrapper is much like a black box: during the chain of execution, there is actually no possibility to measure the individual performance of each plugin throughout the execution of a query. Which means that in the event of loss of performance while using the Wrapper, the troubleshooting process is manual, requiring users to dive deep inside application logs and manually inspecting the behavior of each plugin enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR aims to fix this problem and this doc will be merged in with the solution, should we replace In its current form...
with Previously, the AWS...
?
|
||
OpenTelemetry aims to define vendor and language agnostic specs on how to monitor applications. | ||
|
||
The entities we introduce in this project are the following: Metrics and Traces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we mention metrics before traces on this line, lets introduce them in the same order below (eg metrics section first, then traces section)
|
||
### Traces | ||
|
||
We define a trace as a defined sequence of an application execution, specifically defined by its start and its end. A trace can contain the entire application execution, or a single atomic operation. Traces can be related either by hierarchy either with symbolic links. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this read by heirarchy or with symbolic links
?
|
||
Collecting libraries will see the traces boundaries and measure the execution time for a particular trace. For a better understanding on the application behavior, different attributes can be attached to a trace, such as a success status and/or status code. | ||
|
||
In our project, traces are represented as `TelemetryContext` objects. A `TelemetryContext` object will be created/opened at every plugin invocation or execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: "A TelemetryContext
object will be created for each wrapper invocation and each individual plugin invocation. For more details see the section on [plugin level tracing](#plugin-level-tracing)."
### Failover plugin | ||
|
||
Metrics: | ||
(Metric name | metric type | Unit (if applicable) | Dimensions | Description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could consider removing this heading here and below, it seems quite clear without it. However, if we do keep it we could consider either removing the unit/dimension part (since neither are present in any of the metrics) or changing the dimensions part to Dimensions (if applicable)
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/DefaultTelemetryFactory.java
Show resolved
Hide resolved
|
||
package software.amazon.jdbc.util.telemetry; | ||
|
||
public class NullTelemetryContext implements TelemetryContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're getting closer to a final stage of development, we probably need to add more comments to classes and methods. It's a general note.
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/OpenTelemetryGauge.java
Show resolved
Hide resolved
Is the image file docs/images/telemetry-app.png is unused? |
try (final Connection conn = DriverManager.getConnection(POSTGRESQL_CONNECTION_STRING, properties); | ||
final Statement statement = conn.createStatement(); | ||
final ResultSet rs = statement.executeQuery(SQL_SLEEP)) { | ||
System.out.println(Util.getResult(rs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does pg_sleep return anything here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pg_sleep returns an empty resultset I believe
@@ -89,6 +106,8 @@ public <T, E extends Exception> T execute(Class<T> resultClass, Class<E> excepti | |||
return jdbcMethodFunc.call(); | |||
} | |||
|
|||
totalCallsCounter.inc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move this to the beginning of the method and count it as a call to this plugin even if we enter the if-block on line 105?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we enter the if else block and execute the method directly, we are skipping all the data cache plugin specific code. I don't think we should count that towards our total calls.
if (PropertyDefinition.ENABLE_TELEMETRY.getBoolean(properties)) { | ||
if ("otlp".equalsIgnoreCase(PropertyDefinition.TELEMETRY_METRICS_BACKEND.getString(properties))) { | ||
return OpenTelemetryFactory.createCounter(name); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just checking, I'm guessing there isn't an xray createCounter or createGauge method? Not familiar with the differences between xray and open telemetry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenTelemetry is a standard for tracing and metrics.
XRay is an AWS tool that is only for tracing.
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/OpenTelemetryMetrics.java
Show resolved
Hide resolved
# Proposed solution | ||
|
||
## Definitions | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following few paragraphs (lines 30-36) don't seem to be related Definitions. Could we move these up a level, right under Proposed solutions?
Metrics will also be added to our observability suite for the numeric data that will be collected throughout the application execution. | ||
|
||
For metrics, we follow the same standard defined by OpenTelemetry, which consists of: | ||
- Counters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have the plurality of the metrics be consistent?
- Gauge | ||
- Histogram | ||
|
||
Gauges will be used for situations where the numeric data is varies, but not incrementally throughout the execution of the application. An example of that would be a hit/miss ratio for a cache. Further information on those 3 entities, and to which kind of data they suit better can be found in the OpenTelemetry documentation (link). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the link for OpenTelemetry documentation missing?
this.properties = properties; | ||
this.monitorServiceSupplier = monitorServiceSupplier; | ||
|
||
this.invalidatedConnectionCounter = telemetryFactory.createCounter("efm.connection.invalidated"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to list the names of our counters in metrics-document.md? I don't see this counter being mentioned.
this.properties = properties; | ||
this.monitorServiceSupplier = monitorServiceSupplier; | ||
|
||
this.invalidatedConnectionCounter = telemetryFactory.createCounter("efm.connection.invalidated"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this counter updated in or used?
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/DefaultTelemetryFactory.java
Show resolved
Hide resolved
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/XRayTelemetryContext.java
Show resolved
Hide resolved
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/XRayTelemetryContext.java
Show resolved
Hide resolved
examples/AWSDriverExample/src/main/java/software/amazon/MetricsExample.java
Show resolved
Hide resolved
} | ||
} | ||
|
||
public static void main(String[] args) throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get a nullpointerexception when I run the example as-is even though you have the cluster endpoints and credentials specified
> Task :driverexample:MetricsExample.main() FAILED
Exception in thread "main" java.lang.NullPointerException: endpoint
at java.base/java.util.Objects.requireNonNull(Objects.java:233)
at io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporterBuilder.setEndpoint(OtlpGrpcSpanExporterBuilder.java:93)
at software.amazon.MetricsExample.<init>(MetricsExample.java:60)
at software.amazon.MetricsExample.main(MetricsExample.java:84)
@@ -41,6 +41,7 @@ | |||
import software.amazon.jdbc.mock.TestPluginThree; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test suites have some build errors right now.
wrapper/src/main/java/software/amazon/jdbc/util/telemetry/XRayTelemetryContext.java
Show resolved
Hide resolved
Were we planning on merging this into the bit-quill repo or the aws one? Currently it is targeting bit-quill/main which is not up to date with the aws repo |
Summary
Introducing Telemetry to the JDBC Wrapper
Description
Additional Reviewers
@karenc-bq
@aaron-congo
@aaronchung-bitquill
@sergiyv-improving
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.