-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COSI-65, COSI-46, COSI-21: Add GRPC Metrics Instrumentation and Documentation Updates #83
base: improvement/COSI-53-misc-ci-improvements
Are you sure you want to change the base?
Conversation
driverPrefix = flag.String("driver-prefix", defaultDriverPrefix, "prefix for COSI driver, e.g. <prefix>.scality.com, default cosi.scality.com") | ||
driverMetricsAddress = flag.String("driver-metrics-address", defaultMetricsAddress, "The address to expose Prometheus metrics, default: :8080") | ||
driverMetricsPath = flag.String("driver-metrics-path", defaultMetricsPath, "path for the metrics endpoint, default: /metrics") | ||
driverMetricsPrefix = flag.String("driver-custom-metrics-prefix", defaultMetricsPrefix, "prefix for the metrics, default: scality_cosi_driver_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now a place holder, will be used for S3 and IAM metrics
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
@@ Coverage Diff @@
## improvement/COSI-53-misc-ci-improvements #83 +/- ##
============================================================================
+ Coverage 93.40% 93.67% +0.26%
============================================================================
Files 9 10 +1
Lines 637 680 +43
============================================================================
+ Hits 595 637 +42
- Misses 36 37 +1
Partials 6 6 |
1f69f34
to
87f52de
Compare
Update unit test for run methods usage to include prometheus registry.
87f52de
to
f5d58e9
Compare
e48db12
to
e4b65e3
Compare
var ( | ||
S3RequestsTotal *prometheus.CounterVec | ||
S3RequestDuration *prometheus.HistogramVec | ||
IAMRequestsTotal *prometheus.CounterVec | ||
IAMRequestDuration *prometheus.HistogramVec | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
place holders for s3 and iam metrics
var _ = Describe("InitializeMetrics", func() { | ||
var ( | ||
registry *prometheus.Registry | ||
driverMetricsPath string | ||
) | ||
|
||
BeforeEach(func() { | ||
registry = prometheus.NewRegistry() | ||
driverMetricsPath = "/metrics" | ||
}) | ||
|
||
It("should serve metrics via an HTTP endpoint", func() { | ||
addr := "127.0.0.1:0" | ||
server, err := metrics.StartMetricsServerWithRegistry(addr, registry, driverMetricsPath) | ||
Expect(err).NotTo(HaveOccurred()) | ||
Expect(server).NotTo(BeNil()) | ||
|
||
resp, err := http.Get("http://" + server.Addr + driverMetricsPath) | ||
Expect(err).NotTo(HaveOccurred()) | ||
Expect(resp.StatusCode).To(Equal(http.StatusOK)) | ||
|
||
err = server.Close() | ||
Expect(err).NotTo(HaveOccurred()) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
place holder tests for initialize metrics for s3 and iam
log_and_run() { | ||
echo "Running: $*" | tee -a "$LOG_FILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I commented this already, but wouldn't a set -x
be simpler to print command as well ?
Then the whole script could be tee
into a logfile
# Wait a few seconds to ensure port-forward is established | ||
log_and_run sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can test the port here.
# wait for $localport to become available
while ! nc -vz localhost $localport > /dev/null 2>&1 ; do
# echo sleeping
sleep 0.1
done
# This would show that the port is open
# nmap -sT -p $localport localhost
@@ -116,6 +116,14 @@ jobs: | |||
run: | | |||
.github/scripts/e2e_tests_brownfield_use_case.sh | |||
|
|||
# the script accepts number of requests for APIs for CREATE_BUCKET, DELETE_BUCKET, GET_INFO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# the script accepts number of requests for APIs for CREATE_BUCKET, DELETE_BUCKET, GET_INFO | |
# the script accepts number of requests for APIs for CREATE_BUCKET, DELETE_BUCKET, GET_INFO |
or is there a missing word ?
# Example below we we are testing for 2 CREATE_BUCKET, 1 DELETE_BUCKET, | ||
# 1 GET_INFO, 2 GRANT_ACCESS and 2 REVOKE_ACCESS API counts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we we + use list it's easier to read
# Example below we we are testing for 2 CREATE_BUCKET, 1 DELETE_BUCKET, | |
# 1 GET_INFO, 2 GRANT_ACCESS and 2 REVOKE_ACCESS API counts | |
# Example below we are testing for those API counts: | |
# - 2 CREATE_BUCKET | |
# - 1 DELETE_BUCKET | |
# - 1 GET_INFO | |
# - 2 GRANT_ACCESS | |
# - 2 REVOKE_ACCESS |
@@ -72,6 +72,14 @@ jobs: | |||
run: | | |||
.github/scripts/verify_helm_install.sh | |||
|
|||
# the script accepts number of requests for APIs for CREATE_BUCKET, DELETE_BUCKET, GET_INFO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as previous file
defaultDriverAddress = "unix:///var/lib/cosi/cosi.sock" | ||
defaultDriverPrefix = "cosi" | ||
defaultMetricsPath = "/metrics" | ||
defaultMetricsPrefix = "scality_cosi_driver" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaultMetricsPrefix = "scality_cosi_driver" | |
defaultMetricsPrefix = "scality_cosi_driver_" |
And in the documentation md file as well
or the trailing underscore should be removed from the flag.String
driverAddress = flag.String("driver-address", "unix:///var/lib/cosi/cosi.sock", "driver address for the socket") | ||
driverPrefix = flag.String("driver-prefix", "", "prefix for COSI driver, e.g. <prefix>.scality.com") | ||
driverAddress = flag.String("driver-address", defaultDriverAddress, "driver address for the socket file, default: unix:///var/lib/cosi/cosi.sock") | ||
driverPrefix = flag.String("driver-prefix", defaultDriverPrefix, "prefix for COSI driver, e.g. <prefix>.scality.com, default cosi.scality.com") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driverPrefix = flag.String("driver-prefix", defaultDriverPrefix, "prefix for COSI driver, e.g. <prefix>.scality.com, default cosi.scality.com") | |
driverPrefix = flag.String("driver-prefix", defaultDriverPrefix, "prefix for COSI driver, e.g. <prefix>.scality.com, default: cosi") |
|
||
## Additional Resource | ||
|
||
- [gRPC-Go Prometheus Metrics](https://github.com/grpc-ecosystem/go-grpc-prometheus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [gRPC-Go Prometheus Metrics](https://github.com/grpc-ecosystem/go-grpc-prometheus) | |
- [gRPC-Go Prometheus Metrics](https://github.com/grpc-ecosystem/go-grpc-middleware) |
Change link to the deprecated lib
|
||
srvMetrics := grpcprom.NewServerMetrics( | ||
grpcprom.WithServerHandlingTimeHistogram( | ||
grpcprom.WithHistogramBuckets([]float64{0.001, 0.01, 0.1, 0.3, 0.6, 1, 3, 6, 9, 20, 30, 60, 90, 120}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that histogram can greatly increase cardinality.
If this runs on client side outside our control, maybe it could be interesting to have those buckets configurable so they can change easily is they want more or less cardinality or if they want to target more specifically a certain range
The old PR is #69
go.*
dependency updates are responsible for 440 line changes.This PR introduces metrics instrumentation for gRPC API calls and adds comprehensive testing and documentation updates. Key changes include:
These changes provide better observability and make it easier to monitor the driver’s performance and behavior in production environments.