Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(deployments): separate db, storage, and reports deployments #192

Merged
merged 17 commits into from
Oct 25, 2024

Conversation

andrewazores
Copy link
Member

Fixes #114 #185 #188

Includes #184 #189 #191

@andrewazores andrewazores added feat New feature or request safe-to-test breaking change This change (potentially) breaks API compatibility and requires corresponding changes elsewhere labels Sep 6, 2024
@tthvo
Copy link
Member

tthvo commented Sep 20, 2024

The helm test fail seems related to a timeout waiting for Cryostat main pod to be healthy (i.e. the old set up with 6 containers). Maybe, increase the timeout could help with this test flankiness?

helm-extra-args: --timeout=600s

@andrewazores
Copy link
Member Author

andrewazores commented Sep 23, 2024

I think that's already quite a long timeout - it shouldn't take 10 minutes for things to get ready. Maybe if the network connection for the test runner is slow and it takes a long time to pull the container images, but even then, 10 minutes still seems like a lot.

I say we get #197 integrated and see if that doesn't fix the test reliability. If things are still timing out and we cannot find any other culprit then maybe increasing the timeout is the last resort.

@tthvo
Copy link
Member

tthvo commented Sep 23, 2024

I think CI needs updating to add --force option to helm or we must wait for #200 .

@tthvo
Copy link
Member

tthvo commented Sep 24, 2024

I think CI needs updating to add --force option to helm or we must wait for #200 .

Sorry, as per comment #201 (comment), this is not right behaviour of --force in v3. I think we do need to rename the deployment for seamless upgrade (i.e. CI tests this)...

@andrewazores andrewazores marked this pull request as ready for review September 27, 2024 17:01
Copy link
Member

@ebaron ebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few errors I encountered:

  • Using default chart values, I got a white screen when trying to view a recording in Grafana. Don't see any errors though, so not sure what's going on.
  • Occasionally, I get a CrashLookBackOff on the Cryostat container with:
    /truststore does not exist; no certificates to import
    INFO exec -a "java" java -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+ExitOnOutOfMemoryError -Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager -cp "." -jar /deployments/quarkus-run.jar 
    INFO running in /deployments
    __  ____  __  _____   ___  __ ____  ______ 
    --/ __ \/ / / / _ | / _ \/ //_/ / / / __/ 
    -/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \   
    --\___\_\____/_/ |_/_/|_/_/|_|\____/___/   
    2024-10-11 21:10:18,291 INFO  [io.und.websockets] (main) UT026003: Adding annotated server endpoint class io.cryostat.ws.MessagingServer for path /api/notifications
    2024-10-11 21:10:19,545 WARN  [io.agr.pool] (agroal-11) Datasource '<default>': FATAL: password authentication failed for user "cryostat"
    2024-10-11 21:10:19,603 WARN  [io.agr.pool] (agroal-11) Datasource '<default>': FATAL: password authentication failed for user "cryostat"
    2024-10-11 21:10:19,638 ERROR [io.qua.run.Application] (main) Failed to start application: java.lang.RuntimeException: Failed to start quarkus
    at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
    at io.quarkus.runtime.Application.start(Application.java:101)
    at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:119)
    at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
    at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
    at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
    at io.quarkus.runner.GeneratedMain.main(Unknown Source)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at io.quarkus.bootstrap.runner.QuarkusEntryPoint.doRun(QuarkusEntryPoint.java:62)
    at io.quarkus.bootstrap.runner.QuarkusEntryPoint.main(QuarkusEntryPoint.java:33)
    Caused by: org.flywaydb.core.internal.exception.FlywaySqlException: Unable to obtain connection from database: FATAL: password authentication failed for user "cryostat"
    ----------------------------------------------------------------------------------------------------
    SQL State  : 28P01
    Error Code : 0
    Message    : FATAL: password authentication failed for user "cryostat"
    
    at org.flywaydb.core.internal.jdbc.JdbcUtils.openConnection(JdbcUtils.java:71)
    at org.flywaydb.core.internal.jdbc.JdbcConnectionFactory.<init>(JdbcConnectionFactory.java:76)
    at org.flywaydb.core.FlywayExecutor.execute(FlywayExecutor.java:138)
    at org.flywaydb.core.Flyway.migrate(Flyway.java:164)
    at io.quarkus.flyway.runtime.FlywayRecorder.doStartActions(FlywayRecorder.java:136)
    at io.quarkus.deployment.steps.FlywayProcessor$startActions2099152139.deploy_0(Unknown Source)
    at io.quarkus.deployment.steps.FlywayProcessor$startActions2099152139.deploy(Unknown Source)
    ... 11 more
    Caused by: org.postgresql.util.PSQLException: FATAL: password authentication failed for user "cryostat"
    at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:711)
    at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:213)
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:268)
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:54)
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:273)
    at org.postgresql.Driver.makeConnection(Driver.java:446)
    at org.postgresql.Driver.connect(Driver.java:298)
    at io.agroal.pool.ConnectionFactory.createConnection(ConnectionFactory.java:225)
    at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:580)
    at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:561)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
    at io.agroal.pool.util.PriorityScheduledExecutor.beforeExecute(PriorityScheduledExecutor.java:75)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1583)
    

charts/cryostat/templates/NOTES.txt Outdated Show resolved Hide resolved
@andrewazores
Copy link
Member Author

Using default chart values, I got a white screen when trying to view a recording in Grafana. Don't see any errors though, so not sure what's going on.

Hmm, looks like the port-forward/proxying isn't working properly. The browser goes to /grafana/, but the response it receives back is the index.html of cryostat-web, not Grafana. I've been testing installation on OpenShift with the optional features enabled (helm install cryostat --set authentication.openshift.enabled=true --set core.route.enabled=true ./charts/cryostat/) and the Grafana setup works fine there still. I'll take a closer look at why it's broken with the defaults/port-forward/oauth2-proxy.

Copy link
Member

@ebaron ebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change This change (potentially) breaks API compatibility and requires corresponding changes elsewhere feat New feature or request safe-to-test
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Task] Deploy oauth2-proxy in front of Cryostat, storage, db, jfr-datasource, and Grafana
3 participants