Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any way to get pulse of inference containers on Azure? #19

Open
dthaler opened this issue Jun 5, 2024 · 2 comments
Open

Any way to get pulse of inference containers on Azure? #19

dthaler opened this issue Jun 5, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@dthaler
Copy link
Collaborator

dthaler commented Jun 5, 2024

Any way to get pulse of inference containers on Azure (ask Patrick? Or Michelle on Github?)

@dthaler
Copy link
Collaborator Author

dthaler commented Jun 12, 2024

@pastorep @micya Scott suggested pinging you two. Any way to tell what nodes OrcaHello is actively monitoring? The status endpoint is not node specific, and the detections endpoint doesn't give any information on what is actively being monitored.

@micya
Copy link
Member

micya commented Jun 12, 2024

Any way to get pulse of inference containers on Azure (ask Patrick? Or Michelle on Github?)

Inference system is running on the AKS cluster inference-system-AKS in resource group named LiveSRKWNotificationSystem. https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#deploying-an-updated-docker-build-to-azure-kubernetes-service is still accurate. Each location has its own namespace.

@pastorep @micya Scott suggested pinging you two. Any way to tell what nodes OrcaHello is actively monitoring? The status endpoint is not node specific, and the detections endpoint doesn't give any information on what is actively being monitored.

Not programmatically. Currently, we have a separate container image per location (refer to yaml files here) . The code in all images are the same, but there is a different config file in each image (config files here). I'm not sure which config goes to which image, but you could probably poke through the images in Azure Container Registry to figure it out.

Suggestions:

  1. Unify the docker images into one and inject per-location config at runtime.
  2. Consider having each location deployment report its own location. This can be an endpoint that returns a string (either in the inference service or as a sidecar).
  3. Consider having each location deployment send a heartbeat. A monitoring solution could just subscribe to the heartbeats to figure out which location is active. The heartbeat might also just send the location string.
  4. You may also consider a more comprehensive kubernetes cluster monitoring solution. But since our usage is fairly simple and our cluster is like cattle and not pets, I suggest skipping this in favor of the heartbeat system proposed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants