Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Exporter: performance improvements for big workspaces (#3167)
* Exporter: performance improvements for big workspaces The first part of performance improvements - parallel generation of resources: For each identified resource we're generating its body separately from other resources using the `EXPORTER_RESOURCE_HANDLERS` (default is 50) goroutines - this helps with processing references to other resources (although we still have some performance problems for complex resources, like, jobs, but it will be improved in next commits). Generated bodies are sent to a dedicated channels that are responsible for writing the code into files. Next steps: - reimplement `resourceApproximation` structure to avoid linear search. - optimization of reference search. * add error code handling * Don't use os.Stat, just count resources * More aggressive caching of some checks, like, users & users directories * Rewrite resource approximation has/append * Use dedicated channels only for some resources, everything else goes to a shared one * Rewrote references lookup to avoid iteration when there is a direct lookup Prefix lookup & case-insensitive lookup is still done by iterating over resources * Reorder references to avoid having most expensive executed before direct lookup Also, for case-insensitive matching, first try to do direct lookup before iteration * Fix test of cluster import * Start to emit notebooks & workspace files during the listing, without waiting its finished * Further optimize notebooks/workspace files emits * Introduce lightweight check for user existence * Emit workspace objects from separate goroutines to avoid workspace listing stuck on users lookup * Reorganize the order of checks in `Emit` function It checks if a service is enabled before performing other checks - this should decrease the number of lookups in the state approximation * Move parallel workspace listing to the exporter implementation * Fix test * Another fix * another attempt to fix tests * Small adjustments * control the submissions to the default channel to avoid deadlock
- Loading branch information