-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nan min/max to handle possible nan ghosts #923
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThe changes in this pull request involve modifications to the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant PatchHierarchy
participant Dataset
User->>PatchHierarchy: Call global_min(qty)
PatchHierarchy->>Dataset: Calculate np.nanmin(pd.dataset[:])
Dataset-->>PatchHierarchy: Return minimum value
PatchHierarchy-->>User: Return global minimum
User->>PatchHierarchy: Call global_max(qty)
PatchHierarchy->>Dataset: Calculate np.nanmax(pd.dataset[:])
Dataset-->>PatchHierarchy: Return maximum value
PatchHierarchy-->>User: Return global maximum
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m = np.nanmax(pd.dataset[:]) | ||
first = False | ||
else: | ||
m = max(m, pd.dataset[:].max()) | ||
data_and_max = np.concatenate(([m], pd.dataset[:])) | ||
m = np.nanmax(data_and_max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor to reduce code duplication with global_min
.
The global_max
implementation duplicates the logic from global_min
. Consider extracting a common utility function to handle both cases.
Here's a suggested refactoring:
+ def _global_minmax(self, qty, operation, time=None):
+ """Helper function for global min/max calculations
+ Args:
+ qty: The quantity to compute min/max for
+ operation: np.nanmin or np.nanmax
+ time: Optional timestamp
+ Returns:
+ The global min or max value
+ Raises:
+ ValueError: If all values are NaN
+ """
+ time = time or self._default_time()
+ values = []
+ for _, lvl in self.levels(time).items():
+ for patch in lvl.patches:
+ values.append(patch.patch_datas[qty].dataset[:])
+ all_values = np.concatenate(values)
+ if np.all(np.isnan(all_values)):
+ raise ValueError("All values are NaN")
+ return operation(all_values)
+
+ def global_min(self, qty, **kwargs):
+ return self._global_minmax(qty, np.nanmin, kwargs.get("time"))
+
+ def global_max(self, qty, **kwargs):
+ return self._global_minmax(qty, np.nanmax, kwargs.get("time"))
This refactoring:
- Eliminates code duplication
- Improves performance by avoiding repeated array concatenations
- Properly handles the all-NaN case
- Makes the code more maintainable
Committable suggestion skipped: line range outside the PR's diff.
m = np.nanmin(pd.dataset[:]) | ||
first = False | ||
else: | ||
m = min(m, pd.dataset[:].min()) | ||
data_and_min = np.concatenate(([m], pd.dataset[:])) | ||
m = np.nanmin(data_and_min) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider optimizing the min calculation and handling edge cases.
The current implementation using array concatenation in each iteration is not optimal. Additionally, the case where all values are NaN should be handled explicitly.
Consider this more efficient implementation:
- m = np.nanmin(pd.dataset[:])
- first = False
- else:
- data_and_min = np.concatenate(([m], pd.dataset[:]))
- m = np.nanmin(data_and_min)
+ m = pd.dataset[:]
+ first = False
+ else:
+ m = np.append(m, pd.dataset[:])
+ if np.all(np.isnan(m)):
+ raise ValueError("All values are NaN")
+ return np.nanmin(m)
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
pyphare/pyphare/pharesee/hierarchy/hierarchy.py
(2 hunks)
🔇 Additional comments (1)
pyphare/pyphare/pharesee/hierarchy/hierarchy.py (1)
293-297
: Reduce code duplication with global_min
.
The implementation duplicates the logic from global_min
.
As suggested in a previous review, consider extracting a common utility function to handle both cases. This would:
- Eliminate code duplication
- Ensure consistent handling of edge cases
- Make the code more maintainable
The previous review comment provided a detailed implementation suggestion.
m = np.nanmin(pd.dataset[:]) | ||
first = False | ||
else: | ||
m = min(m, pd.dataset[:].min()) | ||
data_and_min = np.concatenate(([m], pd.dataset[:].flatten())) | ||
m = np.nanmin(data_and_min) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider optimizing the min calculation and handling edge cases.
While using np.nanmin
correctly handles NaN values, the current implementation has some performance concerns:
- Concatenating arrays in a loop is inefficient (O(n²) complexity).
- No explicit handling for the case where all values are NaN.
Consider this more efficient implementation:
- m = np.nanmin(pd.dataset[:])
- first = False
- else:
- data_and_min = np.concatenate(([m], pd.dataset[:].flatten()))
- m = np.nanmin(data_and_min)
+ values = pd.dataset[:]
+ first = False
+ else:
+ values = np.append(values, pd.dataset[:])
+ if np.all(np.isnan(values)):
+ raise ValueError("All values are NaN")
+ return np.nanmin(values)
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* nan min/max to handle possible nan ghosts * flatten
* nan min/max to handle possible nan ghosts * flatten
Summary by CodeRabbit
PatchHierarchy
class.