Begin merging production changes #197

ludgerpaehler · 2024-10-27T01:58:14Z

No description yet.

TODO:

Rewrite commits to actually be associated with Christian Lagemann (the author) @cl126162
Resolve merge conflicts

…ing rules.

…lder.

ReHoss

Sorry wrong button... Trying to abandon the "review".

ReHoss · 2024-10-27T17:17:52Z

hydrogym/firedrake/envs/cylinder/flow.py

-X, Y = np.meshgrid(xp, yp)
-DEFAULT_VEL_PROBES = [(x, y) for x, y in zip(X.ravel(), Y.ravel())]
-
-# Pressure probes (spaced equally around the cylinder)
 RADIUS = 0.5


RADIUS does not seem to be used anymore. It was used by previous pressure probes definition.

ReHoss · 2024-10-27T17:24:08Z

hydrogym/core.py

    self.initialize_state()

    self.reset()

    if config.get("restart"):
-      self.load_checkpoint(config["restart"])
+      self.load_checkpoint(config["restart"][0])


Why the checkpoint is an iterable now ?

ReHoss · 2024-10-27T17:42:11Z

hydrogym/core.py

@@ -391,6 +489,7 @@ def reset(self, t=0.0) -> Union[ArrayLike, Tuple[ArrayLike, dict]]:
    self.flow.reset(q0=self.q0, t=t)
    self.solver.reset()

+    # Previously: return self.flow.get_observations(), info


master still return self.flow.get_observations() only too.
(Linked to the question of the transition to Gymnasium )

ReHoss · 2024-10-27T18:06:06Z

hydrogym/firedrake/envs/cylinder/flow.py

@@ -242,7 +242,7 @@ def cyl_velocity_field(self):
    theta_lo = -0.5 * pi
    A_lo = ufl.conditional(
        abs(theta - theta_lo) < omega / 2,
-        pi / (2 * omega * self.rad**2) * ufl.cos(
+        -pi / (2 * omega * self.rad**2) * ufl.cos(


Critical fix that has not been considered in #195 ? @jcallaham

I don't know, did I get this wrong? I think this is separate from the fixes in #195

Hello @jcallaham , if you have a moment to sort this out I would be very grateful since I am writing my PhD thesis and I am about to launch experiments.
Thank you:)

I'm comparing with Appendix B in the referenced paper and it looks to me like the original was correct. @cl126162 @ludgerpaehler can anyone elaborate on this change? I implemented this as zero net mass flux, which is fairly typical of CFD actuation models, though that reference actually implemented as two independent actuators, which is not really great in terms of boundary conditions and physicality. I guess changing the sign would make it so that both jets are in blowing or suction at once, rather than in opposition (as in the ZNMF configuration).

thank you very much @jcallaham :)!

ReHoss · 2024-10-27T18:21:33Z

hydrogym/firedrake/envs/pinball/flow.py

@@ -13,6 +13,36 @@


 class Pinball(FlowConfig):
+  rad = 0.5


rad is already defined below, it could be good to pass an unused variables check (shipped with standard static checkers)

ReHoss · 2024-10-27T18:31:45Z

hydrogym/core.py

@@ -292,6 +298,43 @@ def solve(
            PDEBase: The state of the PDE at the end of the solve
        """
    for iter, t in enumerate(np.arange(*t_span, self.dt)):
+      iter = iter + start_iteration_value


It looks like solve and solve_multistep do almost the same thing. For a "public" release It may be better to merge them.:)

I agree - what's the distinction supposed to be here?

ReHoss · 2024-10-27T18:53:35Z

hydrogym/core.py

+      assert self.rewardLogCallback is not None,\
+        f"Error: If num_sim_substeps_per_actuation ({self.num_sim_substeps_per_actuation}) " \
+        "is set a reward callback function must be given, {self.rewardLogCallback}"
+      self.reward_aggreation_rule = env_config.get("actuation_config", {}).get(


what's the purpose of the reward_aggregation_rule for a gym interface where only instantaneous rewards/costs are considered? Is it the aggregations of the values in between decision times (times where the user hits step ) ?

(there is a typo in the variable name reward_aggreation_rule)

Could this all happen in some kind of Wrapper? All of this seems to add a lot of complexity to an otherwise straightforward class.

Wrappers can deal with those aggregation rules and also action scaling.

ReHoss · 2024-10-27T19:09:38Z

hydrogym/firedrake/envs/cavity/flow.py

-  MAX_CONTROL = 0.1
+  MAX_CONTROL_LOW = 0.0
+  MAX_CONTROL_UP = 0.03
+  CONTROL_SCALING = 1.0


what does CONTROL_SCALING now introduces?

Answer: It is a multiplicating scaling of the input action used in the step method.

Why is this used? Why wouldn't you just scale the input action externally to the physics model of the flow?

It also appears that the Pinball FlowConfig is missing a CONTROL_SCALING parameter, so it is not possible to step through the pinball environment with the current PR.

ReHoss · 2024-10-27T19:42:48Z

hydrogym/core.py

-    t = self.iter * self.solver.dt
+    action = action * self.flow.CONTROL_SCALING
+
+    if self.num_sim_substeps_per_actuation is not None and self.num_sim_substeps_per_actuation > 1:


If I understand correctly num_sim_substeps_per_actuation allows stretching the duration of the actions uniformly (actions given to the damped controller are piecewise constant). For this piecewise action signal, num_sim_substeps_per_actuation adjusts the length of the intervals on which the action is constant?

ReHoss · 2024-10-27T19:44:27Z

hydrogym/firedrake/envs/cavity/flow.py

@@ -11,6 +11,18 @@


 class Cavity(FlowConfig):
+  # Velocity probes


probs are used now and the old setting of A. Barbagallo, D. Sipp, P. J. Schmid - Closed-loop control of an open cavity flow using reduced-order models (2014) is abandoned?

Looks like it's still supported (as I think it should be), but not the default measurement.

Indeed, probes can be set optionally and the old setting can still be used.

ReHoss · 2024-10-27T20:01:00Z

hydrogym/core.py

+      self.iter += 1
+      self.t += self.dt
+      reward = self.get_reward()
+


Is the else statement above redundant since num_sim_substeps_per_actuation == 1 should be equivalent? (Not critical, and if things are in "production" it may be dangerous to modify the code too much...)

ReHoss · 2024-10-27T20:11:52Z

hydrogym/firedrake/envs/cylinder/flow.py

@@ -200,7 +196,9 @@ def render(self, mode="human", clim=None, levels=None, cmap="RdBu", **kwargs):


 class RotaryCylinder(CylinderBase):
-  MAX_CONTROL = 0.5 * np.pi
+  MAX_CONTROL_LOW = -1.0


Contradicts MAX_CONTROL recent update from #195

Yeah is this based on anything in particular? If not, having it just match the pinball would probably make sense.

ReHoss · 2024-10-27T20:18:59Z

hydrogym/firedrake/envs/cylinder/flow.py

+    else:
+      CL, CD = averaged_objective_values
+    # return np.square(CD) + lambda_value * np.square(CL)
+    return np.abs(CD) + lambda_value * np.abs(CL)


Could give details/references on this new loss at some point?

ReHoss · 2024-10-27T21:06:40Z

A brief summary (correct me if I am wrong)

Hydrogym update

Update the environments to accept a mean/average reward (evaluate_objective) when stretching the action signal (there is now a decision time vs. a physical time)
Implements a series of default probes now

Is reward aggregation now mandatory (otherwise an exception is raised) (Could it be defaulted to average?)

Cavity

The control bounds are updated
Probes are added to the cavity
Nothing removed, evaluate_objective extended to support aggregating objective values

CylinderBase

Move and reduce the number of probes
Add vorticity probes
Cylinder: sign fix for A_lo
RotaryCylinder: change bounds MAX_CONTROL (different from Jared's Fix BCs and geometry for pinball #195); change CONTROL_SCALING
New objective function that combines CD and CL with a penalty

Pinball

Add probes (no probes before)

Code review

Unused variables check
Type hinting?
solve and solve_multiple... seems redundant

ReHoss · 2024-10-27T21:11:47Z

hydrogym/core.py

+        "is set a reward callback function must be given, {self.rewardLogCallback}"
+      self.reward_aggreation_rule = env_config.get("actuation_config", {}).get(
+          "reward_aggreation_rule", None)
+      assert self.reward_aggreation_rule in [


A reward aggregation rule is implemented and mandatory now, a default could possibly be set for backward compat?

jcallaham

Lots of good stuff , but I think we'll have to do some cleanup before it's really release-ready. Let me know if I can help with any of this. Are these basically the changes that made the baseline RL training happen?

High-level comments:

Timesteppers don't use the actual time directly - that's only part of the ODE RHS, so in my opinion it makes more sense to continue having t be owned by flow. If needed for callbacks or controllers or whatever it can always be accessed by flow.t.
The reward aggregation stuff looks useful, but the source code is fairly confusing. I would say it should either be cleaned up and clearly documented or (preferably) moved into a gym.Wrapper unless this is typical as part of a gym.Env implementation.
I really like the idea of having the 1D PDEs as cheap environments, but these implementations are completely independent of everything else in Hydrogym, meaning they don't follow the PDEBase/TransientSolver/FlowEnv design. If the goal is a "production" release, I think it might be better to hold off on these and then later implement them as part of a full "torch" backend or something. Otherwise it seems like these could just be in a totally separate repo.
Generally my opinion is that the FlowConfig should be a physics model of the flow configuration and should not have anything related to reinforcement learning. That makes it so that the package could also be used for more traditional fluid dynamics and controls research. It should be possible to put all of that stuff either in some kind of callback or at the Env level.

jcallaham · 2024-10-31T15:14:04Z

hydrogym/core.py

@@ -272,12 +276,14 @@ def __init__(self, flow: PDEBase, dt: float = None):
    if dt is None:
      dt = flow.DEFAULT_DT
    self.dt = dt
+    self.t = 0.0


Currently the time is owned by the flow, not the solver. I guess you could do it the other way around, but since the "state" (velocity, pressure, etc) is owned by the flow class I figured it made more sense to have time there as well. Then it is advanced with flow.advance_time(dt) here. On the other hand, the solver doesn't directly depend on time, only the timestep, so my feeling is that it doesn't naturally belong there.

Actually I think the only place it's needed is for the "callbacks" or "controller", but now that time is a property of the flow it doesn't need to be there at all (or the callback could be passed flow.t instead of calculating it within the loop).

I agree, a solver object can be viewed as a mapping (implemented by a solve method) taking as input the current time $t$ (for non-autonomous systems), the current state $x_t$ and a time duration $T$ to return the solution $x_{t+T}$ $T$ times later. (this mapping is called a flow in maths). But the mapping in itself is fixed and independent of the time! Hence, initialising it with self.t = 0.0 may not be really needed.

Practically, I don't see any use of self.t in this class, so it seems safe to remove it?

jcallaham · 2024-10-31T15:14:36Z

hydrogym/core.py

@@ -292,6 +298,43 @@ def solve(
            PDEBase: The state of the PDE at the end of the solve
        """
    for iter, t in enumerate(np.arange(*t_span, self.dt)):
+      iter = iter + start_iteration_value


I agree - what's the distinction supposed to be here?

jcallaham · 2024-10-31T15:15:03Z

hydrogym/core.py

@@ -317,20 +360,25 @@ def step(self, iter: int, control: Iterable[float] = None, **kwargs):

  def reset(self):
    """Reset variables for the timestepper"""
-    pass
+    self.t = 0.0


See above - time should be owned by the flow, and the reset method for the flow can also reset the time.

jcallaham · 2024-10-31T15:17:00Z

hydrogym/core.py

    self.solver: TransientSolver = env_config.get("solver")(
        self.flow, **env_config.get("solver_config", {}))
    self.callbacks: Iterable[CallbackBase] = env_config.get("callbacks", [])
+    self.rewardLogCallback: Iterable[CallbackBase] = env_config.get(


Should use snake case for consistent style

jcallaham · 2024-10-31T15:21:09Z

hydrogym/firedrake/utils/io.py

+      # if ((iter + 1) * self.dt * 10e6) % (self.DT * 10e6) == 0:
+      if (iter + 1) % self.num_sim_substeps_per_actuation == 0:
+        self.data = np.array(self.data)
+        flow.reward_array = self.data


Is there another way to do this? The flow is a physics model and so RL-specific stuff should be abstracted

Basically transforming this class in a gym.Wrapper

jcallaham · 2024-10-31T15:39:18Z

hydrogym/firedrake/envs/step/flow.py

+                         q=None,
+                         qB=None,
+                         averaged_objective_values=None,
+                         return_objective_values=False):


Same comments on evaluate_objective changes for the cylinder

jcallaham · 2024-10-31T15:40:14Z

hydrogym/firedrake/flow.py

@@ -345,14 +355,20 @@ def velocity_probe(self, probes, q: fd.Function = None) -> list[float]:
    if q is None:
      q = self.q
    u = q.subfunctions[0]
-    return np.stack(u.at(probes)).flatten("F")
+
+    # probe_values = np.stack([u.at(probe) for probe in probes]).flatten("F")


Lots of commented out code here (and throughout the PR)

jcallaham · 2024-10-31T15:41:29Z

hydrogym/core.py

@@ -47,12 +49,13 @@ class PDEBase(metaclass=abc.ABCMeta):

  def __init__(self, **config):
    self.mesh = self.load_mesh(name=config.get("mesh", self.DEFAULT_MESH))
+    self.reward_lambda = config.get("reward_lambda", 0.0)


This doesn't seem to be used for anything. Also not related to the physics modeled by the PDEBase.

jcallaham · 2024-10-31T15:49:15Z

hydrogym/firedrake/envs/cavity/flow.py

-  MAX_CONTROL = 0.1
+  MAX_CONTROL_LOW = 0.0
+  MAX_CONTROL_UP = 0.03
+  CONTROL_SCALING = 1.0


Why is this used? Why wouldn't you just scale the input action externally to the physics model of the flow?

jcallaham · 2024-10-31T15:51:29Z

pyproject.toml

@@ -70,13 +70,21 @@ seaborn = "^0.12.1"
 [tool.yapf]
 based_on_style = "yapf"

+[tool.yapfignore]
+ignore_patterns = [
+    "hydrogym/experimental/*"


Why does this get a pass on formatting?

ReHoss · 2024-10-31T17:32:20Z

Lots of good stuff , but I think we'll have to do some cleanup before it's really release-ready. Let me know if I can help with any of this. Are these basically the changes that made the baseline RL training happen?

High-level comments:

Timesteppers don't use the actual time directly - that's only part of the ODE RHS, so in my opinion it makes more sense to continue having t be owned by flow. If needed for callbacks or controllers or whatever it can always be accessed by flow.t.

I agree.

The reward aggregation stuff looks useful, but the source code is fairly confusing. I would say it should either be cleaned up and clearly documented or (preferably) moved into a gym.Wrapper unless this is typical as part of a gym.Env implementation.

I think a gym.wrapper can do the job.

I really like the idea of having the 1D PDEs as cheap environments, but these implementations are completely independent of everything else in Hydrogym, meaning they don't follow the PDEBase/TransientSolver/FlowEnv design. If the goal is a "production" release, I think it might be better to hold off on these and then later implement them as part of a full "torch" backend or something. Otherwise, it seems like these could just be in a totally separate repo.

I agree!

Generally, my opinion is that the FlowConfig should be a physics model of the flow configuration and should not have anything related to reinforcement learning. That makes it so that the package could also be used for more traditional fluid dynamics and controls research. It should be possible to put all of that stuff either in some kind of callback or at the Env level.

I agree.

From an outside point of view and by reading @jcallaham comments, the new code typically looks like a fork devoted to research experiments and publication. Indeed, it does not totally fit in the hydrogym package design at this stage (where I assume the hydrogym package is intended for public usage). However, the fixes are rather easy, I think:

Creating a wrapper for the inter-decision time and the reward aggregation
Probes locations global variables can be stored somewhere or uniformised
Unify solve and solve_multistep
typing, formatting etc.

I could help with this task by the end of Nov.

In the end, the "feature" provided by this new code are the introduction of inter-decision time and reward aggregation, probes, a modification of the magnitudes of the control bounds + a sign fix in the dynamics.

It would be interesting to have an explanation of why those changes are needed and how they impact the system.

root and others added 6 commits April 2, 2024 01:52

first rllib modifications

707aac0

fix probes locations

6932f62

rm output

727beba

update probe positions

bb2688b

Add 1D environments

805fa5e

update environments

d8db8ad

ludgerpaehler mentioned this pull request Oct 27, 2024

Configuration files into .ini files #162

Open

ludgerpaehler added 12 commits October 27, 2024 13:26

Merge main into working branch for PR.

ad9e144

Clean up formatting.

94a22cf

Create an experimental directory for which we don't apply the formatt…

c690987

…ing rules.

Ignore ruff and yapf formatting for the experimental directory.

8069011

Move tensor files into the git lfs.

b167795

Refactor the experimental torch environments into the experimental fo…

8a6a0a2

…lder.

Remove torch folder as it was refactored into an experimental folder.

10e4260

Reformat flow description files to comply with Python design guidelines.

d719b15

Clean up to remove experimental features from the core

9df8e64

Remove unused assignment of t.

1f4f989

yapf formatting

9ac74bc

yapf formatting of the pressure probes.

df0cb03

ludgerpaehler self-assigned this Oct 27, 2024

ludgerpaehler added priority High-priority core feature Paper-Release python Pull requests that update Python code labels Oct 27, 2024

ludgerpaehler requested review from jcallaham and cl126162 October 27, 2024 16:25

ReHoss reviewed Oct 27, 2024

View reviewed changes

jcallaham requested changes Oct 31, 2024

View reviewed changes

		@@ -11,6 +11,18 @@


		class Cavity(FlowConfig):
		# Velocity probes

Begin merging production changes #197

Are you sure you want to change the base?

Begin merging production changes #197

Conversation

ludgerpaehler commented Oct 27, 2024 • edited Loading

ReHoss left a comment

Choose a reason for hiding this comment

ReHoss Oct 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss Oct 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss Oct 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss Oct 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss Oct 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss commented Oct 27, 2024 • edited Loading

Hydrogym update

Cavity

CylinderBase

Pinball

Code review

Choose a reason for hiding this comment

jcallaham left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ReHoss commented Oct 31, 2024 • edited Loading

ludgerpaehler commented Oct 27, 2024 •

edited

Loading

ReHoss Oct 27, 2024 •

edited

Loading

ReHoss Oct 27, 2024 •

edited

Loading

ReHoss Oct 27, 2024 •

edited

Loading

ReHoss Oct 27, 2024 •

edited

Loading

ReHoss Oct 27, 2024 •

edited

Loading

ReHoss commented Oct 27, 2024 •

edited

Loading

ReHoss Oct 31, 2024 •

edited

Loading

ReHoss commented Oct 31, 2024 •

edited

Loading