Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Camera FieldOfView can be NaN #8

Open
NoelFB opened this issue Jan 30, 2024 · 22 comments
Open

Camera FieldOfView can be NaN #8

NoelFB opened this issue Jan 30, 2024 · 22 comments

Comments

@NoelFB
Copy link
Contributor

NoelFB commented Jan 30, 2024

Error Log (1/29/2024 11:15:25 PM)
Call Stack:
System.ArgumentOutOfRangeException: fieldOfView ('NaN') must be greater than '0'. (Parameter 'fieldOfView')
Actual value was NaN.
   at System.ArgumentOutOfRangeException.ThrowLessEqual[T](T, T, String)
   at Celeste64.Camera.get_Projection() in /home/noel/Projects/Celeste64/Source/Graphics/Camera.cs:line 128
   at Celeste64.World.Render(Target) in /home/noel/Projects/Celeste64/Source/Scenes/World.cs:line 633
   at Celeste64.Game.Render() in /home/noel/Projects/Celeste64/Source/Game.cs:line 288
   at Foster.Framework.App.Tick()
   at Foster.Framework.App.Run(String, Int32, Int32, Boolean )
   at Foster.Framework.App.Run[T](String, Int32, Int32, Boolean )
   at Celeste64.Program.Main(String[]) in /home/noel/Projects/Celeste64/Source/Program.cs:line 23
Game Output:
Celeste 64 v.1.0.1
Foster: v0.1.14
Platform: Microsoft Windows 10.0.19045 (X64)
Framework: .NET 8.0.1
SDL: v2.28.5
OpenGL: v3.3.13399 Core Profile Forward-Compatible Context 15.201.1151.1008, AMD Radeon HD 6450
FMOD Bindings: v20218
FMOD: v20207
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\Master.strings.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\Master.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\music.bank
Loaded Bank: C:\Users\*\Desktop\Celeste64-win-x64\Content\Audio\sfx.bank
Loaded Assets in 1758ms
Strawb Count: 20
Loaded Map '1' in 268ms```
@NoelFB
Copy link
Contributor Author

NoelFB commented Jan 30, 2024

The only place FieldOfView is ever assigned is here in Player.cs

float targetFOV = Calc.ClampedMap(velocity.XY().Length(), MaxSpeed * 1.2f, 120, 1, 1.2f);
World.Camera.FOVMultiplier = Calc.Approach(World.Camera.FOVMultiplier, targetFOV, Time.Delta / 4);

The only way I could see NaN being assigned is if velocity is also somehow maybe NaN? Weird.

@SnipUndercover
Copy link

SnipUndercover commented Jan 30, 2024

Gonna share some progress debugging this crash.

I managed to narrow down the NaN appearance in one crash instance to this block of code in StNormalUpdate:

// movement
{
var velXY = velocity.XY();
if (Controls.Move.Value == Vec2.Zero || tNoMove > 0)
{
// if not moving, simply apply friction
float fric = Friction;
if (!onGround)
fric *= AirFrictionMult;
// friction
Calc.Approach(ref velXY, Vec2.Zero, fric * Time.Delta);
}
else if (onGround)
{
float max = MaxSpeed;
// change max speed based on ground slope angle
if (groundNormal != Vec3.UnitZ)
{
float slopeDot = 1 - Calc.Clamp(Vec3.Dot(groundNormal, Vec3.UnitZ), 0, 1);
slopeDot *= Vec2.Dot(groundNormal.XY().Normalized(), targetFacing) * 2;
max += max * slopeDot;
}
// trueMax is the max XY speed before applying analog stick magnitude
float trueMax = max;
// apply analog stick magnitude
{
float mag = Calc.ClampedMap(Controls.Move.Value.Length(), .4f, .92f, .3f, 1);
max *= mag;
}
var input = RelativeMoveInput;
// TODO: Solve this way better! Ugh I hate this!!
// move lightly away from ledges by checking for no floor, and then sweeping in until we find floor
// Please don't look at this code
// if I had more time to solve this nicely I would do something else
{
var d = 4;
if (input != Vec2.Zero &&
!World.SolidRayCast(Position + new Vec3(input, 1) * d, -Vec3.UnitZ, 8, out var hit) &&
!World.SolidRayCast(Position + new Vec3(0, 0, d), new Vec3(input, 0), d, out hit))
{
var left = Calc.AngleToVector(Calc.Angle(input) + 0.3f);
var right = Calc.AngleToVector(Calc.Angle(input) - 0.3f);
var count = 0;
if (World.SolidRayCast(Position + new Vec3(left, 1) * d, -Vec3.UnitZ, 8, out hit))
{
while (World.SolidRayCast(Position + new Vec3(left, 1) * d, -Vec3.UnitZ, 8, out hit) && count++ < 10)
left = Calc.AngleToVector(Calc.Angle(left) - 0.1f);
input = Calc.AngleToVector(Calc.Angle(left) + 0.1f); ;
}
else if (World.SolidRayCast(Position + new Vec3(right, 1) * d, -Vec3.UnitZ, 8, out hit))
{
while (World.SolidRayCast(Position + new Vec3(right, 1) * d, -Vec3.UnitZ, 8, out hit) && count++ < 10)
right = Calc.AngleToVector(Calc.Angle(right) + 0.1f);
input = Calc.AngleToVector(Calc.Angle(right) - 0.1f); ;
}
}
}
// if travelling faster than our "true max" (ie. our max not accounting for analog stick magnitude),
// then we switch into a slower decceleration to help the player preserve high speeds
float accel;
if (velXY.LengthSquared() >= trueMax * trueMax && Vec2.Dot(input, velXY) >= .7f)
accel = PastMaxDeccel;
else
accel = Acceleration;
// if our XY velocity is above the Rotate Threshold, then our XY velocity begins rotating
// instead of using a simple approach to accelerate
if (velXY.LengthSquared() >= RotateThreshold * RotateThreshold)
{
if (Vec2.Dot(input, velXY.Normalized()) <= SkidDotThreshold)
{
Facing = targetFacing = input;
stateMachine.State = States.Skidding;
return;
}
else
{
// Rotate speed is less when travelling above our "true max" speed
// this gives high speeds less fine control
float rotate;
if (velXY.LengthSquared() > trueMax * trueMax)
rotate = RotateSpeedAboveMax;
else
rotate = RotateSpeed;
targetFacing = Calc.RotateToward(targetFacing, input, rotate * Time.Delta, 0);
velXY = targetFacing * Calc.Approach(velXY.Length(), max, accel * Time.Delta);
}
}
else
{
// if we're below the RotateThreshold, acceleration is very simple
Calc.Approach(ref velXY, input * max, accel * Time.Delta);
targetFacing = input.Normalized();
}
}
else
{
float accel;
if (velXY.LengthSquared() >= MaxSpeed * MaxSpeed && Vec2.Dot(RelativeMoveInput.Normalized(), velXY.Normalized()) >= .7f)
{
accel = PastMaxDeccel;
var dot = Vec2.Dot(RelativeMoveInput.Normalized(), targetFacing);
accel *= Calc.ClampedMap(dot, -1, 1, AirAccelMultMax, AirAccelMultMin);
}
else
{
accel = Acceleration;
var dot = Vec2.Dot(RelativeMoveInput.Normalized(), targetFacing);
accel *= Calc.ClampedMap(dot, -1, 1, AirAccelMultMin, AirAccelMultMax);
}
Calc.Approach(ref velXY, RelativeMoveInput * MaxSpeed, accel * Time.Delta);
}
velocity = velocity.WithXY(velXY);
}

velXY is being set to NaN somewhere here.

The person says they're crashing whenever they try to move around, but jumping works fine - so does moving the camera. Dashing causes the game to freeze, both grounded and airborne. They use keyboard, and not analog (I even completely commented out the AddLeftJoystick and AddDPad calls - it still crashes.)

Feels like Move.Value is somehow NaN?

@SnipUndercover
Copy link

Good news; Move.Value does not contain NaN. Building and running with the below patch does not log that Controls.Move.Value has NaN.

diff --git a/Source/Game.cs b/Source/Game.cs
index 89c0fe8..a5fb4f9 100644
--- a/Source/Game.cs
+++ b/Source/Game.cs
@@ -113,6 +113,9 @@ public class Game : Module

        public override void Update()
        {
+               if (VectorHelpers.HasNaN(Controls.Move.Value))
+                       Log.Error($"{nameof(Controls.Move.Value)} contains NaN! Things are about to go very wrong! ({Controls.Move.Value})");
+
                // update top scene
                if (scenes.TryPeek(out var scene))
                {
diff --git a/Source/Helpers/VectorHelpers.cs b/Source/Helpers/VectorHelpers.cs
new file mode 100644
index 0000000..173d09a
--- /dev/null
+++ b/Source/Helpers/VectorHelpers.cs
@@ -0,0 +1,9 @@
+namespace Celeste64;
+public static class VectorHelpers
+{
+    public static bool HasNaN(this in Vec2 vec2)
+        => float.IsNaN(vec2.X) || float.IsNaN(vec2.Y);
+
+    public static bool HasNaN(this in Vec3 vec3)
+        => float.IsNaN(vec3.X) || float.IsNaN(vec3.Y) || float.IsNaN(vec3.Z);
+}

@NoelFB
Copy link
Contributor Author

NoelFB commented Jan 31, 2024

Thanks for the investigation! This should narrow it down a lot ...

@SnipUndercover
Copy link

Found the source of the NaN; apparently it's RelativeMoveInput. (so I was almost there; I considered checking RelativeMoveInput directly but chose against it, as I thought that since Controls.Move.Input is fine, RelativeMoveInput would also be fine.)

I added assertions to every parameter when calculating velXY, logging if it contains NaN. input is NaN when grounded; RelativeMoveInput (and subsequently accel) is NaN when airborne.
I'd take it home from there, but it's currently late; figured I'd share the development. I plan to make a PR in case I get to the bottom of this issue.

@SnipUndercover
Copy link

Somehow the NaN surfaces from normalizing the camera's XY components?? This doesn't make any sense..

diff --git a/Source/Actors/Player.cs b/Source/Actors/Player.cs
index d0482d6..03950ae 100644
--- a/Source/Actors/Player.cs
+++ b/Source/Actors/Player.cs
@@ -611,7 +611,15 @@ public class Player : Actor, IHaveModels, IHaveSprites, IRidePlatforms, ICastPoi
                        if (Vec2.Dot(input, Vec2.UnitY) >= .985f)
                                input = Vec2.UnitY;

-                       return forward * input.Y + side * input.X;
+                       Vec2 ret = forward * input.Y + side * input.X;
+                       if (ret.HasNaN())
+                       {
+                               ret.AssertNotNaN($"{nameof(RelativeMoveInput)} contains NaN!");
+                               Log.Error($"{nameof(World.Camera.Forward)} = {World.Camera.Forward}");
+                               Log.Error($"{nameof(forward)} = {forward} (normalized)");
+                               Log.Error($"{nameof(side)} = {side} (normalized)");
+                       }
+                       return ret;
                }
        }

Running this yields these logs:

RelativeMoveInput contains NaN! (<NaN, NaN>)
Forward = <-0.10875643, -0.887422, -0.44794446>
forward = <0.12164313, -Infinity> (normalized)
side = <Infinity, NaN> (normalized)

This is not a one-off, here's what happens when the camera is not moved:

RelativeMoveInput contains NaN! (<NaN, NaN>)
Forward = <-3.9004448E-08, 0.89231783, -0.45140773>
forward = <-4.3711385E-08, Infinity> (normalized)
side = <-Infinity, NaN> (normalized)

@NoelFB
Copy link
Contributor Author

NoelFB commented Feb 1, 2024

Hmm we do use a custom Normalize method, but all it does is check for x/y being 0 and return 0 in that case. Maybe we need to use an epsilon?

    public static Vector2 Normalized(this Vector2 vector)
    {
        if (MathF.Abs(vector.X) <= float.Epsilon && MathF.Abs(vector.Y) <= float.Epsilon)
            return Vector2.Zero;

        return Vector2.Normalize(vector);
    }

I'm not sure what else would be the cause... but that doesn't entirely make sense because the input values aren't both near zero (only X in the example above).

@Kalobi
Copy link
Contributor

Kalobi commented Feb 1, 2024

Getting infinite/NaN values is definitely expected for very small values (Vector2.Normalize(new(float.Epsilon, 0)) produces <∞, NaN> for instance), but the values Snip's test subject is getting this for are definitely not in a range where that should happen (and it doesn't if I test with those values).

Note that even when this is behaving normally, float.Epsilon is definitely too small as a guard to fully avoid this. In my testing this starts happening when the vector components are somewhere in the 1e-23 range (for reference, float.Epsilon is 1e-45).

@SnipUndercover
Copy link

Could it be processor-dependent? The person is running on a relatively old CPU (AMD Phenom II X4 955), and Vector2 operations' assembly leverages AVX instructions; this sounds kind of crazy but it's the only thing floating on my mind.

It might make sense as I've only seen two people experience this crash in the Celeste server.

@SnipUndercover
Copy link

Got bored and decided to make a visualization;
Yellow is input, green is expected and red is actual. (red vector's angle with the -Y axis is not to scale for demonstration sake)
image

@NoelFB
Copy link
Contributor Author

NoelFB commented Feb 1, 2024

if we do our own Normal calculation does that work for them? what the hell

return vector / MathF.Sqrt(vector.X * vector.X + vector.Y * vector.Y)

@SnipUndercover
Copy link

Will try tomorrow. 👍

@SnipUndercover
Copy link

SnipUndercover commented Feb 2, 2024

For those who are following this issue:

It turns out my theory wasn't in fact so crazy. A .NET Runtime bug causes the JIT to emit wrong ASM instructions when calling Vector2.Normalize(Vector2) on CPUs which don't support AVX, making the normalization result incorrect. For example, on an AMD Phenom II X4 955, this causes the normalized vector's Y component to be +/- Infinity.

The solution here is to do a manual normalization.

I've forked Foster and made it fall back to manual normalization, if any component of the normalized vector is not finite.
I've also included some camera debug metrics in the top left if Celeste64 is built in debug mode.

I sent the built version to my tester (thank you Tanya!) and they confirmed it worked for them. I'll soon make a release on my fork including the fix.


EDIT: The release is out. I hope this'll help those affected until the .NET bug is fixed.

@NoelFB
Copy link
Contributor Author

NoelFB commented Feb 4, 2024

Wow, thanks for looking into this and figuring it out!

I guess at this point it's worth submitting findings to .NET runtime? We could add a hack to use manual normalization if it fails but I don't really want to do that on our end if it's something that can be resolved upstream... Unless .NET team decide it's not worth fixing for some reason.

@SnipUndercover
Copy link

@Popax21 said he'd make an issue soon, but not right now as he's currently pretty busy.
I'd rather let him do the talking as I'm not exactly very qualified to be talking about this stuff.

SnipUndercover pushed a commit to SnipUndercover/Celeste64 that referenced this issue Feb 4, 2024
@Popax21
Copy link

Popax21 commented Feb 5, 2024

It would probably be a good idea to figure out the minimum .NET version this bug was introduced in while I am still out of action. I'll try to prepare a writeup for a potential .NET bug report once I have the time for it.

@NoelFB
Copy link
Contributor Author

NoelFB commented Feb 7, 2024

This might be related to issues seen on ARM64 Release Mode: #70 (comment)

@SnipUndercover
Copy link

The fork fell a bit out of date, so I pulled the changes and published the 1.1.1 release + a few extra commits; remembering to publish new platforms as well. (unfortunately I don't have any machines to test the releases on, so fingers crossed)

@movercell
Copy link

So this is CPU dependent? huh interesting(i have a phenom 2😭)

@SnipUndercover
Copy link

[...] For example, on an AMD Phenom X4 955, this causes the normalized vector's Y component to be +/- Infinity.

Whoops, I meant to say "AMD Phenom II X4 955", but just noticed I forgot to put the "II" in.
Regardless, my Celeste64 fork should still make the game playable for you.

@Popax21
Copy link

Popax21 commented Jun 26, 2024

After spending a few hours trying to chase down this bug in the JIT source code, I found out that it's already known & fixed: dotnet/runtime#96939.

Not sure if the fix is in the latest .NET 8 runtime, but it would definitely be worth a try to see if updating fixes this.

@NoelFB
Copy link
Contributor Author

NoelFB commented Jun 27, 2024

Yeah I think at this point it's worth it for me to just update everything and rebuild and see where we're at. I'll do that soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants