Replies: 2 comments 1 reply
-
I've spent lots of time debugging tonight. I made sure there are no instances of Godot transforms or anything else being used to update the physics simulation. I also made sure conversion between byte/Vector3 and byte/Quaternion works flawlessly and also that System.Numerics/Godot conversions are working flawlessly. I did this by creating a checksum on the host before sending and then on the client when receiving and seeing that they always match. I also checked the packets in HxD to verify that my checksum wasn't lying to me. So this rules out an issue with the networking aspect completely... I also tried removing the wheels since I was still under the impression that the springs were the culprit, but there are still desyncs with just the two car boxes and the static floor. And again these 3 objects are being created in the same order, just the creation timing is different. I then decided to print out what the differences are between host/client whenever I detect a desync. About 97% of the desyncs are related to Linear or Angular velocity (on the box or any of the wheels). The other 3% are between position/rotation. I'm not sure what to think about this. It's also the same percentage with or without the wheels/springs. I found that surprising since I was under the impression that the position would be what we're getting desyncs for. I'm still struggling to find a way to tackle this problem, especially since my main theory about the springs seems to be disproved. And I'm convinced that I'm setting the velocities correctly |
Beta Was this translation helpful? Give feedback.
-
Determinism requires that all interactions with the library are done in exactly the same order. Snapshotting body poses is part of the state, but there's quite a bit more. You've noted the constraint state, but there's also any bookkeeping that may result in different constraint order, since the solver result is order sensitive. This would include the content of the narrow phase's Collecting that information is doable and not conceptually difficult, but... implementing it is tedious. It's been on the todo list for a while since it's pretty useful for a lot of stuff (like debug snapshots), but it's not there right now. (I wouldn't recommend waiting on me to implement it; I'm going to have very limited time until, like, July, and there are a bunch of things on the todo list.) There are more specialized alternatives, but they get more conceptually complicated. Collision detection's per-pair results aren't sensitive to ordering, so you could in principle do something that guarantees only constraint equivalence. |
Beta Was this translation helpful? Give feedback.
-
Hello, thank you for making bepuphysics. I've been using it for years on and off for various things I do for fun. The last 2 weeks I've been making a demo that basically amounts to a mini Rocket League car controller if you had to compare it. I'm using
2.5.0-beta.23
To describe my scene, each car is just a box with 4 spheres below it, all attached with springs (
LinearAxisServo
andPointOnLineServo
) to the box but at different offsets. Every car I spawn is identical with no randomness in the construction or movement. I only have a "go forward" and "spin in place" control to keep things simple for testing. Go forward will apply a forward velocity impulse every tick, while spinning applies an angular one and lets the car do 360s around its center without moving at all (and the spheres below support it.)Into the meat of the issue: I'm using the same networking technique that Overwatch/Rocket League use, which is a form of "state synchronization". I'm very familiar with it since I was a hacker for Rocket League and worked closely with the devs to fix issues. I'll describe it briefly since it's really easy to understand, only difficult to implement.
The client will always run ahead of the server. For our sake right now, we will assume that's always the case since we're just trying to understand the technique. The client and server are essentially in perfect sync until an action happens. For example, let's say the client is on tick 4 and the server is on tick 1. Once the server completes tick 1, it sends the state to the client, which will be on tick 5 when it receives it (again, assuming perfect network stability for now.) Once the client receives it, it checks its own snapshot of tick 1 and compares the results. If there is a difference, we snap everything back to where they were on tick 1 and then step 4 ticks until we're back at tick 5 (and saving the new snapshots as we go). Now we're in perfect sync with the server again, assuming the simulation is deterministic. Here is a visual explanation from Psyonix of what I just spoke about (timestamped 37:25 to 43:45): https://youtu.be/ueEmiDM94IE?feature=shared&t=2245
I have implemented this technique and it works great, except I'm finding that it isn't deterministic. What I'm finding is that once things start moving, even when inputs are not changing (for example, holding "go forward" or holding nothing at all), that the host and client disagree very slightly about where the car and wheels are and what their velocities are. This happens when I do an exact comparison with the vectors/quaternion and when I do an approximate comparison with a custom epsilon. Although as I understand it, even an exact comparison should be passing here.
For my demo, I'm working with no thread dispatcher on the simulation. I also made sure I'm adding the ground and the cars (and as an extension, the springs and spheres) in the same order. I even disallow everything from sleeping. The only difference is WHEN I spawn the client's car since it doesn't connect instantly. But for some reason, the results are not 100% the same.
The way I create a snapshot and "snap" a car/wheel to a snapshot is like this:
The
ToVector3()
andToQuaternion()
overloads just convert between aSystem.Numerics.Vector3
/quaternion and aGodot.Vector3
/quaternion:I doubt the compiler is truncating floats here. EDIT: I have confirmed that there is no truncation happening with these casts and no truncation happening with reads/writes to packet bytes.
I also ensured that my serialization for each snapshot is lossless and sent/received in the same order over the network.
I am even using the same .exe on the same machine so it should really be deterministic at this point. From my understanding, the timing shouldn't really matter if they're not colliding at the start. But even if I spawn the client's car at different times for the host/client, we would still need to have determinism after performing corrections, and I believe it's possible with bepuphysics.
So my only theory right now is that I am not actually applying snapshots correctly to bepuphysics.
My theories for why I'm doing it incorrectly are:
Any help is appreciated as I struggle to hunt down the culprit. I just need to know where to look to solve this problem.
If the problem happens to be on my own side with Godot or the technique, I'll post a reply.
I'd still like to know your opinion on whether snapshotting this way is completely safe. Thanks again :)
Beta Was this translation helpful? Give feedback.
All reactions