Our Game Was Broken For a Year 🔦
In this post, we will cover a bug that broke our game, how we found it, how we fixed it, and improvements to our testing practices so that it does not happen again.
A Bit of Context
If you are new here, you should know that lighting is a huge part of Project Ghost, as up to four players explore and loot dark, haunted mansions with nothing but their flashlights. Of course, we’ve dedicated much time to making this mechanic feel good. However, we realized that the system was broken and has been for longer than we want to admit.
Furthermore, there is a lot of added complexity when implementing multiplayer in games, and with complexity comes bugs. You may have guessed it by now, but our game-breaking bug was in our multiplayer implementation. Therefore, we will give you an overview of our multiplayer architecture and how it works.
Upon spawning into a game, our system selects a player to be the host, and the host is in charge of the state of the world. In technical jargon, we would say that the host has state authority. Other players, which we call clients, connect to the host and receive updates about the world’s state through that host. However, anything that requires instant feedback, such as movement — or, as we’ll get to soon, the direction of your flashlight’s light — gets applied locally and propagated to the state authority (the host).
What the Bug Is
After a year of development, we noticed that the direction the flashlight was pointing was not being replicated across clients properly. In other words, the client would see its flashlight in the right place, but the host would replicate it incorrectly across the other clients. This became obvious when we implemented light collision detection because although the client would see themselves shining their light on an interactable, nothing would happen because the state authority didn’t place it in the same place.
How We Identified It
At first, it only seemed like a small rounding error, perhaps some loss of precision in the decimals being passed through the network. But after many hours of troubleshooting, we noticed that if the client stood still without moving their light and the host moved around, the replicated light of the client on the host’s screen would also move. This meant the replicated light was somehow factoring in the host’s position. See the gif above for a visual representation of the bug.
How to Catch It Sooner Next Time
First, the lighting system was the first thing we implemented almost a year ago, and we have learned a lot since then. This means that we made false assumptions in the initial implementation. This is partly due to poor documentation from Photon Fusion and the fact that it was a new architecture with a small support community.
Second, the bug only became obvious once we implemented light collision detection. Until then, the discrepancies were quite subtle. Therefore, all testers will now be streaming their game and looking at each other’s stream to catch any subtle issues when testing. With this, we want everyone on the team to take ownership of testing features.
Another factor that made it hard for us to notice this bug is that the behaviour is most noticeable the further the host is from the client. However, past a certain distance, the client’s light is out of camera view on the host’s screen, which makes it hard to diagnose. Furthermore, the client does not see the issue on their screen. It only occurs on the host’s screen. Therefore, we’ve added a step to our testing template to make sure that we cross-reference the state authority (host) and input authority (client) at different distances.
How to Fix It
To figure out the fix, we had to dig into the Fusion documentation to determine how to network a player’s commands without lag. Although it is a fairly standard operation, we got it wrong when transferring our code base from PUN (Photon’s previous network package) to Fusion.
To reiterate, the bug was that the clients were updating their light position locally, but when the position was sent to the host, it would recalculate the light’s position based on its camera instead of only using the client’s. Since the two players would have different camera positions, this would yield a different position. The fix was to immediately apply the client’s inputs to their game and then pass on the resulting position to the host and make sure the position wasn’t being recalculated so that it may replicate it to all the other clients without factoring in the host’s position. This is the code we used in Unity:
Further Reading
In order to find our fix and implement it in time for the end of our prototyping phase, we used Fusion’s documentation, especially this project example: Tanknarok. We encourage you to read more if you are interested!