Due to my recent design changes towards networked multi-player, it is obviously necessary to write a little networking code. I wanted to learn as much as possible and exert total control over my game design, so I dove in head first from the ground up, using only what’s provided in the .NET API. I decided to use a standard TCP connection for prototyping, so as to avoid dealing with packet loss and connection instability at first. Although I aced my college intro-to-networking class and have written some odd network code here and there since then, I still encountered a number of issues which needed a bit of thought and experimentation to resolve.
I also studied a number of other proven FPS game networking architectures: Quake, Tribes, Halo, Unreal. Quake3 appears to be unique in that it tries to transmit the entire world state to clients in an atomic unit, while the others focus on updating various objects in the world asynchronously, which allows them to prioritize important data for high-frequency updates and de-prioritize unimportant data to save bandwidth. They all vary on data reliability promises, but all of them have strong concessions for unreliable data delivery and a few require some reliable data. They all share a common client/server architecture, where an authoritative server sends out the “official” world state and processes requests from clients but is free to ignore them if deemed invalid. This design allows for a central focus of bandwidth for high player counts, versus decentralized peer-to-peer where all peers require reasonably high bandwidth, and also allows for an authoritative game state to keep things in sync for fast-paced gameplay, without needing to consult all other peers to see if the current state is OK.
- Authoritative host, to allow fast processing, numerous connected clients, and dedicated servers
- Classification of network message types into reliability guarantees
- Unreliable:
- Object state replication
- Full update sends all data fields in the object, including type info which allows the client to spawn a local copy
- Delta update taken on a per-variable basis from previous send, not last client acknowledgement
- Client will request a full update if it sees a delta update for an unknown object
- Effects, one-time events which are non-vital to a client (ie explosions)
- Client Requests, one-time requests from the client which are time-sensitive but not vital (ie fire weapon, pick up item)
- Object state replication
- Reliable:
- Host Events, one-time events which are vital to a client (ie player spawn/death, object deletion, etc)
- Client Commands, one-time requests from the client which are not time-sensitive but shouldn’t be repeated (ie respawn player)
- Unreliable:
- Object relevance:
- Binary decision evaluated on a per-object, per-player basis.
- Accelerated using spatial partitioning to find “nearby” objects for each player
- Freshly relevant objects must be replicated in full, previously-relevant objects can replicate only deltas in state
- Object prioritization:
- Avoid sending updates for all relevant objects in every packet (they wouldn’t fit anyway)
- Important objects get updated frequently, unimportant ones are updates infrequently but eventually
- Importance evaluated
Some of the issues I encountered and spent time on:
- Connection handshake: connecting to a socket makes no guarantee that the application on the other side is the correct one. My handshake sends an assembly build number to ensure that the client and server are running the same code. Clients are disconnected if they fail the handshake or cause trouble at any point.
- Connection ID: keeping track of who owns what and where to send prioritized information, a connection ID is generated for each new connection. This can be used to tag objects with the owning connection, so when that connection is dropped the appropriate items can be destroyed or otherwise handled.
- Object creation: To avoid centralizing all my code into giant switch statements, I make use of .NET Reflection to label types with IDs and register factory methods for each of them. This requires that the assemblies on the client and server match so that the reflection results correspond. It also allows for potential modularity for future expansion via add-on assemblies. This same mechanism is used for arbitrary client requests and host events (a player move request is implemented as a class with an static methods EncodeEvent and DecodeEvent).
- Initial synchronization: A singular GameInfo object is created to centralize game logic, and a client is not “ready” until it has synched the GameInfo for the first time. Clients can then spawn PlayerController objects, which in turn can spawn Player objects. A PlayerController is used for local control and client input and its state is replicated only to the owning connection, while Player state is representative of the player in the world and is replicated to all clients. Objects are always spawned on the host and the client waits for them to synchronize before creating a local copy. The procedural level is not generated until the random seed is obtained via the GameInfo, and subsequent world modifications need to be synched before the client can present the level to the user; this could take some time on a map with heavy amounts of modification.
- Object lifetime: When a connection on the server is dropped due to error or a player quitting, the relevant player objects need to be cleaned up. The connection has a list of associated PlayerController objects which have been created by that client, which provides a starting point for destroying objects on the host. Object destruction is reliably communicated to clients, so they don’t end up with ghosts floating all around.
- Programmer error in serialization due to copy/paste or incomplete refactoring. I refactored my serialization utility functions to operate in both read and write mode, so the same code will write and read the same data in the same order.
- Host-client internal transfer: My design has the host also acting as a client, so the two aspects can share objects rather than duplicating all objects on the host machine. It was originally connecting back to itself via a socket, but that was causing issues with initialization. I added functionality to transfer packets internally via memory instead of sockets and things got much better. This also means single-player mode can use the same code paths as multi-player without needing to actually connect with any sockets.
Upcoming problems which I anticipate consuming significant time:
- Moving from TCP to reliable UDP, which is an essential change to provide a high quality experience, and also because this is supported by the XNA networking system on Xbox. I plan to use the Lidgren library on Windows.
- Adding client prediction to avoid jerky motion of players and other objects.
- Regulating the bandwidth used by each connection, and the rate at which network data is generated (currently a full update 1 or 2 times per frame at 60 fps, way too much).
Resources: