介绍
其实不用继承 UObject 也可以生成一个 actor,
关键的点就是拿到 UWorld ,
所以只要从一个拥有 UWorld 的虚幻相关实例中传递它的 UWorld 给一个原生 C++ 类也可以.
示例代码
. . .
这篇文章网上找不到翻译, 我也没时间详细翻译, 大概总结一下吧.
本篇主要讲了把本系列之前五篇文章的技术应用到实战中处理客户端与服务器的的连接.
请看总结, 不明之处再看文中具体讲解.
First up we have the client state machine.
The client is in one of three states:
The goal is to create an abstraction on top of a UDP socket where our server presents a number of virtual slots for clients to connect to.
When a client requests a connection, it gets assigned to one of these slots.
If a client requests connection, but no slots are available, the server is full and the connection request is denied.
On the server, we have the following data structure:
1 | const int MaxClients = 64; |
这些问题需要用授权系统和加密系统来解决.
we no longer accept client connections immediately on connection request, instead we send back a challenge packet, and only complete connection when a client replies with information that can only be obtained by receiving the challenge packet.
为了防止攻击者利用我们的服务器当做DDos放大攻击的工具, 我们让客户端发的包比服务器发的包要大些.
We’ll add some unique random identifiers, or ‘salts’, to make each client connection unique from previous ones coming from the same IP address and port.
一旦彼此连接上之后, 就用 client salt 和 server salt 的异或值来标识彼此.
以上的防御措施让我们的服务器做到了 no longer able to be used as port of DDoS amplification attacks, and with a trivial xor based authentication, 但对于一个经验丰富的会抓包分析的攻击者来说, 还存在以下问题 :
To solve this, we need to get serious with cryptography to encrypt and sign packets so they can’t be read or modified by a third party.
原文标题 : Client Server Connection (How to create a client/server connection over UDP)
Hi, I’m Glenn Fiedler and welcome to Building a Game Network Protocol.
So far in this article series we’ve discussed how games read and write packets, how to unify packet read and write into a single function, how to fragment and re-assemble packets, and how to send large blocks of data over UDP.
Now in this article we’re going to bring everything together and build a client/server connection on top of UDP.
Developers from a web background often wonder why games go to such effort to build a client/server connection on top of UDP, when for many applications, TCP is good enough.
These days even web servers are transitioning to UDP via Google’s QUIC. If you still think TCP is good enough for time critical data in 2016, I encourage you to put that in your pipe and smoke it :)
The reason is that games send time critical data.
Why don’t games use TCP for time critical data? The answer is that TCP delivers data reliably and in-order, and to do this on top of IP (which is unreliable, unordered) it holds more recent packets hostage in a queue while older packets are resent over the network.
This is known as head of line blocking and it’s a huuuuuge problem for games. To understand why, consider a game server broadcasting the state of the world to clients 10 times per-second. Each client advances time forward and wants to display the most recent state it receives from the server.
But if the packet containing state for time t = 10.0 is lost, under TCP we must wait for it to be resent before we can access t = 10.1 and 10.2, even though those packets have already arrived and contain the state the client wants to display.
Worse still, by the time the resent packet arrives, it’s far too late for the client to actually do anything useful with it. The client has already advanced past 10.0 and wants to display something around 10.3 or 10.4!
So why resend dropped packets at all? BINGO! What we’d really like is an option to tell TCP: “Hey, I don’t care about old packets being resent, by they time they arrive I can’t use them anyway, so just let me skip over them and access the most recent data”.
Unfortunately, TCP simply does not give us this option :(
All data must be delivered reliably and in-order.
This creates terrible problems for time critical data where packet loss and latency exist. Situations like, you know, The Internet, where people play FPS games.
Large hitches corresponding to multiples of round trip time are added to the stream of data as TCP waits for dropped packets to be resent, which means additional buffering to smooth out these hitches, or long pauses where the game freezes and is non-responsive.
Neither option is acceptable for first person shooters, which is why virtually all first person shooters are networked using UDP. UDP doesn’t provide any reliability or ordering, so protocols built on top it can access the most recent data without waiting for lost packets to be resent, implementing whatever reliability they need in radically different ways to TCP.
But, using UDP comes at a cost:
UDP doesn’t provide any concept of connection.
We have to build that ourselves. This is a lot of work! So strap in, get ready, because we’re going to build it all up from scratch using the same basic techniques first person shooters use when creating their protocols over UDP. You can use this client/server protocol for games or non-gaming applications and, provided the data you send is time critical, I promise you, it’s well worth the effort.
The goal is to create an abstraction on top of a UDP socket where our server presents a number of virtual slots for clients to connect to:
When a client requests a connection, it gets assigned to one of these slots:
If a client requests connection, but no slots are available, the server is full and the connection request is denied:
Once a client is connected, packets are exchanged in both directions. These packets form the basis for the custom protocol between the client and server which is game specific.
In a first person shooter, packets are sent continuously in both directions. Clients send input to the server as quickly as possible, often 30 or 60 times per-second, and the server broadcasts the state of the world to clients 10, 20 or even 60 times per-second.
Because of this steady flow of packets in both directions there is no need for keep-alive packets. If at any point packets stop being received from the other side, the connection simply times out. No packets for 5 seconds is a good timeout value in my opinion, but you can be more aggressive if you want.
When a client slot times out on the server, it becomes available for other clients to connect. When the client times out, it transitions to an error state.
Let’s get started with the implementation of a simple protocol. It’s a bit basic and more than a bit naive, but it’s a good starting point and we’ll build on it during the rest of this article, and the next few articles in this series.
First up we have the client state machine.
The client is in one of three states:
Initially the client starts in disconnected.
When a client connects to a server, it transitions to the connecting state and sends connection request packets to the server:
The CRC32 and implicit protocol id in the packet header allow the server to trivially reject UDP packets not belonging to this protocol or from a different version of it.
Since connection request packets are sent over UDP, they may be lost, received out of order or in duplicate.
Because of this we do two things: 1) we keep sending packets for the client state until we get a response from the server or the client times out, and 2) on both client and server we ignore any packets that don’t correspond to what we are expecting, since a lot of redundant packets are flying over the network.
On the server, we have the following data structure:
const int MaxClients = 64;
class Server
{
int m_maxClients;
int m_numConnectedClients;
bool m_clientConnected[MaxClients];
Address m_clientAddress[MaxClients];
};
Which lets the server lookup a free slot for a client to join (if any are free):
int Server::FindFreeClientIndex() const
{
for ( int i = 0; i < m_maxClients; ++i )
{
if ( !m_clientConnected[i] )
return i;
}
return -1;
}
Find the client index corresponding to an IP address and port:
int Server::FindExistingClientIndex( const Address & address ) const
{
for ( int i = 0; i < m_maxClients; ++i )
{
if ( m_clientConnected[i] && m_clientAddress[i] == address )
return i;
}
return -1;
}
Check if a client is connected to a given slot:
bool Server::IsClientConnected( int clientIndex ) const
{
return m_clientConnected[clientIndex];
}
… and retrieve a client’s IP address and port by client index:
const Address & Server::GetClientAddress( int clientIndex ) const
{
return m_clientAddress[clientIndex];
}
Using these queries we implement the following logic when the server processes a connection request packet:
If the server is full, reply with connection denied.
If the connection request is from a new client and we have a slot free, assign the client to a free slot and respond with connection accepted.
If the sender corresponds to the address of a client that is already connected, also reply with connection accepted. This is necessary because the first response packet may not have gotten through due to packet loss. If we don’t resend this response, the client gets stuck in the connecting state until it times out.
The connection accepted packet tells the client which client index it was assigned, which the client needs to know which player it is in the game:
Once the server sends a connection accepted packet, from its point of view it considers that client connected. As the server ticks forward, it watches connected client slots, and if no packets have been received from a client for 5 seconds, the slot times out and is reset, ready for another client to connect.
Back to the client. While the client is in the connecting state the client listens for connection denied and connection accepted packets from the server. Any other packets are ignored.
If the client receives connection accepted, it transitions to connected. If it receives connection denied, or after 5 seconds hasn’t received any response from the server, it transitions to disconnected.
Once the client hits connected it starts sending connection payload packets to the server. If no packets are received from the server in 5 seconds, the client times out and transitions to disconnected.
While this protocol is easy to implement, we can’t use a protocol like this in production. It’s way too naive. It simply has too many weaknesses to be taken seriously:
Spoofed packet source addresses can be used to redirect connection accepted responses to a target (victim) address. If the connection accepted packet is larger than the connection request packet, attackers can use this protocol as part of a DDoS amplification attack.
Spoofed packet source addresses can be used to trivially fill all client slots on a server by sending connection request packets from n different IP addresses, where n is the number of clients allowed per-server. This is a real problem for dedicated servers. Obviously you want to make sure that only real clients are filling slots on servers you are paying for.
An attacker can trivially fill all slots on a server by varying the client UDP port number on each client connection. This is because clients are considered unique on an address + port basis. This isn’t easy to fix because due to NAT (network address translation), different players behind the same router collapse to the same IP address with only the port being different, so we can’t just consider clients to be unique at the IP address level sans port.
Traffic between the client and server can be read and modified in transit by a third party. While the CRC32 protects against packet corruption, an attacker would simply recalculate the CRC32 to match the modified packet.
If an attacker knows the client and server IP addresses and ports, they can impersonate the client or server. This gives an attacker the power to completely a hijack a client’s connection and perform actions on their behalf.
Once a client is connected to a server there is no way for them to disconnect cleanly, they can only time out. This creates a delay before the server realizes a client has disconnected, or before a client realizes the server has shut down. It would be nice if both the client and server could indicate a clean disconnect, so the other side doesn’t need to wait for timeout in the common case.
Clean disconnection is usually implemented with a disconnect packet, however because an attacker can impersonate the client and server with spoofed packets, doing so would give the attacker the ability to disconnect a client from the server whenever they like, provided they know the client and server IP addresses and the structure of the disconnect packet.
If a client disconnects dirty and attempts to reconnect before their slot times out on the server, the server still thinks that client is connected and replies with connection accepted to handle packet loss. The client processes this response and thinks it’s connected to the server, but it’s actually in an undefined state.
While some of these problems require authentication and encryption before they can be fully solved, we can make some small steps forward to improve the protocol before we get to that. These changes are instructive.
The first thing we want to do is only allow clients to connect if they can prove they are actually at the IP address and port they say they are.
To do this, we no longer accept client connections immediately on connection request, instead we send back a challenge packet, and only complete connection when a client replies with information that can only be obtained by receiving the challenge packet.
The sequence of operations in a typical connect now looks like this:
To implement this we need an additional data structure on the server. Somewhere to store the challenge data for pending connections, so when a challenge response comes in from a client we can check against the corresponding entry in the data structure and make sure it’s a valid response to the challenge sent to that address.
While the pending connect data structure can be made larger than the maximum number of connected clients, it’s still ultimately finite and is therefore subject to attack. We’ll cover some defenses against this in the next article. But for the moment, be happy at least that attackers can’t progress to the connected state with spoofed packet source addresses.
Next, to guard against our protocol being used in a DDoS amplification attack, we’ll inflate client to server packets so they’re large relative to the response packet sent from the server. This means we add padding to both connection request and challenge response packets and enforce this padding on the server, ignoring any packets without it. Now our protocol effectively has DDoS minification for requests -> responses, making it highly unattractive for anyone thinking of launching this kind of attack.
Finally, we’ll do one last small thing to improve the robustness and security of the protocol. It’s not perfect, we need authentication and encryption for that, but it at least it ups the ante, requiring attackers to actually sniff traffic in order to impersonate the client or server. We’ll add some unique random identifiers, or ‘salts’, to make each client connection unique from previous ones coming from the same IP address and port.
The connection request packet now looks like this:
The client salt in the packet is a random 64 bit integer rolled each time the client starts a new connect. Connection requests are now uniquely identified by the IP address and port combined with this client salt value. This distinguishes packets from the current connection from any packets belonging to a previous connection, which makes connection and reconnection to the server much more robust.
Now when a connection request arrives and a pending connection entry can’t be found in the data structure (according to IP, port and client salt) the server rolls a server salt and stores it with the rest of the data for the pending connection before sending a challange packet back to the client. If a pending connection is found, the salt value stored in the data structure is used for the challenge. This way there is always a consistent pair of client and server salt values corresponding to each client session.
The client state machine has been expanded so connecting is replaced with two new states: sending connection request and sending challenge response, but it’s the same idea as before. Client states repeatedly send the packet corresponding to that state to the server while listening for the response that moves it forward to the next state, or back to an error state. If no response is received, the client times out and transitions to disconnected.
The challenge response sent from the client to the server looks like this:
The utility of this being that once the client and server have established connection, we prefix all payload packets with the xor of the client and server salt values and discard any packets with the incorrect salt values. This neatly filters out packets from previous sessions and requires an attacker to sniff packets in order to impersonate a client or server.
Now that we have at least a basic level of security, it’s not much, but at least it’s something, we can implement a disconnect packet:
And when the client or server want to disconnect clean, they simply fire 10 of these over the network to the other side, in the hope that some of them get through, and the other side disconnects cleanly instead of waiting for timeout.
We now have a much more robust protocol. It’s secure against spoofed IP packet headers. It’s no longer able to be used as port of DDoS amplification attacks, and with a trivial xor based authentication, we are protected against casual attackers while client reconnects are much more robust.
But it’s still vulnerable to a sophisticated actors who can sniff packets:
This attacker can read and modify packets in flight.
This breaks the trivial identification based around salt values…
… giving an attacker the power to disconnect any client at will.
To solve this, we need to get serious with cryptography to encrypt and sign packets so they can’t be read or modified by a third party.
本篇主要讲了数据包的分包和重组问题, 到底数据包多大才好呢?是不是越大越好呢?包太大了怎么办呢?
请看总结, 不明之处再看文中具体讲解.
网络协议在动作游戏类型(FPS)中的典型特征就是一个持续发送的数据包,在两个方向上以稳定的速度如20或30包每秒发送。这些数据包都包含有不可靠的无序数据例如t时间内的世界状态;所以,当一个数据包丢失,重新发送它并不是特别有用。当重新发送的数据包到达时,时间t已经过去了。
所以这就是我们将要实现可靠性的现状。对于我们90%的数据包,仅丢弃并不再重新发送它会更好。对于10%或更少(误差允许范围内)的情况,我们确实需要可靠性,但这样的数据是非常罕见的,很少被发送而且比不可靠的数据的平均大小要小得多。这个使用案例适用于所有过去十五年来发布的AAA级的FPS游戏。
为实现数据包层级的应答,在每个包的前面添加如下的报头:
1 | struct Header |
这些报头元素组合起来以创建应答系统:
ack_bits 不仅是一个节省带宽的巧妙的编码,它同样也增加了信息冗余来抵御包的丢失。每个应答码要被发送32次。如果有一个包丢失了,仍然有其他31个包有着相同的应答码。从统计上来说,应答码还是非常有可能送达的。
但突发的传送数据包的丢失还是有可能发生的,所以重要的是要注意:
以我的经验,没有必要设计完善的应答机制。在一个极少丢应答码的系统上构建一个可靠性系统并不会增加什么大问题。
为实现这个应答系统,我们在发送方还需要一个数据结构来追踪一个数据包是否已经被应答,这样我们就可以忽略冗余的应答(每个包会通过 ack_bits
多次应答)。我们同样在接收方也还需要一个数据结构来追踪那些已经收到的包,这样我们就可以在数据包的报头填写ack_bits
的值。
1 | const int BufferSize = 1024; |
你在这可以看到的窍门是这个滚动的缓冲区是以序列号来作为索引的:
const int index =sequence % BufferSize;
当条目被顺序添加,就像一个被发送的队列,对插入所需要做的就是把这个序列缓冲区的值更新为新的序列号并且在该索引处重写这个数据:
1 | PacketData & InsertPacketData( uint16_t sequence ) |
原文标题 : Reliable Ordered Messages (How to implement reliable-ordered messages on top of UDP)
Hi, I’m Glenn Fiedler and welcome to Building a Game Network Protocol.
Many people will tell you that implementing your own reliable message system on top of UDP is foolish. After all, why reimplement TCP?
But why limit ourselves to how TCP works? But there are so many different ways to implement reliable-messages and most of them work nothing like TCP!
So let’s get creative and work out how we can implement a reliable message system that’s better and more flexible than TCP for real-time games.
A common approach to reliability in games is to have two packet types: reliable-ordered and unreliable. You’ll see this approach in many network libraries.
The basic idea is that the library resends reliable packets until they are received by the other side. This is the option that usually ends up looking a bit like TCP-lite for the reliable-packets. It’s not that bad, but you can do much better.
The way I prefer to think of it is that messages are smaller bitpacked elements that know how to serialize themselves. This makes the most sense when the overhead of length prefixing and padding bitpacked messages up to the next byte is undesirable (eg. lots of small messages included in each packet). Sent messages are placed in a queue and each time a packet is sent some of the messages in the send queue are included in the outgoing packet. This way there are no reliable packets that need to be resent. Reliable messages are simply included in outgoing packets until they are received.
The easiest way to do this is to include all unacked messages in each packet sent. It goes something like this: each message sent has an id that increments each time a message is sent. Each outgoing packet includes the start message id followed by the data for n messages. The receiver continually sends back the most recent received message id to the sender as an ack and only messages newer than the most recent acked message id are included in packets.
This is simple and easy to implement but if a large burst of packet loss occurs while you are sending messages you get a spike in packet size due to unacked messages.
You can avoid this by extending the system to have an upper bound on the number of messages included per-packet n. But now if you have a high packet send rate (like 60 packets per-second) you are sending the same message multiple times until you get an ack for that message.
If your round trip time is 100ms each message will be sent 6 times redundantly before being acked on average. Maybe you really need this amount of redundancy because your messages are extremely time critical, but in most cases, your bandwidth would be better spent on other things.
The approach I prefer combines packet level acks with a prioritization system that picks the n most important messages to include in each packet. This combines time critical delivery and the ability to send only n messages per-packet, while distributing sends across all messages in the send queue.
To implement packet level acks, we add the following packet header:
struct Header
{
uint16_t sequence;
uint16_t ack;
uint32_t ack_bits;
};
These header elements combine to create the ack system: sequence is a number that increases with each packet sent, ack is the most recent packet sequence number received, and ack_bits is a bitfield encoding the set of acked packets.
If bit n is set in ack_bits, then ack - n is acked. Not only is ack_bits a smart encoding that saves bandwidth, it also adds redundancy to combat packet loss. Each ack is sent 32 times. If one packet is lost, there’s 31 other packets with the same ack. Statistically speaking, acks are very likely to get through.
But bursts of packet loss do happen, so it’s important to note that:
If you receive an ack for packet n then that packet was definitely received.
If you don’t receive an ack, the packet was most likely not received. But, it might have been, and the ack just didn’t get through. This is extremely rare.
In my experience it’s not necessary to send perfect acks. Building a reliability system on top of a system that very rarely drops acks adds no significant problems. But it does create a challenge for testing this system works under all situations because of the edge cases when acks are dropped.
So please if you do implement this system yourself, setup a soak test with terrible network conditions to make sure your ack system is working correctly. You’ll find such a soak test in the example source code for this article, and the open source network libraries reliable.io and yojimbo which also implement this technique.
To implement this ack system we need a data structure on the sender side to track whether a packet has been acked so we can ignore redundant acks (each packet is acked multiple times via ack_bits. We also need a data structure on the receiver side to keep track of which packets have been received so we can fill in the ack_bits value in the packet header.
The data structure should have the following properties:
You might be thinking. Oh of course, hash table. But there’s a much simpler way:
const int BufferSize = 1024;
uint32_t sequence_buffer[BufferSize];
struct PacketData
{
bool acked;
};
PacketData packet_data[BufferSize];
PacketData GetPacketData( uint16_t sequence )
{
const int index = sequence % BufferSize;
if ( sequence_buffer[index] == sequence )
return &packet_data[index];
else
return NULL;
}
As you can see the trick here is a rolling buffer indexed by sequence number:
const int index = sequence % BufferSize;
This works because we don’t care about being destructive to old entries. As the sequence number increases older entries are naturally overwritten as we insert new ones. The sequence_buffer[index] value is used to test if the entry at that index actually corresponds to the sequence number you’re looking for. A sequence buffer value of 0xFFFFFFFF indicates an empty entry and naturally returns NULL for any sequence number query without an extra branch.
When entries are added in order like a send queue, all that needs to be done on insert is to update the sequence buffer value to the new sequence number and overwrite the data at that index:
PacketData & InsertPacketData( uint16_t sequence )
{
const int index = sequence % BufferSize;
sequence_buffer[index] = sequence;
return packet_data[index];
}
Unfortunately, on the receive side packets arrive out of order and some are lost. Under ridiculously high packet loss (99%) I’ve seen old sequence buffer entries stick around from before the previous sequence number wrap at 65535 and break my ack logic (leading to false acks and broken reliability where the sender thinks the other side has received something they haven’t…).
The solution to this problem is to walk between the previous highest insert sequence and the new insert sequence (if it is more recent) and clear those entries in the sequence buffer to 0xFFFFFFFF. Now in the common case, insert is very close to constant time, but worst case is linear where n is the number of sequence entries between the previous highest insert sequence and the current insert sequence.
Before we move on I would like to note that you can do much more with this data structure than just acks. For example, you could extend the per-packet data to include time sent:
struct PacketData
{
bool acked;
double send_time;
};
With this information you can create your own estimate of round trip time by comparing send time to current time when packets are acked and taking an exponentially smoothed moving average. You can even look at packets in the sent packet sequence buffer older than your RTT estimate (you should have received an ack for them by now…) to create your own packet loss estimate.
Now that we have the data structures and packet header, here is the algorithm for implementing packet level acks:
On packet send:
Insert an entry for for the current send packet sequence number in the sent packet sequence buffer with data indicating that it hasn’t been acked yet
Generate ack and ack_bits from the contents of the local received packet sequence buffer and the most recent received packet sequence number
Fill the packet header with sequence, ack and ack_bits
Send the packet and increment the send packet sequence number
On packet receive:
Read in sequence from the packet header
If sequence is more recent than the previous most recent received packet sequence number, update the most recent received packet sequence number
Insert an entry for this packet in the received packet sequence buffer
Decode the set of acked packet sequence numbers from ack and ack_bits in the packet header.
Iterate across all acked packet sequence numbers and for any packet that is not already acked call OnPacketAcked( uint16_t sequence ) and mark that packet as acked in the sent packet sequence buffer.
Importantly this algorithm is done on both sides so if you have a client and a server then each side of the connection runs the same logic, maintaining its own sequence number for sent packets, tracking most recent received packet sequence # from the other side and a sequence buffer of received packets from which it generates sequence, ack and ack_bits to send to the other side.
And that’s really all there is to it. Now you have a callback when a packet is received by the other side: OnPacketAcked. The main benefit of this ack system is now that you know which packets were received, you can build any reliability system you want on top. It’s not limited to just reliable-ordered messages. For example, you could use it to implement delta encoding on a per-object basis.
Messages are small objects (smaller than packet size, so that many will fit in a typical packet) that know how to serialize themselves. In my system they perform serialization using a unified serialize functionunified serialize function.
The serialize function is templated so you write it once and it handles read, write and measure.
Yes. Measure. One of my favorite tricks is to have a dummy stream class called MeasureStream that doesn’t do any actual serialization but just measures the number of bits that would be written if you called the serialize function. This is particularly useful for working out which messages are going to fit into your packet, especially when messages themselves can have arbitrarily complex serialize functions.
struct TestMessage : public Message
{
uint32_t a,b,c;
TestMessage()
{
a = 0;
b = 0;
c = 0;
}
template <typename Stream> bool Serialize( Stream & stream )
{
serialize_bits( stream, a, 32 );
serialize_bits( stream, b, 32 );
serialize_bits( stream, c, 32 );
return true;
}
virtual SerializeInternal( WriteStream & stream )
{
return Serialize( stream );
}
virtual SerializeInternal( ReadStream & stream )
{
return Serialize( stream );
}
virtual SerializeInternal( MeasureStream & stream )
{
return Serialize( stream );
}
};
The trick here is to bridge the unified templated serialize function (so you only have to write it once) to virtual serialize methods by calling into it from virtual functions per-stream type. I usually wrap this boilerplate with a macro, but it’s expanded in the code above so you can see what’s going on.
Now when you have a base message pointer you can do this and it just works:
Message message = CreateSomeMessage();
message->SerializeInternal( stream );
An alternative if you know the full set of messages at compile time is to implement a big switch statement on message type casting to the correct message type before calling into the serialize function for each type. I’ve done this in the past on console platform implementations of this message system (eg. PS3 SPUs) but for applications today (2016) the overhead of virtual functions is neglible.
Messages derive from a base class that provides a common interface such as serialization, querying the type of a message and reference counting. Reference counting is necessary because messages are passed around by pointer and stored not only in the message send queue until acked, but also in outgoing packets which are themselves C++ structs.
This is a strategy to avoid copying data by passing both messages and packets around by pointer. Somewhere else (ideally on a separate thread) packets and the messages inside them are serialized to a buffer. Eventually, when no references to a message exist in the message send queue (the message is acked) and no packets including that message remain in the packet send queue, the message is destroyed.
We also need a way to create messages. I do this with a message factory class with a virtual function overriden to create a message by type. It’s good if the packet factory also knows the total number of message types, so we can serialize a message type over the network with tight bounds and discard malicious packets with message type values outside of the valid range:
enum TestMessageTypes
{
TEST_MESSAGE_A,
TEST_MESSAGE_B,
TEST_MESSAGE_C,
TEST_MESSAGE_NUM_TYPES
};
// message definitions omitted
class TestMessageFactory : public MessageFactory
{
public:
Message Create( int type )
{
switch ( type )
{
case TEST_MESSAGE_A: return new TestMessageA();
case TEST_MESSAGE_B: return new TestMessageB();
case TEST_MESSAGE_C: return new TestMessageC();
}
}
virtual int GetNumTypes() const
{
return TEST_MESSAGE_NUM_TYPES;
}
};
Again, this is boilerplate and is usually wrapped by macros, but underneath this is what’s going on.
The algorithm for sending reliable-ordered messages is as follows:
On message send:
Measure how many bits the message serializes to using the measure stream
Insert the message pointer and the # of bits it serializes to into a sequence buffer indexed by message id. Set the time that message has last been sent to -1
Increment the send message id
On packet send:
Walk across the set of messages in the send message sequence buffer between the oldest unacked message id and the most recent inserted message id from left -> right (increasing message id order).
Never send a message id that the receiver can’t buffer or you’ll break message acks (since that message won’t be buffered, but the packet containing it will be acked, the sender thinks the message has been received, and will not resend it). This means you must never send a message id equal to or more recent than the oldest unacked message id plus the size of the message receive buffer.
For any message that hasn’t been sent in the last 0.1 seconds and fits in the available space we have left in the packet, add it to the list of messages to send. Messages on the left (older messages) naturally have priority due to the iteration order.
Include the messages in the outgoing packet and add a reference to each message. Make sure the packet destructor decrements the ref count for each message.
Store the number of messages in the packet n and the array of message ids included in the packet in a sequence buffer indexed by the outgoing packet sequence number so they can be used to map packet level acks to the set of messages included in that packet.
Add the packet to the packet send queue.
On packet receive:
Walk across the set of messages included in the packet and insert them in the receive message sequence buffer indexed by their message id.
The ack system automatically acks the packet sequence number we just received.
On packet ack:
Look up the set of messages ids included in the packet by sequence number.
Remove those messages from the message send queue if they exist and decrease their ref count.
Update the last unacked message id by walking forward from the previous unacked message id in the send message sequence buffer until a valid message entry is found, or you reach the current send message id. Whichever comes first.
On message receive:
Check the receive message sequence buffer to see if a message exists for the current receive message id.
If the message exists, remove it from the receive message sequence buffer, increment the receive message id and return a pointer to the message.
Otherwise, no message is available to receive. Return NULL.
In short, messages keep getting included in packets until a packet containing that message is acked. We use a data structure on the sender side to map packet sequence numbers to the set of message ids to ack. Messages are removed from the send queue when they are acked. On the receive side, messages arriving out of order are stored in a sequence buffer indexed by message id, which lets us receive them in the order they were sent.
This provides the user with an interface that looks something like this on send:
TestMessage message = (TestMessage) factory.Create( TEST_MESSAGE );
if ( message )
{
message->a = 1;
message->b = 2;
message->c = 3;
connection.SendMessage( message );
}
And on the receive side:
while ( true )
{
Message message = connection.ReceiveMessage();
if ( !message )
break;
if ( message->GetType() == TEST_MESSAGE )
{
TestMessage testMessage = (TestMessage) message;
// process test message
}
factory.Release( message );
}
Which is flexible enough to implement whatever you like on top of it.
嗨,我是格伦费德勒,欢迎来到创建一个游戏网络协议第五篇文章。
从上一篇文章到现在已经有很长一段时间了,上次我已经率先而且实现了余下的这一系列文章所需的源码并创建了开源库libyojimbo,是本系列文章所要描述的网络协议的一个质量有保证的的和经过单元测试的版本。
如果你想要有一个开源库来为自己在UDP之上实现可靠消息或是为了其他更多,看看libyojimbo。但是,如果你像我这样是想理解它具体是怎么工作的并且可能自己去实现它,阅读下去,因为我们将要从头到脚地去建立一个在UDP之上用来发送可靠有序消息的完整的系统!
很多人也许会跟你说,要在UDP之上实现你自己的可靠消息系统是愚蠢的。为什么要撰写你特有的简化版本的TCP?这些人深信,任何可靠性的实现不可避免地 最终会成为一个(简化的)TCP的重实现。
但也有很多不同的方法来在UDP之上实现可靠消息,各有不同的优势和劣势。TCP的方法并不是唯一的选择。事实上,我所了解到的大多数可靠有序信息的选择的原理和TCP并不相同。所以让我们为我们的目标发挥创造力并弄懂我们该如何充分利用我们的现状来实现一个比TCP更好 的可靠性系统。
网络协议在动作游戏类型(FPS)中的典型特征就是一个持续发送的数据包,在两个方向上以稳定的速度如20或30包每秒发送。这些数据包都包含有不可靠的无序数据例如t时间内的世界状态;所以,当一个数据包丢失,重新发送它并不是特别有用。当重新发送的数据包到达时,时间t已经过去了。
所以这就是我们将要实现可靠性的现状。对于我们90%的数据包,仅丢弃并不再重新发送它会更好。对于10%或更少(误差允许范围内)的情况,我们确实需要可靠性,但这样的数据是非常罕见的,很少被发送而且比不可靠的数据的平均大小要小得多。这个使用案例适用于所有过去十五年来发布的AAA级的FPS游戏。
可靠性的一个常用的方法是使用两种包类型:可靠有序的和不可靠的。你在众多网络库中都会看到这个方法。它基本的想法是,这个库不断重新发送可靠的数据包直到它的另一方接收到为止。这是一个最终看起来会有一点像TCP方式传输的可靠包的选择。这并没有很糟糕,但你也可以做得更好。
我更愿意去考虑的方法就是消息其实是更小的位包装元素,它知道如何使它们自己序列化。这就显得非常有意义了,因为按位打包的消息中,用于描述下个字节的前缀或者后缀的字节开销在大部分的情况下是不必需的(例如每个包中包含的许多小的消息)。被发送的消息会被放在一个队列并且每次一个包被发送时,发送队列中的一些消息就会被包含在外发的包中。这样一来,就没有可靠的数据包需要被重新发送了。可靠消息也只会包含在数据包里直到它们被接收。
要做到这样最简单的方法就是,把所有未应答的消息包含到每个被发送的包中。它是这样的:每个被发送的消息都有一个随每当一个消息被发送时递增的id。每个输出数据包包含起始消息id ,紧跟着的是n 个消息的数据。接收方不断发回最新收到消息的id给发送方作为一个应答信号,并且消息要当且仅当比最新的应答消息id要更新,才会被包含在数据包中。
这很简单也易于实现,但当你正在发送消息时如果突发一个很大的包丢失情况,你会遇到一个数据包大小的峰值,因为有很多未应答的消息。。正如在数据包分割和重组中讨论的需要按照MTU分割包的方式来发送大的数据包会增加丢包的情况。在高丢包率下你最不想做的就是增大包的规格并引起更多的包的丢失。这是一个潜在的无底洞。
你可以通过扩展系统来给每个包的消息数量n设置一个上限,来避免这种情况。但现在如果你有一个高数据包发送率(如每秒60包)你就要多次发送同样的消息直到你得到该消息的应答信号。如果的往返时间是100ms,每条消息在被应答之前将要平均被多余发送六次。也许你真的需要这些多余的发送数量因为你的消息是对时间极其敏感的,但在大多数情况下,你应该给队列里的其他消息分配合理的带宽。
我比较喜欢的方法是用一个优先次序系统整合每包的应答信号,这个系统检出n条最重要的消息并包含在每个包中。在散布的消息穿过所有在发送队列中的消息发送时,这样就把对时间敏感的递送与每包仅发送n条消息的能力联合起来了。
让我们行动起来实现它。 这种可靠性系统的基础是每个包的应答。
但为什么应答是在数据包层级而不是在消息层级呢?简要截说原因就是包的数量会远远少于消息的数量。假设每个包中有32或64条消息,显然让一个包含32或64条消息的包来应答会比让每个消息都分别应答要高效得多。
这样同样也增加了灵活性,因为你可以在数据包层级应答上构建其他可靠性系统,不仅仅是为了可靠有序的消息。例如,使用了数据包层级应答,你就知道哪一个时间先决的不可靠状态更新已结束,所以你可以轻易地构建一个系统,在一旦一个数据包所包含的最后一个状态更新已经应答时,停止发送不再改变的对象状态。
为实现数据包层级的应答,在每个包的前面添加如下的报头:
1 2 3 4 5 6 | struct Header { uint16_t sequence; uint16_t ack; uint32_t ack_bits; }; |
这些报头元素组合起来以创建应答系统:
ack_bits 不仅是一个节省带宽的巧妙的编码,它同样也增加了信息冗余来抵御包的丢失。每个应答码要被发送32次。如果有一个包丢失了,仍然有其他31个包有着相同的应答码。从统计上来说,应答码还是非常有可能送达的。
但突发的传送数据包的丢失还是有可能发生的,所以重要的是要注意:
以我的经验,没有必要设计完善的应答机制。在一个极少丢应答码的系统上构建一个可靠性系统并不会增加什么大问题。但对于在所有情况下来测试这个系统的工作将会成为很大的挑战,因为还要考虑应答码丢失的边界情况。
所以如果你自己实现这个系统的话,请设置一个浸泡测试来覆盖糟糕的网络情况,用来确保你的应答系统是在正确的工作,相关地,你的消息系统的执行实际上是在这些网络情况下可靠地而且有序的交付可靠有序消息。以我之见(并且我已经写了许多这样的系统的变式至少有十次了),这是确保正确行为的一个必要步骤。
你在这篇文章的示例源代码中会找到这样一个浸泡测试,它对patreon支持是有效的,并且也在开源网络库libyojimbo中。
为实现这个应答系统,我们在发送方还需要一个数据结构来追踪一个数据包是否已经被应答,这样我们就可以忽略冗余的应答(每个包会通过 ack_bits多次应答)。我们同样在接收方也还需要一个数据结构来追踪那些已经收到的包,这样我们就可以在数据包的报头填写ack_bits的值。
这个数据结构应该具有以下属性:
你可能会想。哦,当然,哈希表。但还有一个更简单的方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | const int BufferSize = 1024; uint32_t sequence_buffer[BufferSize]; struct PacketData { bool acked; }; PacketData packet_data[BufferSize];
{ const int index = sequence % BufferSize; if ( sequence_buffer[index] == sequence ) return &packet_data[index]; else return NULL; } |
你在这可以看到的窍门是这个滚动的缓冲区是以序列号来作为索引的:
const int index =sequence % BufferSize;
这是可行的,因为我们并不在意旧条目破坏。随着序列号的递增,旧的条目也自然而然地随着我们插入了新条目而被重写。sequence_buffer[index]的值是用来测试该索引的条目是否实际上与你所搜寻的序列号相符。一个缓冲序列的值是0xFFFFFFFF 就表示一个空的条目并自然地对任何序列号查询返回NULL,没有任何其他(代码)分支。
当条目被顺序添加,就像一个被发送的队列,对插入所需要做的就是把这个序列缓冲区的值更新为新的序列号并且在该索引处重写这个数据:
1 2 3 4 5 6 | PacketData & InsertPacketData( uint16_t sequence ) { const int index = sequence % BufferSize; sequence_buffer[index] = sequence; return packet_data[index]; } |
但在接收端数据包以乱序到达并且有一部分丢失。在高得离谱的丢包率下(99%),我就会看到旧的序列缓冲区条目还存在,但是新条目的序列号已经超过了65535并且循环到达了旧条目之前,并且打破了我的应答逻辑(导致错误应答并打破了可靠性,这时发送方会真的认为对方已经接收到了一些东西但其实并不是…)
解决这个问题的办法是遍历上一个最高的插入序列与最新收到的插入序列之间的条目(如果它是更加新的话)并在缓冲区清除这些条目即都置为0xFFFFFFFF。现在,在一般情况下,插入操作是非常接近 时间常量的,但最糟的情况是,在先前最高的序列号和当前插入的序列号之间线性遍历的次数n等于缓冲区的长度。
在我们继续之前,我想指出,你可以用这个数据结构做更多事情而不仅是对于应答码。例如,你可以加入发送时间,来扩展每个包的数据:
1 2 3 4 5 | struct PacketData { bool acked; double send_time; }; |
有了这些信息你可以对往返时间通过做指数级的平滑取平均数做修正,最终得到合理的预期往返时间。你甚至可以看到在发送数据包的序列缓存区的数据包会比你RTT预计的(你现在应该已经收到了它们的应答码…)要旧,通过这个往返时间对还没有应答的包做判断,来决定创建你的数据包丢失预计。
现在我们来把注意力集中在数据包层级应答的实际算法上。该算法如下:
在数据包发送端:
在数据包接收端:
重要的一点是这个算法是在两端都可以执行的,所以如果你有一个客户端和一个服务端,然后每一方的连接运行着同样的逻辑,维护自己的序列号发送的数据包,跟踪最新从另一方收到的数据包序列#还有从一个序列缓冲区里接收到的数据包中生成sequence, ack 和ack_bits 来发送到另一方。
并且这真的就是和它有关的全部了。现在当一个数据包被另一方接收到时,你有一个回调:OnPacketAcked 。这个可靠性系统的关键就在于你得知道哪个数据包被接收,你可以在你想的媒介之上创建任何 可靠性系统。它不仅限于可靠有序的消息。例如,你可以用它确认哪个不可靠的状态更新已经完成了,用以实现基于每个物体的增量编码。
消息是小型的对象(比数据包大小要小,所以很多消息装配在一个典型的数据包中)并且知道如何将它们自己序列化。在我的系统里,它们使用一个统一的序列化函数来执行序列化。
这个序列化的函数是模板化的,所以你只要写它一次它就会处理读、写以及测量 。
是的。测量。我喜欢的一个技巧就是有一个虚拟流类叫做MeasureStream,如果你调用了序列化函数,它不参与任何的序列化,而只是测量可能被写入的比特数。这对于解决哪个消息要装载到你的数据包里,特别是当消息可以有任意复杂的序列化函数的情况时是特别有用的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | struct TestMessage : public Message { uint32_t a,b,c; TestMessage() { a = 0; b = 0; c = 0; } template < typename stream
> bool Serialize( Stream & stream ) { serialize_bits( stream, a, 32 ); serialize_bits( stream, b, 32 ); serialize_bits( stream, c, 32 ); return true ; } virtual SerializeInternal( WriteStream & stream ) { return Serialize( stream ); } virtual SerializeInternal( ReadStream & stream ) { return Serialize( stream ); } virtual SerializeInternal( MeasureStream & stream ) { return Serialize( stream ); } };
|
这里的技巧是桥接统一模板的序列化函数(所以你只需要写一次)与虚拟序列化方法,这通过从虚函数每个流类型中调入它。我通常用一个宏来打包这个引用,但它在上文的代码中这个宏已经被展开,所以你可以看到发生了什么。
现在,假设你有一个基于消息的指针可以让你做到这样并且它只是通过重载来工作:
1 2 | Message message = CreateSomeMessage(); message->SerializeInternal( stream ); |
另外一个就是如果你在编译时间知道了消息的完整组合,就可以为每个类型在被调入序列化函数之前实现一个关于消息类型转换为确切消息类型的大的switch语句。我在过去已经在控制台平台实现的这个消息系统这么做了(如PS3 SPUs),但对于现在(2016)的应用程序,虚函数的总开销是忽略不计的。
消息从一个基类派生,这个基类提供一个通用的接口例如序列化、消息的查询类型还有引用计数。引用计数是必要的,因为消息是通过指针传递的并且在应答之前不只是存储在消息发送队列,而且也存储在外发的数据包中,包本身是C++结构体。
这是一个策略,就是避免通过指针传递消息和数据包来复制数据。别的一些场景(理想的情况是在一个单独的线程)它们里面的数据包和消息会序列化到一个缓冲区。最终,当不再有对存在消息发送队列的消息的引用时(消息已经被应答)并且没有数据包包含保留在数据包发送队列里的消息,消息即是被销毁的。
我们也需要一种方式来创建消息。我用一个消息的工厂类来做这件事情,它有一个被复写的虚函数来根据类型创建一个消息。如果这个数据包工厂还知道消息类型的总数量就好了,那样我们就可以在网络上序列化一个消息类型,因为有严格的界限和在有效范围之外的消息类型值的包的恶意丢弃:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | enum TestMessageTypes { TEST_MESSAGE_A, TEST_MESSAGE_B, TEST_MESSAGE_C, TEST_MESSAGE_NUM_TYPES }; // message definitions omitted class TestMessageFactory : public MessageFactory { public : Message Create( int type ) { switch ( type ) { case TEST_MESSAGE_A: return new TestMessageA(); case TEST_MESSAGE_B: return new TestMessageB(); case TEST_MESSAGE_C: return new TestMessageC(); } } virtual int GetNumTypes() const { return TEST_MESSAGE_NUM_TYPES; } }; |
再次重申,这是一个引用并且通常是被包裹在宏里面的,但下面要说明的就是它具体是怎么回事了。
现在让我们来着手于如何在应答系统中实现可靠有序消息的细节。
发送可靠有序消息的算法如下:
对于消息发送:
对于数据包发送:
对于数据包接收:
对于数据包应答:
对于消息接收:
总之,消息要保持被包含在数据包中直到这个数据包包含的消息得到应答。我们在发送者方使用一个数据结构来给消息标识的组合映射数据包序列号以便应答。当消息被应答时,要从发送队列中移除。对于接收方,以乱序到达的消息会被存储在一个序列缓冲区,并以消息id为索引,这个id会让我们以它们被发送的顺序接收它们。
在发送方,这为用户提供了一个像这样的接口:
1 2 3 4 5 6 7 8 | TestMessage message = (TestMessage) factory.Create( TEST_MESSAGE ); if ( message ) { message->a = 1; message->b = 2; message->c = 3; connection.SendMessage( message ); } |
还有在接收方:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | while ( true ) { Message message = connection.ReceiveMessage(); if ( !message ) break ; if ( message->GetType() == TEST_MESSAGE ) { TestMessage testMessage = (TestMessage*) message; // process test message } factory.Release( message ); } |
正如你所看到的,它已经是简单得不能再简单了。
如果这几个接口有引起你的兴趣,请看看我的新开源库 libyojimbo。
我希望你到现在为止对这个系列的文章是享受的请在 patreon上支持我的写作,并且我将更快写新的文章,再者你会在加州大学伯克利分校软件的开源许可证下获得这篇文章的示例源代码。谢谢你的支持!
即将到来:客户端与服务器的连接
在“创建一个游戏网络协议”的下一篇文章会展示你如何在UDP之上创建你自己的客户端/服务器连接层,它会实现挑战/响应,会在服务器上分配客户端插槽,当服务器爆满或检测超时就拒绝客户端的连接。
【版权声明】
原文作者未做权利声明,视为共享知识产权进入公共领域,自动获得授权;