本文试图分析UE4的游戏网络通信模块,采用从先勾勒出主框架再到深入细节的探索模式。
概念介绍
UE4 GamePlay的组成:
World
游戏世界,关于游戏的一切都发生在其中,游戏运行时必有一个World存在。Actor
游戏世界中的角色(如:房子、怪物、英雄)。PlayerController
玩家在游戏世界中的代理。LocalPlayer
表示本地玩家,它对应一个客户端窗口区域。NetConnection
表示一个client-server的网络连接。NetDriver
网络通信对象(非OS的驱动),提供一个UDP Socket用于与外界进行网络通信,是游戏网络数据的出口和进口点。管理当前的NetConnections,
对于Server端的NetDriver对象管理多个NetConnections,对于Client端的NetDriver对象管理一个NetConnection。-
Channel
信道, 每个NetConnection存在多个信道, 按信道的功能分类如下:
a. ControlChannel
用于交换控制消息(control message),例如:连接请求、断开请求等。每个Connection只有一个该类型通道。
b. VoiceChannel
用于交换语音聊天数据。每个Connection只有一个该类型通道。
c. ActorChannel
用于同步游戏中角色数据和进行RPC。存在多个该类型通道实例,每个实例为一个replicated Actor服务。LocalPlayer和NetConnection的概念非常类似于Linux上的本地和远程终端概念, 如下图所示:
网络通信中的数据处理和控制
先举个物流运输案例, 有两个工厂M和N,工厂M有产线A、B、C,工厂N有产线X,Y,Z,产线A、B、C的产品分别供给产线X,Y,Z。现在有一些卡车(载重t)负责M和N之间的运输工作。现在描述下从M输送产品到N的流程:
- A将自己的产品打包成箱,如果该产品比较大则分装到多个箱子中并标记每个箱子表示是产品的一个部分(part0,part1…)。这些箱子搬运到卡车上。
- 同理B、C也是这样操作。
- 当卡车装满后,则卡车启动前往N;如果是紧急情况,则卡车未满也要出发。注意:有可能卡车可能一次运输不了某个产品,需要多次运输。
- 卡车到达N后,进行卸货,根据箱子上的产品标记,将箱子给相应的产线X,Y,Z。假如这时候有个A产线产品部分到达,那么X需要等下一趟运输,继续取A的部分产品,直至组成完整产品,才进行处理。同理B,C也是这样操作。
备注:这里我们假定产品运输途中不会丢失。
UE4 World的网络数据流程与上面类似。UE4的网络数据被定义为Bunch(一束,等价于上述的箱子概念)。
-
数据发送流程
-
数据接收流程
相关源码:
Engine\Source\Runtime\Engine\Classes\Engine\Channel.h
Engine\Source\Runtime\Engine\Classes\Engine\ControlChannel.h
Engine\Source\Runtime\Engine\Classes\Engine\ActorChannel.h
Engine\Source\Runtime\Engine\Classes\Engine\VoiceChannel.h&.cpp
Engine\Source\Runtime\Engine\Classes\Engine\DataChannel.h &.cpp
Engine\Source\Runtime\Engine\Classes\Engine\NetConnection.h & .cpp
Engine\Source\Runtime\CoreUObject\Public\UObject\CoreNet.h
Engine\Source\Runtime\Online\OnlineSubsystemUtils\Classes\IpConnection.h
Engine\Source\Runtime\PacketHandlers\PacketHandler\Public\PacketHandler.h
Engine\Source\Runtime\PacketHandlers\PacketHandler\Private\PacketHandler.cpp //对即将发送出的packet和读入packet进行处理
Engine\Source\Runtime\Engine\Private\PacketHandlers\StatelessConnectHandlerComponent.cpp
Engine\Source\Runtime\Engine\Public\PacketHandlers\StatelessConnectHandlerComponent.h // PacketHandler component for implementing a stateless (non-memory-consuming)
connection handshake Partially based on the Datagram Transport Layer Security protocol.
名词解释:
- Bunch 从Channel产出的叫FoutBunch,输入给Channel的叫FinBuch
- Packet 从Socket读入\输出叫packet.
- Open 发出去的packet且尚未收到ack成为Open状态。
备注: 本小节不考虑数据的生成和解析(也就是生产线上是如何制造出产品的和使用产品的),只需要知道要发送和接收数据被定义为Bunch(一个箱子)。
网络数据接收调用堆栈:
-
Actor Channel Stack:
-
Control Channel Stack:
- Voice-Channel Stack:
TODO:
函数说明:
1.void UNetConnection::ReceivedRawPacket( void* InData, int32 Count )
处理接收到的网络数据InData, 首先用HandlePacket链处理(如handshake操作),如果剩余数据则继续算出有效的packet长度,并调用ReceivedPacket(). 其次该函数还做了些网络包流量统计。
2.void UNetConnection::ReceivedPacket( FBitReader& Reader );
分析packet中的数据流, 如果是Ack消息则响应ack消息; 如果是Bunch数据,则派发它们; 最后根据需要发送Ack消息。
3.void UChannel::ReceivedRawBunch( FInBunch & Bunch, bool & bOutSkipAck );
判断是否bReliable和乱序, 如果乱序则将bunch放入排序队列中,等待前面缺失的bunch到达; 否则调用ReceivedNextBunch()。
4.bool UChannel::ReceivedSequencedBunch( FInBunch& Bunch )
处理一个完整的bunch.它会调用virtual void ReceivedBunch( FInBunch& Bunch ) ;函数。
然后继续处理被缓存的满足顺序性的bunch.
5.virtual void ReceivedBunch( FInBunch& Bunch );
不同类型的的UChannel会重写该函数,进行相应的处理。后面会分别分析它们。
网络数据发送调用堆栈:
函数说明:
1.FPacketIdRange UChannel::SendBunch( FOutBunch* Bunch, bool Merge );
合并或将较大的bunch拆分为多个小bunch发送出去。返回这些bunch占据的pakcet的范围(用PacketId区间表示)
2.int32 UChannel::SendRawBunch(FOutBunch* OutBunch, bool Merge);
int32 UChannel::SendRawBunch(FOutBunch* OutBunch, bool Merge)
{
if ( Connection->bResendAllDataSinceOpen )
{
check( OpenPacketId.First != INDEX_NONE );
check( OpenPacketId.Last != INDEX_NONE );
return Connection->SendRawBunch( *OutBunch, Merge );
}
// Send the raw bunch.
OutBunch->ReceivedAck = 0;
int32 PacketId = Connection->SendRawBunch(*OutBunch, Merge);
if( OpenPacketId.First==INDEX_NONE && OpenedLocally )
OpenPacketId = FPacketIdRange(PacketId);
if( OutBunch->bClose )
SetClosingFlag();
return PacketId;
}
3.int32 UNetConnection::SendRawBunch( FOutBunch& Bunch, bool InAllowMerge );
将bunch信息写入输出流中,并调用WriteBitsToSendBuffer()
4.int32 UNetConnection::WriteBitsToSendBuffer()
写入sendBuffer, 可能会调用FlushNet。返回此次的PacketId.
5.void UNetConnection::FlushNet();
调用UIpConnection::LowLevelSend()发送数据。
6.void UIpConnection::LowLevelSend(void* Data, int32 CountBytes, int32 CountBits);
这里会先调用PacketHandler对data进行处理(目前是在data之前加入些信息)。请参考void StatelessConnectHandlerComponent::Outgoing(FBitWriter& Packet)
。
有效数据包格式(从bit0开始向后排)
1.这里是PacketHander加入的数据
2.Package-Id (18 bits MAX_PACKETID)
3.IsAck (1 bit)
4.If(IsAck==true)
AckPacketId (18 bits)
bHasServerFrameTime (1 bit)
5.否则,属于Bunch.
Control bit,
bOpen bit
bClose bit
bDormant bit
bReliable bit
ChIndex MAX_CHANNELS
bHasPackageMapExports bit
bHasMustBeMappedGUIDs bit
bPartial bit
bPartialIntial bit
bPartialFinal bit
ChType CHTYPE_MAX
BunchDataBits MAX_PACKET*8 // 纯净的bunch数据长度
6.纯净的数据(真正的数据)
下面做一个传输过程总结:
UDP实现可靠性
在上节中阐述了数据的传输流程,我们没有关注数据在传输过程中丢失的情况。本节将分析如何做到可靠地传输。
数据的标识:
- Bunch用 ChIndex和ChSequenceId(reliable时才会用)
- Packet用PacketId (一直递增)
对于可靠的bunch包, 发送端必须要收到Acked, 接收端必须发送Acked.
相关函数:
FOutBunch* UChannel::PrepBunch(FOutBunch* Bunch, FOutBunch* OutBunch, bool Merge);
void UNetConnection::ReceivedPacket( FBitReader& Reader );
void UNetConnection::ReceivedNak( int32 NakPacketId );
void UChannel::ReceivedNak( int32 NakPacketId );
这块代码要考虑到如下问题:
- Packet丢失或不按顺序到达
- Bunch丢失或接收到重复的bunch
Q: 在实现可靠性传输时,使用了Sequence号, 但是MAX_CHSEQUENCE才是1024,如果发生wrap怎么办?
A: 这里用的技巧是跟据已经收到的SequenceId(int32)进行推算出正在处理的Bunch的SequenceID,比如当前的SequenceId是10994, 收到的Bunch SequenceID为512,那么该Sequence的绝对ID为10752,采用就近原则: 10994 % 1024 = 754, 754-512 < (1024/2), 所以这个SequenceId = 10994 - (754-512) = 10752。
if ( Bunch.bReliable )
{
if ( InternalAck )
{
// We can derive the sequence for 100% reliable connections
Bunch.ChSequence = InReliable[Bunch.ChIndex] + 1;
}
else
{
// If this is a reliable bunch, use the last processed reliable sequence to read the new reliable sequence
Bunch.ChSequence = MakeRelative( Reader.ReadInt( MAX_CHSEQUENCE ), InReliable[Bunch.ChIndex], MAX_CHSEQUENCE );
}
}
其中MakeReltive()函数从Wrap的ID计算出绝对的SequenceId。
Bunch数据的生产
上面阐述了Bunch的发送和接收流程,本节将追踪下Bunch的生产。也就是产线上产品的产生过程。
- UControlChannel数据的产生
请参考DataChannel.h中的DEFINE_CONTROL_CHANNEL_MESSAGE_XXX
宏。很简单,之间的序列化。 - UActorChannel数据的产生
bool UActorChannel::ReplicateActor()
负责Actor网络数据的生成。
- Actor属性标记
/** Structure to hold and pass around transient flags used during replication. */
struct FReplicationFlags
{
union
{
struct
{
/** True if replicating actor is owned by the player controller on the target machine. */
uint32 bNetOwner:1;
/** True if this is the initial network update for the replicating actor. */
uint32 bNetInitial:1;
/** True if this is actor is RemoteRole simulated. */
uint32 bNetSimulated:1;
/** True if this is actor's ReplicatedMovement.bRepPhysics flag is true. */
uint32 bRepPhysics:1;
/** True if this actor is replicating on a replay connection. */
uint32 bReplay:1;
};
uint32 Value;
};
FReplicationFlags()
{
Value = 0;
}
};
- ReplicateActor()函数中代码片段:
// ----------------------------------------------------------
// Replicate Actor and Component properties and RPCs
// ----------------------------------------------------------
#if USE_NETWORK_PROFILER
const uint32 ActorReplicateStartTime = GNetworkProfiler.IsTrackingEnabled() ? FPlatformTime::Cycles() : 0;
#endif
if (!bIsNewlyReplicationPaused)
{
// The Actor
WroteSomethingImportant |= ActorReplicator->ReplicateProperties(Bunch, RepFlags);
// The SubObjects
WroteSomethingImportant |= Actor->ReplicateSubobjects(this, &Bunch, &RepFlags);
if (Connection->bResendAllDataSinceOpen)
{
if (WroteSomethingImportant)
{
SendBunch(&Bunch, 1);
}
MemMark.Pop();
bIsReplicatingActor = false;
return WroteSomethingImportant;
}
// Look for deleted subobjects
for (auto RepComp = ReplicationMap.CreateIterator(); RepComp; ++RepComp)
{
if (!RepComp.Key().IsValid())
{
// Write a deletion content header:
WriteContentBlockForSubObjectDelete(Bunch, RepComp.Value()->ObjectNetGUID);
WroteSomethingImportant = true;
Bunch.bReliable = true;
RepComp.Value()->CleanUp();
RepComp.RemoveCurrent();
}
}
}
- 负责数据同步的算法类和数据结构
源文件:DataReplication.h, DataReplication.cpp, RepLayout.h, RepLayout.cpp
-
FrepChangedPropertyTracker
Rep-Property的变化跟踪器, 每个对象有一个。This class is used to store the change list for a group of properties of a particular actor/object This information is shared across connections when possible.
-
FrepLayout
对象的Rep-Layout(需要网络rep的 对象的属性布局), 每个类有一个。 -
FReplicationChangelistMgr
每个对象一个,记录了对象属性值的变化历史记录,就靠它了(非常关键,可以理解为,它是一个变化采样器)./** * FReplicationChangelistMgr manages a list of change lists for a particular replicated object that have occurred since the object started replicating * Once the history is completely full, the very first changelist will then be merged with the next one (freeing a slot) * This way we always have the entire history for join in progress players * This information is then used by all connections, to share the compare work needed to determine what to send each connection * Connections will send any changelist that is new since the last time the connection checked */
-
FObjectReplicator
针对每个连接和每个对象有一个Replicator实例。/** FObjectReplicator * Generic class that replicates properties for an object. * All delta/diffing work is done in this class. * Its primary job is to produce and consume chunks of properties/RPCs: * * |----------------| * | NetGUID ObjRef | * |----------------| * | | * | Properties... | * | | * | RPCs... | * | | * |----------------| * | </End Tag> | * |----------------| * */
-
FrepState
每个对象针对每个连接的Rep历史数据,放在FobjectReplicator对象中。 FrepLayout负责操作它,不记录数据。(参考Quake3的服务器同步设计)。 -
UPackageMapClient
维护Object和FNetGUID的映射。每个Connection拥有一个。
Maps objects and names to and from indices for network communication.
-
FrepChangedPropertyTracker
这块代码比较复杂,这个小节只做了些关键点阐述,细节还是要阅读源码。
疑问:
在Replicate Actor时,Actor的子对象是如何同步的,如何解决互相引用问题?
客户端连接服务器的流程
下面的控制消息用于客户端和服务器的握手连接,阅读源码时注意查找发送时机。
// message type definitions
DEFINE_CONTROL_CHANNEL_MESSAGE_TWOPARAM(Hello, 0, uint8, uint32); // initial client connection message
DEFINE_CONTROL_CHANNEL_MESSAGE_THREEPARAM(Welcome, 1, FString, FString, FString); // server tells client they're ok'ed to load the server's level
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(Upgrade, 2, uint32); // server tells client their version is incompatible
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(Challenge, 3, FString); // server sends client challenge string to verify integrity
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(Netspeed, 4, int32); // client sends requested transfer rate
DEFINE_CONTROL_CHANNEL_MESSAGE_THREEPARAM(Login, 5, FString, FString, FUniqueNetIdRepl); // client requests to be admitted to the game
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(Failure, 6, FString); // indicates connection failure
DEFINE_CONTROL_CHANNEL_MESSAGE_ZEROPARAM(Join, 9); // final join request (spawns PlayerController)
DEFINE_CONTROL_CHANNEL_MESSAGE_TWOPARAM(JoinSplit, 10, FString, FUniqueNetIdRepl); // child player (splitscreen) join request
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(Skip, 12, FGuid); // client request to skip an optional package
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(Abort, 13, FGuid); // client informs server that it aborted a not-yet-verified package due to an UNLOAD request
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(PCSwap, 15, int32); // client tells server it has completed a swap of its Connection->Actor
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(ActorChannelFailure, 16, int32); // client tells server that it failed to open an Actor channel sent by the server (e.g. couldn't serialize Actor archetype)
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(DebugText, 17, FString); // debug text sent to all clients or to server
DEFINE_CONTROL_CHANNEL_MESSAGE_TWOPARAM(NetGUIDAssign, 18, FNetworkGUID, FString); // Explicit NetworkGUID assignment. This is rare and only happens if a netguid is only serialized client->server (this msg goes server->client to tell client what ID to use in that case)
DEFINE_CONTROL_CHANNEL_MESSAGE_ONEPARAM(SecurityViolation, 19, FString); // server tells client that it has violated security and has been disconnected
DEFINE_CONTROL_CHANNEL_MESSAGE_TWOPARAM(GameSpecific, 20, uint8, FString); // custom game-specific message routed to UGameInstance for processing
网络同步的高层策略
本节阐述在网络游戏世界同步过程中,决定哪些对象在什么时机同步到客户端的策略。我们不可能把所有的对象都同步到客户端(只同步当前跟客户端玩家相关性的对象),也不可能在一帧中同步所有数据,需要根据优先级和同步频率有节奏地进行。
相关函数流程(服务器执行):
void UNetDriver::TickFlush(float DeltaSeconds);
-
int32 UNetDriver::ServerReplicateActors(float DeltaSeconds);
同步工作在此函数中执行,关键的流程如下:-
ServerReplicateActors_BuildConsiderList( ConsiderList, ServerTickTime );
生成需要同步的Actors列表 - 针对每个Client Connection:
-
ServerReplicateActors_PrioritizeActors( Connection, ConnectionViewers, ConsiderList, bCPUSaturated, PriorityList, PriorityActors );
对ConsiderList中的Actors的NetPriority进行评估,然后排序,结果放入PriorityActors。 -
ServerReplicateActors_ProcessPrioritizedActors( Connection, ConnectionViewers, PriorityActors, FinalSortedCount, Updated );
遍历PriorityActors,针对每个Actor执行replicated(主要工作在Channel->ReplicateActor())。
-
-
重点函数
bool UActorChannel::ReplicateActor();
bool FObjectReplicator::ReplicateProperties( FOutBunch & Bunch, FReplicationFlags RepFlags );