感觉之前的博客已经整理了大多数之前的关于基础的私人笔记, 现在应该可以讨论一下实操的东西了.
先来一发之前的kbe在ubuntu下的编译笔记吧, 因为官方对于ubuntu下的kbe编译文档是有问题的.
根据之前的博文 游戏服务端常用架构
官方是有自动化的安装py脚本的, 不过还是有很多小坑的.
不过其实脚本主要也就是只做两件事, 其他都是可选的:
- 配置环境变量
- 安装mysql
指针与数组是 C/C++ 编程中非常重要的元素,同时也是较难以理解的。其中,多级指针与 “多维” 数组更是让很多人云里雾里,其实,只要掌握一定的方法,理解多级指针和 “多维” 数组完全可以像理解一级指针和一维数组那样简单。
数组与指针的关系是因为数组下标操作符[],比如,int a[3][2]相当于((a+3)+2) 。
解引用操作符(dereference operator)会根据指针当前的地址值,以及所指向的数据类型,访问一块连续的内存空间(大小由指针所指向的数据类型决定),将这块空间的内容转换成相应的数据类型,并返回左值。
5char str[] ={0, 1, 2, 3}; /* 以字符的 ASCII 码初始化 */
char * pc = &str[0]; /* pc 指向 str[0],即 0 */
int * pi = (int *) pc; /* 指针的 “值” 是个地址,32 位。 */
此时,pc 和 pi 同时指向 str[0],但 pc 的值为 0(即,ASCII 码值为 0 的字符);而 pi 的值为 50462976。或许把它写成十六进制会更容易理解:0x03020100(4 个字节分别为 3,2,1,0)。我想你已经明白了,因为小端字节序, 且指针 pi 指向的类型为 int,因此在解引用时,需要访问 4 个字节的连续空间,并将其转换为 int 返回。
char a[3];
该数组一共有 3 个元素,元素的类型为 char,如果想定义一个指针指向该数组,也就是如果想把数组名 a 赋值给一个指针变量,那么该指针变量的类型应该是什么呢?前文说过,一个数组的数组名代表其首元素的地址,也就是相当于 & a[0],而 a[0] 的类型为 char,因此 & a[0] 类型为 char *
char * p = a;//相当于char * p = &a[0]
大家都应该知道,a 和 & a[0] 代表的都是数组首元素的地址,而如果你将 & a 的值打印出来,会发现该值也等于数组首元素的地址。请注意我这里的措辞,也就是说,&a 虽然在数值上也等于数组首元素地址的值,但是其类型并不是数组首元素地址类型,也就是char *p = &a
前文第 6 条常识已经说过,对数组名进行取地址操作,其类型为整个数组,因此,&a 的类型是 char (*)[3],所以正确的赋值方式如下:
char (*p)[3] = &a;
等感到迷惑,其实只要搞清楚指针的类型就可以迎刃而解。比如在面对 a+1 和 & a+1 的区别时,由于 a 表示数组首元素地址,其类型为 char *
,因此 a+1 相当于数组首地址值 + sizeof(char);而 & a 的类型为char (*)[3]
,代表整个数组,因此 & a+1 相当于数组首地址值 + sizeof(a)。假如有如下二维数组:
char a[3][2];
由于实际上并不存在多维数组,因此,可以将 a[3][2] 看成是一个具有 3 个元素的一维数组,只是这三个元素分别又是一个一维数组。实际上,在内存中,该数组的确是按照一维数组的形式存储的,存储顺序为 (低地址在前):a[0][0]、a[0][1]、a[1][0]、a[1][1]、a[2][0]、a[2][1]。(此种方式也不是绝对,也有按列优先存储的模式)
如上图所示,我们可以将数组分成两个维度来看,首先是第一维,将 a[3][2] 看成一个具有三个元素的一维数组,元素分别为:a[0]、a[1]、a[2],其中,a[0]、a[1]、a[2] 又分别是一个具有两个元素的一维数组 (元素类型为 char)。从第二个维度看,此处可以将 a[0]、a[1]、a[2] 看成自己代表” 第二维” 数组的数组名,以 a[0]为例,a[0]
(数组名)代表的一维数组是一个具有两个 char 类型元素的数组,而 a[0]是这个数组的数组名 (代表数组首元素地址),因此 a[0] 类型为 char *
,同理 a[1]和 a[2]类型都是 char *
。而 a 是第一维数组的数组名,代表首元素地址,而首元素是一个具有两个 char 类型元素的一维数组,因此 a 就是一个指向具有两个 char 类型元素数组的数组指针,也就是 char(*)[2]。
1 | char (*p)[2] = a; //a为第一维数组的数组名,类型为char (*)[2] |
同样,对 a 取地址操作代表整个数组的首地址,类型为数组类型 (请允许我暂且这么称呼),也就是 char (*)[3][2],所以如下赋值是正确的:
char (*p)[3][2] = &a;
1 | int a[3][4] = {0,1,2,3,4,5,6,7,8,9,10,11}; |
p 是一个二级指针,它首先是一个指针,指向一个 int*;
a 是二维数组名,它首先是一个指针,指向一个含有 4 个元素的 int 数组;
由此可见,a 和 p 的类型并不相同,如果想将 a 赋值给 p,需要强制类型转换。
假如我们将 a 强制转换之后赋值给 p :
p = (int**)a;
既然 p 是二级指针,那么 当 **p
首先看一下 p 的值,p 指向 a[0][0],即 p 的值为 a[0][0] 的地址;
再看一下 p 的值,p 所指向的类型是 int,占 4 字节,根据前面所讲的解引用操作符的过程:从 p 指向的地址开始,取连续 4 个字节的内容。 * p得到的正式 a[0][0] 的值,即 0。
再看一下 **p 的值,诶,报错了?当然报错了,因为你访问了地址为 0 的空间,而这个空间你是没有权限访问的。
char a[3][2][2];
同样,为了便于理解,特意画了如下的逻辑内存图。分析方法和二维数组类似,首先,从第一维角度看过去,a[3][2][2] 是一个具有三个元素 a[0]、a[1]、a[2] 的一维数组,只是这三个元素分别又是一个 “二维” 数组, a 作为第一维数组的数组名,代表数组首元素的地址,也就是一个指向一个二维数组的数组指针,其类型为 char ()[2][2]。从第二维角度看过去,a[0]、a[1]、a[2] 分别是第二维数组的数组名,代表第二维数组的首元素的地址,也就是一个指向一维数组的数组指针,类型为 char()[2];同理,从第三维角度看过去,a[0][0]、a[0][1]、a[1][0]、a[1][1]、a[2][0]、a[2][1] 又分别是第三维数组的数组名,代表第三维数组的首元素的地址,也就是一个指向 char 类型的指针,类型为 char *。
1 | char (*p)[3][2][2] = &a;//对数组名取地址类型为整个数组 |
1 | char *p = "my name is chenyang."; |
多级指针通常用来作为函数的形参,比如常见的 main 函数声明如下:
int main(int argc,char ** argv)
int main(int argc,char* argv[])
argv 用于接收用户输入的命令参数,这些参数会以字符串数组的形式传入,类似于:
1 | //模拟用户传入的参数 |
1 | void * get_memery(int size) |
1 | int get_memery(int** buf,int size) |
IPC 即 Inter Process Communication, 大概有以下几种方式(排序已打乱) :
6.共享内存( shared memory, 非常实用, 后文将说一下比较常用的两种方式, 分别是 mmap 和 System V共享内存 ) :
共享内存就是映射一段能被其他进程所访问的内存,这段共享内存由一个进程创建,但多个进程都可以访问。共享内存是最快的 IPC 方式,它是针对其他进程间通信方式运行效率低而专门设计的。它往往与其他通信机制,如信号量,配合使用,来实现进程间的同步和通信。
3.信号量( semophore, 主要用来进程/线程间同步, 后文将会说 System V信号量) :
7.套接字( socket ) :
1.匿名管道( 英文为pipe, 这种IPC很原始 ):
2.命名管道 ( named pipe或FIFO, 这种IPC很原始 ) :
4.消息队列( message queue, 正在被淘汰 ) :
5.信号 ( sinal ) :
采用共享内存通信的一个显而易见的好处是效率高,因为进程可以直接读写内存,而不需要任何数据的拷贝。对于像管道和消息队列等通信方式,则需要在内核和用户空间进行四次的数据拷贝,而共享内存则只拷贝两次数据 [1]:一次从输入文件到共享内存区,另一次从共享内存区到输出文件。实际上,进程之间在共享内存时,并不总是读写少量数据后就解除映射,有新的通信时,再重新建立共享内存区域。而是保持共享区域,直到通信完毕为止,这样,数据内容一直保存在共享内存中,并没有写回文件。共享内存中的内容往往是在解除映射时才写回文件的。因此,采用共享内存的通信方式效率是非常高的。
Linux 的 2.2.x 内核支持多种共享内存方式,如 mmap() 系统调用,Posix 共享内存,以及系统 V 共享内存。linux 发行版本如 Redhat 8.0 支持 mmap() 系统调用及系统 V 共享内存,但还没实现 Posix 共享内存,本文将主要介绍 mmap() 系统调用及系统 V 共享内存 API 的原理及应用。
1、page cache 及 swap cache 中页面的区分:一个被访问文件的物理页面都驻留在 page cache 或 swap cache 中,一个页面的所有信息由 struct page 来描述。struct page 中有一个域为指针 mapping ,它指向一个 struct address_space 类型结构。page cache 或 swap cache 中的所有页面就是根据 address_space 结构以及一个偏移量来区分的。
2、文件与 address_space 结构的对应:一个具体的文件在打开后,内核会在内存中为之建立一个 struct inode 结构,其中的 i_mapping 域指向一个 address_space 结构。这样,一个文件就对应一个 address_space 结构,一个 address_space 与一个偏移量能够确定一个 page cache 或 swap cache 中的一个页面。因此,当要寻址某个数据时,很容易根据给定的文件及数据在文件内的偏移量而找到相应的页面。
3、进程调用 mmap() 时,只是在进程空间内新增了一块相应大小的缓冲区,并设置了相应的访问标识,但并没有建立进程空间到物理页面的映射。因此,第一次访问该空间时,会引发一个缺页异常。
4、对于共享内存映射情况,缺页异常处理程序首先在 swap cache 中寻找目标页(符合 address_space 以及偏移量的物理页),如果找到,则直接返回地址;如果没有找到,则判断该页是否在交换区 (swap area),如果在,则执行一个换入操作;如果上述两种情况都不满足,处理程序将分配新的物理页面,并把它插入到 page cache 中。进程最终将更新进程页表。
注:对于映射普通文件情况(非共享映射),缺页异常处理程序首先会在 page cache 中根据 address_space 以及数据偏移量寻找相应的页面。如果没有找到,则说明文件数据还没有读入内存,处理程序会从磁盘读入相应的页面,并返回相应地址,同时,进程页表也会更新。
注:一个共享内存区域可以看作是特殊文件系统 shm 中的一个文件,shm 的安装点在交换区上。
使用共享内存的优缺点如下所述 。
共享还使进程间的数据不用传送,而是直接访问内存,也加快了程序的效率。 同时,它也不
像无名管道那样要求通信的进程有一定的父子关系 。
缺点:共享 内存没有提供同步的机制,这使得在使用共享 内存进行进程间通信时,
往往要借助其他的手段来进行进程间的同步工作 。
mmap() 系统调用使得进程之间通过映射同一个普通文件实现共享内存。普通文件被映射到进程地址空间后,进程可以向访问普通内存一样对文件进行访问,不必再调用 read(),write()等操作。
注:实际上,mmap() 系统调用并不是完全为了用于共享内存而设计的。它本身提供了不同于一般对普通文件的访问方式,进程可以像读写内存一样对普通文件的操作。而 Posix 或系统 V 的共享内存 IPC 则纯粹用于共享目的,当然 mmap() 实现共享内存也是其主要应用之一。
void* mmap (void * addr , size_t len , int prot , int flags , int fd , off_t offset)
这里不再详细介绍 mmap() 的参数,读者可参考 mmap() 手册页获得进一步的信息。
(1)使用普通文件提供的内存映射:适用于任何进程之间; 此时,需要打开或创建一个文件,然后再调用 mmap();典型调用代码如下:
fd=open(name, flag, mode);
ptr=mmap(NULL, len , PROT_READ|PROT_WRITE, MAP_SHARED , fd , 0); 通过 mmap() 实现共享内存的通信方式有许多特点和要注意的地方,我们将在范例中进行具体说明。
(2)使用特殊文件提供匿名内存映射:适用于具有亲缘关系的进程之间; 由于父子进程特殊的亲缘关系,在父进程中先调用 mmap(),然后调用 fork()。
那么在调用 fork() 之后,子进程继承父进程匿名映射后的地址空间,同样也继承 mmap() 返回的地址,这样,父子进程就可以通过映射区域进行通信了。
而 mmap() 返回的地址,却由父子进程共同维护。
此时,不必指定具体的文件,只要设置相应的标志即可,参见范例 2。
int munmap(void * addr, size_t len)
该调用在进程地址空间中解除一个映射关系,addr 是调用 mmap() 时返回的地址,len 是映射区的大小。当映射关系解除后,对原来映射地址的访问将导致段错误发生。
int msync (void * addr , size_t len, int flags)
一般说来,进程在映射空间的对共享内容的改变并不直接写回到磁盘文件中,往往在调用 munmap()后才执行该操作。可以通过调用 msync() 实现磁盘上文件内容与共享内存区的内容一致。
下面将给出使用 mmap() 的两个范例:
系统调用 mmap() 有许多有趣的地方,下面是通过 mmap()映射普通文件实现进程间的通信的范例,我们通过该范例来说明 mmap() 实现共享内存的特点及注意事项。
范例1 包含两个子程序:map_normalfile1.c 及 map_normalfile2.c。
编译两个程序,可执行文件分别为 map_normalfile1 及 map_normalfile2。
map_normalfile2 试图打开命令行参数指定的一个普通文件,把该文件映射到进程的地址空间,并对映射后的地址空间进行写操作。
map_normalfile1 把命令行参数指定的文件映射到进程地址空间,然后对映射后的地址空间执行读操作。
1 | /*-------------map_normalfile1.c-----------*/ |
1 | /*-------------map_normalfile2.c-----------*/ |
map_normalfile1.c 首先定义了一个 people 数据结构,(在这里采用数据结构的方式是因为,共享内存区的数据往往是有固定格式的,这由通信的各个进程决定,采用结构的方式有普遍代表性)。map_normfile1 首先打开或创建一个文件,并把文件的长度设置为 5 个 people 结构大小。然后从 mmap() 的返回地址开始,设置了 10 个 people 结构。然后,进程睡眠 10 秒钟,等待其他进程映射同一个文件,最后解除映射。
map_normfile2.c 只是简单的映射一个文件,并以 people 数据结构的格式从 mmap() 返回的地址处读取 10 个 people 结构,并输出读取的值,然后解除映射。
分别把两个程序编译成可执行文件 map_normalfile1 和 map_normalfile2 后,在一个终端上先运行./map_normalfile2 /tmp/test_shm,程序输出结果如下:
initialize over
umap ok
在 map_normalfile1 输出 initialize over 之后,输出 umap ok 之前,在另一个终端上运行 map_normalfile2 /tmp/test_shm,将会产生如下输出 (为了节省空间,输出结果为稍作整理后的结果):
name: b age 20;
name: c age 21;
name: d age 22;
name: e age 23;
name: f age 24;
name: g age 25;
name: h age 26;
name: I age 27;
name: j age 28;
name: k age 29;
在 map_normalfile1 输出 umap ok 后,运行 map_normalfile2 则输出如下结果:
name: b age 20;
name: c age 21;
name: d age 22;
name: e age 23;
name: f age 24;
name: age 0;
name: age 0;
name: age 0;
name: age 0;
name: age 0;
1、 最终被映射文件的内容的长度不会超过文件本身的初始大小,即映射不能改变文件的大小;
2、 可以用于进程通信的有效地址空间大小大体上受限于被映射文件的大小,但不完全受限于文件大小。打开文件被截短为 5 个 people 结构大小,而在 map_normalfile1 中初始化了 10 个 people 数据结构,在恰当时候(map_normalfile1 输出 initialize over 之后,输出 umap ok 之前)调用 map_normalfile2 会发现 map_normalfile2 将输出全部 10 个 people 结构的值,后面将给出详细讨论。
注:在 linux 中,内存的保护是以页为基本单位的,即使被映射文件只有一个字节大小,
进程可以对从 mmap() 返回地址开始的一个页面大小进行访问,
3、 文件一旦被映射后,调用 mmap() 的进程对返回地址的访问是对某一内存区域的访问,暂时脱离了磁盘上文件的影响。所有对 mmap() 返回地址空间的操作只在内存中有意义,只有在调用了 munmap() 后或者 msync() 时,才把内存中的相应内容写回磁盘文件,所写内容仍然不能超过文件的大小。
2#include <semaphore.h>
int sem_init(sem_t *sem, int pshared, unsigned int value);
1 | #include<stdio.h> |
为了能够跨进程使用 semaphore ,我们引入了跨进程的技术mmap,第61、第62行分别打开了两个mmap需要映射的文件,和我们平时用的open函数不同,这里面为程序赋予了该文件的666权限。这点很重要,因为mmap需要映射的本地文件必须明确赋予其可读写的权限,否则无法通信。
编译命令 : gcc mmap_fork_sync.c -o mmap_fork_sync -pthread
, 体会父子进程匿名共享内存:
b@b-VirtualBox:~/tc/mmap_test$ ./mmap_fork_sync
前面对范例运行结构的讨论中已经提到,linux 采用的是页式管理机制。对于用 mmap() 映射普通文件来说,进程会在自己的地址空间新增一块空间,空间大小由 mmap() 的 len 参数指定,注意,进程并不一定能够对全部新增空间都能进行有效访问。进程能够访问的有效地址大小取决于文件被映射部分的大小。简单的说,能够容纳文件被映射部分大小的最少页面个数决定了进程从 mmap() 返回的地址开始,能够有效访问的地址空间大小。超过这个空间大小,内核会根据超过的严重程度返回发送不同的信号给进程。可用如下图示说明:
1 | #include <sys/mman.h> |
如程序中所注释的那样,把程序编译成两个版本,两个版本主要体现在文件被映射部分的大小不同。文件的大小介于一个页面与两个页面之间(大小为:pagesize2-99),版本 1 的被映射部分是整个文件,版本 2 的文件被映射部分是文件大小减去一个页面后的剩余部分,不到一个页面大小 (大小为:pagesize-99)。程序中试图访问每一个页面边界,两个版本都试图在进程空间中映射 pagesize3 的字节数。
版本 1 的输出结果如下:
pagesize is 4096
access page 1 over
access page 1 edge over, now begin to access page 2
access page 2 over
access page 2 over
access page 2 edge over, now begin to access page 3
Bus error //被映射文件在进程空间中覆盖了两个页面,此时,进程试图访问第三个页面
版本 2 的输出结果如下:
pagesize is 4096
access page 1 over
access page 1 edge over, now begin to access page 2
Bus error //被映射文件在进程空间中覆盖了一个页面,此时,进程试图访问第二个页面
结论:采用系统调用 mmap() 实现进程间通信是很方便的,在应用层上接口非常简洁。内部实现机制区涉及到了 linux 存储管理以及文件系统等方面的内容,可以参考一下相关重要数据结构来加深理解。在本专题的后面部分,将介绍系统 v 共享内存的实现。
说一下System V共享内存.
顾名思义,共享内存就是允许两个不相关的进程访问同一个逻辑内存。 共享内存是在两
个正在运行的进程之间共享和传递数据的一种非常有效的方式 。 不同进程之间共享的内存通
常安排在同-段物理内存中 。 进程可以将同一段共享内存连接到它们 自己 的地址空间中,所
有进程都可以访问共享内存中的地址,就好像它们是由用 C 语言 函数 malloc 分配的内存一
样。 而如果某个进程向共享内存写入数据,所做的改动将立即影响到可以访问同一段共享内
存的任何其他进程 。
之前,并无自动机制可以阻止第二个进程对它进行读取。 所以通常需要用其他的机制来同步
对共享内存的访问 。
在 Linux 中也提供了一组函数接口用于使用共享 内存, 首先常用的函数是 shmget , 该函
数用来创建共享内存,它用到的头文件是 :
#include <sys/shm .h>
函数原型是:int shmget(key_ t key, int size , int flag) ;
第一个参数,程序需要提供一个参数 key (非 0 整数),它有效地为共享内存段命名,
shmget 函数运行成功时会返回一个与 key 相关的共享内存标识符(非负整数),用于后续的共
享内存函数;调用失败返回- 1 。
个资源,程序对所有共享内存的访问都是间接的 。 程序先通过调用 shmget 函数并提供一个
键,再由系统生成一个相应的共享内存标识符( shmget 函数的返回值) 。
第二个参数, size 以字节为单位指定需要共享的内存容量。
第三个参数, shmfl.g 是权限标志,它的作用与 open 函数的 mode 参数一样,如果要想在
key 标识的共享 内存不存在的条件下创建它的话,可以与 IPC_CREAT 做或操作 。 共享内存
的权限标志与文件的读写权限一样,举例来说, 0644 表示允许一个进程创建的共享内存被内
内存 。
当共享 内存创建后,其余进程可以调用 shmat 将其连接到自身的地址空间中,它的函数
原型是 :void *shmat(int shmid , void *addr , int flag) ;
shmid 为 shmget 函数返回的共享存储标识符, addr 和 flag 参数决定了以什么方式来确定
连接的地址,函数的返回值即是该进程数据段所连接的实际地址, 其他进程可以对此进程进
行读写操作 。
shmdt 函数用于将共享 内存从当前进程中分离 。 注意,将共享内存分离并不是删除它,
只是使该共享内存对当前进程不再可用 。 它的原型如下:int shmdt(const void *shmaddr) ;
参数 shmaddr 是 shmat 函数返回的地址指针,调用成功时返回 0 ,失败时返回- 1 。
共享 内存是进程间通信的最快的方式,但是共享 内存的同步问题自身无法解决(即进
程该何时去共享内存取得数据,而何时不能取),但用信号量即可轻易解决这个问题 。 下
面使用例来说明如何使用信号量解决共享内存的同步问题 。 这个例子的主要功能是
writer 向 reader 传递数据,并且只有在 writer 发送完毕后, reader 才取数据,否则阻塞
等待 。
1 | #include <sys/types.h> |
1 | #include <sys/types.h> |
多打开几个终端,同时执行 writer 程序,看是否 reader 能够正确地读到数据
writer :
[b@host 1105]$ ./writer
writer :
[b@host 1105]$ ./writer
reader :
[b@host 1105]$ ./reader
the NUM:22
the NUM:11
the NUM:55
the NUM:55
the NUM:5
the NUM:51
the NUM:9
the NUM:977
操作 。
例如,使用信号量来进行进程的同步 。 因为对信号量的操作都是原子性的 。
在 Linux 中提供了一组函数接口用于使用System V信号量 ,首先常用的函数是 semget,该函数用
来创建和打开信号量 ,它用到的头文件是:
1 | #include <sys / types . h> |
函数原型是:int semget( key_ t key , int nsems , int semflg) ;
该函数执行成功返回信号量标示符,失败则返回- 1 。 参数 key 是函数通过调用负ok 函
数得到的键值, nsems 代表创建信号量的个数,如果只是访问而不创建则可以指定该参数为
0 ;但一旦创建了该信号量 ,就不能更改其信号量个数。 只要不删除该信号量 ,就可以重新
调用该函数创建该键值的信号量 ,该函数只是返回以前创建的值,而不会重新创建。
semflg指定该信号茸的读写权限, 当创建信号量时不许加 IPCC阻AT ,若指定 IPC CREAT IIPC
EXCL 后创建时发现存在该信号量 ,创建失败 。
semop 函数,用于改变信号量的值,原型是:int semop(int semid, struct sembuf *sops , unsigned nsops) ;
sem_id 是 由 semget 返回的信号量标识符, sembuf 结构的定义如下:1
7struct sembuf {
short sem_num; // 除非使用一组信号量,否则它为 O
short sem_op ; // 信号量在一次操作中需要改变的数据,通常是两个数,
// 一个是- 1 ,即 p (等待)操作,一个是+ 1 ,即 v (发送信号)操作 。
short sem_flg; // 通常为 SEM_UNDO , 使操作系统跟踪信号,
// 并在进程没有释放该信号量而终止时 , 操作系统释放信号量
semctl 函数,该函数用来直接控制信号量信息,它的原型是:int semctl (int semid, int semnum, int cmd , ... ) ;
如果有第 4 个参数,它通常是一个 union semum 结构,定义如下:1
5union semun{
int val ;
struct semid_ds *buf;
unsigned short *arry ;
前两个参数与前面一个函数中的一样, cmd 通常是 SETVAL 或 IPC RMID 。 SETVAL
用来把信号量初始化为一个己知的值 。 p 值通过 union semun 中的 val 成员设置,其作用是
在信号量第一次使用前对它进行设置 。 IPC_RMID 用于删除一个已经无须继续使用的信号量
ipcs 是一个 UINX/Linux 的命令 ,用于报告系统的消息队列、信号量、共享内存等 。 下
ipcs -a 用于列出本用户所有相关的 ipcs 参数,结果如下所示 :
[b@host ~]$ ipcs -a
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x000004d1 32768 b 666 2052 0
0x000004d2 65537 b 666 2052 0
------ Semaphore Arrays --------
key semid owner perms nsems
------ Message Queues --------
key msqid owner perms used-bytes messages
ipcs -l 用于列出系统的限额
[b@host ~]$ ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 4194303
max total shared memory (kbytes) = 1073741824
min seg size (bytes) = 1
------ Semaphore Limits --------
max number of arrays = 32000
max semaphores per array = 32000
max semaphores system wide = 1024000000
max ops per semop call = 500
semaphore max value = 32767
------ Messages: Limits --------
max queues system wide = 32000
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536
ipcs -u 用于列出当前的使用情况
[b@host ~]$ ipcs -u
------ Shared Memory Status --------
segments allocated 2
pages allocated 2
pages resident 2
pages swapped 0
Swap performance: 0 attempts 0 successes
------ Semaphore Status --------
used arrays = 3
allocated semaphores = 3
------ Messages: Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes
ipcs -t 用于列出最后的访问时间
[b@host ~]$ ipcs -t
------ Shared Memory Attach/Detach/Change Times --------
shmid owner attached detached changed
32768 b May 18 06:46:54 May 18 06:47:43 May 18 06:45:48
65537 b May 18 06:45:57 May 18 06:46:08 May 18 06:45:57
------ Semaphore Operation/Change Times --------
semid owner last-op last-changed
------ Message Queues Send/Recv/Change Times --------
msqid owner send recv change
状态同步的要点为 :
原文标题 : State Synchronization (Keeping simulations in sync by sending state)
Hi, I’m Glenn Fiedler and welcome to Networked Physics.
In the previous article we discussed techniques for compressing snapshots.
In this article we round out our discussion of networked physics strategies with state synchronization, the third and final strategy in this article series.
What is state synchronization? The basic idea is that, somewhat like deterministic lockstep, we run the simulation on both sides but, unlike deterministic lockstep, we don’t just send input, we send both input and state.
This gives state synchronization interesting properties. Because we send state, we don’t need perfect determinism to stay in sync, and because the simulation runs on both sides, objects continue moving forward between updates.
This lets us approach state synchronization differently to snapshot interpolation. Instead of sending state updates for every object in each packet, we can now send updates for only a few, and if we’re smart about how we select the objects for each packet, we can save bandwidth by concentrating updates on the most important objects.
So what’s the catch? State synchronization is an approximate and lossy synchronization strategy. In practice, this means you’ll spend a lot of time tracking down sources of extrapolation divergence and pops. But other than that, it’s a quick and easy strategy to get started with.
Here’s the state sent over the network per-object:
struct StateUpdate
int index;
vec3f position;
quat4f orientation;
vec3f linear_velocity;
vec3f angular_velocity;
Unlike snapshot interpolation, we’re not just sending visual quantities like position and orientation, we’re also sending non-visual state such as linear and angular velocity. Why is this?
The reason is that state synchronization runs the simulation on both sides, so it’s always extrapolating from the last state update applied to each object. If linear and angular velocity aren’t synchronized, this extrapolation is done with incorrect velocities, leading to pops when objects are updated.
While we must send the velocities, there’s no point wasting bandwidth sending (0,0,0) over and over while an object is at rest. We can fix this with a trivial optimization, like so:
void serialize_state_update( Stream & stream,
int & index,
StateUpdate & state_update )
serialize_int( stream, index, 0, NumCubes - 1 );
serialize_vector( stream, state_update.position );
serialize_quaternion( stream, state_update.orientation );
bool at_rest = stream.IsWriting() ? state_update.AtRest() : false;
serialize_bool( stream, at_rest );
if ( !at_rest )
serialize_vector( stream, state_update.linear_velocity );
serialize_vector( stream, state_update.angular_velocity );
else if ( stream.IsReading() )
state_update.linear_velocity = vec3f(0,0,0);
state_update.angular_velocity = vec3f(0,0,0);
What you see above is a serialize function. It’s a trick I like to use to unify packet read and write. I like it because it’s expressive while at the same time it’s difficult to desync read and write. You can read more about them here.
Now let’s look at the overall structure of packets being sent:
const int MaxInputsPerPacket = 32;
const int MaxStateUpdatesPerPacket = 64;
struct Packet
uint32_t sequence;
Input inputs[MaxInputsPerPacket];
int num_object_updates;
StateUpdate state_updates[MaxStateUpdatesPerPacket];
First we include a sequence number in each packet so we can determine out of order, lost or duplicate packets. I recommend you run the simulation at the same framerate on both sides (for example 60HZ) and in this case the sequence number can work double duty as the frame number.
Input is included in each packet because it’s needed for extrapolation. Like deterministic lockstep we send multiple redundant inputs so in the case of packet loss it’s very unlikely that an input gets dropped. Unlike deterministic lockstep, if don’t have the next input we don’t stop the simulation and wait for it, we continue extrapolating forward with the last input received.
Next you can see that we only send a maximum of 64 state updates per-packet. Since we have a total of 901 cubes in the simulation so we need some way to select the n most important state updates to include in each packet. We need some sort of prioritization scheme.
To get started each frame walk over all objects in your simulation and calculate their current priority. For example, in the cube simulation I calculate priority for the player cube as 1000000 because I always want it to be included in every packet, and for interacting (red cubes) I give them a higher priority of 100 while at rest objects have priority of 1.
Unfortunately if you just picked objects according to their current priority each frame you’d only ever send red objects while in a katamari ball and white objects on the ground would never get updated. We need to take a slightly different approach, one that prioritizes sending important objects while also distributing updates across all objects in the simulation.
You can do this with a priority accumulator. This is an array of float values, one value per-object, that is remembered from frame to frame. Instead of taking the immediate priority value for the object and sorting on that, each frame we add the current priority for each object to its priority accumulator value then sort objects in order from largest to smallest priority accumulator value. The first n objects in this sorted list are the objects you should send that frame.
You could just send state updates for all n objects but typically you have some maximum bandwidth you want to support like 256kbit/sec. Respecting this bandwidth limit is easy. Just calculate how large your packet header is and how many bytes of preamble in the packet (sequence, # of objects in packet and so on) and work out conservatively the number of bytes remaining in your packet while staying under your bandwidth target.
Then take the n most important objects according to their priority accumulator values and as you construct the packet, walk these objects in order and measure if their state updates will fit in the packet. If you encounter a state update that doesn’t fit, skip over it and try the next one. After you serialize the packet, reset the priority accumulator to zero for objects that fit but leave the priority accumulator value alone for objects that didn’t. This way objects that don’t fit are first in line to be included in the next packet.
The desired bandwidth can even be adjusted on the fly. This makes it really easy to adapt state synchronization to changing network conditions, for example if you detect the connection is having difficulty you can reduce the amount of bandwidth sent (congestion avoidance) and the quality of state synchronization scales back automatically. If the network connection seems like it should be able to handle more bandwidth later on then you can raise the bandwidth limit.
The priority accumulator covers the sending side, but on the receiver side there is much you need to do when applying these state updates to ensure that you don’t see divergence and pops in the extrapolation between object updates.
The very first thing you need to consider is that network jitter exists. You don’t have any guarantee that packets you sent nicely spaced out 60 times per-second arrive that way on the other side. What happens in the real world is you’ll typically receive two packets one frame, 0 packets the next, 1, 2, 0 and so on because packets tend to clump up across frames. To handle this situation you need to implement a jitter buffer for your state update packets. If you fail to do this you’ll have a poor quality extrapolation and pops in stacks of objects because objects in different state update packets are slightly out of phase with each other with respect to time.
All you do in a jitter buffer is hold packets before delivering them to the application at the correct time as indicated by the sequence number (frame number) in the packet. The delay you need to hold packets for in this buffer is a much smaller amount of time relative to interpolation delay for snapshot interpolation but it’s the same basic idea. You just need to delay packets just enough (say 4-5 frames @ 60HZ) so that they come out of the buffer properly spaced apart.
Once the packet comes out of the jitter how do you apply state updates? My recommendation is that you should snap the physics state hard. This means you apply the values in the state update directly to the simulation.
I recommend against trying to apply some smoothing between the state update and the current state at the simulation level. This may sound counterintuitive but the reason for this is that the simulation extrapolates from the state update so you want to make sure it extrapolates from a valid physics state for that object rather than some smoothed, total bullshit made-up one. This is especially important when you are networking large stacks of objects.
Surprisingly, without any smoothing the result is already pretty good:
As you can see it’s already looking quite good and barely any bandwidth optimization has been performed. Contrast this with the first video for snapshot interpolation which was at 18mbit/sec and you can see that using the simulation to extrapolate between state updates is a great way to use less bandwidth.
Of course we can do a lot better than this and each optimization we do lets us squeeze more state updates in the same amount of bandwidth. The next obvious thing we can do is to apply all the standard quantization compression techniques such as bounding and quantizing position, linear and angular velocity value and using the smallest three compression as described in snapshot compression.
But here it gets a bit more complex. We are extrapolating from those state updates so if we quantize these values over the network then the state that arrives on the right side is slightly different from the left side, leading to a slightly different extrapolation and a pop when the next state update arrives for that object.
The solution is to quantize the state on both sides. This means that on both sides before each simulation step you quantize the entire simulation state as if it had been transmitted over the network. Once this is done the left and right side are both extrapolating from quantized state and their extrapolations are very similar.
Because these quantized values are being fed back into the simulation, you’ll find that much more precision is required than snapshot interpolation where they were just visual quantities used for interpolation. In the cube simulation I found it necessary to have 4096 position values per-meter, up from 512 with snapshot interpolation, and a whopping 15 bits per-quaternion component in smallest three (up from 9). Without this extra precision significant popping occurs because the quantization forces physics objects into penetration with each other, fighting against the simulation which tries to keep the objects out of penetration. I also found that softening the constraints and reducing the maximum velocity which the simulation used to push apart penetrating objects also helped reduce the amount of popping.
With quantization applied to both sides you can see the result is perfect once again. It may look visually about the same as the uncompressed version but in fact we’re able to fit many more state updates per-packet into the 256kbit/sec bandwidth limit. This means we are better able to handle packet loss because state updates for each object are sent more rapidly. If a packet is lost, it’s less of a problem because state updates for those objects are being continually included in future packets.
Be aware that when a burst of packet loss occurs like 1⁄4 a second with no packets getting through, and this is inevitable that eventually something like this will happen, you will probably get a different result on the left and the right sides. We have to plan for this. In spite of all effort that we have made to ensure that the extrapolation is as close as possible (quantizing both sides and so on) pops can and will occur if the network stops delivering packets.
We can cover up these pops with smoothing.
Remember how I said earlier that you should not apply smoothing at the simulation level because it ruins the extrapolation? What we’re going to do for smoothing instead is calculating and maintaining position and orientation error offsets that we reduce over time. Then when we render the cubes in the right side we don’t render them at the simulation position and orientation, we render them at the simulation position + error offset, and orientation * orientation error.
Over time we work to reduce these error offsets back to zero for position error and identity for orientation error. For error reduction I use an exponentially smoothed moving average tending towards zero. So in effect, I multiply the position error offset by some factor each frame (eg. 0.9) until it gets close enough to zero for it to be cleared (thus avoiding denormals). For orientation, I slerp a certain amount (0.1) towards identity each frame, which has the same effect for the orientation error.
The trick to making this all work is that when a state update comes in you take the current simulation position and add the position error to that, and subtract that from the new position, giving the new position error offset which gives an identical result to the current (smoothed) visual position.
The same process is then applied to the error quaternion (using multiplication by the conjugate instead of subtraction) and this way you effectively calculate on each state update the new position error and orientation error relative to the new state such that the object appears to have not moved at all. Thus state updates are smooth and have no immediate visual effect, and the error reduction smoothes out any error in the extrapolation over time without the player noticing in the common case.
I find that using a single smoothing factor gives unacceptable results. A factor of 0.95 is perfect for small jitters because it smooths out high frequency jitter really well, but at the same time it is too slow for large position errors, like those that happen after multiple seconds of packet loss:
The solution I use is two different scale factors at different error distances, and to make sure the transition is smooth I blend between those two factors linearly according to the amount of positional error that needs to be reduced. In this simulation, having 0.95 for small position errors (25cms or less) while having a tighter blend factor of 0.85 for larger distances (1m error or above) gives a good result. The same strategy works well for orientation using the dot product between the orientation error and the identity matrix. I found that in this case a blend of the same factors between dot 0.1 and 0.5 works well.
The end result is smooth error reduction for small position and orientation errors combined with a tight error reduction for large pops. As you can see above you don’t want to drag out correction of these large pops, they need to be fast and so they’re over quickly otherwise they’re really disorienting for players, but at the same time you want to have really smooth error reduction when the error is small hence the adaptive error reduction approach works really well.
Even though I would argue the result above is probably good enough already it is possible to improve the synchronization considerably from this point. For example to support a world with larger objects or more objects being interacted with. So lets work through some of those techniques and push this technique as far as it can go.
There is an easy compression that can be performed. Instead of encoding absolute position, if it is within a range of the player cube center, encode position as a relative offset to the player center position. In the common cases where bandwidth is high and state updates need to be more frequent (katamari ball) this provides a large win.
Next, what if we do want to perform some sort of delta encoding for state synchronization? We can but it’s quite different in this case than it is with snapshots because we’re not including every cube in every packet, so we can’t just track the most recent packet received and say, OK all these state updates in this packet are relative to packet X.
What you actually have to do is per-object update keep track of the packet that includes the base for that update. You also need to keep track of exactly the set of packets received so that the sender knows which packets are valid bases to encode relative to. This is reasonably complicated and requires a bidirectional ack system over UDP. Such a system is designed for exactly this sort of situation where you need to know exactly which packets definitely got through. You can find a tutorial on how to implement this in this article.
So assuming that you have an ack system you know with packet sequence numbers get through. What you do then is per-state update write one bit if the update is relative or absolute, if absolute then encode with no base as before, otherwise if relative send the 16 bit sequence number per-state update of the base and then encode relative to the state update data sent in that packet. This adds 1 bit overhead per-update as well as 16 bits to identify the sequence number of the base per-object update. Can we do better?
Yes. In turns out that of course you’re going to have to buffer on the send and receive side to implement this relative encoding and you can’t buffer forever. In fact, if you think about it you can only buffer up a couple of seconds before it becomes impractical and in the common case of moving objects you’re going to be sending the updates for same object frequently (katamari ball) so practically speaking the base sequence will only be from a short time ago.
So instead of sending the 16 bit sequence base per-object, send in the header of the packet the most recent acked packet (from the reliability ack system) and per-object encode the offset of the base sequence relative to that value using 5 bits. This way at 60 packets per-second you can identify an state update with a base half a second ago. Any base older than this is unlikely to provide a good delta encoding anyway because it’s old, so in that case just drop back to absolute encoding for that update.
Now lets look at the type of objects that are going to have these absolute encodings rather than relative. They’re the objects at rest. What can we do to make them as efficient as possible? In the case of the cube simulation one bad result that can occur is that a cube comes to rest (turns grey) and then has its priority lowered significantly. If that very last update with the position of that object is missed due to packet loss, it can take a long time for that object to have its at rest position updated.
We can fix this by tracking objects which have recently come to rest and bumping their priority until an ack comes back for a packet they were sent in. Thus they are sent at an elevated priority compared with normal grey cubes (which are at rest and have not moved) and keep resending at that elevated rate until we know that update has been received, thus “committing” that grey cube to be at rest at the correct position.
And that’s really about it for this technique. Without anything fancy it’s already pretty good, and on top of that another order of magnitude improvement is available with delta compression, at the cost of significant complexity!
1 | struct StateUpdate |
21void serialize_state_update( Stream & stream,
int & index,
StateUpdate & state_update )
serialize_int( stream, index, 0, NumCubes - 1 );
serialize_vector( stream, state_update.position );
serialize_quaternion( stream, state_update.orientation );
bool at_rest = stream.IsWriting() ? state_update.AtRest() : false;
serialize_bool( stream, at_rest );
if ( !at_rest )
serialize_vector( stream, state_update.linear_velocity );
serialize_vector( stream, state_update.angular_velocity );
else if ( stream.IsReading() )
state_update.linear_velocity = vec3f(0,0,0);
state_update.angular_velocity = vec3f(0,0,0);
11const int MaxInputsPerPacket = 32;
const int MaxStateUpdatesPerPacket = 64;
struct Packet
uint32_t sequence;
Input inputs[MaxInputsPerPacket];
int num_object_updates;
StateUpdate state_updates[MaxStateUpdatesPerPacket];
接下来,如果我们想对状态同步执行某种增量编码怎么办? 我们可以做到但是具体的方法会和快照里面的增量编码方法差别很大,这是因为在这种情况下我们的每个数据包不会包含每一个立方体的信息,所以我们不能跟踪最新收到的数据包,并且自以为地觉得这个数据包的所有这些状态更新都是相对于X这个数据包的。