性能优化之sleep、sched_yield和忙等待

最近帮一家公司优化他们的量化交易系统,其中有这么一段代码:

void xxxx::MonitorThread()
{
    while (m_running)
    {
        MonitorOrders();
        Sleep(0);
    }
}

在监控订单的线程里调用了sleep(0);这种设计就是死循环地将队列中的订单执行完,然后调用sleep(0)去让出CPU,以供其他线程获得更高优先级去执行。
在整个系统中大量使用了sleep(0)这种方式的设计,那么这种方式是否恰当呢?
我们都知道对于量化交易来讲,天下武功唯快不破;
实际上还有一个系统调用sched_yiled也能让出CPU的执行权限,描述如下:

NAME
       sched_yield - yield the processor

SYNOPSIS
       #include <sched.h>

       int sched_yield(void);

DESCRIPTION
       sched_yield() causes the calling thread to relinquish the CPU.  The thread is moved to the end of the queue for its static priority and a new thread gets to run.

RETURN VALUE
       On success, sched_yield() returns 0.  On error, -1 is returned, and errno is set appropriately.

ERRORS
       In the Linux implementation, sched_yield() always succeeds.

CONFORMING TO
       POSIX.1-2001, POSIX.1-2008.

NOTES
       If the calling thread is the only thread in the highest priority list at that time, it will continue to run after a call to sched_yield().

       POSIX systems on which sched_yield() is available define _POSIX_PRIORITY_SCHEDULING in <unistd.h>.

       Strategic  calls to sched_yield() can improve performance by giving other threads or processes a chance to run when (heavily) contended resources (e.g., mutexes) have been released by the caller.  Avoid calling sched_yield() unnecessarily
       or inappropriately (e.g., when resources needed by other schedulable threads are still held by the caller), since doing so will result in unnecessary context switches, which will degrade system performance.

       sched_yield() is intended for use with real-time scheduling policies (i.e., SCHED_FIFO or SCHED_RR).  Use of sched_yield() with nondeterministic scheduling policies such as SCHED_OTHER is unspecified and very likely means your application
       design is broken.

如果当前的线程是最高优先级的线程,那么调用sched_yield后该线程会继续运行。
下面我们看看sched_yield和sleep(0)的性能对比:

root@iZ2zefnvk8kwih8l62w90yZ:/data# more test.c
#include <sched.h>
#include <unistd.h>

int main(int argc, char **argv) {

    for (int i = 0; i < 100000; i++) {
        //sleep(0);
    sched_yield();
    }

    return 0;
}
root@iZ2zefnvk8kwih8l62w90yZ:/data# time ./test  

real    0m6.186s
user    0m0.092s
sys 0m0.460s
root@iZ2zefnvk8kwih8l62w90yZ:/data# time ./test

real    0m0.043s
user    0m0.012s
sys 0m0.031s

0.043 vs 6.186,这个差距还是比较明显的,那这是如何造成的呢?
这是因为sleep过程中触发了系统的调度,但是系统调度会将进程从红黑树中移出,并放入等待队列,这个过程耗时明显。
在设计的时候实际上我们期待的是该执行订单线程能一直运行着,如果可以的话,想一直运行着,那么这实际上就是一种“忙等待”,我们来看看redis 6.0之后的多线程IO方案里的“忙等待”是如何执行的:

void *IOThreadMain(void *myid) {
    /* The ID is the thread number (from 0 to server.iothreads_num-1), and is
     * used by the thread to just manipulate a single sub-array of clients. */
    long id = (unsigned long)myid;
    char thdname[16];

    snprintf(thdname, sizeof(thdname), "io_thd_%ld", id);
    redis_set_thread_title(thdname);
    redisSetCpuAffinity(server.server_cpulist);
    makeThreadKillable();

    while(1) {
        /* Wait for start */
        for (int j = 0; j < 1000000; j++) {
            if (getIOPendingCount(id) != 0) break;
        }

        /* Give the main thread a chance to stop this thread. */
        if (getIOPendingCount(id) == 0) {
            pthread_mutex_lock(&io_threads_mutex[id]);
            pthread_mutex_unlock(&io_threads_mutex[id]);
            continue;
        }

        serverAssert(getIOPendingCount(id) != 0);

        /* Process: note that the main thread will never touch our list
         * before we drop the pending count to 0. */
        listIter li;
        listNode *ln;
        listRewind(io_threads_list[id],&li);
        while((ln = listNext(&li))) {
            client *c = listNodeValue(ln);
            if (io_threads_op == IO_THREADS_OP_WRITE) {
                writeToClient(c,0);
            } else if (io_threads_op == IO_THREADS_OP_READ) {
                readQueryFromClient(c->conn);
            } else {
                serverPanic("io_threads_op value is unknown");
            }
        }
        listEmpty(io_threads_list[id]);
        setIOPendingCount(id, 0);
    }
}

具体我们可以看到:

        /* Wait for start */
        for (int j = 0; j < 1000000; j++) {
            if (getIOPendingCount(id) != 0) break;
        }

也就是说一直让CPU忙碌,直到发现pending队列的io数量不为0,或者for了100万次。

那么在spin_lock的实现中,又是如何设计的呢?

  • 如果只是简单地不断地去check spinlock, 那么会非常占用CPU。
  • 如果使用sleep(0)或者是sched_yield()的话,那么会导致ring3 -> ring0的context switch. 延迟会非常高。
  • 最好的方式检查几轮spinlock, 然后使用sleep(0), sched_yield()切换出去,如此往复。
    不过在检查spinlock的时候,可以使用 mm_pause 这个指令。使用这个指令可以告诉CPU, 接下来的指令是是要去check spinlock, 所以不用full-speed地去检查,比如完全填满流水线这样,最后功能也能节省4%。这个指令之后接下来的执行可能会延迟一段时间,但是这个延迟时间是不可控,完全由CPU去决定的,所以我们不能依赖或者是假设这个延迟。

Essentially, the pause instruction delays the next instruction's execution for a finite period of time. By delaying the execution of the next instruction, the processor is not under demand, and parts of the pipeline are no longer being used, which in turn reduces the power consumed by the processor.
The pause instruction can be used in conjunction with a Sleep(0) to construct something similar to an exponential back-off in situations where the lock or more work may become available in a short period of time, and the performance may benefit from a short spin in ring 3. It is important to note that the number of cycles delayed by the pause instruction may vary from one processor family to another. You should avoid using multiple pause instructions, assuming you will introduce a delay of a specific cycle count. Since you cannot guarantee the cycle count from one system to the next, you should check the lock in between each pause to avoid introducing unnecessarily long delays on new systems.

这个指令按照Intel文档解释来说,就是专门给spinlock场景设计的,从文档看上去这个指令的latency很高(不知道这个latency是不是就是到接下来执行指令的延迟)。

void _mm_pause (void) #include <emmintrin.h> Instruction: pause CPUID Flags: SSE2 Provide a hint to the >processor that the code sequence is a spin-wait loop. This can help improve the performance and power consumption of spin-wait loops.

Architecture Latency Throughput (CPI) Skylake 140 140

最后实现出来的代码长的是这个样子的:

ATTEMPT_AGAIN:
  if (!acquire_lock())
  {
    /* Spin on pause max_spin_count times before backing off to sleep */
    for(int j = 0; j < max_spin_count; ++j)
    {
      /* pause intrinsic */
      _mm_pause();
      if (read_volatile_lock())
      {
        if (acquire_lock())

        {
          goto PROTECTED_CODE;
        }
      }
    }

    /* Pause loop didn't work, sleep now */
    Sleep(0);
    goto ATTEMPT_AGAIN;
  }
PROTECTED_CODE:
  get_work();
  release_lock();
  do_work();

致力于分布式系统的高并发和低延时系统的设计,有技术问题请联系李哥

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 220,002评论 6 509
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 93,777评论 3 396
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 166,341评论 0 357
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,085评论 1 295
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,110评论 6 395
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,868评论 1 308
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,528评论 3 420
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,422评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,938评论 1 319
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,067评论 3 340
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,199评论 1 352
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,877评论 5 347
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,540评论 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,079评论 0 23
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,192评论 1 272
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,514评论 3 375
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,190评论 2 357

推荐阅读更多精彩内容