前言

上次写纯技术文章还是五年前了，暮然回首，还是很怀念当年激情而单纯在计算机技术的海洋里满足自己的好奇，探究一些本质的东西。本文主要介绍线程概念、C++里面线程调用范式、线程安全、锁、以及浅用汇编窥探 C++ std::thread 的本质。

为什么想写线程

线程这个概念，是现代操作系统里多任务处理的底层诠释，无数程序员是遇到面试，还是遇到设计多任务并行场景，都是逃离不开要掌握的。当然也是能在出问题的时候，给无数程序员极其蛋疼的排查体验的。
尽管一些新的语言Python、Go等高度封装线程，也在语法、编程范式上让开发者解放不少，但是遇到一些极致要求高的技术场景，还是需要了解线程的本质，继而知道Python、Go是如何封装的，带来的副作用是什么，以及如何规避调优。

概念

不同技术人的视角：
APP开发者：卡主（UI）线程：刷抖音时候，不能因为要同时加载推荐视频，而不能滑动屏幕，要开线程去加载做。
Web：啥是线程？俺不知道，反正 js 得动态加载。
后台开发：一个网络请求貌似对应一个线程么？管它单机多少核心，并发上不去老板有钱就加机器、没钱说明开发的应用不行，关我啥事。
操作系统视角：
线程是程序执行时的最小单位，程序中一个单一的顺序控制流程。它是进程的一个执行流，是CPU调度和分派的基本单位，一个进程可以由很多个线程组成，线程间共享进程的所有资源，每个线程有自己的堆栈和局部变量。即代码区是共享的，不同的线程可以执行同样的函数。线程由CPU独立调度执行，在多CPU环境下就允许多个线程同时运行。

C++ 里线程基本玩法：

基本创建调度函数任务

#include <iostream>
#include <thread>
#include <mutex>

void printMessage(const std::string& msg) {
    std::cout << msg << std::endl;
}

int main() {
    std::thread t1(printMessage, "Hello from thread 1");
    std::thread t2(printMessage, "Hello from thread 2");

    t1.join();
    t2.join();

    return 0;
}

//执行结果：
Hello from thread 2
Hello from thread 1
或者：
Hello from thread 1
Hello from thread 2

代码创建了一个新的线程 t1、t2，并在线程中执行 printMessage 函数，传递 "Hello from thread 1" 、"Hello from thread 2" 作为参数。
join() 函数阻塞主线程，直到线程 t1 和 t2 完成执行。join() 确保主线程等待所有子线程完成后再继续执行，从而避免程序在子线程完成之前就退出。

不阻塞当前线程

如果当前线程不等待新创建线程结束，可以使用 detach() ：

void schedulingTask() {
    std::cout << "thread task begin!" << std::endl;
    std::this_thread::sleep_for(std::chrono::seconds(2));
    std::cout << "thread task finished!" << std::endl;
}

int main() {
    std::thread t(schedulingTask);
    
    // 分离线程，让它在后台运行
    std::cout << "created thread." << std::endl;
    t.detach();

    std::cout << "main thread continiue." << std::endl;

    // 等待一段时间，以确保后台线程完成
    std::this_thread::sleep_for(std::chrono::seconds(3));
    
    std::cout << "main thread finished." << std::endl;

    return 0;
}

//执行结果：
created thread.
thread task begin!
main thread continiue.
thread task finished!
main thread finished.

或者：
created thread.
main thread continiue.
thread task begin!
thread task finished!
main thread finished.

上面的代码创建了 schedulingTask 的耗时任务，通过新线程来执行，并且使用了detach方法。
得到的执行结果分析：
1.detach之后，主线程继续运行，schedulingTask 在后台运行，直到耗时2秒的任务完成。
2."main thread continiue." 与 "thread task begin!" 的打印顺序不定，说明detach之后，操作系统对于主线程继续执行、后台任务 schedulingTask 的调度是随机的。

验证线程跟进程关系

上面的代码去掉等待后台线程延时：std::this_thread::sleep_for(std::chrono::seconds(3));

int main() {
    std::thread t(schedulingTask);
    
    // 分离线程，让它在后台运行
    std::cout << "created thread." << std::endl;
    t.detach();

    std::cout << "main thread continiue." << std::endl;    
    std::cout << "main thread finished." << std::endl;
    return 0;
}

//执行结果：
created thread.
main thread continiue.
main thread finished.
thread task begin!

发现 thread task finished! 没有打印，说明尽管后台线程还在执行中，但是进程已经退出，由此说明线程是进程的一个执行流，且依附于进程。

不调用 join、detach 引发崩溃

int main() {
    std::thread t(schedulingTask);
    std::cout << "created thread." << std::endl;
    return 0;
}

上面代码运行崩溃：

企业微信20240810-223922.png

可以看到崩溃是 thread 析构之后，出现的。当 std::thread 对象的析构函数被调用时，如果线程仍然在运行且没有被 join()，C++ 标准规定程序应该调用 std::terminate()，导致程序异常终止。

C++ 线程安全

经典互斥锁案例：

#include <iostream>
#include <thread>
#include <mutex>
 
int counter = 0; // 共享资源
std::mutex mtx;  // 互斥锁

void incrementCounter() {
    for (int i = 0; i < 1000; ++i) {
        std::lock_guard<std::mutex> lock(mtx); // 加锁
        ++counter; // 线程安全地访问共享资源
    }
}

int main() {
    std::vector<std::thread> threads;
    // 创建10个线程
    for (int i = 0; i < 10; ++i) {
        threads.emplace_back(incrementCounter);
    }
    // 等待所有线程完成
    for (auto& t : threads) {
        t.join();
    }
    std::cout << "Final counter value: " << counter << std::endl;
    return 0;
}
// 输出
Final counter value: 10000

counter 是所有线程共享的变量。
互斥锁 std::mutex mtx 用于保护 counter，防止多个线程同时访问它。
加锁与解锁：使用 std::lock_guard<std::mutex> 来自动管理锁的生命周期。在 lock_guard 的作用域结束时（即离开代码块时），锁会自动释放。
使用 std::thread 创建 10 个线程，每个线程都会调用 incrementCounter 函数，使用 join 等待所有线程完成工作。
输出：
程序执行后，你应该会看到 Final counter value: 10000，因为每个线程都增加了 1000 次计数，共 10 个线程。

除了 std::mutex 以外，C++ 标准库还提供了几种其他常用的锁和同步机制，以应对不同的并发编程需求。以下是一些常见的锁和同步机制：

std::recursive_mutex 递归锁

#include <iostream>
#include <thread>
#include <mutex>

std::recursive_mutex rec_mtx;
int counter = 0;

void incrementCounter(int depth) {
    if (depth > 0) {
        std::lock_guard<std::recursive_mutex> lock(rec_mtx);
        ++counter;
        incrementCounter(depth - 1); // 递归调用
    }
}
int main() {
    std::thread t1(incrementCounter, 5);
    std::thread t2(incrementCounter, 5);

    t1.join();
    t2.join();
    std::cout << "Final counter value: " << counter << std::endl;
    return 0;
}

允许同一个线程多次加锁，而不会导致死锁。对于需要递归调用锁定同一资源的情况非常有用。

std::recursive_mutex 时效锁

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>

std::timed_mutex tmtx;

void tryLockForWork() {
    if (tmtx.try_lock_for(std::chrono::milliseconds(100))) {
        std::cout << "Work completed by thread " << std::this_thread::get_id() << std::endl;
        tmtx.unlock();
    } else {
        std::cout << "Thread " << std::this_thread::get_id() << " couldn't lock, doing something else." << std::endl;
    }
}

int main() {
    std::thread t1(tryLockForWork);
    std::thread t2(tryLockForWork);
    t1.join();
    t2.join();
    return 0;
}

类似于 std::mutex，但它允许线程尝试在一段时间内加锁，失败后可以放弃。

std::unique_lock 自主管理锁

#include <iostream>
#include <thread>
#include <mutex>
std::mutex mtx;

void printID(int id) {
    std::unique_lock<std::mutex> lock(mtx);
    std::cout << "Thread ID: " << id << std::endl;
    lock.unlock(); // 可以提前解锁
    // 做一些不需要锁的操作
    lock.lock(); // 重新加锁
    std::cout << "End of thread ID: " << id << std::endl;
}

int main() {
    std::thread t1(printID, 1);
    std::thread t2(printID, 2);
    t1.join();
    t2.join();
    return 0;
}

比 std::lock_guard 更灵活的锁管理器，支持延迟锁定、提前解锁和转移锁的所有权。

std::call_once 和 std::once_flag

#include <iostream>
#include <thread>
#include <mutex>
std::once_flag flag;

void initialize() {
    std::call_once(flag, []() {
        std::cout << "Initialized only once!" << std::endl;
    });
}

int main() {
    std::thread t1(initialize);
    std::thread t2(initialize);
    std::thread t3(initialize);
    t1.join();
    t2.join();
    t3.join();
    return 0;
}

确保某个函数只被执行一次（即使有多个线程尝试执行它）。

std::shared_mutex （C++17 引入）

用途：支持多读单写。多个线程可以同时读取资源，但写操作是独占的。适用于读多写少的场景：

#include <iostream>
#include <thread>
#include <shared_mutex>

std::shared_mutex sh_mtx;
int shared_resource = 0;

void reader() {
    std::shared_lock<std::shared_mutex> lock(sh_mtx);
    std::cout << "Read value: " << shared_resource << " by thread " << std::this_thread::get_id() << std::endl;
}

void writer() {
    std::unique_lock<std::shared_mutex> lock(sh_mtx);
    ++shared_resource;
    std::cout << "Written value: " << shared_resource << " by thread " << std::this_thread::get_id() << std::endl;
}

int main() {
    std::thread t1(reader);
    std::thread t2(reader);
    std::thread t3(writer);
    t1.join();
    t2.join();
    t3.join();
    return 0;
}

std::thread 的本质

以最简单的代码示例：

void printMessage(const std::string& msg) {
    std::cout << msg << std::endl;
}

int main() {
    std::thread t1(printMessage, "Hello from thread 1");
    t1.join();
    std::cout << "Main thread continiue." << std::endl;
    return 0;
}

打断点到 std::thread t1(printMessage, "Hello from thread 1");

企业微信20240810-192618.png

0x100001eb4 是构造函数，lldb 里 si 进入：

企业微信20240810-192757.png

0x100002bcc 还是创建有关的，继续 lldb 里 si 进入，会找到一个叫std::__1::__libcpp_thread_create 函数：

企业微信20240810-192842.png

继续lldb 里 si 进入：

企业微信20240810-192916.png

最后看到 pthread_create 说明C++ 这套线程底层就是基于C语言的 pthread。

结尾

后续会继续顺着线程的话题，比如涉及到跟网络socket、io等场景的结合，以及新的语言里提到协程、无栈协程等。

窥探 C++ 线程