如何高效创建 C++ 并行构建？

Blog

Author:: Dori Exterman
Published On:: 4月 10, 2021
Estimated reading time:: 1 minutes

在“CPPCon 2019”中，谷歌首席软件工程师 Chandler Carruth 的演讲《没有零成本的抽象》向大家展示了抽象与成本的关系。抽象需要运行时间、构建时间和人力成本。他清晰地向大家解释了使用 Arena 内存分配器降低运行时成本，反而导致构建成本增加的原因。他也提到，编译本质上是一个高度输出的分布式系统。但是，首先我们要如何建立这样一个系统的？如何有效地进行分布？如何创建高效的并行构建？欲知详情，请继续阅读…

知己知彼，百战百胜

你的构建时间以哪种度量单位计算？几秒、几分，还是几小时？在几秒钟内完成构建，这是每个程序员梦寐以求的事。正如我们把 WTK 的数据当做唯一有效的代码评审指标一样?，程序员耗费在喝咖啡等待构建完成的时间，也是衡量工作效率的一个有效指标。我希望这个时间不要超过 5 分钟。但该如何实现这个目标？大致有以下两种方法：

微观优化
宏观优化

微观优化，加速构建

使用 MSBuild 在命令行上执行构建时，你可能看到以下消息：

Building the projects in this solution one at a time. To enable parallel build, please add the “-m” switch.

在此解决方案中一次只能构建一个项目。要启用并行构建，请添加“-m”开关。

MSBuild 并行编译是通过 -m 开关启用的，并且我们可以限定并行构建的进程数量。如果未使用开关，则会收到上述消息；如果在使用 -m 开关时未限定具体的并行进程数值，则 MSBuild 默认设置为计算机上的处理器数量。

C++ 的 Visual Studio 并行构建需要在配置选项中设置。（项目属性 > C/C++ > 通用 > 多核并行编译，如图）

如果使用 Make 进行构建，请记住使用 -j 标志。此参数允许多个独立任务并行运行，减少构建时间。

对于 CMake 并行构建，你可以参考博客《现代 CMake 使用技巧》第 15 条（DRY 是我个人很喜欢遵循的一个原则，且不仅局限于软件工程领域）

使用预编译头文件可以大大加快后续构建的速度。在编译过程中，每个文件都会解析并概括成抽象语法树。这个语法树是解析文件的中间格式，另外，预编译头文件本身也是中间格式，保存着那些很少更改的头文件。顾名思义，对于预编译的头文件，解析和编译步骤都已避免，因此减少了项目构建时间。

美中不足的是，在分布式构建场景中，预编译头文件有时也会出错。预编译头不是并行构建的多个单元，而是聚合单元，因此任务无法并行处理。所以，如果需要重新编译预编译头文件，那么分布式构建反而拖累了进程。

因此，尽量保证每个编译单元的依赖关系简单清晰。在编译单元中，依赖关系可以通过类/结构引用、函数调用、相应头文件的 API 调用（标准系统库、STL、第三方库等）。如果已包含常见的<iostream>头文件时，你可以进行如下间接引用：

这里有一些小诀窍——我向大家推荐一个可以减少头文件依赖性的小工具，叫做 include-what-you-use（可以顺便看看这篇讨论如何利用 include-what-you-use 的文章）。这个小工具最初用于谷歌源代码树，现在仍处于内测状态，但我用过，的确很方便。

如果你已经在 D:\Tools\IWYU 中安装了 include-what-you-use，可以参考下列内容，了解如何在 CMake 并行构建中使用这个工具：

CMake -H. -Bbuild -DCMAKE_CXX_INCLUDE_WHAT_YOU_USE=”D:\Tools\IWYU\include-what-you-use.exe;-Xiwyu;any;-Xiwyu;iwyu;-Xiwyu;–driver-mode=cl” -DCMAKE_C_INCLUDE_WHAT_YOU_USE=”D:\Tools\IWYU\include-what-you-use.exe;-Xiwyu;any;-Xiwyu;iwyu;-Xiwyu;–driver-mode=cl” -G “Ninja”

(你不用 CMake? 点击链接了解 CMake 相关博客）

这是有关格式问题的警告:

[2/72] Building CXX object CMakeFiles\mysecretproject\secret_vector_core.cpp.obj

../secret_vector_core.cpp should add these lines:

#include <corecrt_math.h> // for fabs, atan, sqrt

#include <corecrt_search.h> // for qsort

#include <vcruntime_string.h> // for memset

#include <cmath> // for pow

../secret_vector_core.cpp should remove these lines:

– #include <math.h> // lines 3-3

– #include <iostream> // lines 7-7

The full include-list for ../secret_vector_core.cpp:

#include <corecrt_math.h> // for fabs, atan, sqrt

#include <corecrt_search.h> // for qsort

#include <stdio.h> // for sprintf, NULL

#include <stdlib.h> // for free, malloc

#include <string.h> // for strlen, strncat

#include <vcruntime_string.h> // for memset

#include <cmath> // for pow

—

以下是如何在 Windows 上使用 include-what-you-use 的建议：

1.CMake 对 include-what-you-use 本地兼容，但请记住，如果源代码树同时包含 C 和 CPP 文件，则需要同时设定两个选项：CMAKE_CXX_INCLUDE_WHAT_YOU_USE 和 CMAKE_C_INCLUDE_WHAT_YOU_USE

2.如果使用 Visual Studio 编译器进行构建，则需要设定 –driver mode=cl 参数

3.生成器必须是 Ninja，因为 Windows 默认的 Visual Studio 生成器不支持include-what-you-use。

宏观优化，加速构建

在开始介绍任何宏观优化加速 C 或 C++ 构建的内容之前，我想先说明一下。我是 Incredibuild 的技术首席官，我们公司的职责是为客户提供更快的构建方案。我们一开始只是帮助客户减少编译时间，现在，我们提供多种解决方案，包括编译速度、测试、代码分析、模拟等等，大幅提速持续集成周期。

加速构建的第一个宏观优化技术是多核并行处理器扩展。多核并行处理器局限于单一的机器中，但如果我们可以使用大量联网的计算机来分配工作进程呢? Incredibuild 等工具就完美实现这一点，帮助提高性能。

快速构建的第二个宏观优化技术，是优化持续集成管道的每个部分——无论是Azure DevOps 还是 Jenkins 构建。设置 Jenkins 并行构建就像配置一个主节点和两个从节点以上一样简单。向 Jenkins 节点添加分布式构建功能（使用Incredibuild）可以将构建节点转换为具有上百个内核的超级计算机，这些超级计算机可以使用本地网络中的空闲 CPU 或无缝扩展到公共云中。关于如何使用 Jenkins 声明式和命令式管道方法建立 Jenkins 并行构建的方法，网上有很多资料。但在大多数持续集成构建中，都涉及多个步骤，例如：签出分支、运行配置检查、执行 CMake 配置、执行 CMake 构建，甚至快速冒烟测试，确保构建质量过关。你必须优化管道中的每个阶段以提升构建速度。在这一点上，我强烈建议大家看看我之前关于左移策略的博客。

结论

凡事预则立，行以致远。准备和实践在编程中也至为关键。在任何项目中，都应该有技术领航人，了解优化如构建等普通任务的价值。这些任务看似微不足道，实则极大地影响了开发人员的工作效率。建立高效的并行构建并非易事，如果这已成为开发的瓶颈，建议使用 Incredibuild 等分布式处理技术。

Dori Exterman

Dori Exterman 是一名软件开发专家、产品策略分析师，在软件开发行业拥有 20 年的工作经历。作为 Incredibuild 的首席技术官（CTO），他指导公司的产品策略，负责规划产品前景、执行方案、选择技术合作伙伴。在加入 Incredibuild 之前，Dori 在软件公司身兼数职，主要负责各种技术和产品开发，聚焦系统架构、产品性能、先端技术、DevOps、发布管理和 C++.他是开发工具先进技术领域的专家和分享者。

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_mkto_trk	2 years	This cookie, provided by Marketo, has information (such as a unique user ID) that is used to track the user's site usage. The cookies set by Marketo are readable only by Marketo.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
utm_medium	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_147093399_1	1 minute	Set by Google to distinguish users.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
BAIDUID	1 year	Baidu installs this cookie to store analytical data like number of sessions, time spent on the page, bounce rate, the device used, etc.
utm_campaign	2 months	Google Ad Services sets this cookie to store session campaign value if present.
utm_content	2 months	This cookie is used for storing the session content value if present.
utm_source	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
utm_term	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.

Cookie	Duration	Description
AGL_USER_ID	7977 years 6 months 22 days 13 hours	No description available.
BIGipServersn-mch-v2-80	session	No description
BIGipServersn02web-nginx-app_https	session	No description
Hm_ck_1654686534484	session	No description
Hm_ck_1654686545903	session	No description
Hm_ck_1654686785317	session	No description
Hm_ck_1654686803939	session	No description
Hm_ck_1654686830687	session	No description
Hm_ck_1654686905307	session	No description
Hm_lpvt_08824d287f65a57bc02536f25f8be026	session	No description
Hm_lvt_08824d287f65a57bc02536f25f8be026	1 year	No description
HMACCOUNT	15 years 7 months 10 days 13 hours	This cookie is set by the provider Baidu. This cookie is used to send data about visitor device and behaviour to Baidu. It helps in tracking the visitor across devices.
HMACCOUNT_BFESS	15 years 7 months 10 days 13 hours	No description available.
ib_last_referrer	2 months	No description
incap_ses_873_2167377	session	No description
nlbi_2167377	session	No description
referrer66_00f	1 month	No description
SESSb11a5778793f573778d3e2b21c7f1e0a	session	No description
visid_incap_2167377	1 year	No description
visitorId	1 year	No description

如何高效创建 C++ 并行构建？

Dori Exterman

订阅博客

Related Posts

我们如何让 CI 快 85% —— 内部先行实践实录

Agentic AI 如何改变软件开发

使用人工智能进行软件开发的挑战