显式模板实例化-什么时候使用？

Question 1

休息几周后，我试图用David Vandevoorde和Nicolai M. Josuttis撰写的“模板-完整指南”一书来扩展和扩展模板的知识，而现在我想了解的是模板的显式实例化。

我实际上没有这种机制的问题，但是我无法想象我想要或想要使用此功能的情况。如果有人可以向我解释，我将不胜感激。

Question 2

直接从https://docs.microsoft.com/zh-cn/cpp/cpp/explicit-instantiation复制：

您可以使用显式实例化来创建模板化类或函数的实例化，而无需在代码中实际使用它。因为这在创建使用模板进行分发的库（.lib）文件时很有用，所以未实例化的模板定义不会放入对象（.obj）文件中。

（例如，libstdc ++包含std::basic_string<char,char_traits<char>,allocator<char> >（是std::string）的显式实例，因此，每次使用函数时std::string，都不需要将相同的功能代码复制到对象。编译器仅需要将它们引用（链接）到libstdc ++。）

Question 3

如果定义模板类，则只想为几个显式类型工作。

像普通类一样，将模板声明放在头文件中。

像普通类一样，将模板定义放在源文件中。

然后，在源文件的末尾，仅显式实例化要使用的版本。

愚蠢的例子：

// StringAdapter.h
template<typename T>
class StringAdapter
{
     public:
         StringAdapter(T* data);
         void doAdapterStuff();
     private:
         std::basic_string<T> m_data;
};
typedef StringAdapter<char>    StrAdapter;
typedef StringAdapter<wchar_t> WStrAdapter;

资源：

// StringAdapter.cpp
#include "StringAdapter.h"

template<typename T>
StringAdapter<T>::StringAdapter(T* data)
    :m_data(data)
{}

template<typename T>
void StringAdapter<T>::doAdapterStuff()
{
    /* Manipulate a string */
}

// Explicitly instantiate only the classes you want to be defined.
// In this case I only want the template to work with characters but
// I want to support both char and wchar_t with the same code.
template class StringAdapter<char>;
template class StringAdapter<wchar_t>;

主要

#include "StringAdapter.h"

// Note: Main can not see the definition of the template from here (just the declaration)
//       So it relies on the explicit instantiation to make sure it links.
int main()
{
  StrAdapter  x("hi There");
  x.doAdapterStuff();
}

Question 4

显式实例化可以减少编译时间和对象大小

这些是它可以提供的主要收益。它们来自以下各节中详细描述的以下两种效果：

从标题中删除定义，以防止构建工具重建包含程序
对象重新定义

从标题中删除定义

显式实例化使您可以将定义保留在.cpp文件中。

当定义位于标头上并对其进行修改时，智能的构建系统将重新编译所有包含程序，这些包含程序可能包含数十个文件，从而使编译速度变得异常缓慢。

将定义放入.cpp文件确实有一个缺点，即外部库无法使用其自己的新类重用模板，但是下面的“从包含的标头中删除定义，但也向模板公开了外部API”显示了一种解决方法。

请参见下面的具体示例。

重新定义对象：了解问题

如果仅在头文件上完全定义一个模板，则包含该头的每个编译单元最终都会针对所使用的每种不同的模板参数，编译其自己的模板的隐式副本。

这意味着大量无用的磁盘使用和编译时间。

下面是一个具体的例子，其中两个main.cpp和notmain.cpp隐式地定义MyTemplate<int>，因为它在这些文件中使用。

main.cpp

#include <iostream>

#include "mytemplate.hpp"
#include "notmain.hpp"

int main() {
    std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}

notmain.cpp

#include "mytemplate.hpp"
#include "notmain.hpp"

int notmain() { return MyTemplate<int>().f(1); }

mytemplate.hpp

#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP

template<class T>
struct MyTemplate {
    T f(T t) { return t + 1; }
};

#endif

notmain.hpp

#ifndef NOTMAIN_HPP
#define NOTMAIN_HPP

int notmain();

#endif

GitHub上游。

使用以下命令编译和查看符号nm：

g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o notmain.o notmain.cpp
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o main.o main.cpp
g++    -Wall -Wextra -std=c++11 -pedantic-errors -o main.out notmain.o main.o
echo notmain.o
nm -C -S notmain.o | grep MyTemplate
echo main.o
nm -C -S main.o | grep MyTemplate

输出：

notmain.o
0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
main.o
0000000000000000 0000000000000017 W MyTemplate<int>::f(int)

从中man nm，我们看到这W意味着弱符号，GCC选择了弱符号，因为这是模板函数。弱符号表示这MyTemplate<int>两个文件都编译了隐式生成的代码。

它不会在具有多个定义的链接时崩溃的原因是，链接器接受多个弱定义，并且仅选择其中一个以放入最终可执行文件中。

输出中的数字表示：

0000000000000000：部分内的地址。零是因为模板会自动放入自己的部分
0000000000000017：为其生成的代码的大小

我们可以通过以下方式更清楚地看到这一点：

objdump -S main.o | c++filt

结束于：

Disassembly of section .text._ZN10MyTemplateIiE1fEi:

0000000000000000 <MyTemplate<int>::f(int)>:
   0:   f3 0f 1e fa             endbr64 
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   c:   89 75 f4                mov    %esi,-0xc(%rbp)
   f:   8b 45 f4                mov    -0xc(%rbp),%eax
  12:   83 c0 01                add    $0x1,%eax
  15:   5d                      pop    %rbp
  16:   c3                      retq

并且_ZN10MyTemplateIiE1fEi是错位的名字MyTemplate<int>::f(int)>，其c++filt决定不unmangle。

因此，我们看到为每个方法实例化都生成一个单独的节，并且每个节都占用了目标文件中的课程空间。

对象重新定义问题的解决方案

可以通过使用显式实例化和以下任一方法来避免此问题：

在hppextern template上保留定义，并在hpp上添加将要显式实例化的类型。

如以下内容所述：使用extern模板（C ++ 11） extern template可以防止编译单元实例化完全定义的模板，除了我们的显式实例化。这样，将在最终对象中仅定义我们的显式实例化：

mytemplate.hpp
```
#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP

template<class T>
struct MyTemplate {
    T f(T t) { return t + 1; }
};

extern template class MyTemplate<int>;

#endif
```
mytemplate.cpp
```
#include "mytemplate.hpp"

// Explicit instantiation required just for int.
template class MyTemplate<int>;
```
main.cpp
```
#include <iostream>

#include "mytemplate.hpp"
#include "notmain.hpp"

int main() {
    std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}
```
notmain.cpp
```
#include "mytemplate.hpp"
#include "notmain.hpp"

int notmain() { return MyTemplate<int>().f(1); }
```
缺点：
- 如果您是仅标头的库，则可以强制外部项目执行自己的显式实例化。如果您不是仅标头的库，则此解决方案可能是最好的。
- 如果模板类型是在您自己的项目中定义的，而不是内置的like int，似乎您被迫在标头上添加包含，则向前声明是不够的：extern template＆incomplete types这会增加标头依赖性一点点。
移动cpp文件上的定义，仅在hpp上保留声明，即将原始示例修改为：

mytemplate.hpp
```
#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP

template<class T>
struct MyTemplate {
    T f(T t);
};

#endif
```
mytemplate.cpp
```
#include "mytemplate.hpp"

template<class T>
T MyTemplate<T>::f(T t) { return t + 1; }

// Explicit instantiation.
template class MyTemplate<int>;
```
缺点：外部项目无法将模板与自己的类型一起使用。另外，您还必须显式实例化所有类型。但这也许是一个好处，因为从那时起程序员就不会忘记。

在hppextern template上保留定义，并在每个includer上添加：

mytemplate.cpp

#include "mytemplate.hpp"

// Explicit instantiation.
template class MyTemplate<int>;

main.cpp

#include <iostream>

#include "mytemplate.hpp"
#include "notmain.hpp"

// extern template declaration
extern template class MyTemplate<int>;

int main() {
    std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}

notmain.cpp

#include "mytemplate.hpp"
#include "notmain.hpp"

// extern template declaration
extern template class MyTemplate<int>;

int notmain() { return MyTemplate<int>().f(1); }

缺点：所有包含程序的人都必须在externCPP文件中添加，程序员可能会忘记这样做。

无论采用哪种解决方案，nm现在都包含：

notmain.o
                 U MyTemplate<int>::f(int)
main.o
                 U MyTemplate<int>::f(int)
mytemplate.o
0000000000000000 W MyTemplate<int>::f(int)

所以我们看到只有mytemplate.o一个MyTemplate<int>所需的编译，而notmain.o并main.o没有，因为U意味着未定义。

从包含的标头中删除定义，但还在仅标头的库中向外部模板公开模板

如果您的库不是仅标头，则该extern template方法将起作用，因为使用项目将仅链接到您的目标文件，该文件将包含显式模板实例化的对象。

但是，对于仅标头库，如果您想同时使用以下两种方法：

加快项目的编译速度
将标头公开为外部库API，以供其他人使用

那么您可以尝试以下操作之一：

- mytemplate.hpp：模板定义
- mytemplate_interface.hpp：模板声明仅匹配from中的定义mytemplate_interface.hpp，没有定义
- mytemplate.cpp：包括mytemplate.hpp并进行明确的实例化
- main.cpp以及代码库中的其他所有位置：include mytemplate_interface.hpp，而不是mytemplate.hpp
- mytemplate.hpp：模板定义
- mytemplate_implementation.hpp：包括mytemplate.hpp并添加extern到将实例化的每个类中
- mytemplate.cpp：包括mytemplate.hpp并进行明确的实例化
- main.cpp以及代码库中的其他所有位置：include mytemplate_implementation.hpp，而不是mytemplate.hpp

甚至对于多个标头来说甚至更好：在文件夹内创建一个intf/impl文件includes/夹，并mytemplate.hpp始终用作名称。

该mytemplate_interface.hpp方法如下所示：

mytemplate.hpp

#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP

#include "mytemplate_interface.hpp"

template<class T>
T MyTemplate<T>::f(T t) { return t + 1; }

#endif

mytemplate_interface.hpp

#ifndef MYTEMPLATE_INTERFACE_HPP
#define MYTEMPLATE_INTERFACE_HPP

template<class T>
struct MyTemplate {
    T f(T t);
};

#endif

mytemplate.cpp

#include "mytemplate.hpp"

// Explicit instantiation.
template class MyTemplate<int>;

main.cpp

#include <iostream>

#include "mytemplate_interface.hpp"

int main() {
    std::cout << MyTemplate<int>().f(1) << std::endl;
}

编译并运行：

g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o mytemplate.o mytemplate.cpp
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o main.o main.cpp
g++    -Wall -Wextra -std=c++11 -pedantic-errors -o main.out main.o mytemplate.o

输出：

在Ubuntu 18.04中测试。

C ++ 20模块

https://zh.cppreference.com/w/cpp/language/modules

我认为该功能将在可用时提供最佳的设置，但我尚未对其进行检查，因为它在我的GCC 9.2.1中尚不可用。

您仍然必须进行显式实例化以节省速度/节省磁盘，但是至少我们有一个合理的解决方案，“从包含的标头中删除定义，但还向模板公开了一个外部API”，该方法不需要复制大约100次。

预期的用法（没有显式的实例化，不确定确切的语法是什么样，请参见：如何在C ++ 20模块中使用模板显式实例化？）是：

helloworld.cpp

export module helloworld;  // module declaration
import <iostream>;         // import declaration
 
template<class T>
export void hello(T t) {      // export declaration
    std::cout << t << std::end;
}

main.cpp

import helloworld;  // import declaration
 
int main() {
    hello(1);
    hello("world");
}

然后在https://quuxplusone.github.io/blog/2019/11/07/modular-hello-world/中提到的编译

clang++ -std=c++2a -c helloworld.cpp -Xclang -emit-module-interface -o helloworld.pcm
clang++ -std=c++2a -c -o helloworld.o helloworld.cpp
clang++ -std=c++2a -fprebuilt-module-path=. -o main.out main.cpp helloworld.o

因此，从中我们可以看到clang可以将模板接口+实现提取到magic中helloworld.pcm，其中必须包含源的一些LLVM中间表示形式：如何在C ++模块系统中处理模板？仍然允许模板规范的发生。

如何快速分析您的构建，看看它是否将从模板实例化中受益匪浅

因此，您有一个复杂的项目，并且想确定模板实例化是否会带来重大收益，而无需实际进行完整的重构？

下面的分析可能会通过借鉴以下内容来帮助您决定或至少选择最有希望的对象进行实验时首先进行重构：我的C ++对象文件太大

# List all weak symbols with size only, no address.
find . -name '*.o' | xargs -I{} nm -C --size-sort --radix d '{}' |
  grep ' W ' > nm.log

# Sort by symbol size.
sort -k1 -n nm.log -o nm.sort.log

# Get a repetition count.
uniq -c nm.sort.log > nm.uniq.log

# Find the most repeated/largest objects.
sort -k1,2 -n nm.uniq.log -o nm.uniq.sort.log

# Find the objects that would give you the most gain after refactor.
# This gain is calculated as "(n_occurences - 1) * size" which is
# the size you would gain for keeping just a single instance.
# If you are going to refactor anything, you should start with the ones
# at the bottom of this list. 
awk '{gain = ($1 - 1) * $2; print gain, $0}' nm.uniq.sort.log |
  sort -k1 -n > nm.gains.log

# Total gain if you refactored everything.
awk 'START{sum=0}{sum += $1}END{print sum}' nm.gains.log

# Total size. The closer total gain above is to total size, the more
# you would gain from the refactor.
awk 'START{sum=0}{sum += $1}END{print sum}' nm.log

梦想：模板编译器缓存

我认为最终的解决方案是，如果我们可以建立：

g++ --template-cache myfile.o file1.cpp
g++ --template-cache myfile.o file2.cpp

然后myfile.o会自动在文件之间重用以前编译的模板。

除了将额外的CLI选项传递到构建系统之外，这还意味着程序员需要付出额外的0精力。

显式模板实例化的第二个好处：帮助IDE列出模板实例化

我发现某些IDE（例如Eclipse）无法解析“所有使用的模板实例的列表”。

因此，例如，如果您位于模板代码中，并且想要查找模板的可能值，则必须一一找到构造函数的用法，并一一推断出可能的类型。

但是在Eclipse 2020-03上，我可以通过对类名进行“查找所有用法”（Ctrl + Alt + G）搜索来轻松列出显式实例化的模板，例如，我指向：

template <class T>
struct AnimalTemplate {
    T animal;
    AnimalTemplate(T animal) : animal(animal) {}
    std::string noise() {
        return animal.noise();
    }
};

至：

template class AnimalTemplate<Dog>;

这是一个演示：https : //github.com/cirosantilli/ide-test-projects/blob/e1c7c6634f2d5cdeafd2bdc79bcfbb2057cb04c4/cpp/animal_template.hpp#L15

您可以在IDE之外使用的另一种游击技术是nm -C在最终可执行文件上运行并grep模板名称：

nm -C main.out | grep AnimalTemplate

这直接指向以下事实Dog之一：

0000000000004dac W AnimalTemplate<Dog>::noise[abi:cxx11]()
0000000000004d82 W AnimalTemplate<Dog>::AnimalTemplate(Dog)
0000000000004d82 W AnimalTemplate<Dog>::AnimalTemplate(Dog)

Question 5

它取决于编译器模型-显然有Borland模型和CFront模型。然后，这也取决于您的意图-如果您正在编写库，则可以（如上所述）显式实例化所需的专业化。

GNU c ++页面在此处https://gcc.gnu.org/onlinedocs/gcc-4.5.2/gcc/Template-Instantiation.html讨论模型。