元编程 (meta-programming)

2022-01-17 04:56:22

术语

meta：英语前缀词根，来源于希腊文。*一般翻译成”元“。

在逻辑学中，可以理解为：关于X的更高层次，同时，这个更高层次的范围仍然在X的范围之内。

meta-data

meta-function

meta-bank

meta-verse

meta-programming

因何而生

诞生必然性：需要非常灵活的代码来适应快速变化的需求，同时保证性能。

定义

元编程 (meta-programming) 通过操作 程序实体 (program entity)，在 编译时 (compile time) 计算出 运行时 (runtime) 需要的常数、类型、代码的方法。

区别：

一般代码的操作对象是数据。

元编程的操作对象是代码。code as data。

如果编程的本质是抽象，那么元编程就是更高层次的抽象。

Metaprogramming is writing code that writes code.

用处

数值计算和类型计算。

提高运行时性能
提高类型安全

编程语言的支持

计算机语言有两种类型：

从汇编起步，C、C++、Java
从数学模型起步，Lisp、Julia

Lisp是第一个实现了“将代码作为数据”的语言。

元编程机制是现代编程语言的标配。

C++：Boost MPL、Facebook fatal（Facebook Template Library）、Blitz++。

Julia：天生自带。

C++

C++是一个语言联邦，集众家之所长。C++之父表示“我只是熟悉”。

五种编程范式

面向过程
面向对象
泛型
模板元
函数式

其中，”模板元“是最难的，有些代码像看天书。

随着C++20版本的发布，按照后续发展趋势，这些范式会融为一体，彼此没有非常明确的界限，混合范式是将来的趋势。

“面向过程”和“面向对象”是最基本的范式，是C++的基础，无论如何都是必须要掌握的。

如果是开发直接面对用户的普通应用（Application），那么可以再研究一下“泛型”和“函数式”。

如果是开发面向程序员的库（Library），那么非常有必要深入了解“泛型”和“模板元”，优化库的接口和运行效率。

模板元编程

最开始，模板技术是为了实现泛型的，是泛型编程的基础。

后来，偶然发现模板可以用来实现元编程机制，并且证明了模板技术是图灵完备的。

于是模板元编程（template meta-programming，TMP）就诞生了。

模板能做元编程完全是个意外，所以其语法其丑无比。

模板语法很丑，但是它很强大。

C++以模板为基础，历经多个版本，把元编程这个坑越挖越大，也越来越漂亮。

Q：如果模板不能实现元编程机制，C++中的元编程机制会如何实现？

A：如果模板不能实现元编程机制，C++也会通过其他方式来实现元编程机制。原因：高性能是C++不可能放弃的方向。

核心思想

基本的程序结构：顺序、选择、循环。

顺序结构：按照语句出现的先后顺序一次执行
选择结构：根据条件判断是否执行相关语句
循环结构：当条件成立时，重复执行某些语句

图灵完备。理论上可以实现任何可实现的算法。

基础设施

操作对象

模板元编程使用C++中的静态语言成分，所以不能操作变量，只能操作类型和常量。

输入

命名约定：类型_Ty、常量_Val。

非强制约定。

输出

命名约定：类型type、常量value。也有用_t、_v封装。

强制约定。

template< class T >
using remove_reference_t = typename remove_reference<T>::type;

template< class T >
inline constexpr bool is_class_v = is_class<T>::value;

还有一种输出：代码。一般指代码展开。

基本结构

元编程是以模板为基础，准确的说应该是模板特化和递归。

种类

值元编程（Value Metaprogramming）

C++11之前用递归的模板实例化来实现，比较复杂。

template<unsigned int n>
struct Factorial {
	enum { value = n * factory<n - 1>::value };
};
template<>
struct Factorial<0> {
	enum { value = 1 };
};
int main() {
	Factorial<4>::value;
	return 0;
}

C++11引入了constexpr，另一种实现。

template<unsigned int n>
struct Factorial {
	static constexpr int value = n * Factorial<n - 1>::value ;
};
template<>
struct Factorial<0> {
	static constexpr int value = 1;
};
int main() {
	Factorial<4>::value;
	return 0;
}

C++14完善了constexpr，大大简化了这个实现。

template <typename T>
constexpr T Factorial(T x) {
	if (x <= 1) {
		return 1;
	}
	T s = 1;
	for (T i = 2; i <= x; i++) {
		s *= i;
	}
	return s;
}

int main() {
    static_assert(Factorial(4) == 24, "error");
    return 0;
}

递归实现

constexpr int Factorial(unsigned int n) {
	if (n <= 1) {
		return 1;
	} else {
		return n * Factorial(n - 1);
	}
}

int main() {
	static_assert(Factorial(4) == 24, "error");
	return 0;
}

constexpr ：表示修饰的对象可以在编译期算出来，修饰的对象可以当做常量。

修饰变量：

这个变量就是编译期常量。
修饰函数：

如果传入的参数可以在编译时期计算出来，那么这个函数就会产生编译时期的值。

否则，这个函数就和普通函数一样了。
修饰构造函数：

这个构造函数只能用初始化列表给属性赋值并且函数体要是空的。

构造函数创建的对象可以当作常量使用。

constexpr的特点：

给编译器足够的信心在编译期去做优化，优化被constexpr修饰的表达式。
当其检测到函数参数是一个常量字面值的时候，编译器才会去对其做优化，否则，依然会将计算任务留给运行时。
constexpr修饰的是函数，不是返回值。
constexpr修饰的函数，默认inline。

Q：const和constexpr的区别？

A：在 C 里面，const 很明确只有「只读 read only」一个语义，不会混淆。C++ 在此基础上增加了「常量 const」语义，也由 const 关键字来承担，引出来一些奇怪的问题。C++11 把「常量」语义拆出来，交给新引入的 constexpr 关键字。

在 C++11 以后，建议凡是「常量」语义的场景都使用 constexpr，只对「只读」语义使用 const。

constexpr简化了值元编程的难度，但是应用范围有限。constexpr的初衷是为了承担「常量」语义。

类型元编程（Type Metaprogramming）

template <class _Ty>
struct remove_reference {
    using type                 = _Ty;
};

template <class _Ty>
struct remove_reference<_Ty&> {
    using type                 = _Ty;
};

template <class _Ty>
struct remove_reference<_Ty&&> {
    using type                 = _Ty;
};

template <class _Ty>
using remove_reference_t = typename remove_reference<_Ty>::type;

//以下写法等价
int a;
remove_reference_t<int>   a;
remove_reference_t<int&>  a;
remove_reference_t<int&&> a;

混合元编程

计算array的点积。

#include <iostream>
#include <array>
using namespace std;

template<typename T, std::size_t N>
struct DotProductT {
	static inline T result(const T* a, const T* b) {
		return (*a) * (*b) + DotProductT<T, N - 1>::result(a + 1, b + 1);
	}
};

template<typename T>
struct DotProductT<T, 0> {
	static inline T result(const  T*, const  T*) {
		return T{};
	}
};
template<typename T, std::size_t N>
auto dotProduct(std::array<T, N> const& x, std::array<T, N> const& y) {
	return DotProductT<T, N>::result(x.data(), y.data());
}

int main() {
	array<int, 3> A{1, 2, 3};

	auto x = dotProduct(A, A);
	cout << x << endl;
	return 0;
}

编译时：生成了代码结构，把for循环展开。

运行时：执行生成的代码，计算出结果。

一般约定

为了统一，返回值的命名为“value”，返回类型的命名为“type”。

实践证明，对于现代C++编程而言，元编程最大的用场并不在于编译期数值计算，而是用于类型计算（type computation）。

类型计算的约定

类型计算分为两类：

通过运算得到一个新类型
判断类型是否符合某种条件

template< class T >
using remove_reference_t = typename remove_reference<T>::type;

template< class T >
inline constexpr bool is_class_v = is_class<T>::value;

进一步统一，返回“value”的都改为返回“type”，通过一个类模板封装：

修改前：

template <typename T> struct is_reference      { static constexpr bool value = false; };    
template <typename T> struct is_reference<T&>  { static constexpr bool value = true; };     
template <typename T> struct is_reference<T&&> { static constexpr bool value = true; };

修改后：

template <bool b>
struct bool_ { static constexpr bool value = b; };

template <typename T> struct is_reference      { using type = bool_<false>; };
template <typename T> struct is_reference<T&>  { using type = bool_<true>; };
template <typename T> struct is_reference<T&&> { using type = bool_<true>; };

在调用 is_reference 时，也是使用 “type” 这个名字，如果想访问结果中的布尔值，使用 is_reference<T>::type::value 即可。

保证外界在使用类型计算时，都以 “type” 作为唯一的返回值。

目的是规范元编程的代码，使其更具可读性和兼容性。

断言和契约

编译时断言

C++11 引入了关键字static_assert。

static_assert(1 + 1 == 2, "error");

编译时契约（约束）

C++20 concept、requires

#include <iostream>
#include <type_traits>
using namespace std;

template<typename T>
concept Integral = is_integral_v<T>;

template<Integral T>
T Add(T a, T b) {
	return a + b;
}

template<typename T>
	requires Integral<T>
T Add2(T a, T b) {
	return a + b;
}

template<typename T>
T Add3(T a, T b) requires Integral<T> {
	return a + b;
}

Integral auto Add4(Integral auto a, Integral auto b) {
	return a + b;
}

int main() {
	Add(1, 2);
	//Add(1.1, 2.2); //error “Add”: 未满足关联约束
	return 0;
}

还支持不同参数设置不同的约束。

template<typename T>
concept Floating = ::is_floating_point_v<T>;

auto Add5(Integral auto a, Floating auto b) {
	return a + b;
}

template<typename T1, typename T2>
	requires Integral<T1> && Floating<T2>
double Add6(T1 a, T2 b) {
	return a + b;
}

concept替代了C++11的enable_if。

concept可以使代码清晰不少，还可以使编译错误提示更直观。

C++20的四大特性：concept、ranges、coroutine、module

concept 语法的出现，大大简化了泛型编程和元编程的难度。

语法

类型参数、模板参数、typedef/using、enum/static/constexpr、内嵌类成员

SFINAE（Substitution Failure Is Not An Error）：替换失败不是一个错误。

C++11 enable_if、conditional

C++20 concept、requires

介绍下<type_traits>基础类，integral_constant包装了指定类型的静态常量。

template <class _Ty, _Ty _Val>
struct integral_constant {
    static constexpr _Ty value = _Val;

    using value_type = _Ty;
    using type       = integral_constant;

    constexpr operator value_type() const noexcept {
        return value;
    }
    
    // since c++14
    _NODISCARD constexpr value_type operator()() const noexcept {
        return value;
    }
};

template <bool _Val>
using bool_constant = integral_constant<bool, _Val>;

using true_type  = bool_constant<true>;
using false_type = bool_constant<false>;

Julia

数值计算

JuMP ("Julia for Mathematical Programming")

using JuMP
using GLPK
model = Model(GLPK.Optimizer)
@variable(model, x >= 0)
@variable(model, 0 <= y <= 3)
@objective(model, Max, 12x + 20y)
@constraint(model, c1, 6x + 8y <= 100)
@constraint(model, c2, 7x + 12y <= 120)
print(model)
optimize!(model)

@show termination_status(model)
@show primal_status(model)
@show dual_status(model)
@show objective_value(model)
@show value(x)
@show value(y)
@show shadow_price(c1)
@show shadow_price(c2)

输出：

julia> 
Max 12 x + 20 y
Subject to
 c1 : 6 x + 8 y <= 100.0
 c2 : 7 x + 12 y <= 120.0
 x >= 0.0
 y >= 0.0
 y <= 3.0
termination_status(model) = MathOptInterface.OPTIMAL
primal_status(model) = MathOptInterface.FEASIBLE_POINT
dual_status(model) = MathOptInterface.FEASIBLE_POINT
objective_value(model) = 204.99999999999997
value(x) = 15.000000000000005
value(y) = 1.249999999999996
shadow_price(c1) = 0.24999999999999922
shadow_price(c2) = 1.5000000000000007

多重派发（multiple dispatch）

可以看下这个https://www.youtube.com/watch?v=SeqAQHKLNj4

多重派发技术可以实现元编程机制。图灵完备。

C++模板的加强版，Julia的语法写起来更优雅。

dispatch:根据参数的类型，选择同名函数的不同实现

static dispatch表示根据编译时类型选择

dynamic dispatch根据运行时类型选择

single dispatch表示根据函数第一个参数的类型选择

multiple dispatch表示根据函数所有参数类型选择

C++: multiple static dispatch + single dynamic dispatch

Julia: multiple dynamic dispatch

参考

https://zhuanlan.zhihu.com/p/138875601

https://zhuanlan.zhihu.com/p/378356824

https://max.book118.com/html/2017/0713/122000037.shtm

https://zhuanlan.zhihu.com/p/266086040

https://www.youtube.com/watch?v=SeqAQHKLNj4

https://zhuanlan.zhihu.com/p/105953560

码农公寓

术语