ARTS Week 23

Algorithm

本周的 LeetCode 题目为 50. Pow(x, n)

实现 pow(x, n) ,即计算 x 的 n 次幂函数(即,xn)。

输入:x = 2.00000, n = 10
输出:1024.00000

输入:x = 2.00000, n = -2
输出:0.25000
解释:2^-2 = (1/2)^2 = 1/4 = 0.25

为了使用二分法,需要首先保证 n 永远为正值,故需要对x的n次方进行转换。首先对 n 进行判断,若 n < 0,xn等于 pow(1/x, -n),当 n = 0或1 时,pow(x, n) 永远为 x,只有当 n>1 时,x的n次方就是计算 pow(x,n)。在真正计算xn函数calPow(double x, long n)中,若 n==0 则返回1,先通过递归得到xn/2的结果tmp,若n为偶数,那么最后结果为 tmp*tmp;若n为奇数,那么最后结果为 tmp*tmp*x,代码如下:

class Solution {
    public double myPow(double x, int n) {
        if (x == 0 || x == 1) {
            return x;
        }
        if (n > 0) {
            return calPow(x, n);
        } else if (n < 0) {
            return calPow(1/x, -n);
        } else {
            return 1;
        }
    }

    // 使用 long 类型:因为 绝对值(Integer.MIN_VALUE) > 绝对值(Integer.MAX_VALUE)
    public double calPow(double x, long n) {
        if (n == 0) {
            return 1;
        }
        double tmp = calPow(x, n/2);
        if (n % 2 == 0) {
            return tmp * tmp;
        } else {
            return tmp * tmp * x;
        }
    }
}

Review

本周 Review 的英文文章为:C语言运行的开销

作者介绍了当C语言库(libc)成为程序运行的瓶颈时该如何应对的方法。

作者通过对一个空程序使用 strace -tt 进行追踪,发现在其机器上执行之间长达 9.4ms,这完全是链接器和C语言运行时的开销

// strace output for `int main() { return 0; }`

00:32:27.539873 execve("./overhead", ["./overhead"], [/* 15 vars */]) = 0
00:32:27.541060 brk(0)                  = 0xa7b000
00:32:27.541466 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
00:32:27.541834 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
00:32:27.542199 fstat(3, {st_mode=S_IFREG|0644, st_size=36586, ...}) = 0
00:32:27.542438 mmap(NULL, 36586, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4f640be000
00:32:27.542865 close(3)                = 0
00:32:27.543109 open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
00:32:27.543455 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\1\2\0\0\0\0\0"..., 832) = 832
00:32:27.543782 fstat(3, {st_mode=S_IFREG|0755, st_size=1984416, ...}) = 0
00:32:27.544162 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bd000
00:32:27.544466 mmap(NULL, 3813200, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4f63b03000
00:32:27.544796 mprotect(0x7f4f63c9c000, 2097152, PROT_NONE) = 0
00:32:27.545104 mmap(0x7f4f63e9c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x199000) = 0x7f4f63e9c000
00:32:27.545503 mmap(0x7f4f63ea2000, 16208, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4f63ea2000
00:32:27.545775 close(3)                = 0
00:32:27.546087 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bc000
00:32:27.546386 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bb000
00:32:27.546664 arch_prctl(ARCH_SET_FS, 0x7f4f640bc700) = 0
00:32:27.547373 mprotect(0x7f4f63e9c000, 16384, PROT_READ) = 0
00:32:27.548209 mprotect(0x7f4f640c7000, 4096, PROT_READ) = 0
00:32:27.548533 munmap(0x7f4f640be000, 36586) = 0
00:32:27.548949 exit_group(0)           = ?
00:32:27.549272 +++ exited with 0 +++

接着作者尝试使用内联汇编和 -ffreestanding -nostdlib 来进行测试检查:

// gcc -m32 -ffreestanding -nostdlib
void _start() {
    /* exit system call */
    asm("movl $1,%eax;"
        "xorl %ebx,%ebx;"
        "int  $0x80"
    );
}

// strace -tt output
// 00:41:38.884122 execve("./blank", ["./blank"], [/* 15 vars */]) = 0
// 00:41:38.884552 exit(0)                 = ?
// 00:41:38.884629 +++ exited with 0 +++

可见这个没有使用任何系统调用的程序只运行了 0.5ms。接着作者尝试了静态编译程序,运行时间大约降到了8ms,追踪得到的输出如下:

00:35:19.095150 execve("./main-static", ["./main-static"], [/* 15 vars */]) = 0
00:35:19.095973 uname({sys="Linux", node="vagrant-arch", ...}) = 0
00:35:19.096373 brk(0)                  = 0x16cf000
00:35:19.096724 brk(0x16d01c0)          = 0x16d01c0
00:35:19.097065 arch_prctl(ARCH_SET_FS, 0x16cf880) = 0
00:35:19.097387 readlink("/proc/self/exe", "/vagrant/level0/main-static", 4096) = 27
00:35:19.097779 brk(0x16f11c0)          = 0x16f11c0
00:35:19.098140 brk(0x16f2000)          = 0x16f2000
00:35:19.098539 fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
00:35:19.098837 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8d5a7cb000
00:35:19.099192 read(0, "Collaboratively administrate emp"..., 4096) = 1398
00:35:19.099544 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
00:35:19.099893 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8d5a7ca000

最后,作者编写了一个最小libc库来使用 -ffreestanding -nostdlib 标志进行编译,运行时间降低到了1ms,接近glibc的运行时间的1/10,其只有五个系统调用:

00:53:41.882305 execve("./main-43.1", ["./main-43.1"], [/* 15 vars */]) = 0
00:53:41.882920 read(0, "Collaboratively administrate emp"..., 4096) = 1398
00:53:41.882989 read(0, "", 4096)       = 0
00:53:41.883006 write(1, "Collaboratively administrate emp"..., 1449) = 1449
00:53:41.883023 _exit(0)                = ?
00:53:41.883356 +++ exited with 0 +++

总结:如果你觉得程序or进程启动时间比较慢并且确实需要从库层面进行优化,那么你值得花时间尝试另一种libc的实现(如 musl libcDiet libc

Tip

在 C 语言回调函数的示例。所谓回调函数就是将一个函数当作是另一个函数的参数,最经典的应用的便是在排序算法中自定义实现比较大小的函数。下面是一个简单的示例,operate 函数分别调用 addmulti 函数来分别实现加法、乘法的效果。

#include <stdio.h>

int operate(int a, int b, int (* func) (int a, int b));
int add(int a, int b);
int multi(int a, int b);

int main()
{
    int a = 3, b = 4;
    printf("%d + %d = %d\n", a, b, operate(a, b, add));
    printf("%d * %d = %d\n", a, b, operate(a, b, multi));
    return 0;
}

int operate(int a, int b, int (* func) (int a, int b))
{
    return func(a, b);
}

int add(int a, int b)
{
    return a+b;
}

int multi(int a, int b)
{
    return a*b;
}

程序运行结果如下:

$ gcc test.c
$ ./a.out
3 + 4 = 7
3 * 4 = 12

Share

目前已经想好了一个非 ARTS 的技术类博文的选题,大致分为三部曲,目前正在努力写作中,加油!

上一篇:FR8016A-串口下载协议分析


下一篇:L1-043 阅览室 (20 分)(C语言)含大坑的注意点