Lua的函数调用和协程中,栈的变化情况
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
|
#include <stdio.h> #include <lua.h> #include <lauxlib.h> int luv_dumpstack(lua_State *L) {
int i, top;
printf ( "-------------------\n" );
printf ( "dumpstack: L=%p\n" , L);
top = lua_gettop(L);
printf ( "top: %d\n" , top);
for (i = 1; i <= top; ++i) {
printf ( "[%d][%s]: %s\n" ,
i,
luaL_typename(L, i),
luaL_tolstring(L, i, NULL));
lua_pop(L, 1);
}
printf ( "-------------------\n" );
return top;
} static int cont(lua_State *L) {
printf ( "stack after yield\n" );
luv_dumpstack(L);
lua_pushinteger(L, 21);
lua_pushinteger(L, 22);
lua_pushinteger(L, 23);
return lua_gettop(L);
} static int test_yield(lua_State *L) {
printf ( "stack before yield\n" );
luv_dumpstack(L);
lua_pushinteger(L, 1);
lua_pushinteger(L, 2);
lua_pushinteger(L, 3);
lua_pushinteger(L, 4);
lua_pushinteger(L, 5);
return lua_yieldk(L, 2, 0, cont);
} static int test_resume(lua_State *L) {
lua_State *L1 = lua_newthread(L);
lua_pushinteger(L1, 11);
lua_pushinteger(L1, 12);
lua_pushcfunction(L1, test_yield);
lua_pushinteger(L1, 13);
lua_pushinteger(L1, 14);
lua_pushinteger(L1, 15);
printf ( "stack before resume\n" );
luv_dumpstack(L1);
printf ( "resume: %d\n" , lua_resume(L1, L, 3));
printf ( "stack after resume\n" );
luv_dumpstack(L1);
lua_pushinteger(L1, 24);
lua_pushinteger(L1, 25);
printf ( "stack before second resume\n" );
luv_dumpstack(L1);
/* XXX notice even we pass 2, all values in stack (4,5,24,25)
* will passed to coroutine */
printf ( "resume: %d\n" , lua_resume(L1, L, 2));
printf ( "stack after second resume\n" );
luv_dumpstack(L1);
return 0;
} int main( void ) {
lua_State *L = luaL_newstate();
lua_pushcfunction(L, test_resume);
lua_call(L, 0, 0);
lua_close(L);
return 0;
} /* cc: libs+='-llua52' */ |
在Lua里面声明小数组的最好方法是什么?
在很早以前,就看到“闭包比表要快”的这么一个言论。一直没有验证过,只是心里就这么觉得了。所以自己第一次写的Lua网游,大量利用了闭包,最后估计还是有很严重的内存问题……在我的对象模型里面,对象构造函数通常是这样的:
1
2
3
4
5
6
7
8
9
10
11
12
|
function Object()
local t = {}
local object_state1
local object_state2
-- init ...
-- methods
function t:foo(...) return ... end
function t:bar(...) return ... end
-- return object
return t
end |
这个设计可以保证,在访问对象的函数的时候,速度可以达到最快——因为没有元表查询。但是,这样做恰好就违背了这样的原则:“在设计的初期,不要过早地考虑优化”。是的,因为任何优化都是有代价的,这里的代价理所当然就是内存了。
另外,在Lua-5.1中,我后来自己测试的结果是,表貌似还是比闭包要快一点,关键点是,这样貌似占用内存还小很多,自从这么摆了一道以后,反正至少对于“看见大括号就有点恐惧内存分配”的心理上是好过多了。不过至少对于闭包的大小和速度什么的心里头反而就没底了。
今天突然想到了一个叫做lua-vararg的lua库。这个库可以对vararg进行包装,提供对用户来说“比较自然”的vararg的体验——说白了,这就是用vararg去实现了元组(tuple)嘛。这个库很早以前就知道了,我还自己亲自改过,但是性能到底怎么样呢?心里没底,所以决定今天写个伪测试来看看效果。
首先,我们看一下Lua源代码里面表和闭包的描述:
560
561
562
563
564
565
566
567
568
569
570
|
typedef struct Table {
CommonHeader;
lu_byte flags; /* 1<<p means tagmethod(p) is not present */
lu_byte lsizenode; /* log2 of size of `node' array */
struct Table *metatable;
TValue *array; /* array part */
Node *node;
Node *lastfree; /* any free position is before this position */
GCObject *gclist;
int sizearray; /* size of `array' array */
} Table; |
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
|
#define ClosureHeader \ CommonHeader; lu_byte nupvalues; GCObject *gclist
typedef struct CClosure {
ClosureHeader;
lua_CFunction f;
TValue upvalue[1]; /* list of upvalues */
} CClosure; typedef struct LClosure {
ClosureHeader;
struct Proto *p;
UpVal *upvals[1]; /* list of upvalues */
} LClosure; typedef union Closure {
CClosure c;
LClosure l;
} Closure; |
OK,代码有了,那么第一个问题是:表和闭包分别占多大的大小呢?
额= =这个问题不好搞啊,看起来好复杂的样子,还得考虑对齐……额,不好搞的话,直接写个程序不就可以了吗?
1
2
3
4
5
6
7
8
9
|
#include "src/lobject.h" #include <stdio.h> int main( void ) {
printf ( "size of Table: %d\n" , sizeof (Table));
printf ( "size of Closure: %d\n" , sizeof (Closure));
return 0;
} |
恩,输出结果是32和24……(别打偶……)
好吧,闭包居然比表要小!恩,这是很正常的。而且,这里的表可是“裸表”哦,一点数据都没有的,而这个被声明的闭包是默认带上了一个upvalue的!
那么,对于有十个元素的表和闭包,大小又是怎么样的呢?注意到表保存数组元素用的是TValue指针,而闭包里面的upvalue(主要指C闭包)也同样是TValue,我们只需要分别给表和闭包的大小增加10个TValue的大小即可——额,闭包只需要加9,因为前面已经有个元素了,最后得到的大小是:112字节和96字节……恩~
好了,那么在大小方面,的确闭包(特指C闭包)是保存小数组的最好途径了,那么,除开大小,在速度上面,它们之间有什么变化吗?我们继续测试一下!首先声明一下,这里的测试对于真实的环境是没有意义的,通常来说,我们并不需要考虑像分配小数组之间的效率问题,这里的测试,只是在于检验“闭包比表快”这个说法罢了,这样的检验本身,可以当作是使用vararg模块的依据,对做出某些设计决策是有好处的。
我们准备测试四种情况下的速度情况
- 用C API创建一个表,向其中压入10个数字
- 用C API创建一个闭包,其中有10个upvalue
- 用Lua代码创建一个表,其中有10个数字
- 用Lua代码创建一个闭包,其中有10个upvalue
下面是测试结果:
1
2
3
4
5
6
7
8
9
10
|
test c table time: 2.35s c table memory: 5724 c closure time: 0.20s c closure memory: 6020 lua table time: 0.40s lua table memory: 5836 lua closure time: 1.51s lua closure memory: 4556 Hit any key to close this window... |
我们来分析一下这个数据,最简单的分析方法是排序,对于时间的顺序是c closure < lua table < lua closure < c table,对于内存剩余的顺序是lua closure < c table == lua table < c closure。
好奇怪的结果!!我们试着来分析一下吧!看起来内存里面有一项是相同的:即用c和用lua创建表,占用内存的大小是近似相同的。而用C创建闭包,占用内存最大,但是也大不了多少。这里面是什么情况呢?我们关闭一下垃圾回收试试?对了!原来是这样……关掉垃圾回收以后,内存的占用非常非常夸张,原来,这是垃圾回收以后的结果(顺便说一下,关闭垃圾回收,所有的测试都变慢了,不过相对结果不变)。
那么,内存的意义上就不大了。只能说是垃圾回收的策略不同罢了。我们来分析一下时间问题好了。
首先,这个数据实在是太奇怪了了:如果说C的API比Lua代码快,那么为什么c table是最慢的呢?如果说闭包比表快,那么为什么lua闭包不如lua的表呢?这是怎么回事呢?
很容易发现问题的是lua闭包,这是倒数第二慢的……因为Lua5.2出了一个新功能:在创建lua闭包的时候,会和之前缓存的进行比较,看看是新的闭包还是以前的,因为我们创建的闭包都是新的函数调用,显然和以前闭包的upvalue不可能一样,因此这个比较始终是会失败的……看来也只能是这个地方会占用时间了。当然了,关联upvalue、关闭upvalue等等都需要占据时间。lua闭包比较慢几乎是可以预见到的了。这一点我们不奇怪。
奇怪的是,为什么用C API写的新建表比lua代码还要慢呢?注意到,Lua代码是不需要再创建新的“整数”的,它们是直接从常量表(K表)载入的,所以pushnumber会拖慢一点速度,其次,lua_rawseti会比luaH_setint多做一些检查,除了这两点以外,还有一个很容易忽视的地方是API调用开销:因为Lua是在DLL里面的,因此API调用相对较慢,如果静态链接的话,对其他测试来说,结果影响不大,但是c table测试瞬间就快了20%——也许是因为每次推入新的包,c table测试比别的方法多了10个C API的原因吧……Lua里面没有一次给表设置多个值的方法,必须一个一个地设置,每次设置都必须压入新值,所以即使是相对较快的lua_rawseti,也架不住每次设置的两次API啊……当然,从这一点上也可以看出来,我们的测试是很不合理的,因为即使是API效率,都可以极大地影响测试结果。当然这里需要说明的是,即使是将API效率的原因去除,c table仍然是最慢的。
然后,c cloure和lua table就比较符合我们的期望了。看来的确是闭包会快一点啊……
这里只比较了创建的效率,对于取值的效率,大家可以自己去比较。但是从这里可以看出来,如果希望交换大量小数据块,那么closure的确是一个合理的选择:它体积较小,没有多余的功能(比如表支持的哈希查找或者增加大小或者独立的元表等等)。作为需要快速保存少量数据,lua-vararg的确是一个很好的选择——另外,尽量选择C实现的版本,如果无法使用C库,那就还是用表来实现吧……这样比较快……
最后,附上我自己写的vararg的C实现:https://gist.github.com/starwing/5893607
#define LUA_LIB
#include <lua.h>
#include <lauxlib.h> static lua_Integer posrelat(lua_Integer pos, size_t len) {
if (pos >= ) return pos;
else if (0u - (size_t)pos > len) return ;
else return (lua_Integer)len + pos + ;
} static int tuple(lua_State *L) {
int top, n = (int)lua_tointeger(L, lua_upvalueindex());
lua_Integer i, j;
switch (lua_type(L, )) {
case LUA_TNIL: /* as iterator */
i = lua_tointeger(L, ) + ;
if (i <= || i > n) return ;
lua_pushinteger(L, i);
lua_pushvalue(L, lua_upvalueindex(i + ));
return ;
case LUA_TSTRING: /* as length operator */
if (*lua_tostring(L, ) == '#') {
lua_pushinteger(L, n);
return ;
}
break;
case LUA_TNONE: /* get all varargs */
luaL_checkstack(L, n, "too many values");
for (i = ; i <= n; ++i)
lua_pushvalue(L, lua_upvalueindex(i+));
return n;
case LUA_TNUMBER: /* get/set a range */
i = posrelat(luaL_checkinteger(L, ), n);
j = posrelat(luaL_optinteger(L, , i), n);
if (i > j) return ;
n = (int)(j-i+);
luaL_checkstack(L, n, "too many values");
if ((top = lua_gettop(L)) <= ) { /* get */
for (; i <= j; ++i)
lua_pushvalue(L, lua_upvalueindex(i+));
}
else {
int idx;
lua_settop(L, top = n + );
for (idx = ; idx <= top; ++idx) {
lua_pushvalue(L, idx);
lua_replace(L, lua_upvalueindex(i+idx-));
}
}
return n;
}
return luaL_argerror(L, , "invalid argument");
} static int Lpack(lua_State *L) {
int n = lua_gettop(L);
if (n >= ) luaL_error(L, "too many values to pack");
lua_pushinteger(L, n);
lua_insert(L, );
lua_pushcclosure(L, tuple, n+);
return ;
} static int Lrange(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i, j;
if (n < ) return ;
i = posrelat(luaL_checkinteger(L, ), n);
j = posrelat(luaL_checkinteger(L, ), n);
if (i > j || j == ) return ;
if (j > n) luaL_checkstack(L, j-n, "range is too big");
lua_settop(L, j + );
return j-i+;
} static int Linsert(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i = posrelat(luaL_checkinteger(L, ), n);
if (i > n) {
luaL_checkstack(L, i-n, "index is too big");
lua_settop(L, i + );
lua_pushvalue(L, );
return i;
}
lua_pushvalue(L, );
lua_insert(L, i + );
return n + ;
} static int Lremove(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i = posrelat(luaL_checkinteger(L, ), n);
if (i <= n) {
lua_remove(L, i + );
--n;
}
return n;
} static int Lreplace(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i = posrelat(luaL_checkinteger(L, ), n);
if (i > n) {
luaL_checkstack(L, i-n, "index is too big");
lua_settop(L, i + );
lua_pushvalue(L, );
return i;
}
lua_pushvalue(L, );
lua_replace(L, i + );
return n;
} static int Lpush(lua_State *L) {
lua_pushvalue(L, );
return lua_gettop(L) - ;
} static int Lpop(lua_State *L) {
lua_pop(L, );
return lua_gettop(L);
} static int Ltake(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i = posrelat(luaL_checkinteger(L, ), n);
if (i > n) return ;
lua_pop(L, n-i);
return i;
} static int Ltail(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i = posrelat(luaL_checkinteger(L, ), n);
if (i > n) return ;
return n-i+;
} static int Lshift(lua_State *L) {
return lua_gettop(L) - ;
} static int Lmap(lua_State *L) {
int i, n = lua_gettop(L);
luaL_checkany(L, );
for (i = ; i <= n; ++i) {
lua_pushvalue(L, );
lua_pushvalue(L, i);
lua_call(L, , );
lua_replace(L, i);
}
return n-;
} static int Lfilter(lua_State *L) {
int i, n = lua_gettop(L);
luaL_checkany(L, );
for (i = ; i <= n; ++i) {
lua_pushvalue(L, );
lua_pushvalue(L, i);
lua_call(L, , );
if (!lua_toboolean(L, -)) {
lua_remove(L, i);
--i, --n;
}
lua_pop(L, );
}
return n-;
} static int Lreduce(lua_State *L) {
int i, n = lua_gettop(L);
luaL_checkany(L, );
if (n <= ) {
lua_call(L, n-, );
return ;
}
lua_pushvalue(L, );
lua_pushvalue(L, );
lua_pushvalue(L, );
lua_call(L, , );
for (i = ; i <= n; ++i) {
lua_pushvalue(L, );
lua_insert(L, -);
lua_pushvalue(L, i);
lua_call(L, , );
}
return ;
} static int Lunpack(lua_State *L) {
int i, n = lua_gettop(L);
for (i = ; i <= n; ++i) {
lua_pushvalue(L, i);
lua_call(L, , LUA_MULTRET);
}
return lua_gettop(L)-n;
} static int Lrotate(lua_State *L) {
int n = lua_gettop(L) - ;
lua_Integer i, c = luaL_checkinteger(L, ) % n;
#if LUA_VERSION_NUM >= 503
(void)i; /* unused */
lua_rotate(L, , (int)c);
#else
c = -c + ((c > ) ? n+ : );
if (c > ) luaL_checkstack(L, c-, "too many values");
for (i = ; i <= c; ++i)
lua_pushvalue(L, i);
#endif
return n;
} static int Lreverse(lua_State *L) {
int i, j, n = lua_gettop(L);
for (i = , j = n; i < j; ++i, --j) {
lua_pushvalue(L, i);
lua_pushvalue(L, j);
lua_replace(L, i);
lua_replace(L, j);
}
return n;
} static int Lrep(lua_State *L) {
int top = lua_gettop(L), n = top - ;
lua_Integer i, j, count = luaL_checkinteger(L, );
if (count <= ) return ;
if (n == ) {
luaL_checkstack(L, count, "too many values");
lua_settop(L, count+);
return count;
}
luaL_checkstack(L, n*(count-), "too many values");
for (i = ; i < count; ++i)
for (j = ; j <= top; ++j)
lua_pushvalue(L, j);
return n*count;
} LUALIB_API int luaopen_vararg(lua_State *L) {
luaL_Reg libs[] = {
{ "concat", Lunpack },
{ "unpack", Lunpack },
{ "filter", Lfilter },
{ "insert", Linsert },
{ "map", Lmap },
{ "pack", Lpack },
{ "pop", Lpop },
{ "push", Lpush },
{ "range", Lrange },
{ "reduce", Lreduce },
{ "remove", Lremove },
{ "rep", Lrep },
{ "replace", Lreplace },
{ "reverse", Lreverse },
{ "rotate", Lrotate },
{ "shift", Lshift },
{ "tail", Ltail },
{ "take", Ltake },
{ NULL, NULL }
};
#if LUA_VERSION_NUM >= 502
luaL_newlib(L, libs);
#else
luaL_register(L, "vararg", libs);
#endif
return ;
}
/* cc: flags+='-s -O3 -mdll -DLUA_BUILD_AS_DLL' libs+='-llua53'
* cc: output='vararg.dll' */
local _G = require "_G"
local assert = _G.assert
local pcall = _G.pcall
local print = _G.print
local select = _G.select
local type = _G.type local math = require "math"
local ceil = math.ceil
local huge = math.huge
local min = math.min local table = require "table"
local unpack = table.unpack or _G.unpack local vararg = require "vararg"
local pack = vararg.pack
local range = vararg.range
local insert = vararg.insert
local remove = vararg.remove
local replace = vararg.replace
local push = vararg.push
local concat = vararg.concat
local map = vararg.map
local rotate = vararg.rotate -- auxiliary functions---------------------------------------------------------- local values = {}
local maxstack
for i = , huge do
if not pcall(unpack, values, , ^i) then
local min, max = ^(i-), ^i
while min < max do
local mid = ceil((min+max)/)
if pcall(unpack, values, , mid) then
min = mid
else
max = mid-
end
end
maxstack = max
break
end
end
for i = , maxstack, do
values[i] = i
end local function tpack(...)
return {..., n=select("#", ...)}
end local function assertsame(v, i, j, ...)
local count = select("#", ...)
assert(count == j-i+, count..","..i..","..j)
for pos = , count do
assert(v[i+pos-] == select(pos, ...))
end
end local function asserterror(expected, f, ...)
local ok, actual = pcall(f, ...)
assert(ok == false, "error was expected")
assert(actual:find(expected, ), "wrong error, got "..actual)
end -- test 'pack' function -------------------------------------------------------- local function testpack(...)
local v = {...}
local n = select("#", ...)
local p = pack(...)
assertsame(v, , n, p())
assert(n == p("#"))
for i,pv in p do assert(v[i] == pv) end
for i = , n do
assert(v[i] == p(i))
if n > then
assert(v[i] == p(i-n-))
end
end
for i = , n, do
local j = i+
assertsame(v, i, j, p(i, j))
if n > j then
assertsame(v, i, j, p(i-n-, j-n-))
end
end
end testpack()
testpack({},{},{})
testpack(nil)
testpack(nil, nil)
testpack(nil, , nil)
testpack(unpack(values, , )) local ok, err = pcall(pack, unpack(values, , ))
if ok then -- Lua version
assert(type(err) == "function")
else -- C version
assert(ok == false and err == "too many values to pack")
end -- test 'range' function ------------------------------------------------------- local function testrange(n, ...)
local v = {...}
for c = , do
for i = , n, c do
local j = min(i+c-, n)
assertsame(v, i, j, range(i, j, ...))
local n = select("#", ...)
if n > then
assertsame(v, i, j, range(i-n-, j-n-, ...))
end
end
end
end local ok, err = pcall(range, , , ...)
if ok then -- Lua version
assert(err == nil)
else -- C version
assert(ok == false and err == "bad argument #1 to '?' (index out of bounds (0))")
end testrange()
testrange(, ,,,,,,,,,)
maxstack = -- use a smaller value
testrange(maxstack, unpack(values, , maxstack)) -- test other functions -------------------------------------------------------- assertsame({,,,,}, , , insert(, , ,,,))
assertsame({,,,,}, , , insert(,-, ,,,))
assertsame({,,nil,}, , , insert(, , ,))
assertsame({nil,nil,}, , , insert(, )) assertsame({,,,,}, , , replace(, , ,,,,))
assertsame({,,,,}, , , replace(,-, ,,,,))
assertsame({,,nil,}, , , replace(, , ,))
assertsame({nil,nil,}, , , replace(, )) assertsame({,,,,}, , , remove( , ,,,,,))
assertsame({,,,,}, , , remove(-, ,,,,,))
assertsame({,,nil,}, , , remove( , ,,nil,,))
assertsame({nil,nil,}, , , remove( , nil,nil,,))
assertsame({,,,,}, , , remove(, ,,,,)) assertsame({,,,,}, , , push(, ,,,))
assertsame({,,nil,}, , , push(, ,,nil))
assertsame({nil,nil,}, , , push(, nil,nil)) assertsame({,,,,}, , , rotate(, ,,,,))
assertsame({,,,,}, , , rotate(-, ,,,,))
assertsame({,,,,}, , , rotate(, ,,,,))
assertsame({,,,,}, , , rotate(-, ,,,,)) assertsame({,,,,,,,,}, , , concat(pack(,,),
pack(,,),
pack(,,))) -- test function errors and expectional conditions --------------------------- assertsame({"","","","",""}, , , map(tostring, ,,,,))
assertsame({"","","nil","" }, , , map(tostring, ,,nil,))
assertsame({"nil","nil","" }, , , map(tostring, nil,nil,))
assertsame({"","nil","nil" }, , , map(tostring, ,nil,nil)) asserterror("bad argument #2 to '[^']*' %(number expected, got no value%)", insert)
asserterror("bad argument #2 to '[^']*' %(number expected, got no value%)", insert, nil)
asserterror("bad argument #2 to '[^']*' %(number expected, got nil%)", insert, nil, nil)
--asserterror("bad argument #2 to '[^']*' %(index out of bounds %(0%)%)", insert, nil, 0)
assert(select('#', insert(nil, )) == ) asserterror("bad argument #2 to '[^']*' %(number expected, got no value%)", replace)
asserterror("bad argument #2 to '[^']*' %(number expected, got no value%)", replace, nil)
asserterror("bad argument #2 to '[^']*' %(number expected, got nil%)", replace, nil, nil)
--asserterror("bad argument #2 to '[^']*' %(index out of bounds %(0%)%)", replace, nil, 0)
assert(select('#', replace(nil, )) == ) asserterror("bad argument #1 to '[^']*' %(number expected, got no value%)", remove)
asserterror("bad argument #1 to '[^']*' %(number expected, got nil%)", remove, nil)
--asserterror("bad argument #1 to '[^']*' %(index out of bounds %(0%)%)", remove, 0)
assert(select('#', remove()) == ) assertsame({}, , , push())
assertsame({nil}, , , push(nil)) assertsame({}, , , concat())
asserterror("attempt to call a nil value", concat, nil) asserterror("bad argument #1 to '[^']*' %(value expected%)", map)
assertsame({}, , , map(nil))
asserterror("attempt to call a nil value", map, nil, nil)