首先要理解一个概念
MAC (message authenticate code)
消息认证码(带密钥的Hash函数):密码学中,通信实体双方使用的一种验证机制,保证消息数据完整性的一种工具。
构造方法由M.Bellare提出,安全性依赖于Hash函数,故也称带密钥的Hash函数。消息认证码是基于密钥和消息摘要所获得的一个值,
可用于数据源发认证和完整性校验。
构造方法由M.Bellare提出,安全性依赖于Hash函数,故也称带密钥的Hash函数。消息认证码是基于密钥和消息摘要所获得的一个值,可用于数据源发认证和完整性校验。
签名cookie就是基于这种原理。
一般发送的数据都会做base64编码,关于base64编码可以看这个链接
blog.xiayf.cn/2016/01/24/base64-encoding/
总的来说:base64并不是用来加密的,base64是一种数据编码方式,目的让数据符合传输协议的要求,将二进制的数据转化为一种文本数据
在Python中提供了两个模块,去实现。以下是base64的基本用法
>>> import base64
>>> s='b'asdafsdafa'
SyntaxError: invalid syntax
>>> s=b'asdasdas'
>>> s_64 = base64.bencode(s) s_64 = base64.bencode(s)
AttributeError: module 'base64' has no attribute 'bencode'
>>> s_64 = base64.b64encode(s)
>>> s_64
b'YXNkYXNkYXM='
>>> base64.b64decode(s_64)
b'asdasdas'
>>>
对于mac,Python有个模块hmac基本用法如下
>>> import hmac
>>> hmac.new(b'slat')
<hmac.HMAC object at 0x0000000004110B70>
>>> hmac = hmac.new(b'slat')
>>> hmac.update(b'asdas')
>>> hmac.digest()
b'\xe8\\\xb6\x11\x9dj\rY\x06I\x1f[\x06\xeb\xeb\xf3'
>>>
文档说明
class HMAC(builtins.object)
| RFC 2104 HMAC class. Also complies with RFC 4231.
|
| This supports the API for Cryptographic Hash Functions (PEP 247).
|
| Methods defined here:
|
| __init__(self, key, msg=None, digestmod=None)
| Create a new HMAC object.
|
| key: key for the keyed hash object.
| msg: Initial input for the hash, if provided.
| digestmod: A module supporting PEP 247. *OR*
| A hashlib constructor returning a new hash object. *OR*
| A hash name suitable for hashlib.new().
| Defaults to hashlib.md5.
| Implicit default to hashlib.md5 is deprecated and will be
| removed in Python 3.6.
|
| Note: key and msg must be a bytes or bytearray objects.
|
| copy(self)
| Return a separate copy of this hashing object.
|
| An update to this copy won't affect the original object.
|
| digest(self)
| Return the hash value of this hashing object.
|
| This returns a string containing 8-bit data. The object is
| not altered in any way by this function; you can continue
| updating the object after calling this function.
|
| hexdigest(self)
| Like digest(), but returns a string of hexadecimal digits instead.
|
| update(self, msg)
| Update this hashing object with the string msg.
立即了基础,来看bottle框架设置签名cookie和获取签名cookie的值的源码.
设置签名cookie
first_bottle.py
response.set_cookie('account',username,secret='salt')
def set_cookie(self, name, value, secret=None, **options): if not self._cookies:
self._cookies = SimpleCookie() if secret:
value = touni(cookie_encode((name, value), secret))
elif not isinstance(value, basestring):
raise TypeError('Secret key missing for non-string Cookie.') if len(value) > 4096: raise ValueError('Cookie value to long.')
self._cookies[name] = value for key, value in options.items():
if key == 'max_age':
if isinstance(value, timedelta):
value = value.seconds + value.days * 24 * 3600
if key == 'expires':
if isinstance(value, (datedate, datetime)):
value = value.timetuple()
elif isinstance(value, (int, float)):
value = time.gmtime(value)
value = time.strftime("%a, %d %b %Y %H:%M:%S GMT", value)
self._cookies[name][key.replace('_', '-')] = value
self._cookie默认是None,SimpleCookie继承BaseCookie,BaseCookie继承一个字典,所以暂且认为self.cookie是一个字典,
if secret 是判断,如果设置了密钥,就执行这一步,
def tob(s, enc='utf8'):
return s.encode(enc) if isinstance(s, unicode) else bytes(s)
def touni(s, enc='utf8', err='strict'):
return s.decode(enc, err) if isinstance(s, bytes) else unicode(s)
tonat = touni if py3k else tob
touni 函数是返回str类型的字符串,这里unicode=str,unicode(s) 相当于str(s)
def cookie_encode(data, key):
''' Encode and sign a pickle-able object. Return a (byte) string '''
msg = base64.b64encode(pickle.dumps(data, -1))
sig = base64.b64encode(hmac.new(tob(key), msg).digest())
return tob('!') + sig + tob('?') + msg
pickle.dumps将数据序列化,返回的是bytes类型的字符串,然后编码为base64 sig 是先用hmac加密,
最后将msg(消息) 和sig(签名)拼接,这样一个签名cookie就设置好了,注意这里的msg是一个(name,value)包含cookie的key和value
这样一个签名cookie就设置好了
理解了签名cookie的设置,再看获得签名cookie的值就比较简单了。。
大致原理是拿到cookie的值,通过?分割出message 和sig ,再拿message和secret 进行hmac 拿到新的sig,这个新的sig与分割出来的sig比较,如果一致,表示没有被篡改,这样吧message 用base64decode
然后pickle.loads 就拿到原来的数组了。数组的[1]就是那个value,
def cookie_decode(data, key):
''' Verify and decode an encoded string. Return an object or None.'''
data = tob(data)
if cookie_is_encoded(data):
sig, msg = data.split(tob('?'), 1)if _lscmp(sig[1:], base64.b64encode(hmac.new(tob(key), msg).digest())):
return pickle.loads(base64.b64decode(msg))
因为之前setcookie时在自古穿前面加了一个感叹 号! ,所以切片sig[1:]
def _lscmp(a, b):
''' Compares two strings in a cryptographically safe way:
Runtime is not affected by length of common prefix. '''
return not sum(0 if x==y else 1 for x, y in zip(a, b)) and len(a) == len(b)
上面这个函数是逐个字符比较,如果比较的字符都相等那么就返回0,否则返回1,这样如果是两个字符串完全匹配,就都是0,调用sum() 相加肯定返回0 ,否则肯定不是1,但是必须在长度相等的条件下才可以
测试代码
>>> a='asdas'
>>> unicode(a)
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
unicode(a)
NameError: name 'unicode' is not defined
>>> b='asd'
>>> (0 if x==y else 1 for x,y in zip(a,b))
<generator object <genexpr> at 0x0000000003170200>
>>> sum((0 if x==y else 1 for x,y in zip(a,b)))
0
>>> s=zip(a,b)
>>> s
<zip object at 0x0000000003147948>
>>> for i in s:
print(i) ('a', 'a')
('s', 's')
('d', 'd')
为什么比较字符串相等不直接用 a==b?