利用Wireshark和OSS的API文档简单实现上传和下载

背景及目的

由于各个开发者使用的开发语言可能在官方SDK找不到相应的语言版本,就必须自主开发SDK。
本文根据wireshark和API文档,来简单实现上传和下载的请求,给需要自主开发的开发者提供一个简单的示例。

准备工作

安装wireshark

官网地址:https://www.wireshark.org/download.html
找到合适的平台及版本,下载并安装。

找到OSS的API文档

官网地址:https://help.aliyun.com/document_detail/oss/api-reference/abstract.html

准备开发环境

1. 这里使用的是python 2.7, 并且使用requests库。

http://cn.python-requests.org/zh_CN/latest/

2. 需要开通OSS,并且拥有一个bucket,同时需要获取AccessKeyId和AccessKeySecret

实践

基于OSS API文档,用python实现一个简单的上传和下载操作

上传

1. 先看Put Object的API文档

https://help.aliyun.com/document_detail/oss/api-reference/object/PutObject.html

请求语法
PUT /ObjectName HTTP/1.1
Content-Length:ContentLength
Content-Type: ContentType
Host: BucketName.oss-cn-hangzhou.aliyuncs.com
Date: GMT Date
Authorization: SignatureValue

2. 构建类似的HTTP请求

BucketName是ali-beijing
Endpoint是oss-cn-beijing.aliyuncs.com
ObjectName是test.txt
将如下的代码保持文件后运行

import requests
bucket = "ali-beijing"
objectname = "test.txt"
endpoint = "oss-cn-beijing.aliyuncs.com"
url = "http://%s.%s/%s" % (bucket, endpoint, objectname)
headers = {}
r = requests.put(url, data="hello", headers=headers)
print r.text
print r.status_code
print r.headers

3. 运行的同时,打开wireshark来抓包,查看请求

运行完毕后,停止抓包,查看请求。
如图所示:
利用Wireshark和OSS的API文档简单实现上传和下载

停止抓包后点击图中红框的"Protocol",找到发送的HTTP请求,然后点击“Analyze"->"Follow TCP Stream",即可看到整个HTTP请求的内容。

可以看到最终的HTTP请求如下所示

PUT /test.txt HTTP/1.1
Host: ali-beijing.oss-cn-beijing.aliyuncs.com
Content-Length: 5
User-Agent: python-requests/2.5.1 CPython/2.7.10 Darwin/15.0.0
Connection: keep-alive
Accept: */*
Accept-Encoding: gzip, deflate

hello

HTTP/1.1 403 Forbidden
Server: AliyunOSS
Date: Tue, 26 Apr 2016 10:01:20 GMT
Content-Type: application/xml
Content-Length: 279
Connection: keep-alive
x-oss-request-id: 571F3C704FF4F07A6A0080A6

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>AccessDenied</Code>
  <Message>You have no right to access this object because of bucket acl.</Message>
  <RequestId>571F3C704FF4F07A6A0080A6</RequestId>
  <HostId>ali-beijing.oss-cn-beijing.aliyuncs.com</HostId>
</Error>

经过和Put Object的协议对比,我们可以看到,请求的header中没有加入Authorization,以及Date,也没有Content-Type。由于bucket是私有权限,没有Authorization的认证信息是无法对bucket进行写入操作。所以需要加入签名信息。

4. 根据API文档描述的,加入签名的信息

签名相关的文档见:
https://help.aliyun.com/document_detail/oss/api-reference/access-control/signature-header.html

#coding=utf-8
import requests, datetime, hmac, httplib, hashlib
from email.utils import formatdate
from urllib import quote
from base64 import b64encode

class OssRequest():
    def __init__(self,  endpoint, AccessKeyId, AccessKeySecret, bucket):
        self.endpoint = endpoint
        self.AccessKeyId = AccessKeyId
        self.AccessKeySecret = AccessKeySecret
        self.bucket = bucket
        self.objectname = ""
        self.subresource = ""
        self.VERB = ""

    def format_oss_headers(self, headers=None):
        map = {}
        for header, value in headers.iteritems():
            header = header.lower()
            if header.startswith("x-oss-"):
                map.setdefault(header, []).append(value)
        parts = []
        for key in sorted(map):
            parts.append("%s:%s\n" % (key, ",".join(map[key])))
        return "".join(parts)

    def canonical_resource(self):
        resource = "/"
        if self.bucket:
            resource += self.bucket + "/"
        if self.objectname:
            resource += "%s" % self.objectname
        if self.subresource:
            resource += "?%s" % quote(self.subresource, "/")
        return resource

    def sign(self, headers=None):
        if not headers:
            headers = {}
        AuthString = "\n".join(str(item_) for item_ in items) + "\n"
        CanonicalizedOSSHeaders = self.format_oss_headers(headers)
        CanonicalizedResource = self.canonical_resource()
        AuthString = "".join((AuthString, CanonicalizedOSSHeaders, CanonicalizedResource))
        Signature = '%s' % (b64encode(hmac.new(AccessKeySecret, AuthString.encode("utf-8"), hashlib.sha1).digest()))
        return Signature

    def put(self, objectname):
        self.VERB = 'PUT'
        self.objectname = objectname
        url = "http://%s.%s/%s" % (self.bucket, self.endpoint, self.objectname)
        headers = {'Date' : formatdate(None, usegmt=True)}
        Signature = self.sign(headers)
        headers['Authorization'] = 'OSS %s:%s' % (self.AccessKeyId, Signature)
        r = requests.put(url, data = "hello", headers=headers)
        print r.text
        print r.status_code
        print r.headers

if __name__ == "__main__":
    AccessKeyId = "替换成自己的AccessKeyId"
    AccessKeySecret = "替换成自己的AccessKeySecret"
    bucket = "ali-beijing"
    objectname = "test.txt"
    endpoint = "oss-cn-beijing.aliyuncs.com"
    a = OssRequest(endpoint, AccessKeyId, AccessKeySecret, bucket)
    a.put(objectname)

5. 再次在运行后,通过wireshark抓包观察

同之前的抓包和观察方法,可以看到,上传成功了。

PUT /test.txt HTTP/1.1
Host: ali-beijing.oss-cn-beijing.aliyuncs.com
Content-Length: 5
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.5.1 CPython/2.7.10 Darwin/15.0.0
Connection: keep-alive
Date: Tue, 26 Apr 2016 13:44:42 GMT
Content-Type: plain/text
Authorization: OSS testaliyun:1aUnxjJ4V/0+pTwzd7t9An3d10c=

helloHTTP/1.1 200 OK
Server: AliyunOSS
Date: Tue, 26 Apr 2016 13:44:42 GMT
Content-Length: 0
Connection: keep-alive
x-oss-request-id: 571F70CA4FF4F07A6A022212
ETag: "5D41402ABC4B2A76B9719D911017C592"
x-oss-hash-crc64ecma: 11177612005948864433

下载

1. 查看Get Object的API文档

https://help.aliyun.com/document_detail/oss/api-reference/object/GetObject.html

GET /ObjectName HTTP/1.1
Host: BucketName.oss-cn-hangzhou.aliyuncs.com
Date: GMT Date
Authorization: SignatureValue
Range: bytes=ByteRange(可选)

2. 在上传成功的基础上实现下载

由于之前上传Object已经成功,这里只需要添加如下代码

省略和上传一样的代码
在def put(self, objectname):
函数下添加

    def get(self, objectname):
        self.VERB = 'GET'
        self.objectname = objectname
        url = "http://%s.%s/%s" % (self.bucket, self.endpoint, self.objectname)
        headers = {'Date' : formatdate(None, usegmt=True)}
        Signature = self.sign(headers)
        headers['Authorization'] = 'OSS %s:%s' % (self.AccessKeyId, Signature)
        r = requests.get(url, headers=headers)
        print r.text
        print r.status_code
        print r.headers

调用的时候在a.put(objectname)下添加a.get(objectname)

3. 抓包观察

GET /test.txt HTTP/1.1
Host: ali-beijing.oss-cn-beijing.aliyuncs.com
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.5.1 CPython/2.7.10 Darwin/15.0.0
Connection: keep-alive
Date: Tue, 26 Apr 2016 14:16:32 GMT
Authorization: OSS testaliyun:ARRfi3zGoiGdrAjmM5lJ0o4LEBA=

HTTP/1.1 200 OK
Server: AliyunOSS
Date: Tue, 26 Apr 2016 14:16:32 GMT
Content-Type: plain/text
Content-Length: 5
Connection: keep-alive
x-oss-request-id: 571F78404FF4F07A6A023023
Accept-Ranges: bytes
ETag: "5D41402ABC4B2A76B9719D911017C592"
Last-Modified: Tue, 26 Apr 2016 13:44:42 GMT
x-oss-object-type: Normal
x-oss-hash-crc64ecma: 11177612005948864433
Cache-Control: max-age=86400

hello

以上是根据API文档,简单实现的上传和下载操作。
代码都是很简单的,没有异常的重试,也没有考虑大文件的上传和下载。
主要目的是演示如何通过wireshark和API文档来构建HTTP 请求来实现OSS的相关接口。

常见问题

1. Content-MD5计算错误

以消息内容为"123456789"来说,计算这个字符串的Content-MD5

正确的计算方式:
标准中定义的算法简单点说就是:
1. 先计算MD5加密的二进制数组(128位)。
2. 再对这个二进制进行base64编码(而不是对32位字符串编码)。 

以Python为例子:
正确计算的代码为:
>>> import base64,hashlib
>>> hash = hashlib.md5()
>>> hash.update("0123456789")
>>> base64.b64encode(hash.digest())
'eB5eJF1ptWaXm4bijSPyxw=='

需要注意
正确的是:hash.digest(),计算出进制数组(128位)
>>> hash.digest()
'x\x1e^$]i\xb5f\x97\x9b\x86\xe2\x8d#\xf2\xc7'

常见错误是直接对计算出的32位字符串编码进行base64编码。
例如,错误的是:hash.hexdigest(),计算得到可见的32位字符串编码
>>> hash.hexdigest()
'781e5e245d69b566979b86e28d23f2c7'
错误的MD5值进行base64编码后的结果:
>>> base64.b64encode(hash.hexdigest())
'NzgxZTVlMjQ1ZDY5YjU2Njk3OWI4NmUyOGQyM2YyYzc='

2. 某些头部没有加入到签名的计算中

例如x-oss-开头的header没有加入到签名的计算中。

3. Content-Type设置不对

上传Objec的时候没有设置正确的Content-Type,导致浏览器等无法根据Content-Type进行预览等处理。

上一篇:JDK及tomcat服务器的配置


下一篇:迁云工具1.5.1.3版本发布