基于HTTP的多文件上传问题

企业应用中会经常用到基于Http的文件上传,其中Multipart是多文件上传所使用的,在用JAVA开发时有时会用第三方类库来拼HTTP报文,有时则手动组装,

然后手动组装前先要了解一下报文的格式,如下是HTTP的报文头:

POST /test/upload HTTP/1.1
Content-Type: multipart/form-data;boundary=0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64
charset: utf-8
Content-Length: 19520216
Host: 10.19.220.234:8080
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.2.5 (java 1.5)

重点是:

Content-Type: multipart/form-data;boundary=0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64
其中Content-Type决定了报文的类型,boundary则用于分隔各条目项。

示例如下:

--0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64
Content-Disposition: form-data; name="messageId"
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit

1395271935
--0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64
Content-Disposition: form-data; name="name"
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit

image
--0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64
Content-Disposition: form-data; name="type"
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit

3
--0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64
Content-Disposition: form-data; name="content"; filename="11.mp4"
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary

....ftypisom....isomavc1..4.moov...lmvhd.....1...1.....X........(省略文件)
--0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64

Content-Disposition: form-data; name="vimg"; filename="test.jpg"

Content-Type: application/octet-stream

Content-Transfer-Encoding: binary

......JFIF.............C..............
..
................. $.' ",#..(7),01444.'9=82<.342...C........(省略文件)
--0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64--

对于multipart的解析我用的是apache的commons-fileupload-1.3.jar,demo如下:

		boolean isHaveData = ServletFileUpload.isMultipartContent(request);
		if (isHaveData) {
			FileItemFactory factory = new DiskFileItemFactory();
			ServletFileUpload upload = new ServletFileUpload(factory);
			try {
				List<?> items = upload.parseRequest(request);
				Iterator<?> iter = items.iterator();
				while (iter.hasNext()) {
					FileItem item = (FileItem) iter.next();
					if (item.isFormField()) {
						// 普通文本信息处理
						String paramName = item.getFieldName();
						String paramValue = item.getString();
						map.put(paramName, paramValue);
					} else {
						// 上传文件信息处理
						if("vimage".equalsIgnoreCase(item.getName()) || "vimg".equalsIgnoreCase(item.getFieldName())){
							vimg = item.get();
						} else {
							data = item.get();
						}
					}
				}
			} catch (FileUploadException e) {
				logger.error("FileUploadException : ", e);
			}
		}
其中判断类型的代码如下:
    public static final String MULTIPART = "multipart/";

    public static final boolean isMultipartContent(RequestContext ctx) {
        String contentType = ctx.getContentType();
        if (contentType == null) {
            return false;
        }
        if (contentType.toLowerCase(Locale.ENGLISH).startsWith(MULTIPART)) {
            return true;
        }
        return false;
    }
有一点需要注意的是,upload.parseRequest(request)方法里的源码是这样的:

                boolean nextPart;
                if (skipPreamble) {
                    nextPart = multi.skipPreamble();
                } else {
                    nextPart = multi.readBoundary();
                }
                if (!nextPart) {
                    if (currentFieldName == null) {
                        // Outer multipart terminated -> No more data
                        eof = true;
                        return false;
                    }
                    // Inner multipart terminated -> Return to parsing the outer
                    multi.setBoundary(boundary);
                    currentFieldName = null;
                    continue;
                }
也就是说如果两个boundary之间是空的,就认为到eof了,也就是结束了往下的遍历,因为在实际的使用当中,我在拼报文时不小心在两个条项目中多加了一个boundary,这时后面的条项目就取不到了,这个问题让我纠结了半天,觉得很诡异,后来读源码才发现,原来如此。

如果使用第三方的类库就不会有此烦恼了,我用的是apache的httpmime-4.2.2.jar,测试的代码如下:

        MultipartEntity entity = new MultipartEntity(null, "0xKhTmLbOuNdArY-35123DE7-B577-4495-AA1E-092BB0CCFC64", null);
        File file = new File("E:\\ceshi\\11.flv");
        File file2 = new File("E:\\ceshi\\test.jpg");
        entity.addPart("messageId", new StringBody("1395271935"));
        entity.addPart("name", new StringBody("image"));
        entity.addPart("type", new StringBody("3"));
        entity.addPart("to", new StringBody("6000028403"));
        entity.addPart("from", new StringBody("6000608066"));
        entity.addPart("content", new FileBody(file));
        entity.addPart("vimage", new FileBody(file2));
        httpost.setEntity(entity);

测试的过程中我用的抓包工具是Wireshark,界面如下:

基于HTTP的多文件上传问题

                                                                       图1--Wireshark界面

查看到的TCP流如下:

基于HTTP的多文件上传问题

                                                                                   图2--TCP流

最后总结一下:multipart/form-data对于多文件上传还是很好用的



上一篇:RabbitMQ的使用与分析


下一篇:MongoDB学习小结