Django 大文件下载

django提供文件下载时,若果文件较小,解决办法是先将要传送的内容全生成在内存中,然后再一次性传入Response对象中:

def simple_file_downloadrequest):
	# do something...
	content = open"simplefile", "rb").read)
	return HttpResponsecontent)

如果文件非常大时,最简单的办法就是使用静态文件服务器,比如Apache或者Nginx服务器来处理下载。不过有时候,我们需要对用户的权限做一下限定,或者不想向用户暴露文件的真实地址,或者这个大内容是临时生成的比如临时将多个文件合并而成的),这时就不能使用静态文件服务器了。

django文档中提到,可以向HttpResponse传递一个迭代器,流式的向客户端传递数据。

要自己写迭代器的话,可以用yield:

def read_filefilename, buf_size=8192):
    with openfilename, "rb") as f:
        while True:
            content = f.readbuf_size)
            if content:
                yield content
            else:
                break
def big_file_downloadrequest):
    filename = "filename"
    response = HttpResponseread_filefilename))
    return response

或者使用生成器表达式,下面是django文档中提供csv大文件下载的例子:

import csv

from django.utils.six.moves import range
from django.http import StreamingHttpResponse

class Echoobject):
    """An object that implements just the write method of the file-like
    interface.
    """
    def writeself, value):
        """Write the value by returning it, instead of storing in a buffer."""
        return value

def some_streaming_csv_viewrequest):
    """A view that streams a large CSV file."""
    # Generate a sequence of rows. The range is based on the maximum number of
    # rows that can be handled by a single sheet in most spreadsheet
    # applications.
    rows = ["Row {0}".formatidx), stridx)] for idx in range65536))
    pseudo_buffer = Echo)
    writer = csv.writerpseudo_buffer)
    response = StreamingHttpResponsewriter.writerowrow) for row in rows),
                                     content_type="text/csv")
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    return response

python也提供一个文件包装器,将类文件对象包装成一个迭代器:

class FileWrapper:
    """Wrapper to convert file-like objects to iterables"""

    def __init__self, filelike, blksize=8192):
        self.filelike = filelike
        self.blksize = blksize
        if hasattrfilelike,'close'):
            self.close = filelike.close

    def __getitem__self,key):
        data = self.filelike.readself.blksize)
        if data:
            return data
        raise IndexError

    def __iter__self):
        return self

    def nextself):
        data = self.filelike.readself.blksize)
        if data:
            return data
        raise StopIteration

使用时:

from django.core.servers.basehttp import FileWrapper
from django.http import HttpResponse
import os
def file_downloadrequest,filename):

    wrapper = FileWrapperopenfilename, 'rb'))
    response = HttpResponsewrapper, content_type='application/octet-stream')
    response['Content-Length'] = os.path.getsizepath)
    response['Content-Disposition'] = 'attachment; filename=%s' % filename
    return response

django也提供了StreamingHttpResponse类来代替HttpResponse对流数据进行处理。

压缩为zip文件下载:

import os, tempfile, zipfile  
from django.http import HttpResponse  
from django.core.servers.basehttp import FileWrapper  
def send_zipfilerequest):  
    """                                                                          
    Create a ZIP file on disk and transmit it in chunks of 8KB,                  
    without loading the whole file into memory. A similar approach can           
    be used for large dynamic PDF files.                                         
    """  
    temp = tempfile.TemporaryFile)  
    archive = zipfile.ZipFiletemp, 'w', zipfile.ZIP_DEFLATED)  
    for index in range10):  
        filename = __file__ # Select your files here.                             
        archive.writefilename, 'file%d.txt' % index)  
    archive.close)  
    wrapper = FileWrappertemp)  
    response = HttpResponsewrapper, content_type='application/zip')  
    response['Content-Disposition'] = 'attachment; filename=test.zip'  
    response['Content-Length'] = temp.tell)  
    temp.seek0)  
    return response

不过不管怎么样,使用django来处理大文件下载都不是一个很好的注意,最好的办法是django做权限判断,然后让静态服务器处理下载。

这需要使用sendfile的机制:”传统的Web服务器在处理文件下载的时候,总是先读入文件内容到应用程序内存,然后再把内存当中的内容发送给客户端浏览器。这种方式在应付当今大负载网站会消耗更多的服务器资源。sendfile是现代操作系统支持的一种高性能网络IO方式,操作系统内核的sendfile调用可以将文件内容直接推送到网卡的buffer当中,从而避免了Web服务器读写文件的开销,实现了“零拷贝”模式。 “

Apache服务器里需要mod_xsendfile模块来实现,而Nginx是通过称为X-Accel-Redirect的特性来实现。

nginx配置文件:

# Will serve /var/www/files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
	internal;
	alias /var/www/files;
}

或者

# Will serve /var/www/protected_files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
	internal;
	root /var/www;
}

注意alias和root的区别。

django中:

response['X-Accel-Redirect']='/protected_files/%s'%filename

这样当向django view函数发起request时,django负责对用户权限进行判断或者做些其它事情,然后向nginx转发url为/protected_files/filename的请求,nginx服务器负责文件/var/www/protected_files/filename的下载:

@login_required
def document_viewrequest, document_id):
    book = Book.objects.getid=document_id)
    response = HttpResponse)
    name=book.myBook.name.split'/')[-1]
    response['Content_Type']='application/octet-stream'
    response["Content-Disposition"] = "attachment; filename={0}".format
            name.encode'utf-8'))
    response['Content-Length'] = os.path.getsizebook.myBook.path)
    response['X-Accel-Redirect'] = "/protected/{0}".formatbook.myBook.name)
    return response

  

 

Published by

风君子

独自遨游何稽首 揭天掀地慰生平

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注