Django文件下载2

django提供文件下载时，若果文件较小，解决办法是先将要传送的内容全生成在内存中，然后再一次性传入Response对象中：

 
         def  
         simple_file_download(request): 
        
         # do something... 
        
         content  
         =  
         open 
         ( 
         "simplefile" 
         ,  
         "rb" 
         ).read() 
        
         return  
         HttpResponse(content)

如果文件非常大时，最简单的办法就是使用静态文件服务器，比如Apache或者Nginx服务器来处理下载。不过有时候，我们需要对用户的权限做一下限定，或者不想向用户暴露文件的真实地址，或者这个大内容是临时生成的(比如临时将多个文件合并而成的)，这时就不能使用静态文件服务器了。

django文档中提到，可以向HttpResponse传递一个迭代器，流式的向客户端传递数据。

要自己写迭代器的话，可以用yield：

 
         def  
         read_file(filename, buf_size 
         = 
         8192 
         ): 
        
         with  
         open 
         (filename,  
         "rb" 
         ) as f: 
        
         while  
         True 
         : 
        
         content  
         =  
         f.read(buf_size) 
        
         if  
         content: 
        
         yield  
         content 
        
         else 
         : 
        
         break 
        
         def  
         big_file_download(request): 
        
         filename  
         =  
         "filename" 
        
         response  
         =  
         HttpResponse(read_file(filename)) 
        
         return  
         response

或者使用生成器表达式，下面是django文档中提供csv大文件下载的例子：

 
         import  
         csv 
        
         from  
         django.utils.six.moves  
         import  
         range 
        
         from  
         django.http  
         import  
         StreamingHttpResponse 
        
         class  
         Echo( 
         object 
         ): 
        
         """An object that implements just the write method of the file-like 
        
         interface. 
        
         """ 
        
         def  
         write( 
         self 
         , value): 
        
         """Write the value by returning it, instead of storing in a buffer.""" 
        
         return  
         value 
        
         def  
         some_streaming_csv_view(request): 
        
         """A view that streams a large CSV file.""" 
        
         # Generate a sequence of rows. The range is based on the maximum number of 
        
         # rows that can be handled by a single sheet in most spreadsheet 
        
         # applications. 
        
         rows  
         =  
         ([ 
         "Row {0}" 
         . 
         format 
         (idx),  
         str 
         (idx)]  
         for  
         idx  
         in  
         range 
         ( 
         65536 
         )) 
        
         pseudo_buffer  
         =  
         Echo() 
        
         writer  
         =  
         csv.writer(pseudo_buffer) 
        
         response  
         =  
         StreamingHttpResponse((writer.writerow(row)  
         for  
         row  
         in  
         rows), 
        
         content_type 
         = 
         "text/csv" 
         ) 
        
         response[ 
         'Content-Disposition' 
         ]  
         =  
         'attachment; filename="somefilename.csv"' 
        
         return  
         response

python也提供一个文件包装器，将类文件对象包装成一个迭代器：

 
         class  
         FileWrapper: 
        
         """Wrapper to convert file-like objects to iterables""" 
        
         def  
         __init__( 
         self 
         , filelike, blksize 
         = 
         8192 
         ): 
        
         self 
         .filelike  
         =  
         filelike 
        
         self 
         .blksize  
         =  
         blksize 
        
         if  
         hasattr 
         (filelike, 
         'close' 
         ): 
        
         self 
         .close  
         =  
         filelike.close 
        
         def  
         __getitem__( 
         self 
         ,key): 
        
         data  
         =  
         self 
         .filelike.read( 
         self 
         .blksize) 
        
         if  
         data: 
        
         return  
         data 
        
         raise  
         IndexError 
        
         def  
         __iter__( 
         self 
         ): 
        
         return  
         self 
        
         def  
         next 
         ( 
         self 
         ): 
        
         data  
         =  
         self 
         .filelike.read( 
         self 
         .blksize) 
        
         if  
         data: 
        
         return  
         data 
        
         raise  
         StopIteration

使用时：

 
         from  
         django.core.servers.basehttp  
         import  
         FileWrapper 
        
         from  
         django.http  
         import  
         HttpResponse 
        
         import  
         os 
        
         def  
         file_download(request,filename): 
        
         wrapper  
         =  
         FileWrapper( 
         open 
         (filename,  
         'rb' 
         )) 
        
         response  
         =  
         HttpResponse(wrapper, content_type 
         = 
         'application/octet-stream' 
         ) 
        
         response[ 
         'Content-Length' 
         ]  
         =  
         os.path.getsize(path) 
        
         response[ 
         'Content-Disposition' 
         ]  
         =  
         'attachment; filename=%s'  
         %  
         filename 
        
         return  
         response

django也提供了StreamingHttpResponse类来代替HttpResponse对流数据进行处理。

压缩为zip文件下载：

 
         import  
         os, tempfile, zipfile  
        
         from  
         django.http  
         import  
         HttpResponse  
        
         from  
         django.core.servers.basehttp  
         import  
         FileWrapper  
        
         def  
         send_zipfile(request):  
        
         """                                                                          
        
         Create a ZIP file on disk and transmit it in chunks of 8KB,                  
        
         without loading the whole file into memory. A similar approach can           
        
         be used for large dynamic PDF files.                                         
        
         """  
        
         temp  
         =  
         tempfile.TemporaryFile()  
        
         archive  
         =  
         zipfile.ZipFile(temp,  
         'w' 
         , zipfile.ZIP_DEFLATED)  
        
         for  
         index  
         in  
         range 
         ( 
         10 
         ):  
        
         filename  
         =  
         __file__  
         # Select your files here.                             
        
         archive.write(filename,  
         'file%d.txt'  
         %  
         index)  
        
         archive.close()  
        
         wrapper  
         =  
         FileWrapper(temp)  
        
         response  
         =  
         HttpResponse(wrapper, content_type 
         = 
         'application/zip' 
         )  
        
         response[ 
         'Content-Disposition' 
         ]  
         =  
         'attachment; filename=test.zip'  
        
         response[ 
         'Content-Length' 
         ]  
         =  
         temp.tell()  
        
         temp.seek( 
         0 
         )  
        
         return  
         response

不过不管怎么样，使用django来处理大文件下载都不是一个很好的注意，最好的办法是django做权限判断，然后让静态服务器处理下载。

这需要使用sendfile的机制："传统的Web服务器在处理文件下载的时候，总是先读入文件内容到应用程序内存，然后再把内存当中的内容发送给客户端浏览器。这种方式在应付当今大负载网站会消耗更多的服务器资源。sendfile是现代操作系统支持的一种高性能网络IO方式，操作系统内核的sendfile调用可以将文件内容直接推送到网卡的buffer当中，从而避免了Web服务器读写文件的开销，实现了“零拷贝”模式。 "

Apache服务器里需要mod_xsendfile模块来实现，而Nginx是通过称为X-Accel-Redirect的特性来实现。

nginx配置文件：

 
         # Will serve /var/www/files/myfile.tar.gz 
        
         # When passed URI /protected_files/myfile.tar.gz 
        
         location  
         / 
         protected_files { 
        
         internal; 
        
         alias  
         / 
         var 
         / 
         www 
         / 
         files; 
        
         }

或者

 
         # Will serve /var/www/protected_files/myfile.tar.gz 
        
         # When passed URI /protected_files/myfile.tar.gz 
        
         location  
         / 
         protected_files { 
        
         internal; 
        
         root  
         / 
         var 
         / 
         www; 
        
         }

注意alias和root的区别。

django中：

 
         response[ 
         'X-Accel-Redirect' 
         ] 
         = 
         '/protected_files/%s' 
         % 
         filename

这样当向django view函数发起request时，django负责对用户权限进行判断或者做些其它事情，然后向nginx转发url为/protected_files/filename的请求，nginx服务器负责文件/var/www/protected_files/filename的下载：

 
         @login_required 
        
         def  
         document_view(request, document_id): 
        
         book  
         =  
         Book.objects.get( 
         id 
         = 
         document_id) 
        
         response  
         =  
         HttpResponse() 
        
         name 
         = 
         book.myBook.name.split( 
         '/' 
         )[ 
         - 
         1 
         ] 
        
         response[ 
         'Content_Type' 
         ] 
         = 
         'application/octet-stream' 
        
         response[ 
         "Content-Disposition" 
         ]  
         =  
         "attachment; filename={0}" 
         . 
         format 
         ( 
        
         name.encode( 
         'utf-8' 
         )) 
        
         response[ 
         'Content-Length' 
         ]  
         =  
         os.path.getsize(book.myBook.path) 
        
         response[ 
         'X-Accel-Redirect' 
         ]  
         =  
         "/protected/{0}" 
         . 
         format 
         (book.myBook.name) 
        
         return  
         response

猜你喜欢