Python: 进阶系列之一：常用第三方库

爬虫类

requests：访问网络资源，使用,比内置的urllib更好用

"""
    这是一个以django框架的后端API，当客户端上传文件到后端时，后端API的处理逻辑
    如果想要看前端如何提交一个文件，请参考链接：https://blog.csdn.net/wucong60/article/details/81289227
"""

from django.http import HttpResponse, HttpRequest, JsonResponse
from io

def post_cz_file(request: HttpRequest):
    file_obj = request.FILES.get('file')
    file_bytes = file_obj.read() # better to use chunk if file is big
    # print('s', type(s))
    # print('file_obj', file_obj)
    # print(type(file_obj))
    # print('chunks', file_obj.chunks())

    params = {
        'access_token': '',
        'file_id': '',
        'index': 0,
    }

    files = {
        'inputStream': ('file', io.BytesIO(file_bytes), 'image/png')
    }

    # files = {'inputStream': (open('b.png', 'rb'))}
    try:
        r = requests.post(url, params=params, files=files)
        if r.status_code == 200:
            return r.json()
        else:
            print(r.json())
            raise Exception("请求失败,未知错误, code:%s" % r.status_code)
    except Exception as ex:
        print(ex)
        raise ex

    return JsonResponse(result)

lxml：可以使用xpath获取html元素的信息

PIL：Python Imaging Library，已经是Python平台事实上的图像处理标准库了。PIL功能非常强大，但API却非常简单易用。安装命令pip install pillow

库名	帮助文档
array	https://docs.python.org/2.7/library/array.html
cmath	https://docs.python.org/2.7/library/cmath.html
collections	https://docs.python.org/2.7/library/collections.html
copy	https://docs.python.org/2.7/library/copy.html
datetime	https://docs.python.org/2.7/library/datetime.html
dateutil	https://pypi.python.org/pypi/dateutils/0.6.6
functools	https://docs.python.org/2.7/library/functools.html
heapq	https://docs.python.org/2.7/library/heapq.html
itertools	https://docs.python.org/2.7/library/itertools.html
json	https://docs.python.org/2.7/library/json.html
math	https://docs.python.org/2.7/library/math.html
operator	https://docs.python.org/2.7/library/operator.html
pytz	https://pypi.python.org/pypi/pytz/2015.2
random	https://docs.python.org/2.7/library/random.html
re	https://docs.python.org/2.7/library/re.html
stringing	https://docs.python.org/2.7/library/stringing.html
time	https://docs.python.org/2.7/library/time.html
xml	https://docs.python.org/2.7/library/xml.html

库名	简单说明	文档地址
ta-lib	TALib是一个处理金融数据和技术分析的开放代码库	http://mrjbq7.github.io/ta-lib/
numpy	NumPy系统是Python的一种开源的数值计算扩展。NumPy（Numeric Python）提供了许多高级的数值编程工具，如：矩阵数据类型、矢量处理，以及精密的运算库。专为进行严格的数字处理而产生	http://www.numpy.org
scipy	SciPy是一款方便、易于使用、专为科学和工程设计的Python工具包。它包括统计，优化，整合，线性代数模块，傅里叶变换，信号和图像处理，常微分方程求解器等等	http://www.scipy.org
pandas	Python Data Analysis Library 或 pandas 是基于NumPy 的一种工具，该工具是为了解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型，提供了高效地操作大型数据集所需的工具。pandas提供了大量能使我们快速便捷地处理数据的函数和方法	http://pandas.pydata.org
anyjson	一个几乎可以把任何对象（anything）转换为序列化json的工具	https://bitbucket.org/runeh/anyjson/src
graphviz	一个绘图工具，可以根据dot脚本画出树形图	https://graphviz.gitlab.io/about/
lasagne	Pyhton深度学习库	http://lasagne.readthedocs.org/en/latest/
seaborn	该模块是一个统计数据可视化库	http://seaborn.pydata.org
requests	网络访问模块	http://docs.python-requests.org
pycrypto	Python加密工具包	https://www.dlitz.net/software/pycrypto/
beautifulsoup4	python下很帅气的爬虫包	https://www.crummy.com/software/BeautifulSoup
xlrd	读取Excel的扩展工具	https://xlrd.readthedocs.io/en/latest/
cvxopt	cvxopt是一个最优化计算包，进行线性规划、二次规划、半正定规划等的计算	http://cvxopt.org/
gensim	gensim用于计算文本相似度，依赖NumPy和SciPy这两大Python科学计算工具包	http://radimrehurek.com/gensim/tutorial.html
matplotlib	matplotlib可能是Python 2D绘图领域使用最广泛的库。它能让使用者很轻松地将数据图形化，并且提供多样化的输出格式	http://matplotlib.org/mpl_toolkits/index.html
statsmodels	Statismodels是一个Python包，提供一些互补scipy统计计算的功能，包括描述性统计和统计模型估计和推断	http://statsmodels.sourceforge.net
theano	Pyhton深度学习库	http://deeplearning.net/software/theano/
xlwt	写入Excel文件的扩展工具	https://xlwt.readthedocs.io/en/latest/
openpyxl	一个python读写Excel 2010文件的库	http://openpyxl.readthedocs.io/en/default/
quantLib-Python	一个有名的金融计算库，能方便地用于计算许多金融模型和公式	https://www.quantlib.org/
mysql-connector-python	MySQL官方提供的驱动器	https://dev.mysql.com/doc/dev/connector-python/8.0/
wxpy	实现微信一些自动化功能	https://github.com/youfou/wxpy

https://www.cnblogs.com/welhzh/p/5972107.html

numpy：核心数据组织 ndarray，常用统计函数

pandas：数据二维报表风格管理，index，columns， value

scipy：常用科学计算库：傅里叶变化，优化算法等

matplotlib：数据可视化基础包，提供基础绘图功能

seaborn：数据可视化高级包，提供数据统计分析专业函数以及绘图方法。

https://quant.pobo.net.cn/doc?name=api#%E9%99%84%E5%BD%95%E4%BA%8C-%E6%94%AF%E6%8C%81%E7%9A%84python%E7%AC%AC%E4%B8%89%E6%96%B9%E5%BA%93

创建WebApplication

Django: 功能全，推荐用这个

Flask: 适合用于微小型项目，6-7行代码就可以把API创建起来。

定时job应用

apscheduler

from apscheduler.schedulers.background import BackgroundScheduler

scheduler = BackgroundScheduler()

# 表示每天06点00分执行该程序 
scheduler.add_job(corn_service.execute, 'cron', hour=6, minute=00)

# 定时job存证：表示每隔5分执行该程序 
scheduler.add_job(china_jci_service.execute, 'interval', seconds=300)

加密(md5,sha1等）

hashlib

# md5加密
sign = hashlib.md5('something').hexdigest()


# sha1加密
sha1 = hashlib.sha1(file_bytes)
hash_value = sha1.hexdigest()

深拷贝与浅拷贝

import copy

数据库ORM框架: SQLAlchemy

wucong60

发布了105 篇原创文章 · 获赞 46 · 访问量 21万+

私信关注

Python: 进阶系列之一：常用第三方库

爬虫类

创建WebApplication

定时job应用

加密(md5,sha1等）

猜你喜欢