1.一些常规操作方法
import PyPDF2
reader = PyPDF2.PdfFileReader(open('1.pdf', 'rb'))
print(reader.getNumPages()) # 获取pdf总页数
print(reader.isEncrypted) # 判断加密
page = reader.getPage(1) # 获取第四页
print(page.extractText()) # 获取第四页的内容
print(reader.getDocumentInfo()) # 获取PDF信息,创建时间,作者,标题等
2.分割文档
from PyPDF2 import PdfFileReader,PdfFileWriter
pdf_reader = PdfFileReader("3.pdf")
for page in range(pdf_reader.getNumPages()):
pdf_writer = PdfFileWriter()
pdf_writer.addPage(pdf_reader.getPage(page))
with open(f'fengechuli {page} .pdf',"wb") as out:
pdf_writer.write(out)
3.合并文档
from PyPDF2 import PdfFileReader,PdfFileWriter
pdf_writer = PdfFileWriter()
for page in range(2):
pdf_reader = PdfFileReader(f'fengechuli {page} .pdf')
for page in range(pdf_reader.getNumPages()):
pdf_writer.addPage(pdf_reader.getPage(page))
with open("4.pdf","wb") as out:
pdf_writer.write(out)
4.旋转文档
from PyPDF2 import PdfFileReader,PdfFileWriter
pdf_reader = PdfFileReader("4.pdf")
pdf_writer = PdfFileWriter()
page = pdf_reader.getPage(0).rotateClockwise(90)
pdf_writer.addPage(page)
page = pdf_reader.getPage(1).rotateCounterClockwise(270)
pdf_writer.addPage(page)
with open("5.pdf","wb") as out:
pdf_writer.write(out)
rotateClockwise(90) 顺时针90
rotateCounterClockwise(270) 逆时针270
5.排序
如 倒序:
from PyPDF2 import PdfFileReader,PdfFileWriter
pdf_reader = PdfFileReader("4.pdf") #paixu
pdf_writer = PdfFileWriter()
for page in range(pdf_reader.getNumPages()-1,-1,-1):
pdf_writer.addPage(pdf_reader.getPage(page))
with open("5.pdf","wb") as out:
pdf_writer.write(out)
后续将会持续更新excel,ppt,爬虫,人工智能等相关内容,敬请关注