8.IO

读写文件
字符编码
StringIO
BytesIO

1.读文件

def readfile(filepath):
    return open(filepath, 'r')


print(readfile('test8_1.py').read())
输出结果：
def readFile(filePath):
    return open(filePath, 'r')

print(readFile('test8_1.py').read())
t.close()

第2行代码意思是打开一个文件，第一个参数是文件路径，第二个’r’表示读。第5行的read()是读出文件的内容为一个字符串，第11行是关闭这个打开的文件对象，释放占用的资源（文件使用完毕后必须关闭，因为文件对象会占用操作系统的资源，并且操作系统同一时间能打开的文件数量也是有限的）。
如果传入的路径不存在，会报IO异常：

Traceback (most recent call last):
  File "E:/python/project/test8/test8_1.py", line 5, in <module>
    print(readfile('test8_11.py').read())
  File "E:/python/project/test8/test8_1.py", line 2, in readfile
    return open(filepath, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'test8_11.py'

如果出现异常，就会导致后面的close不执行，此时需要使用try。。。finally来确保close是执行的：

def readfile(filepath):
    return open(filepath, 'r')

t = None
try:
    t = readfile('test8_1.py')

    print(t.read())
finally:
    if t:
        t.close()
        print('close')
输出结果:
def readfile(filepath):
    return open(filepath, 'r')

t = None
try:
    t = readfile('test8_1.py')

    print(t.read())
finally:
    if t:
        t.close()
        print('close')

close

这跟java的写法基本一样，不过python可以用with语句来代替这个在finally关闭资源的操作：

with open('test8_1.py', 'r') as f:
    print(f.read())

读文件有三种方式，区别如下：
read()：一次性读取全部文件内容，适合在文件较小时使用；
read(size)：每次读取size个字节的内容，适合在不确定文件大小，文件又有可能很大时使用；
readline()：每次读取一行内容；
readlines()：一次读取所有内容，并按行返回一个list，特别适合在读配置文件时使用。

with open('test8_1.py', 'r') as f:
    for l in f.readlines():
        print(l.strip()) #去掉末尾的'\n'

输出结果：
with open('test8_1.py', 'r') as f:
for l in f.readlines():
print(l.strip())

上述读取文件都是读取UTF-8编码的文本文件，如果要读取图片、视频等二进制文件，需要用’rb’模式打开：

f = open('test.jpg', 'rb')

2.写文件

with open('testwirte.txt', 'w') as f:
    f.write('Hi,python!')

调用open时可传入’w’和’wb’分别写文本文件和二进制文件；
如果不用with语句，就需要自己手动调用f.close()方法，只有在调用这个方法时，系统才真正的把文件完全写入文件了，否则可能只写了一部分数据，另一部分丢失了，所以不如都用with语句，方便又安全。

3.字符编码
python读非UTF-8编码的数据时，就加上编码参数，如果编码不匹配，会报UnicodeDecodeError异常，例如读一个gbk编码的文件：

with open('iso8859-1.txt', 'r', encoding='utf-8') as f:
    print(f.read())
输出结果：
Traceback (most recent call last):
  File "E:/python/project/test8/test8_3.py", line 2, in <module>
    print(f.read())
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python35\lib\codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 0: invalid start byte

这种情况或者文件内容里有非法编码时都会报异常，这时可以加个errors参数来忽略以避免出异常：

with open('iso8859-1.txt', 'r', encoding='utf-8', errors='ignore') as f:
    print(f.read())
输出结果：
Եļ

但是编码不匹配，读出的内容是乱码的，所以还是要用匹配的编码来读写：

with open('iso8859-1.txt', 'r', encoding='gbk', errors='ignore') as f:
    print(f.read())
输出结果：
测试的文件

4.StringIO
在内存中读写字符串

from io import StringIO

f = StringIO()
f.write('hi')
f.write(' ')
f.write('python')
print(f.getvalue()) #获取写入后的内容
输出结果：
hi python

要想读取StringIO的内容，可在初始化StringIO时直接传入内容

from io import StringIO

f = StringIO('a!\nb!\nc!')
while True:
    s = f.readline()
    if s == '':
        break
    print(s.strip())

输出结果：
a!
b!
c!

5.BytesIO
StringIO操作字符串，BytesIO操作二进制数据，

from io import StringIO, BytesIO

f1 = BytesIO()
f1.write('中国'.encode('UTF-8'))
print(f1.getvalue())
输出结果：
b'\xe4\xb8\xad\xe5\x9b\xbd'

猜你喜欢