两种方法
- load,读取的是整个文件,每个json之间用”,”分割开。此时文件开头”[” ,末尾加”]”
loads,写在for循环里面一行一行的读取。每个json之间没有”,”的时候使用
下面开始写代码读取
import json
import pandas as pd
loads
df = pd.DataFrame()
with open(r'../weibo/weibo-users.json','r',encoding='utf-8')as f:
for ff in f:
data = json.loads(ff)
row = pd.DataFrame(data)
df = df.append(row,ignore_index=True)
print(df.head())
load
df = pd.DataFrame()
with open(r'../data/Tweets.json','r',encoding = 'utf-8')as f:
data = json.load(f)
df = pd.DataFrame(data)
json格式是下面这种,转DataFrame时候会报错
{
"status": {
"statuscode": 200,
"statusmessage": "Everything OK"
},
"result": [{
"id": 22,
"club_id": 16182
}, {
"id": 23,
"club_id": 16182
}, {
"id": 24,
"club_id": 16182
}, {
"id": 25,
"club_id": 16182
}, {
"id": 26,
"club_id": 16182
}, {
"id": 27,
"club_id": 16182
}]
}
解决:
import json
import pandas as pd
data = json.load(open('json_file.json'))
df = pd.DataFrame(data["result"])
参考:ValueError