names : array-like, default None
用于结果的列名列表,如果数据文件中没有列标题行,就需要执行header=None。默认列表中不能出现重复,除非设定参数mangle_dupe_cols=True。
Age | Gender | Education | EducationField | MaritalStatus | Income | OverTime |
37 | Male | 4 | Life Sciences | Divorced | 5993 | No |
54 | Female | 4 | Life Sciences | Divorced | 10502 | No |
34 | Male | 3 | Life Sciences | Single | 6074 | Yes |
39 | Female | 1 | Life Sciences | Married | 12742 | No |
28 | Male | 3 | Medical | Divorced | 2596 | No |
24 | Female | 1 | Medical | Married | 4162 | Yes |
29 | Male | 5 | Other | Single | 3983 | No |
36 | Male | 2 | Medical | Married | 7596 | No |
33 | Female | 4 | Medical | Married | 2622 | No |
import pandas as pd
1.1
data = pd.read_csv('./train.csv',
names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
)
print(data.head(5))
输出结果:
new_0 new_1 new_2 new_3 new_4 new_5 new_6
0 Age Gender Education EducationField MaritalStatus Income OverTime
1 37 Male 4 Life Sciences Divorced 5993 No
2 54 Female 4 Life Sciences Divorced 10502 No
3 34 Male 3 Life Sciences Single 6074 Yes
4 39 Female 1 Life Sciences Married 12742 No
1.2
data = pd.read_csv('./train.csv',
header=None,
names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
)
print(data.head(5))
输出结果:
new_0 new_1 new_2 new_3 new_4 new_5 new_6
0 Age Gender Education EducationField MaritalStatus Income OverTime
1 37 Male 4 Life Sciences Divorced 5993 No
2 54 Female 4 Life Sciences Divorced 10502 No
3 34 Male 3 Life Sciences Single 6074 Yes
4 39 Female 1 Life Sciences Married 12742 No
1.3 header=2, names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
等于header=2,则第2行作为列名,Dataframe 从3行的数据开始,但names定义列名覆盖第2行的列名。
data = pd.read_csv('./train.csv',
header=2,
names=['new_0','new_1','new_2','new_3','new_4','new_5','new_6']
)
print(data.head(5))
输出结果:
new_0 new_1 new_2 new_3 new_4 new_5 new_6
0 34 Male 3 Life Sciences Single 6074 Yes
1 39 Female 1 Life Sciences Married 12742 No
2 28 Male 3 Medical Divorced 2596 No
3 24 Female 1 Medical Married 4162 Yes
4 29 Male 5 Other Single 3983 No