版权声明:此文章有作者原创,涉及相关版本问题可以联系作者,[email protected] https://blog.csdn.net/weixin_42600072/article/details/88807440
import xgboost
from numpy import loadtxt
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
导入数据集:印度糖尿病病人
dataset = loadtxt('./data/pima-indians-diabetes.csv', delimiter=',')
#分为数据集和标签集
X = dataset[:,0:8]
Y = dataset[:,8]
分为训练集和测试集
seed = 6 #指定一个随机种子,保证每次分割都一致
test_size = 0.2 #训练集和验证集的分割比例
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=test_size, random_state=seed)
训练Xgboost模型
model = XGBClassifier()
运行 XGBClassifier().fit()函数会使kernel直接挂掉,原因暂时未知,可能电脑有点老,内存不够?
model.fit(X_train,y_train)
做出预测和给出精确度
y_pred = model.predict(X_test)
predictions = [round(value) for value in y_pred]
accuracy = accuracy_score(y_test, predictions)
print('Accuracy: %.2f%%' % (accuracy * 100.0))
---------------------------------------------------------------------------
XGBoostError Traceback (most recent call last)
<ipython-input-6-a4a337ba7f6f> in <module>
----> 1 y_pred = model.predict(X_test)
2 predictions = [round(value) for value in y_pred]
3 accuracy = accuracy_score(y_test, predictions)
4 print('Accuracy: %.2f%%' % (accuracy * 100.0))
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\sklearn.py in predict(self, data, output_margin, ntree_limit, validate_features)
767 if ntree_limit is None:
768 ntree_limit = getattr(self, "best_ntree_limit", 0)
--> 769 class_probs = self.get_booster().predict(test_dmatrix,
770 output_margin=output_margin,
771 ntree_limit=ntree_limit,
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\sklearn.py in get_booster(self)
185 """
186 if self._Booster is None:
--> 187 raise XGBoostError('need to call fit or load_model beforehand')
188 return self._Booster
189
XGBoostError: need to call fit or load_model beforehand