python3.6+pycharm

都说数据集太大，要先下下来，我真的是解压了好久。

链接什么的https://blog.csdn.net/zcf1784266476/article/details/70821417

这位敬爱的博主里面都有。

也不知道前面一大堆的数据预处理程序是给的还是博主自己写的，真是厉害。（也感觉很麻烦，自己又啥也不会，

只能勉强看懂，要改就完全不会了），但是博主前面好像少了一部分，

我是参考https://blog.csdn.net/u013698770/article/details/54645326这个给补上的

数据的提取吧。

def maybe_extract(filename, force=False):
    root=os.path.splitext(os.path.splitext(filename)[0])[0]
    data_folders=[os.path.join(root,d) for d in sorted(os.listdir(root))
        if os.path.isdir(os.path.join(root,d))]
    if len(data_folders)!=num_classes:
        raise Exception("expeption")
    return data_folders


#train_filename = 'E:\\pycharm\\notMNIST_large'
train_filename = 'E:\\pycharm\\notMNIST_large'
train_folders = maybe_extract(train_filename)
test_filename = 'E:\\pycharm\\notMNIST_small'
test_folders = maybe_extract(test_filename)

其实主要的还是逻辑回归的调用吧。调用真的很简单。

from sklearn.linear_model import LogisticRegression
size=100
with open('E:\\pycharm.pickle','rb')as f:
    data=pickle.load(f)
train_dt=data['train_dataset']
length=train_dt.shape[0]
train_dt=train_dt.reshape(length,image_size*image_size)
train_lb=data['train_label']
test_dt=data['test_dataset']
length=test_dt.shape[0]
test_dt=test_dt.reshape(length,image_size*image_size)
test_lb=data['test_label']

def train_linear_logistic(tdata,tlable):
    model=LogisticRegression(C=1.0,penalty='l1')
    print('initializing size is{}'.format(size))
    model.fit(tdata[:size,:],tlable[:size])
    print('testing model')
    y_out=model.predict(test_dt)
    print('accuracce {}is{}'.format(size,np.sum(y_out==test_lb)*1.0/len(y_out)))
    return None
train_linear_logistic(train_dt,train_lb)

一开始会出现

Reshape your data either using array.reshape(-1, 1)

在print后面又加了一个‘）’才好使，据说是数组的原因。就这样吧……

还有我的plt.show()第一个可以显示出来，第二个怎么也不显示。

还有一个问题就是ipython调用显示图像没有成功，改天要好好研究一下。

Udacity--1--notMNIst Reshape your data either using array.reshape(-1, 1)

Reshape your data either using array.reshape(-1, 1)

猜你喜欢