哈喽,今天周六啦。不知道有多少妹子想我呢。哈哈。
言归正传。这个问题源于菜鸟思维。
数据集X shape (1000,64)的整体进行PCA与分成两部分或多部分是否结果相同??为何不同?差别大不大?为何?
依旧以MNIST数据集为例进行探索:
1-整体PCA与结果展示,64->20
奇异值:
[567.0065665 542.25185421 504.63059421 426.11767608 353.33503278
325.82036568 305.26157987 281.16033046 269.06977886 257.8239478
226.3187942 221.51478853 198.33066914 195.70009822 177.97288431
174.46075724 168.72640164 164.15235888 148.22422881 139.8223383 ]
占比:0.8942989517025197
2-分成两部分各自PCA,PCA后都是20维
[494.16894798 449.86043083 361.65617118 248.75442367 238.56963133
219.44600126 190.14159973 182.20945787 174.00574178 156.07163314
148.54921821 138.91401288 132.29034605 121.18280729 116.76818737
110.18550831 101.17695759 95.48735832 88.62803916 87.32415621]
0.9199960043548181
[454.72837686 387.92491221 319.67202851 276.08273363 248.36528251
208.91128358 194.45330919 182.89837591 170.67360347 156.77394418
151.9585882 141.42100262 135.60230213 130.54275122 127.97025433
112.22707956 108.0753686 103.78329442 99.07857098 97.99463294]
0.9017003841600361
整体展示
看下X的差别,绝对差值
[[1.31244167e+01 4.36752741e+00 5.03112333e+00 ... 1.04615884e+00
2.28185268e+00 5.36106528e+00]
[8.84453712e+00 1.10787771e+00 5.04179155e+00 ... 2.74136365e+00
4.39718268e+00 1.14849456e+00]
[1.09177845e+01 1.26699499e-02 6.99048209e+00 ... 2.07857394e+00
2.99868658e+00 2.30527202e+00]
...
[1.11176016e+01 1.59416532e+01 3.37943944e+00 ... 5.21226348e+00
3.04954629e+00 1.83212308e+00]
[1.11372977e+01 1.22892474e+01 8.83083935e+00 ... 1.04313776e+00
2.36339129e+00 2.38198290e-02]
[7.60063192e+00 4.04736222e+00 4.34867805e+00 ... 1.58014905e-01
7.77644982e+00 3.76430875e+00]]
看下具体差值整体XX[i]与分开X[i],XX[i]-X[i]
[ 3.56096291 21.52902529 -8.76773335 7.32404604 -6.94593806
-14.69827875 3.32651229 6.47221225 -0.02380591 -4.90829268
-0.55484534 2.84993604 -2.70771926 6.35663403 0.78859426
3.26912966 0.23222904 0.23445374 4.6073553 2.36956838]
[16.14207373 21.48994462 -4.26359793 -2.48269522 -1.68815802 1.77106465
-1.54286287 12.83694219 0.56913882 -1.9535757 -2.24191185 3.07127108
-4.47903486 -3.35420378 5.04365275 -0.58886118 -3.79776338 -0.40613234
-1.97151561 3.09692294]
[-12.58111082 0.03908066 -4.50413543 9.80674126 -5.25778004
-16.4693434 4.86937515 -6.36472994 -0.59294473 -2.95471698
1.68706651 -0.22133504 1.7713156 9.71083781 -4.25505849
3.85799083 4.02999242 0.64058608 6.57887091 -0.72735456]
[ 14.7961332 16.28413111 -6.7647892 1.36242249 -9.29640102
-18.09023954 2.00035581 2.94950406 1.23273672 -7.17750491
-3.66377601 5.1385072 -3.7531872 4.80955955 -2.93783748
3.27712621 -1.97418567 -0.10553849 -1.28421136 -1.02145653]
[22.49121131 10.71358422 -6.11988302 -7.25299299 -3.13691144 5.81681726
-3.09347511 15.19176067 -0.4990786 -3.00478096 1.65769038 0.25374448
-1.35045248 -0.68152649 1.28574377 2.86078425 1.32846106 1.95248632
0.2208832 -1.30050615]
[ -7.69507811 5.57054689 -0.64490618 8.61541548 -6.15948958
-23.90705681 5.09383092 -12.24225661 1.73181532 -4.17272395
-5.32146639 4.88476271 -2.40273472 5.49108604 -4.22358124
0.41634195 -3.30264672 -2.05802481 -1.50509456 0.27904963]
[ 5.64839667 18.79495119 -7.06889209 11.12006236 1.8788549
-14.31910144 2.11111015 -1.88758557 -2.74297214 5.46409076
-3.67492337 7.1324154 1.92404562 -1.99110216 0.48296671
-0.93767825 3.20106433 0.77706224 3.78907047 -3.30854926]
[15.22792683 18.5155614 -6.002477 -4.64166054 -1.06318825 -0.14042794
-5.993878 2.56868882 8.15210627 8.18421121 -3.75171648 -1.32540499
7.8124144 1.45459602 8.04908732 -2.47005895 4.82957793 0.61750721
2.40766548 0.29896154]
[ -9.57953017 0.27938979 -1.06641509 15.7617229 2.94204315
-14.1786735 8.10498816 -4.45627438 -10.89507841 -2.72012045
0.07679311 8.45782039 -5.88836878 -3.44569817 -7.56612061
1.5323807 -1.6285136 0.15955503 1.38140499 -3.60751081]
差值感觉还比较大。
但整体特征值差别不大。占比也行。
这就可以了。万变不离其中??
另外有相关问题可以加入QQ群讨论,不设微信群
QQ群:868373192
语音图像视频深度-学习群