文章目录
Lecture 5: Training versus Testing
Recap and Preview
Two Central Questions
make sure that
is close enough to
.
make
mall enough.
Fun Time
Data size: how large do we need?
One way to use the inequality
is to pick a tolerable difference ? as well as a tolerable BAD probability δ, and then gather data with size (N) large enough to achieve those tolerance criteria. Let ? = 0.1, δ = 0.05, and M = 100.
What is the data size needed?
1. 215 2. 415
3. 615 4. 815
Explanation
所以
Effective Number of Lines
Uniform Bound
for most
,
over-estimating
为了合并重叠的部分,我们按照类别将类似的假设分组归类。
Many Lines
Effective Number of Hypotheses
1点2个类别
2个点4个类别
3个点6个类别
4个点14个类别
,用有效类别数(
)替换
(infite)
Fun Time
What is the effective number of lines for five inputs $ \in R^2$ ?
1. 14 2. 16 3. 22
4. 32
Explanation
总共有32(
)种
5个O:1种,4个O1个X:5种,3个O2个X:共有$C_5^3$10种,只有5种满足条件。
而又O和X等价
所以有效类别数共有2X(1+5+5)=22种。
Growth Function
Growth Function for Positive Rays
Growth Function for Positive Intervals
Growth Function for Convex Sets
call those
inputs ‘shattered’ by
Fun Time
Consider positive and negative rays as H, which is equivalent to the perceptron hypothesis set in 1D. The hypothesis set is often called ‘decision stump’ to describe the shape of its hypotheses. What is the growth function
?
1. N 2. N+1 3. 2N
4.
Explanation
正或负方向的Positive Intervals,再加上全X和全O。
2X(N-1)+2 = 2N
Break Point
是多项式( ),我们用它替换
call k a break point for
if k is a break point, k+1, k+2, k+3,… also break points! k is the minimum break point.
Fun Time
Consider positive and negative rays as H, which is equivalent to the perceptron hypothesis set in 1D. As discussed in an earlier quiz question, the growth function
. What is the minimum break point for
?
1. 1 2. 2 3. 3
4. 4
Explanation
正负线的"一线曙光"是3.
Summary
讲义总结
是多项式(
),我们用它替换
,降低假设集的上界。
通过minimum break point,求出多项式的幂级数。
Recap and Preview
two questions:
, and
Effective Number of Lines
at most 14 through the eye of 4 inputs
Effective Number of Hypotheses
at most
through the eye of
inputs
Break Point
when
becomes ‘non-exponential’
参考文献
《Machine Learning Foundations》(机器学习基石)—— Hsuan-Tien Lin (林轩田)