1.Some of the problems below are best addressed using a supervised
learning algorithm, and the others with an unsupervised
learning algorithm. Which of the following would you apply
supervised learning to? (Select all that apply.) In each case, assume some appropriate
dataset is available for your algorithm to learn from.(C、D)
Examine a large collection of emails that are known to be spam email, to discover if there are sub-types of spam mail.
Take a collection of 1000 essays written on the US Economy, and find a way to automatically group these essays into a small number of groups of essays that are somehow "similar" or "related".
Given historical data of children's ages and heights, predict children's height as a function of their age.
Given 50 articles written by male authors, and 50 articles written by female authors, learn to predict the gender of a new manuscript's author (when the identity of this author is unknown).
2.Let f be some function so that
f(θ0,θ1) outputs a number. For this problem,
f is some arbitrary/unknown smooth function (not necessarily the
cost function of linear regression, so f may have local optima).
Suppose we use gradient descent to try to minimize f(θ0,θ1)
as a function of θ0 and θ1. Which of the
following statements are true? (Check all that apply.)(A、C)
If the first few iterations of gradient descent cause f(θ0,θ1) to
increase rather than decrease, then the most likely cause is that we have set the
learning rate α to too large a value.
Setting the learning rate α to be very small is not harmful, and can
only speed up the convergence of gradient descent.
If θ0 and θ1 are initialized at
the global minimum, then one iteration will not change their values.
No matter how θ0 and θ1 are initialized, so long
as α is sufficiently small, we can safely expect gradient descent to converge
to the same solution.