We want to use a simple feed-forward neural network to classify images from a dataset consisting of images with rotated versions of these four lowercase letters: 'b', 'd', 'p', and 'q'. All rotation angles are equally common in the dataset. Given one image, our network has to determine which one of the four letters it is: the output will be a 4-way softmax.
12 input images from this dataset are shown below:
Because the letters are rotated, the network will have trouble distinguishing between several of the letters. Which ones will the network have trouble with?
The network will have trouble distinguishing 'b' from 'q', and it will have trouble distinguishing 'd' from 'p'.
A 'b' can be seen as a 'q' that is rotated by 180 degrees, and a 'd' can be seen as a 'p' that has been rotated 180 degrees. If we allow for all degrees of rotation, and we do not know what the rotation is a test-time, then there are two possible answers for every test image. Sometimes datasets can exhibit such significant variations that the problem itself becomes ill-posed.
The network will have trouble distinguishing 'b' from 'd', and it will have trouble distinguishing 'p' from 'q'.
The network will have trouble distinguishing 'b' from 'd' and 'q'.
The network will have trouble distinguishing 'q' from 'b' and 'd'.