### Loss Function

Now we want to solve a image classification problem, for example classifying an image to be cow or cat. The machine learning algorithm will score a unclassified image according to different classes, and decide which class does this image belong to based on the score. One of the keys of the classification algorithm is designing this loss function.

**Map/compute image pixels to the confidence score of each class**

Assume a training set:

\[(x_i,y_i)\]

\(x_i\) is the image and \(y_i\) is the corresponding class

i∈1…N means the traning set constains N images

\(y_i\)∈1…K means there are K image categories

**So a score function maps x to y:**

\[f(x_i,W,b)=W\cdot x_i+b\]

In the above function, each image \(x_i\) is flattend to a 1 dimention vector

If one image’s size is 32x32 pixels with 3 channels

\(x_i\) will be a 1 dimention vector with the length of **D=32x32x3=3072**
Parameter matrix **W** has the size of **[KxD]**, it is often called weights
**b** of size **[Kx1]** is often called bias vector
In this way, **W** is evaluating \(x_i\)’s confidence score for **K** categories at the same time