Representation

The algorithms we are going to create need to be able to handle lots of different data, the problems we can now solve with machine learning are usually relating image, text or audio inputs.

To handle these types of data, we try and convert them into formats that we can easily manipulate using mathematics and computers, a common strategy we'll see is to turn our input data into a vector.

Vectorization
The process of converting data into vectors

For example given a black and white image that is \( n \) pixels wide and \( m \) pixels tall, we can turn it into a vector in \( [0,1]^{n \cdot m} \) by mapping the pixel's value in the \( i \)-th row and \( j \)-th column to \( i \cdot n + j \)-th position in the vector, we can already start doing meaningful things with this image, because we already know that the 0 vector is an image that is completely black, and the vector with every entry equal to one would be a pure white image, we can use this as a measure of how bright an image is.

Thus we could write a very simple function to compute the brightness of an image by using the length of the input vector. Alternatively given two images (vectors), we could measure the length of their difference, if 0 the images are the same and the larger the the length of the difference the more the two images must differ

When we're dealing with supervised learning we are trying to predict properties about new data after it's seen lots of data with their true properties. The example we spoke of previous was doing object categorization

We know that our algorithm has access to a training set before hand, and now that we're equipped with knowledge of vectorization, we could first have a finite set of categories \( C \subseteq \mathbb{N}_0 \), where each number corresponds to a specific object category, and perhaps, we've mapped \( 3 \) to the category of fruit, now suppose that given a 1280x720 image of an apple \( a \in \mathbb{R}^{1280 \cdot 720} \), then one of our labeled data points in our training set would be \( (a, 3) \).

Thus given this object categorizaiton problem we have a concrete way to represent what kind of data we'll be working with while creating our learning algorithms.