Machine Learning: July 2016

The pieces are starting to come together.

From beginning to end:

Take dataset, e.g. images organized into folders (folder_name eventually becomes a given image's "label")

For each folder, convert it into a 3D array (image index, x, y) [with x and y being the pixel size of each image) and normalize the data of each image

"Pickle" the 3D arrays into 1 file each

Organize into "data" (i.e. the image data) and "labels" (i.e. what the image is, e.g. "A") - will need to "load" the data from the "pickled" files and randomize them

Choose a set size (# of images) and calculate how many images X are needed from each folder/class
Create a new "merged" 3D array containing data for X images from each folder/class; at the same time, create a companion 1D array containing the label for each image
Randomize the merged array and corresponding label array - this is the train_dataset and train_labels for the model (i.e. "X" and "y")
Do steps 1-3 to create validation and testing datasets and labels

Create a model and train it

e.g. use sklearn LogisticRegression()
do we need to set parameters?

multi_class='multinomial' in order to use cross-entropy as the loss function
solver='newton-cg' or 'lbfgs' (which support multi_class=multinomial)

Comparing the udacity tutorial with the documentation for LogisticRegression(), is it correct that it is doing all the following:

Sets up a linear model (y=Wx+b)
Applies softmax to convert computed values of y to probabilities
Compares the computed values of y with the actual values of y (one-hot encoded) using cross-entropy
Tweaks the model's parameters using an optimization algorithm to minimize the loss as defined in relation to cross-entropy

is it using stochastic gradient descent here? (taking many many tiny steps instead of fewer big ones)
loss over the entire set or a small random sample?
and the concepts of momentum (running average of gradients and moving in that direction vs just the current gradient assessment)
and learning rate decay (i.e. how much we change the weights at each step, where lower rate is better)

Validate the model

Test the model

Machine Learning

Friday, July 1, 2016

Lesson 1 Summary - in words