machine learning

Artificial Neural Networks: Matrix Form (Part 5)

To actually implement a multilayer perceptron learning algorithm, we do not want to hard code the update rules for each weight. Instead, we can formulate both feedforward propagation and backpropagation as a series of matrix multiplies, which leads to better usability. This is what leads to the impressive performance of neural nets - dumping matrix multiplies to a graphics card allows for massive parallelization and large amounts of data.

This tutorial will cover how to build a neural network that uses matrices. It builds heavily off of the theory in Part 4 of this series, so make sure you understand the math there! 

Read More

Artificial Neural Networks: Mathematics of Backpropagation (Part 4)

Up until now, we haven't utilized any of the expressive non-linear power of neural networks - all of our simple one layer models corresponded to a linear model such as multinomial logistic regression. These one-layer models had a simple derivative. We only had one set of weights the fed directly to our output, and it was easy to compute the derivative with respect to these weights. However, what happens when we want to use a deeper model? What happens when we start stacking layers? 

Read More

Linux snippets: sorting a file, ignoring the header

When working with large data files that have a header, sometimes it is more efficient to sort the files for evaluation so that a streaming algorithm can be used. In addition, you may want to simply sort the data that you have by some key for organizational and readability purposes. Regardless, a lot of data preparation involves doing something with data in a delimited file containing a header, while also preserving the position and contents of the header.

Here is a short example that sorts a tab delimited file with a header by the first field in the file:

(head -n 1 data.tsv && tail -n +2 data.tsv  | sort -k1 -t'     ') > data_sorted.tsv
What this command does is spawn a subshell that runs everything in parenthesis, and then outputs it to a second file. Within the parenthesis, we first get the header (head -n 1). Then we run another command that takes everything except the header (tail -n +2) and pipes it to the sort utility. The arguments to sort include the field to sort by (-k1, or the first field in this case) and a delimiter (-t' ', which specifies using tab as a delimiter - you can paste a tab character by typing Ctrl-V followed by Tab). You could substitute whatever routine you want for sort.

ML Primers

During my first year of studying machine learning, I've read a lot of papers, book chapters, and online tutorials. I like to learn by example. For me, theory paired with an implementation is the best way to learn a topic in machine learning. However, nearly every tutorial I've come across has a lot of one and little of the other. The ones that include both are usually presented at such a high level that they are of little use to someone trying to fully understand the topic at hand, or their code isn't even public!

Over the past few years, I've collated a lot of material from different sources to create a set of "primers" that are introductions to a wide array of machine learning topics. They include a theoretical foundation paired with a short tutorial with accompanying code written in Python. After going through one of these primers, you should have enough basic knowledge to begin reading papers about a particular subject without feeling too lost.

I'm currently in the process of posting these primers. Most of my original code was written in MATLAB, but it's not free and not everyone has access to it through their university or workplace, so I'm translating it to Python. The primers are aimed at an audience familiar with calculus and computer science so as not to "dumb down" any material, but I try to avoid using undefined terms or concepts.

I hope you find them helpful! If you have any issues with the code or material, feel free to leave a comment.