Predicting Unpredictable Probabilities

Deep Neural Networks and Why They Rule the World (Mostly)

A significant portion of machine learning approaches are linear in nature — i.e. they take a look at training data observations and try to find the slope and intercept of the line that best fits that data. For example, let’s assume we have a training data set that has a shape something like this:

It’s like an ink-blot test, but for data nerds. What shapes to you see?

While there are some outlying observations, the general trend for this data is as the x-axis moves to the right, the y-axis goes up. Therefore, a linear model is an appropriate approach for inference. The prediction visualized might look something like this:

This is art, not science. Zero points awarded for pointing out my line is incorrect.

This is all well and good, so long as your data have a relatively straightforward distribution like this. However, in the real world, for really complex data and predictions, the data are rarely that straightforward. Take, for example, the scenario where your data look like this:

OK, this one is definitely a bird. Or a seahorse, maybe?

You can still use linear models to find the “best fit” from a mathematical perspective. It might look something like this:

Again, no points for making the line more accurate. Stop judging me! :)

In all likelihood, however, this model is going to perform very poorly in the real world. When you run into these kinds of data, what you really need are machine learning models that can learn more complex functions, and at the end, give you something that looks more like this from a prediction standpoint:

Same data, much better predictions.

There are two primary ways I know of to approach this: Decision Trees, and Neural Networks. For really, really complicated problems like images and language, Neural Networks currently rule the world, and for standard tabular data, they’re getting there. So how do they work and what makes them so powerful?

At a basic level, Neural Networks are very similar to our old friend the Perceptron. They start by working through the data and generating a linear separator.

Neural Network Step 1: Draw a bad line.

At the point where the error passes a certain threshold, the Neural Network will stop, mark its spot, and change direction in the direction of the error, once again generating a linear separator, but this time starting at that point. It would visualize something like this:

Neural Network Step 2: Draw another bad line.

This process repeats, and can repeat as many times as needed to achieve a predictive model with an arbitrarily good fit. A simple example on our current image might look something like this:

OK, definitely a bird.

And since you can repeat this process, continually smoothing the curves to better and better fit the data, you can eventually end up with a predictive model that looks like the “perfect” function we envisioned earlier.

Given enough time and compute power, any data distribution can be accurately modeled.

This comes with challenges, of course — the most obvious one perhaps being the risk of overfitting, but there are ways to address that, and if I can find time, I’ll put together a post specifically on regularization and how it works for Neural Networks and other machine learning approaches. The other big challenge with Neural Networks is that all of this line drawing, and eventually getting down to super smooth, accurate functions, can be extremely computationally expensive. Thus, while Neural Networks aren’t always the best choice for every machine learning problem (they need a lot of data, the more features the better, and explainability — while doable, is more complex than with some other approaches), they are the de facto standard for some of the hardest challenges in machine learning today, and are unlikely to be removed from that perch any time soon.




Data Science & Cloud nerd with a passion for making complex topics easier to understand. All writings and associated errors are my own doing, not work-related.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Crossing A Finish Line

An interview with Dmytro Mishkin, Computer Vision Researcher

AutoCorrect / Spell Check using Deep Learning in Python!

Udacity Providing Amazon Web Services Machine Learning Scholarship Program

How Bayes’ Theorem Coincides with Machine Learning

Penalizing the Discount Factor in Reinforcement Learning

Naive Bayes Classifier from scratch

The Story of a Bad Train-Test Split

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jason Eden

Jason Eden

Data Science & Cloud nerd with a passion for making complex topics easier to understand. All writings and associated errors are my own doing, not work-related.

More from Medium

Form of… An Innacurate Prediction!

How To Classify Handwritten Digits Using A Multilayer Perceptron Classifier

A Simplistic Approach to Machine Learning Modeling in Data Science — Part 2

Gartner Blog Network

Binary Classification with Logistic Regression