There are 11 types of lies. (In binary…)

When I announced my blog project on LinkedIn, I did so with the following post:

I’m a man of my word.

I’ve since completed nine hours of coursework in my M.S. in Health Data Science Program, and have learned a significant number of things in the process. However, one point that has been driven home for me over and over again is that you cannot trust statistical models. Well, that’s not exactly right. It is more accurate to say “you should not blindly trust statistical models.” Early on in one of my first classes, I put together a…


Understanding Results vs. Predicting the Future

I recently completed a summer course as part of my Master’s in Health Data Science program that focused on Inferential Modeling. It was a really informative course that opened my eyes to a whole side of data science that I had never really been exposed to in my previous work. Since I assume there are readers in the same boat, I will try to explain it as I understand it today, where it fits, and some of the cool capabilities it enables.

In predictive analytics / machine learning, another term for “prediction” is “inference.”…


Invoking Intentional Inefficiency to Improve Inference

In a previous post I briefly touched on the problem with overfitting, which is loosely defined as a machine learning model that memorizes a training data set and thus provides high accuracy for predictions using it, but then performs poorly when presented with new data — a phenomenon known as variance. The post discussed the Random Forest approach using bootstrap aggregation to address this issue, but it begged the question: “Why does intentionally producing lower-quality data sets and averaging across their results produce better predictions?”

Reality, it turns out, is messy, so intentionally introducing…


Deep Neural Networks and Why They Rule the World (Mostly)

A significant portion of machine learning approaches are linear in nature — i.e. they take a look at training data observations and try to find the slope and intercept of the line that best fits that data. For example, let’s assume we have a training data set that has a shape something like this:

It’s like an ink-blot test, but for data nerds. What shapes to you see?

While there are some outlying observations, the general trend for this data is as the x-axis moves to the right, the y-axis goes up. Therefore, a linear model is an appropriate approach for inference. …


Finishing the Cornell Machine Learning Certificate

I recently completed the Cornell University Machine Learning Certificate program.

Pomp and Circumstance and All That Jazz

Coincidentally, at the same time I was completing the last course in this certificate program, at work I was also steeped in some intensive AI/ML training for Google Cloud SMEs. The cherry on top was I was (am) smack dab in the middle of my summer Inferential Modeling class as part of the Master’s program in Health Data Science. To say last week was a little… intense… would be an understatement. …


Building vs. Using Machine Learning Approaches

I’m about 30% through my summer course on Inferential Modeling, and in the latest set of lectures, I’m running into topics that I learned and wrote algorithms for as part of the Cornell Machine Learning Certificate program. The difference between the approaches is striking, and I’m absolutely loving it. In the Cornell program, you learn enough theory and mathematic principles to be able to code, from scratch, an entire machine learning algorithm, and over the course of two weeks, you might build a handful of functions that end up accomplishing a single task. …


Defining “Real” Data Science and AI/ML Expertise

What is a data scientist? I have been reading for years about the difficulty in defining exactly what qualifies one to become one. The broad strokes are you have to have appropriate degrees of statistical knowledge, coding skills, and domain expertise. Quantifying those areas in any meaningful way (i.e. a definition that can be generally agreed to) in the current data science landscape turns out to be next to impossible.

What I think of when I read a “real” data science post.

In some cases, a data scientist needs to have a Ph.D…


A Whirlwind Tour of the Past Few Weeks of AI/ML Learning

Things are getting busy — end of spring semester, beginning of summer semester in my M.S. Health Data Science Program, and since my last Cornell-specific update, I’ve finished two more classes and just have one to go which starts next week. That, plus some exciting related news at work (more later, perhaps) has meant a lot of time learning and not as much time available for writing — a quality problem to have, indeed! …


Why Good Data Engineers Will Always Have Work, Part Deux

Over the summer I’m taking an Inferential Modeling course using R. Our first assignment was to do some basic regression modeling on a public data set (link). I thought I would be all clever and code my R script to ingest the file as a first step, rather than download the static file and read it in locally — teacher’s pet and all, showing off my mad R skills.

I should have known…

If you download the file today, you’ll note that the column names are all nice and…


Health Data Science Statistics and Analytical Programming Final Project

For my Statistics and Analytical Programming final project, the goals were similar to my Python course final, but with different caveats and requirements. For starters, there were no requirements about reading in data from a certain number of different file formats. The only requirement on the data side was finding an interesting data set and doing some analysis on it. Therefore, I was able to take the work I had started here using public data on BigQuery, export it to GitHub, and just start chunking away at it. On the flip…

Jason Eden

Data Science, Big Data, & Cloud nerd with a focus on healthcare & a passion for making complex topics easier to understand. All thoughts are mine & mine alone.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store