Sep 21Reduce, Reuse, and Recycle?Recognizing Rewards and Risks of Redeploying Prejudiced Pretrained Ptransformers In a couple of previous posts I’ve talked about the level of data crunching that has to happen in order to train a Transformer Large Language Model (LLM). This turns out to be an…Large Language Models4 min readLarge Language Models4 min read

Sep 13Tokenization Level 2 (or so…)Demystifying the Magic Behind Large Language Models When I first started reading/watching people in the field talking about complex encoding and decoding systems for NLP (including transformers and various other neural network architectures), if I’m being honest, it sounded (still sounds?) like magic. …Tokenization4 min readTokenization4 min read

Aug 28Hello Large Language ModelsAn Initial Foray into Bloom using Google Colab As part of my Master’s program Capstone project, I get to explore how to implement Large Language Models (LLMs). …Bloom3 min readBloom3 min read

Aug 28Deal Part Deux: A Simple ExplanationWhen Overthinking Produces the Simplest Explanation In my previous post on the Monty Hall Problem, I walked through the logic and statistics behind the 66% probability of winning by switching your answer when given the option to do so. Math, examples, and code stuffs — including examples of tests I…Monty Hall Problem4 min readMonty Hall Problem4 min read

Aug 27Let’s Make a Deal!Making Sense of the Monty Hall Problem The Monty Hall Problem is one of those proof points that illustrates how bad humans (myself included) are at statistical thinking. The quick version is this: you are playing a game with three options, often illustrated as doors. …Monty Hall Problem6 min readMonty Hall Problem6 min read

Published in MLearning.ai·Jun 14My Personal Chinese RoomThe Blurring Line Between Biological and Digital Intelligence If you’re not familiar with the Chinese room thought exercise, here’s the tl;dr — if a computational program can take an input and produce the exact same output that a knowledgeable, typical human might produce, is it actually thinking? …Artificial Intelligence4 min readArtificial Intelligence4 min read

Jun 3A Quick Lesson in OverheadWhen Going Faster Slows You Down Usually when building a machine learning algorithm from scratch (like we do in the Cornell Machine Learning program), you want to employ vectorized / matrixed math operations everywhere you possibly can in order to speed up performance. The reason for this is that it’s…Performance Testing4 min readPerformance Testing4 min read

Jun 1A Token GestureNatural Language Processing — Defining Tokens I have what is now an embarrassing confession to make. Before I started deeply exploring the math, statistics, and theory behind data science, I had a notion that predictive analytics — i.e taking tabular data and using it to predict the future — was…NLP4 min readNLP4 min read

May 17Another Milestone: Coding Courses CompleteRounding a Corner in my Master’s Degree Journey I recently submitted the final exam for the High Performance Computing course in my Master’s Degree program in Health Data Science. While this represents completion of 70% of my overall coursework (assuming I pass, fingers crossed…) this was the final code-specific data…Data Science2 min readData Science2 min read

Published in MLearning.ai·Apr 30And the Winner Is…A Clear Answer to Which Machine Learning Approach is the Best A commonly asked question among those studying machine learning for the first time, especially after they have reviewed several different machine learning algorithms, is “Which one is the best one?” or alternatively, “How do I know which approach to…Data Science5 min readData Science5 min read