Life is Like a Box of Samples, You Never Know What You’re Gonna Get

## Life is Like a Box of Samples, You Never Know What You’re Gonna Get

Good ol’ Forrest Gump. He is right, you never know what you’re going to get, and that is why we use resampling techniques in data science! In this post, I will review two popular resampling techniques for predictive models and give examples of how to implement them in R. But first, let’s review the basics. What does “resample” mean? Often we only have one dataset to examine and use to build prediction models. Let’s call this dataset our sample. Resampling…

Using a Forest Plot to Display Regression Results

## Using a Forest Plot to Display Regression Results

View high res version here. How many times have you sat through a presentation and stared blankly at a table of regression results? If you have been to my presentations, it has been many, many times. I was thinking about a better and more intuitive way to present regression results that also gives a sense of uncertainty. In this post, I show how to visualize OLS regression results via a forest plot. The nice thing about forest plots is that they…

Visualizing Body Mass Index by Percent of the Federal Poverty Level

## Visualizing Body Mass Index by Percent of the Federal Poverty Level

View high res version here. It is often assumed that low-income populations have worse health outcomes. This is correct for many health outcomes, but for one, it isn’t as clear: Body Mass Index (BMI). BMI doesn’t show a clear association with income. Given that we are in the midst of an obesity epidemic, and BMI is the primary measurement of obesity, this is quite interesting. I wanted to take a look at some recent data to visualize the association. I created…

Not Feeling So Great? A Random Forest Analysis of Demographics and Health

## Not Feeling So Great? A Random Forest Analysis of Demographics and Health

Each of us have an intuition of our own health and understand when we are feeling great, or not not so great. Indeed, self-perceived health status even predicts mortality. I started to wonder how well basic demographics, such as age, gender, and income predict self-perceived health status. We know demographics strongly correlate with lots of other health outcomes, but I have never seen them used to predict self-perceived health status. In this post I use a Random Forest classifier to predict…