Hypothesis Writing and Interpretation for Data Science Explained

dolcikey
7 min readAug 18, 2020
Statistics is a crucial tool for data scientists. Image Source

In data science, one of the most important tools we use is statistical testing and hypothesis testing. Generally, we want to know if one measure we have is has a statistically significant effect on our target, and if it is, maybe we want to use that feature in a machine learning model. Statistical testing is an important tool we can use for feature selection, especially if we have many features and are unsure what really matters to our target variable (variable of interest/what we are looking to predict).

Before we do the statistical tests themselves, we need to know what we are testing. This is where the hypothesis testing set up comes in to play. Sometimes this can be very intuitive, but sometimes not so much. Espcially in a bootcamp environment where you are learning all of these in a day or two.

In this article, we are going to be focusing on how to write the hypothesis out. If you want to know what test to use when, I will link an amazing reference by Jagandeep Singh here. You’re welcome.

Let’s write some hypothesis tests!

Null Hypothesis

The null hypothesis is often written as H0. The best way to remember how to write the null hypothesis is that it’s a glass half empty sort of measure. It is the Eeyore of data science. There is…

--

--

dolcikey
dolcikey

Written by dolcikey

Curve model and Data lover based in London

No responses yet