Physics is an experimental science in which we make measurements and try to make sense of the data taken. Because of natural variations in the phenomenon and because of the measurements techniques, the numbers we collect generally vary. We will need to make conclusions by looking at an ensemble of data points. To do so we look at patterns and we use statistics.
In this class, we will not assume any prior knowledge of statistics and we will not require detailed statistical calculations. Instead we will focus on knowing a few important concepts and we will focus on using statistical tools to visualize and compute for us all the important properties of our dataset.
For example here is a data set of the age of students in an online course (this is fake, just an example). The students are marked whether they attend the World campus (WC) or the University Park (UP) campus.
Age | WC or UP |
---|---|
19 | up |
20 | up |
17 | up |
17 | up |
18 | up |
65 | up |
26 | up |
25 | wc |
40 | wc |
42 | wc |
65 | wc |
30 | wc |
27 | wc |
Given this set of data, it looks like the UP students are younger than the WC students but it would be nice to be a bit more precise. Instead of computing all the statistics yourself, we will use tools to do the work for us.
In this class we will primarily use StatKey. StatKey which is a free Javascript application that is used with the textbook "Statistics: Unlocking the Power of Data".
The first visual tool, we will learn is the box plot shown below.
I will show in the video below how to create this image and what it means. On the right side next to the image are a list of statistic properties of the data. You do not need to know how to compute these numbers but you should have a basic understanding of what they mean.
The box plot is a way to visualize all this information in one graph.
The boxplot also displays outliers . These are points which are far from the range (Q3-Q1) of all the other points. They are represented by an asterisk.
In the following video, I will show how to create a box plot graph. You will need to do this in your lab.
Things to Remember
In the next video, I will show how outliers can have a large impact on data and it is sometimes good to not include them in the analysis.