bugl
bugl
HomeLearnPatternsPathsSearch
HomeLearnPatternsPathsSearch

Loading lesson path

Learn/Data Science/DS Statistics
Data Science•DS Statistics

Data Science - Statistics Percentiles

Flash cards

Review the key moves

1/4
Core idea

What is the main idea behind Data Science - Statistics Percentiles?

Lesson checks

Practice each idea before moving on

Short Mimo-style checks built from this lesson's code, terms, and sequence.

1Quick choice

Which statement best captures the main point of this lesson?

2Fill blank

Complete the missing token from the example code.

___ numpy as np
3Order

Put the learning moves in the order that makes the concept easiest to apply.

Percentiles are used in statistics to give you a number that describes the value that a given percent of the values are lower than.
Task: Find the 10% percentile for Max_Pulse
25%, 50% and 75% - Percentiles
4Data move

Before charting or modeling a dataset, which move should come first?

25%, 50% and 75% - Percentiles

Percentiles are used in statistics to give you a number that describes the value that a given percent of the values are lower than.

Let us try to explain it by some examples, using Average_Pulse.

  • The 25% percentile of Average_Pulse means that 25% of all of the training sessions have an average pulse of 100 beats per minute or lower. If we flip the statement, it means that 75% of all of the training sessions have an average pulse of 100 beats per minute or higher
  • The 75% percentile of Average_Pulse means that 75% of all the training session have an average pulse of 111 or lower. If we flip the statement, it means that 25% of all of the training sessions have an average pulse of 111 beats per minute or higher

Task: Find the 10% percentile for Max_Pulse

The following example shows how to do it in Python:

Example

import numpy as np
Max_Pulse= full_health_data["Max_Pulse"]

percentile10 = np.percentile(Max_Pulse, 10)
print(percentile10)
  • Max_Pulse = full_health_data["Max_Pulse"] - Isolate the variable Max_Pulse from the full health data set.
  • np.percentile() is used to define that we want the 10% percentile from Max_Pulse.

The 10% percentile of Max_Pulse is 120. This means that 10% of all the training sessions have a Max_Pulse of 120 or lower.

Previous

Data Science - Intro to Statistics

Next

Data Science - Statistics Standard Deviation