D1.5 Determine the impact of adding or removing data from a data set on a measure of central tendency, and describe how these changes alter the shape and distribution of the data.

Activity 1: Median and Mean


Pose the following statements to the students and let them explore them in teams. Afterwards, pair up the teams so that they can share their findings. 

  • Even if a large number is part of the set of data, it will not affect the median. Show an example.
  • Would this be the case for the mean? Why?

Source: Making Math Meaningful, Marian Small, p. 575.

Activity 2: Mean 


Ask students to invent a mean for a given situation, then imagine what the data in the series might be to arrive at that mean.

Example

Present students with the following scenario:

At the field hockey tournament, six friends scored an average of four goals. What is the number of goals each friend scored? Explain your set of numbers and compare it with that of a peer.

Facilitate a mathematical exchange to highlight the different strategies of the students and share their learning.

This type of exercise will help students understand what a mean is and is indicative of their understanding of the concept.

Variation: Incorporate averaging into decimal number lessons by replacing 4 goals with 4.5 goals, for example.

Source: translated from L'@telier - Ressources pédagogiques en ligne (atelier.on.ca).

Explore and discuss with students the effect of adding a piece of data (adding counted goals) such as 10, 1, or 4, and how this addition would affect the graphing of the data.

Activity 3: Mean


Have students create two data sets such that they both have the same mean, and in one set the values are close to the mean and in the other set the data values are far from the mean; for example: 5; 5.7; 6.3 compared to 1.3; 5.8; 9.9.

Which set does the mean best describe and why?

The set whose values are close to the mean; if you used the mean for the other set, it would not give you any sense as to the range of the data.

Source: Making Math Meaningful, Marian Small, p. 575.

Activity 4: Median and Mean


Present the following data set to students.

Number of Students in Groups

Group A

Group B

Group C

Group D

Group E

Group F

3

5

4

7

5

6

  • The mean of the students in the six groups is 5. Do a “Think-Pair-Share” by asking the students what the number 5, the mean, means.

    Students can see that if we put all the group values together and divide the total evenly into 6, it gives us a mean of 5 students. We can also say that 5 does not necessarily mean 5 students in each group. There may be groups with more or fewer students.

  • The median of this set is 5.5. Do another think-pair-share by asking students what the number 5.5 means, which is the median.

  • Does the mean of 5 appropriately represent the number of students in the groups?

    Yes, because there are few groups that have a little more or a little less than 5 students. Two groups have 5 students.

  • Does the median of 5.5 appropriately represent the number of students in the groups?

    Yes, because there are few groups that have slightly more or slightly less than 5.5 students. Two groups have 5 students.

  • If a group of 30 students were added, and group A, B, or D were removed, how would this affect the mean? the median? the graph? Why?

    Regarding the median, if we added a group of 30 students, it would not have a big impact, since it would be 5 instead of 5.5. However, regarding the mean, it would change to 10 instead of 5. The mean of 10 would no longer be representative of the number of students in the groups.

Point out that by adding a data value (or two data values) to a set of one extreme value the median is a particularly good measure of central tendency because these extreme values do not affect its value. However, this is not the case with the mean. Discuss this with students to point out what the median and mean represent.

Source: adapted from Making Math Meaningful, Marian Small, p. 575.

Activity 5: Mode, Median and Mean


Present students with the following data set:

Number of Minutes of Exercise in a Day

Student A

Student B

Student C

Student D

Student E

Student F

90

90

90

90

90

30

  • Which measure of central tendency would be most appropriate to describe this data set: mode, median or mean? Why?

    The mode 90 since it repeats very often. Also the median since 90 is in the center of the data. The mean of 60 is not the best indicator of how many minutes students exercise.

  • How does the value 30 of student F affect the data set?

    It affects the mean. The fact that the 30 value is lower than the other values, 90 minutes, it lowers the mean and gives a false impression that students averaged 60 minutes of exercise in a day.

  • If the data from student F were removed, how would this affect the mode, median and mean?
  • If we added another student who exercised for 90 minutes? 30 minutes? 5 minutes? what impact would this have on the mode, median and mean?

Source: adapted from Making Math Meaningful, Marian Small, p. 575.