D1.1 Explain why percentages are used to represent the distribution of a variable for a population or sample in large sets of data, and provide examples.

Activity 1: Percentage Analysis


Present the following situation to the students.

Data was collected by the Sleep Academy from 10 000 9-13 year-olds, and then from 10 000 14-17 year-olds regarding their average sleep duration. The two relative frequency tables below present this data. What conclusions can you draw from comparing the two tables? Does the relative frequency expressed as percentages make it easier to understand the distribution of the data than if the frequency was expressed as its absolute values?

Average Sleep Time of 9-13 Year-Olds

Number of Hours of Sleep Relative Frequency
Between 6 and 7 hours 1 %
Between 7 and 8 hours 4 %
Between 8 and 9 hours 15 %
Between 9 and 10 hours 33 %
More than 10 hours 47 %

Average Sleep Time of 14-17 Year-Olds

Number of Hours of Sleep Relative Frequency
Between 6 and 7 hours 9 %
Between 7 and 8 hours 25 %
Between 8 and 9 hours 33 %
Between 9 and 10 hours 18 %
More than 10 hours 15 %

Explanations and Justifications

By observing the two tables, we observe that young people aged 9 to 13 and young people aged 14 to 17 do not have the same sleep habits. 47% of the population aged 9 to 13 spend more than 10 hours sleeping, while only 15% of the population aged 14 to 17 sleep more than 10 hours. 1% of 9 to 13 year-olds sleep 6 to 7 hours, while nearly 10% of 14 to 17 year-olds sleep 6 to 7 hours. We also note that there are approximately 5 times more individuals aged 14 to 17 who sleep 7 to 8 hours than individuals aged 9 to 13, or 4%. In the first table, the percentages indicate that the more the number of hours of sleep increases, the more the number of individuals is high, which reveals that young people aged from 9 to 13 seem to need a lot of sleep. In the second table, the number of individuals increases gradually in each of the first three categories, but the number of people having answered the survey decreases considerably in the last two categories. We can therefore assume that young people aged 9 to 13 need more sleep than young people aged from 14 to 17. Finally, the frequencies expressed in percentages made it possible to analyze and compare the data quickly.

Ask students to explain why percentage frequency is used to represent the distribution of a variable from a population in large data sets.

Invite students to explain their answer with examples.

Possible Solutions

Frequencies expressed as percentages are well-suited to represent variables in a large data set. In a study, 45 000 individuals were asked how they reduce stress, 29 344 people said they get physically active and 5678 people said they meditate. This data is much easier to interpret when comparing percentages, and by saying that 65% of people get physically active and 13% meditate to reduce stress.

Source: translated from En avant, les maths!, 7e année, ML, Données, p. 11-12.

Activity 2: Comparing Frequency and Relative Frequency Data


A municipality in Ontario is looking to invest in a community project in the coming year. In order to meet the needs of its population, the municipality surveyed 25,000 individuals and asked them: What community project do you think should be implemented in the coming year to meet the growing needs of the population? The two tables below summarize the results of the survey.

Proposed Community Projects Frequency
Build a community center for youth aged 12 to 18 5674
Create a community vegetable garden 7594
Set up a local food bank 2846
Organize a market for local businesses 6508
Equip public washrooms with accessible toilets in all parks 2378
Total 25 000

Choosing a Community Project to Implement

Proposed Community Projects Frequency Relative Frequency (Fractions) Relative Frequency (Decimal Numbers) Relative Frequency (%)
Build a community center for youth aged 12 to 18 5674 \(\frac{5\ 674}{25\ 000}\) 0.23 23 %
Create a community vegetable garden 7594 \(\frac{7\ 594}{25\ 000}\) 0.30 30 %
Set up a local food bank 2846 \(\frac{2\ 846}{25\ 000}\) 0.11 11 %
Organize a market for local businesses 6508 \(\frac{6\ 508}{25\ 000}\) 0.28 26 %
Equip public washrooms with accessible toilets in all parks 2378 \(\frac{2\ 378}{25\ 000}\) 0.10 10 %
Total 25 000 \(\frac{25\ 000}{25\ 000}\) 1.00 100 %

Ask students to analyze the data in the two different tables. Ask them questions such as the following:

  • What do you notice when you compare the data in the two tables?
  • Which of the two tables do you think helps you better analyze the data? Why?
  • How can this data help you make decisions?

Example of a Response 

The first table presents frequencies compared to the second table which presents relative frequencies.

In the first table, a municipality in Ontario has counted the number of people who chose each category. The frequency distribution represents the number of individuals in each category and gives an overall idea of how the data is distributed; for example, 7594 people voted for a vegetable garden. This is the highest value in the table. It is difficult to compare absolute values, as they do not show the proportion of each category to the whole.

The second table shows the relative frequency of each of the categories, expressed as a fraction, a decimal number, and a percentage. This helps to better interpret the distribution of the data, as the relative values are expressed relative to the same whole (100); for example, 7594 people say they want to start a community vegetable garden, while 6,508 people say they want to organize a market for local businesses. This data is easier to interpret by comparing percentages, saying that 30% (almost \(\frac{1}{3}\)) of people say they want to start a community vegetable garden and 26% (approximately \(\frac{1}{4}\) of the population) say they want to organize a market for local businesses.

Comparing frequencies, 468 more people prefer that the municipality establish a food bank than purchase accessible washrooms and install them in all parks. The first of the two choices is therefore more popular. However, only one percent separates the choice of establishing a food bank (11%) from the choice of public washrooms with accessible washrooms and installing them in all parks (10%). The two choices are therefore about the same in popularity.

In conclusion, relative frequency data helps to make more accurate predictions. For example, looking at the data, it is possible to predict that the municipality of Ontario will create a community vegetable garden in the next year, since 30% of the people who responded to the survey voted in favour of this project.