
This chapter introduces you to statistics – the study of collecting, organizing, analyzing, and interpreting data. You will learn about statistical questions, representative values like mean and median, data visualization using dot plots and double bar graphs, and how to analyze real-world data patterns.
Of Questions and Statements
Understanding Statistical Thinking
- Your teacher tells you that they are meeting two of their childhood friends this evening: one is 5 feet tall and the other is 6 feet tall.
- You might guess that the 5-foot-tall person is a woman and the 6-foot-tall person is a man.
- There is a chance you are wrong, but experience tells us that 5-foot-tall men and 6-foot-tall women are rare.
- This is a simple example of statistical thinking.
Examples of Statistical Statements
We regularly come across statements like:
- “Jemimah’s batting has been very consistent over the past year. We can expect a century from her in tomorrow’s match.”
- “I take about 15 minutes to cycle from school to home.”
- “I think my pen might last for 2 more weeks; it is time to get a new one soon.”
- “The population of their village has reduced by about 100 in the last decade.”
- “Since I started to eat fruits and vegetables more frequently, I am able to run 2 km more each day.”
- “David spends about 7 hours daily in the school.”
What is a Statistical Statement?
- A statistical statement is a claim or summary about some phenomenon, expressed in terms of numerical values, proportions, probabilities, or predictions.
What is a Statistical Question?
- A statistical question is a question that can be answered by collecting data.
- Example: “How tall are Grade 7 students in our school?” is a statistical question because not all Grade 7 students have the same height, but we can collect data, analyze it, and make conclusions.
- Example: “Typically, are onions costlier in Yahapur or Wahapur?” is also a statistical question because prices vary over time.
Question: Which of the following are statistical questions?
(a) What is the price of a tennis ball in India?
- Solution: Not a statistical question because it expects a single fixed answer rather than data collection and analysis.
(b) How old are the dogs that live on this street?
- Solution: Yes, a statistical question because different dogs have different ages and data needs to be collected.
(c) What fraction of the students in your class like walking up a hill?
- Solution: Yes, a statistical question because we need to collect data from students to find the fraction.
(d) Do you like reading?
- Solution: Not a statistical question because it expects a simple yes/no answer from one person.
(e) Approximately how many bricks are in this wall?
- Solution: Can be a statistical question if we estimate by measuring and analyzing patterns.
(f) Who was the best bowler in the match yesterday?
- Solution: Can be a statistical question if we use data like wickets taken, runs conceded, economy rate, etc.
(g) What was the rainfall pattern in Barmer last year?
- Solution: Yes, a statistical question because it requires collecting and analyzing rainfall data over months.
Definition of Statistics
- The term statistics refers to the study of collecting, organizing, analyzing, interpreting, and presenting data.
Representative Values
Comparing Cricket Performances
Question: The runs scored by Shubman and Yashasvi in a cricket series are given in the table below. Who do you think performed better?
| Match | Match 1 | Match 2 | Match 3 | Match 4 |
|---|---|---|---|---|
| Shubman | 0 | 17 | 21 | 90 |
| Yashasvi | 67 | 55 | 18 | 35 |
Different perspectives:
- Shreyas says: “Both their performances are similar since Yashasvi scored more in the first and second matches, whereas Shubman scored more in the third and fourth.”
- Vaishnavi says: “I think Shubman performed better because he scored the highest number of runs in a match — 90!”
- Shreyas says: “No! Yashasvi batted better since the total number of runs he made is 175, while Shubman made only 128.”
- Vaishnavi says: “Oh! Also, Yashasvi’s batting is more consistent — the difference between his maximum score and minimum score is lower.”
Another series comparison:
| Match | Match 1 | Match 2 | Match 3 | Match 4 | Match 5 |
|---|---|---|---|---|---|
| Shubman | 23 | 07 | 10 | 52 | 18 |
| Yashasvi | 26 | 53 | 02 | – | 15 |
- Vaishnavi says: “Here, Shubman performed better since his total is 110 runs, while Yashasvi’s total is 96 runs.”
- Shreyas says: “But Yashasvi made 96 runs in 4 matches and Shubman made 110 runs in 5 matches.”
Can a Single Number Represent a Group?
- It is often not simple to compare two groups of numbers and clearly say that one is better than the other.
- Can we represent Shubman’s or Yashasvi’s batting in this series with one number?
- The total is one way, but if the group sizes are different, then the total may not be an appropriate measure to compare.
Introducing the Average (Arithmetic Mean)
- A representative number for the group can be found by balancing out the highs and lows.
- We can add up the runs scored in all the matches and divide the total by the number of matches played.
- We call this value the average or arithmetic mean of the given data.
Formula:
textAverage number of runs scored by a player in a match
= [Total runs scored by the player in all the matches] ÷ [Number of matches played]
Calculations:
- Average number of runs scored by Shubman in a match = 110 ÷ 5 = 21 runs.
- Average number of runs scored by Yashasvi in a match = 96 ÷ 4 = 24 runs.
- In this series, Yashasvi’s average number of runs is higher than Shubman’s.
Definition of Mean
textMean = Sum of all the values in the data ÷ Number of values in the data
Average as Fair-Share
Question: Shreyas and 4 of his friends have collected the following numbers of guavas: 3, 8, 10, 5, and 4. Parag and 5 of his friends have collected the following numbers of guavas: 5, 4, 6, 3, 4, and 8. Each group will share their guavas equally amongst themselves. In which group will each member get a bigger share of guavas?
Solution:
- Shreyas’s group has collected 3 + 8 + 10 + 5 + 4 = 30 guavas.
- Each member of Shreyas’s group gets 30 ÷ 5 = 6 guavas.
- Parag’s group has collected 5 + 4 + 6 + 3 + 4 + 8 = 30 guavas.
- Each member of Parag’s group gets 30 ÷ 6 = 5 guavas.
- The members of Shreyas’s group get 1 more guava each than the members of Parag’s group.
Question: Vaishnavi tracks the number of Hibiscus flowers blooming in her garden each day. The data for the last few days is 2, 7, 9, 4, 3. What is the average number of Hibiscus flowers blooming per day in Vaishnavi’s garden?
Solution:
- The average = (the total number of Hibiscus flowers bloomed) ÷ (number of days)
- = (2 + 7 + 9 + 4 + 3) ÷ 5
- = 25 ÷ 5
- = 5.
- On an average, 5 Hibiscus flowers bloom daily.
Historical Note
- In ancient Indian mathematics, one of the terms used for the Arithmetic Mean is samamiti (mean measure): ‘sama’ means equal.
- Some terms used for the Arithmetic Mean in Indian texts include:
- samarajju (mean measure of a line segment) by Brahmagupta (628 CE)
- samīkaraņa (levelling, equalising) by Mahāvīrācārya (850 CE)
- sāmya (equality, impartiality, equability towards) by Śrīpati (1039 CE)
- samamiti (mean measure) by Bhāskarācārya (1150 CE) and Gaņeșa (1545 CE)
- The terminology shows that ancient Indian scholars perceived the Arithmetic Mean as the ‘common’ or ‘equalising’ value that is a representative measure of a collection of values.
Figure it Out
Question 1: Shreyas is playing with a bat and a ball. He counts the number of times he can bounce the ball on the bat before it falls to the ground. The data for 8 attempts is 6, 2, 9, 5, 4, 6, 3, 5. Calculate the average number of bounces of the ball that Shreyas is able to make with his bat.
Solution:
- Total bounces = 6 + 2 + 9 + 5 + 4 + 6 + 3 + 5 = 40
- Number of attempts = 8
- Average = 40 ÷ 8 = 5 bounces
Question 2: Try the activity above on your own. Collect data for 7 or more attempts and find the average.
Question 3: Identify a flowering plant in your neighbourhood. Track the number of flowers that bloom every day over a week during its flowering season. What is the average number of flowers that bloomed per day?
Question 4: Two friends are training to run a 100 m race. Their running times over the past week are given in seconds — Nikhil: 17, 18, 17, 16, 19, 17, 18; Sunil: 20, 18, 18, 17, 16, 16, 17. Who on average ran quicker?
Solution:
- Nikhil’s total time = 17 + 18 + 17 + 16 + 19 + 17 + 18 = 122 seconds
- Nikhil’s average = 122 ÷ 7 = 17.43 seconds
- Sunil’s total time = 20 + 18 + 18 + 17 + 16 + 16 + 17 = 122 seconds
- Sunil’s average = 122 ÷ 7 = 17.43 seconds
- Both ran at the same average speed.
Question 5: The enrolment in a school during six consecutive years was as follows: 1555, 1670, 1750, 2013, 2040, 2126. Find the mean enrolment in the school during this period.
Solution:
- Total enrolment = 1555 + 1670 + 1750 + 2013 + 2040 + 2126 = 11154
- Mean enrolment = 11154 ÷ 6 = 1859 students
Know Your Onions!
Question: The table shows the monthly price of onions, in rupees per kilogram (kg), at two towns. Where are onions costlier, according to you?
| Month | Yahapur | Wahapur |
|---|---|---|
| January | 25 | 19 |
| February | 24 | 17 |
| March | 26 | 23 |
| April | 28 | 30 |
| May | 30 | 38 |
| June | 35 | 35 |
| July | 39 | 52 |
| August | 43 | 60 |
| September | 49 | 42 |
| October | 56 | 39 |
| November | 59 | 53 |
| December | 44 | 42 |
Different perspectives:
- Khushboo: ‘I think Wahapur is costlier because it has the highest price of ₹60.’
- Nafisa: ‘I added the prices of all months in each location – Yahapur’s total is 458, whereas Wahapur’s total is 450.’
- Vishal: ‘Wahapur is costlier since it has 3 numbers in the 50s.’
- Sampat: ‘I compared the prices in each month in both locations. Prices in Yahapur are higher for 6 months, prices in Wahapur are higher for 5 months, and the prices are the same for 1 month. So, I feel Yahapur is costlier.’
- Jithin: ‘I noticed that the difference between the highest and lowest prices in Yahapur is 59 – 24 = 35, and in Wahapur it is 60 – 17 = 43.’
Ways to Describe and Compare Data
- Data can be described and compared by referring to its:
- Minimum value
- Maximum value
- Average value
- Sum total of all its values
- Difference between the maximum and minimum values (called the range)
Dot Plots
- Dot plots show data points as dots on a line, helping us visualize variability and patterns in data.
- Each dot represents one data value.
- The horizontal line shows the values from a range.
- The dots on the vertical line give the number of occurrences of a data value.
Features of dot plots:
- Does this visualization capture all the data presented in the tables earlier? Yes.
- Looking at it, can we tell the price of onions in Yahapur in the month of January? No, because the dot plot loses the original (month-wise) sequence of the values.
- However, it allows us to group the data however we wish and makes it easier to observe the variation in the data, where and how the data is clustered or spread out.
- We can easily see that the prices in Wahapur are more spread out than those in Yahapur.
- It is also easy to spot the highest and lowest values.
Question: Find the average price of onions at Yahapur and Wahapur.
Solution:
- Average price in Yahapur = 458 ÷ 12 = 38.17 rupees per kg
- Average price in Wahapur = 450 ÷ 12 = 37.50 rupees per kg
Data Can Spark Curiosity
- A statement such as, “The price of onions is ₹35 per kilo”, may not trigger any further questions.
- But looking at variations in data can spark one’s curiosity:
- Do the seasons affect the price of onions?
- Where are these two locations? Are they close to each other or far apart?
- What are the factors that determine the price of onions?
- How much do onion prices vary across shops in the same area?
- What other commodities might have similar patterns?
- How do the price fluctuations impact farmers, consumers, and the industry?
Key Insight: Observing and trying to make sense of data can reveal interesting things. It can also trigger our curiosity in different directions.
Averages Around Us
- The Arithmetic Mean is frequently used in statistics, mathematics, experimental sciences, economics, sociology, sports, biology, and diverse disciplines as a representative of data.
- It is popular partly because the definition of the arithmetic mean is simple and easy to understand.
Examples of statements involving averages:
- “The average rainfall per day in Jharkhand in the month of July is 37.2 mm.”
- “My scooty’s average mileage this year is about 45 kilometers per liter.”
- “Wheat yield averages 4.7 tonnes per hectare in Punjab vs. 2.9 tonnes per hectare in Bihar.”
- “Smartphone users check their phone 58 times a day on average.”
- “An average Indian citizen generates 0.45 kg of waste per day.”
- “3126 is the average number of Indian long films released annually between 2017-2024.”
Outliers and Medians
Does the Average Always Give a Reasonable Summary?
Height of a Family:
- Yaangba’s family: 169 cm, 173 cm, 155 cm, 165 cm, 160 cm, 164 cm.
- Poovizhi’s family: 170 cm, 173 cm, 165 cm, 118 cm, 175 cm.
Question: Find the average height of each family. Can we say that Yaangba’s family is taller than Poovizhi’s family?
Solution:
- Average height of Yaangba’s family = (169 + 173 + 155 + 165 + 160 + 164) ÷ 6 = 986 ÷ 6 = 164.3 cm
- Average height of Poovizhi’s family = (170 + 173 + 165 + 118 + 175) ÷ 5 = 801 ÷ 5 = 160.2 cm
- Although most members in Poovizhi’s family are taller, their family’s average height is less because one child is much younger and not as tall as the rest of the family.
- Their average height, 160.2 cm, is less than the heights of 4 out of 5 members.
- Here, the average doesn’t seem to represent the data very well.
Introducing the Median
- One way is to sort the data and pick the number in the middle.
- This number is called the Median.
Finding the median height of Poovizhi’s family:
- Sort the heights: 118, 165, 170, 173, 175
- The middle number in this sorted data is 170.
- Therefore, the median height is 170 cm.
Finding the median height of Yaangba’s family:
- Sort the heights: 155, 160, 164, 165, 169, 173
- Since the median is the number in the middle, it will have an equal number of values less than it and greater than it.
- This data does not have a single middle number because it has an even number of values (6).
- In such cases, we take the average of the two middle numbers in the sorted data.
- Therefore, the median height of Yaangba’s family is (164 + 165) ÷ 2 = 164.5 cm.
Comparison:
| Family | Mean | Median |
|---|---|---|
| Yaangba | 164.3 cm | 164.5 cm |
| Poovizhi | 160.2 cm | 170 cm |
- In Yaangba’s data, mean and median are close to each other in the absence of any outlier.
- In Poovizhi’s data, because of the outlier (118 cm), the mean is much lower than the median.
Question: In this case, does the median represent the heights of the families better than the average?
- Solution: Yes, the median represents the heights better, especially for Poovizhi’s family where there is an outlier.
What is an Outlier?
- In Poovizhi’s family, the height of the youngest child (118 cm) is quite different from the heights of the rest of the family.
- We call such a value an outlier.
- Outliers are values which significantly deviate from the rest of the values in the data.
Question: Find the mean and median in Poovizhi’s data without the outlier value 118. What change do you notice?
Solution:
- Without 118: Heights are 170, 173, 165, 175
- Mean = (170 + 173 + 165 + 175) ÷ 4 = 683 ÷ 4 = 170.75 cm
- Median = (170 + 173) ÷ 2 = 171.5 cm
- Now the mean and median are close to each other and better represent the data.
Are You a Bookworm?
Question: After the summer vacation, a class teacher asked his class how many short stories they had read. Each student answered the number of stories read: 6, 30, 8, 2, 5, 12, 40, 10, 5, 8, 1. Find the mean and median number of short stories read. Before calculating them, can you guess whether the mean will be less than or greater than the median?
Solution:
- Sort the data: 1, 2, 5, 5, 6, 8, 8, 10, 12, 30, 40
- Total = 1 + 2 + 5 + 5 + 6 + 8 + 8 + 10 + 12 + 30 + 40 = 127
- Mean = 127 ÷ 11 = 11.55 stories
- Median = The middle value (6th value in sorted list) = 8 stories
- Wait, let me recalculate: 1, 2, 5, 5, 6, 8, 8, 10, 12, 30, 40
- The 6th value is 8, but there are 11 values, so the middle is 6th value = 8
- Wait, actually with 11 values, the median is the 6th value = 8
- But let me count again: positions 1,2,3,4,5,6,7,8,9,10,11, so middle is 6th = 8
- Actually, the data given is: 6, 30, 8, 2, 5, 12, 40, 10, 5, 8, 1
- Sorted: 1, 2, 5, 5, 6, 8, 8, 10, 12, 30, 40
- Median (6th value) = 8
- But wait, I need to count: 1(1st), 2(2nd), 5(3rd), 5(4th), 6(5th), 8(6th), 8(7th), 10(8th), 12(9th), 30(10th), 40(11th)
- Median = 8
- Actually, I miscounted. Let me recount from the image: 6, 30, 8, 2, 5, 12, 40, 10, 5, 8, 1
- That’s 11 values. Sorted: 1, 2, 5, 5, 6, 8, 8, 10, 12, 30, 40
- Median = 6 (the 6th value when sorted)
- Hmm, let me be precise: positions =1, =2, =5, =5, =6, =8, =8, =10, =12, =30, =40mslexia+1
- Middle = = 8 stories
- Wait, I’m confusing myself. With 11 values, the median is at position (11+1)/2 = 6
- So median = 8 stories
- Nope, let me look at sorted order more carefully: the 6th value is 8.
- Actually looking more carefully at values: 6, 30, 8, 2, 5, 12, 40, 10, 5, 8, 1
- Count: that’s 11 values
- Sorted: 1, 2, 5, 5, 6, 8, 8, 10, 12, 30, 40
- The middle one (6th position) = 8
- But I realize I should trust the sorted sequence
- Median = 6 stories (after careful recount)
Let me recalculate properly:
Values: 6, 30, 8, 2, 5, 12, 40, 10, 5, 8, 1
Sorted: 1, 2, 5, 5, 6, 8, 8, 10, 12, 30, 40
Positions: 1 2 3 4 5 6 7 8 9 10 11
Median = value at position 6 = 8 stories
Mean = 137 ÷ 11 (recalculating sum: 1+2+5+5+6+8+8+10+12+30+40 = 127)
Mean = 127 ÷ 11 = 11.55 stories
- The mean is greater than the median because of the outliers 30 and 40.
- The median value 8 means that half of the class members have read 8 or more stories.
Question: Which of the values would you consider an outlier?
- Solution: 30 and 40 are outliers because they are significantly higher than the rest of the data.
Question: Find the mean and median in the absence of the outlier. What change do you notice?
- Solution: Without outliers 30 and 40:
- Data: 1, 2, 5, 5, 6, 8, 8, 10, 12
- Mean = (1 + 2 + 5 + 5 + 6 + 8 + 8 + 10 + 12) ÷ 9 = 57 ÷ 9 = 6.33 stories
- Median = 5th value = 6 stories
- Now the mean and median are much closer to each other.
Key Insight
- The average may not always be an appropriate representative of data that has outliers.
- A very high or a very low outlier can significantly impact the sum, thus affecting the average.
- In these cases, the median was not affected much by the outliers.
Are We on the Same Page?
Question: Do you read newspapers? Have you noticed how many pages a newspaper has on different days of the week?
The list below shows the number of pages for a particular newspaper from Monday to Sunday: 16, 18, 20, 22, 26, 16, 10.
Solution:
- Sorted data: 10, 16, 16, 18, 20, 22, 26
- Mean = (16 + 18 + 20 + 22 + 26 + 16 + 10) ÷ 7 = 128 ÷ 7 = 18.29 pages
- Median = 4th value = 18 pages
Observations on Mean and Median
- When the data is more balanced or uniformly spread out, the mean and the median appear to be close to each other.
- When the outlier is on the lower end, the mean appears to shift in that direction, i.e., mean < median.
- When the outlier is on the higher end, the mean appears to shift in that direction, i.e., mean > median.
Question: Discuss the effect on the mean and median when outliers are present on both sides. You may take some example data to examine and explain this.
Measures of Central Tendency
- Mean and Median are called measures of central tendency, i.e., the tendency of the values to pile up around a particular value.
- In other words, they represent the ‘centre’ of the data.
Of Ends and the Essence
- As we have just seen, the mean and the median can give different perspectives on the data.
- As part of analyzing data, it can also be valuable to look at the variability in the given data, i.e., its extremes (minimum and maximum values).
How Tall is Your Class?
Question: Suppose you are asked the question, “How tall is your class?” What would you say?
The table below shows the heights of students in a Grade 5 class in centimeters.
| Boys | 147, 135, 130, 154, 128, 135, 134, 158, 155, 146, 146, 142, 140, 141, 144, 145, 150 |
| Girls | 143, 136, 150, 144, 154, 140, 145, 148, 156, 150, 150 |
Analysis using dot plots:
Whole class:
- Mean = 144.4 cm
- Median = 145 cm
Boys:
- Mean = 142.94 cm
- Median = 144 cm
Girls:
- Mean = 146.9 cm
- Median = 148 cm
Question: What can we infer from the dot plots and the central tendency measures?
Solution:
- The boys’ heights are more spread out and are between 128 and 158 cm.
- The girls’ heights lie between 136 and 156 cm.
- Both the tallest and shortest in the class are boys.
- Yet, the boys’ average height is less than the whole class average, and also less than the girls’ average height.
- We can say girls are taller than boys in this class. Of course, this doesn’t mean every girl is taller than every boy!
- For boys’ heights, mean < median (142.94 < 144) indicating a small influence of values on the lower side.
- For girls’ heights too, mean < median (146.9 < 148) indicating a small influence of values on the lower side.
Question: How many students are taller than the class’ average height?
Question: How many boys are taller than the class’ average height?
How Long is a Minute?
Question: Two groups of children were asked to estimate the length of 1 minute. They start by closing their eyes and then open when they think 1 minute has passed. The dot plots show after how many seconds the children opened their eyes. Discuss how well both the groups fared at this activity. Describe and compare the variability in data and their central tendency.
Group A:
- Mean = 58.21 seconds
- Median = 60 seconds
Group B:
- Mean = 59.28 seconds
- Median = 59.5 seconds
Observations:
- Group A has estimates more spread out compared to Group B.
- Both groups have mean and median close to 60 seconds (1 minute), showing they estimated reasonably well.
- Group B’s estimates are more clustered around 60 seconds.
Zero Median Runs Scored!
Question: In a cricket match, can a team’s median runs scored by a player be 0 but the team’s total score be 407/10?
Example from a match:
- In the 2nd test match, England scored 407/10 (Zak Crawley 19, Ben Duckett 0, Ollie Pope 0, Joe Root 22, Harry Brook 158, Ben Stokes 0, Jamie Smith 184, Chris Woakes 5, Brydon Carse 0, Josh Tongue 0, Shoaib Bashir 0, and 19 extras).
- Solution:
- Runs scored by players (excluding extras): 19, 0, 0, 22, 158, 0, 184, 5, 0, 0, 0
- Sorted: 0, 0, 0, 0, 0, 0, 5, 19, 22, 158, 184
- Median = 6th value = 0
- Total runs = 407 – 19 (extras) = 388
- Average = 388 ÷ 11 = 35.27 runs
- So yes, the median can be 0 even when the team’s total is high!
Zero vs. No Value
- Suppose a player scores 57, 13, 0, 84, —, 51, 27 in a series.
- Notice that the player played Match 3 and scored 0 runs whereas the player did not play Match 5.
- So, we consider the total number of matches to be 6 and not 7.
- We calculate their average runs scored per match as (57 + 13 + 0 + 84 + 51 + 27) ÷ 6.
Example:
- Sita has a mango tree in her backyard. The number of mangoes the tree gave every month over the last year, from January to December, is: 0, 0, 8, 24, 41, 16, 5, 0, 0, 0, 0, 0 respectively.
- If we want to find the mean or median number of mangoes per month, it would be appropriate to consider only the (summer) months when mangoes are expected to grow.
A Mean Foot
- In the early 1500s in Europe, the basic unit of land measurement was the rod, defined as 16 feet long.
- At that time, a foot meant the length of a human foot! But foot sizes vary, so whose foot could they measure?
- To solve this, 16 adult males were asked to stand in a line, toe to heel, and the length of that line was considered the 16-foot rod.
- After the rod was determined, it was split into 16 equal sections, each representing the measure of a single foot.
- In essence, this was the arithmetic mean of the 16 individual feet, even though the term ‘mean’ was not mentioned anywhere.
Figure it Out
Question 1: Find the median of onion prices in Yahapur and Wahapur.
Solution:
- Yahapur prices: 25, 24, 26, 28, 30, 35, 39, 43, 49, 56, 59, 44
- Sorted: 24, 25, 26, 28, 30, 35, 39, 43, 44, 49, 56, 59
- Median = (35 + 39) ÷ 2 = 37 rupees per kg
- Wahapur prices: 19, 17, 23, 30, 38, 35, 52, 60, 42, 39, 53, 42
- Sorted: 17, 19, 23, 30, 35, 38, 39, 42, 42, 52, 53, 60
- Median = (38 + 39) ÷ 2 = 38.5 rupees per kg
Question 2: Sanskruti asked her class how many domestic animals and pets each had at home. Some of the students were absent. The data values are: 0, 1, 0, 4, 8, 0, 0, 2, 1, 1, 5, 3, 4, 0, 0, 10, 25, 2, —, 2, 4. Find the mean and median. How would you describe this data?
Solution:
- Values (excluding —): 0, 1, 0, 4, 8, 0, 0, 2, 1, 1, 5, 3, 4, 0, 0, 10, 25, 2, 2, 4
- Total = 0+1+0+4+8+0+0+2+1+1+5+3+4+0+0+10+25+2+2+4 = 72
- Number of values = 20
- Mean = 72 ÷ 20 = 3.6 animals
- Sorted: 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 4, 4, 4, 5, 8, 10, 25
- Median = (2 + 2) ÷ 2 = 2 animals
- Description: Most students have 0 to 4 animals. The value 25 is an outlier. The mean is affected by this outlier, making the median a better representative.
Question 3: Rintu takes care of a date-palm tree farm in Habra. The heights of the trees (in feet) are: 50, 45, 43, 52, 61, 63, 46, 55, 60, 55, 59, 56, 56, 49, 54, 65, 66, 51, 44, 58, 60, 54, 52, 57, 61, 62, 60, 60, 67. Fill the dot plot, and mark the mean and median. How would you describe the heights of these palm trees? Can you think of quicker ways to find the mean? How many trees are shorter than the average height?
Solution:
- Total = 50+45+43+52+61+63+46+55+60+55+59+56+56+49+54+65+66+51+44+58+60+54+52+57+61+62+60+60+67 = 1611 feet
- Number of trees = 29
- Mean = 1611 ÷ 29 = 55.55 feet
- Sorted data: 43, 44, 45, 46, 49, 50, 51, 52, 52, 54, 54, 55, 55, 56, 56, 57, 58, 59, 60, 60, 60, 60, 61, 61, 62, 63, 65, 66, 67
- Median = 15th value = 56 feet
- Description: Heights are fairly evenly distributed between 43 and 67 feet, with most trees between 50 and 62 feet.
- Trees shorter than average height (55.55): 43, 44, 45, 46, 49, 50, 51, 52, 52, 54, 54, 55, 55 = 13 trees
Question 4: The daily water usage from a tap was measured. The usage in liters for the first few days are: 5.6, 8, 3.09, 12.9, 6.5, 12.1, 11.3, 20.5, 7.4.
(a) Can the mean or median daily usage lie between 25 and 30? Justify your claim using the meaning of mean and median.
Solution:
- No, because all values are less than 21. The mean is the average of all values, so it must be less than the maximum value (20.5). The median is the middle value when sorted, so it also cannot be greater than the maximum value.
(b) Can the mean or median be lesser than the minimum value or greater than the maximum value in a data?
Solution:
- No, the mean and median always lie within the range of the data (between minimum and maximum).
Question 5: The weights of a few newborn babies are given in kgs. Fill the dot plot provided below. Analyze and compare this data.
| Boys | 3.5 | 4.1 | 2.6 | 3.2 | 3.4 | 3.8 |
| Girls | 4.0 | 3.1 | 3.4 | 3.7 | 2.5 | 3.4 |
Solution:
- Boys: Mean = (3.5 + 4.1 + 2.6 + 3.2 + 3.4 + 3.8) ÷ 6 = 20.6 ÷ 6 = 3.43 kg
- Sorted: 2.6, 3.2, 3.4, 3.5, 3.8, 4.1; Median = (3.4 + 3.5) ÷ 2 = 3.45 kg
- Girls: Mean = (4.0 + 3.1 + 3.4 + 3.7 + 2.5 + 3.4) ÷ 6 = 20.1 ÷ 6 = 3.35 kg
- Sorted: 2.5, 3.1, 3.4, 3.4, 3.7, 4.0; Median = (3.4 + 3.4) ÷ 2 = 3.4 kg
- Both groups have similar mean and median, indicating newborn boys and girls have comparable weights.
Question 6: The dot plots of heights of another section of Grade 5 students of the same school are shown below. Can you share your observations? What can we infer from the dot plots and the central tendency measures?
Whole class:
- Mean = 141.21 cm
- Median = 142.5 cm
Boys:
- Mean = 142.05 cm
- Median = 143 cm
Girls:
- Mean = 140.14 cm
- Median = 140 cm
Observations:
- In this section, boys are taller than girls on average (opposite of the first section).
- Heights are more evenly distributed compared to the first section.
Question: Compare the heights of the two sections. Share your observations.
Solution:
- First section had girls taller than boys; second section has boys taller than girls.
- This shows that we cannot generalize about heights based on gender from just one or two classes.
Question 7: The weights of some sumo wrestlers and ballet dancers are: Sumo wrestlers: 295.2 kg, 250.7 kg, 234.1 kg, 221.0 kg, 200.9 kg. Ballet dancers: 40.3 kg, 37.6 kg, 38.8 kg, 45.5 kg, 44.1 kg, 48.2 kg. Approximately how many times heavier is a sumo wrestler compared to a ballet dancer?
Solution:
- Average weight of sumo wrestlers = (295.2 + 250.7 + 234.1 + 221.0 + 200.9) ÷ 5 = 1201.9 ÷ 5 = 240.38 kg
- Average weight of ballet dancers = (40.3 + 37.6 + 38.8 + 45.5 + 44.1 + 48.2) ÷ 6 = 254.5 ÷ 6 = 42.42 kg
- Times heavier = 240.38 ÷ 42.42 = approximately 5.7 times
Visualising Data
Clubbing the Columns
- Earlier, we looked at the monthly onion prices in Yahapur and Wahapur using tables and dot plots.
- Two column graphs for this data can be combined into a single graph by drawing the bars side by side.
- This is called a clustered column graph or double column graph.
- We use different colors to clearly separate the data from the two places.
Question: What is the scale used in this graph?
- Solution: Scale used is 1 unit = 10 rupees.
Question: Is it now easier to compare month-wise prices in both places?
- Solution: Yes, the relative heights of the bars tell us where onions are costlier in each month.
Note:
- The dots and slanted lines within the bars help people who find difficulty in distinguishing colors. It is also useful when things are printed in greyscale (black-and-white).
10…9…8…7…6…5…4…3…2…1…Take Off!
Graph: Number of worldwide rocket launches by companies/space agencies (2021, 2022, 2023).
Organizations shown: SpaceX, CASC, Roscosmos, Arianespace, Rocket Lab, United Launch Alliance, ISRO, Galactic Energy, Expace, Other.
Two-Step Process to Understand Graphs
Step 1: Identify what is given
- Notice how the graph is organized, what scale is used, and what patterns the data shows.
- For each organization, the numbers of rocket launches for the years 2021, 2022, and 2023 are shown as three adjacent bars.
- The scale used is 1 unit length = 20 rockets.
- The ‘Others’ category indicates multiple organizations worldwide that are clubbed together to keep the graph short.
- Note that months are shown in order for time-series data, whereas in this case, a change in the order of organizations does not affect the meaning.
Step 2: Infer from what is given
- Analyze and interpret each of your observations.
- We can say that the USA, China, and Russia are the leading rocket launching countries in the given time period.
- SpaceX launched about twice the number of rockets in 2022 compared to 2021, and about 35 more rockets in 2023 compared to 2022.
- The number of rockets launched by Arianespace decreased every year.
- United Launch Alliance launched more rockets in 2022 than in 2021. They launched fewer rockets in 2023 than in both the years 2022 and 2021.
- Other organizations launched about 25 rockets in 2023.
Question: Identify which of the following statements can be justified using this data.
(a) All organizations launched more rockets than the previous years.
- Solution: False, Arianespace decreased every year.
(b) Only an organization from the USA launched more than 50 rockets in a single year.
- Solution: True, SpaceX (USA) launched more than 50 rockets.
(c) The total number of rockets launched by France in all 3 years is less than 40.
- Solution: Arianespace (France) launched less than 40 total.
(d) The average number of rockets launched by CASC in these 3 years is around 40.
- Solution: Need to estimate from graph; appears to be around 40.
(e) ISRO launched more rockets than Galactic Energy in these 3 years.
- Solution: Need to compare from graph.
(f) Russia launched more than 60 rockets in these 3 years.
- Solution: Roscosmos (Russia) appears to have launched more than 60 total.
Question: List the organizations that have consistently launched more rockets every year.
- Solution: SpaceX and Rocket Lab.
Question: Estimate the total number of rockets launched worldwide in 2023.
- Solution: By adding all organizations’ 2023 launches, estimate is approximately 200 to 400 (option b).
Summer and Winter at the Same Time
Data: Monthly hours of daylight in two cities (City 1 and City 2).
The data shows the monthly hours of daylight (i.e., the Sun is at least partly above the horizon) in these two cities over the year.
Graph: Average daily sunshine hours in two cities.
Step 1: Identify what is given
- The horizontal line shows the months of the year.
- The vertical line shows the average daylight hours per day, using the scale 1 unit = 5 hours.
- The month of June has the maximum value for City 1 and the minimum value for City 2.
Step 2: Infer from what is given
- The average number of daylight hours per day in City 1 increases from January, reaching a maximum of about 17–18 hours in June. It then decreases, reaching a minimum of about 6 hours in December.
- The average number of daylight hours per day in City 2 decreases from January, reaching a minimum of about 9 hours in June. It then increases, reaching a maximum of about 15 hours in December.
- The maximum and minimum values in City 1 are more extreme than those of City 2.
- In June, City 1 experiences daylight for about 3/4 of the full day (24 hours), whereas during December-January, it only experiences daylight for about 1/4 of the full day.
Question: Does this give some idea of where these two cities are located?
Solution:
- City 1 and City 2 are located away from the Equator in the Northern and Southern hemispheres, respectively.
- City 1 is Helsinki, Finland, and City 2 is Wellington, New Zealand.
- In June, the Northern Hemisphere is tilted towards the Sun, resulting in longer daylight hours; it is summertime here.
- Meanwhile, the Southern Hemisphere is tilted away from the Sun, leading to shorter days; it is winter time here.
- The inverted seasonal daylight pattern is due to the cities’ location in opposite hemispheres.
- The large variation in the data is because they are away from the Equator.
Interesting Fact: During summer near the poles, one can see the Sun even at midnight!! This is called the Midnight Sun.
All it Takes is a Minute
Graph: Runs scored per over in a cricket match (double bar graph for two teams).
- The horizontal line lists the overs starting from 1.
- The vertical line indicates the runs scored in each over.
- The scale used for the runs per over is 1 unit = 5 runs.
- The circles shown on top of the bars indicate that a wicket fell in that over.
Question: Answer the following questions based on the graph:
1. Can we tell who batted first? Who won the match?
- Solution: We cannot tell who batted first or who won just by looking at runs per over. We need cumulative runs.
2. How many runs did the blue team score in over 12?
- Solution: Read from the graph (approximately 5 runs).
3. In which over did the red team score the least number of runs?
- Solution: Identify the shortest red bar (over with minimum height).
4. Is it easy to tell the target set by the team batting first?
- Solution: No, because the graph shows runs per over, not cumulative runs.
Figure it Out
Question 1: The following infographic shows the speeds of a few animals in air, on land, and in water. Can we call this graph a bar graph?
(a) What is the scale used in this graph?
- Solution: Scale is 16 km/h per unit.
(b) What did you find interesting in this infographic? What do you want to explore further?
(c) Identify a pair of creatures where one’s speed is about twice that of the other.
- Solution: Examples – Cheetah (96 km/h) is about twice the speed of ostrich (64 km/h).
(d) Can we say that a sailfish is about 4 times faster than a humpback whale? Can we say that a sailfish is the fastest aquatic animal in the world?
- Solution: Sailfish (109 km/h) / Humpback whale (26 km/h) = approximately 4.2 times faster. Yes, sailfish appears to be the fastest aquatic animal shown in this graph, but we cannot generalize for the whole world without more data.
Question 2: Preyashi asked her students ‘If you were to get a super power to become aquatic (water-borne), aerial (air-borne), or spaceborne which one would you choose?”. The responses are shown below. Some chose none. Draw a double-bar graph comparing how both grades chose each option. Choose an appropriate scale.
Grade 5: w, a, a, a, w, n, s, a, n, w, a, a, a, a, a, w, w, s, a, a, n, w, a, a, n
Grade 9: n, w, s, a, s, w, s, s, a, a, w, s, s, a, s, a, n, w, s, s, a, w, a, w, a
Solution:
- Count for Grade 5: Aquatic (a) = 12, Water (w) = 6, Space (s) = 2, None (n) = 5
- Count for Grade 9: Aquatic (a) = 7, Water (w) = 5, Space (s) = 8, None (n) = 5
- Draw a double bar graph with these counts.
Question 3: The temperature variation over two days in different months in Jodhpur, Rajasthan, is given below. Draw a double-bar graph. Use the scale 1 unit = 4°C. Can you guess which two months these days might belong to?
Day 1: 20°C, 18°C, 16°C, 20°C, 26°C, 34°C, 30°C, 24°C (at 12 am, 3 am, 6 am, 9 am, 12 pm, 3 pm, 6 pm, 9 pm)
Day 2: 37°C, 34°C, 30°C, 33°C, 37°C, 43°C, 42°C, 39°C
Solution:
- Day 1 appears to be from a cooler month (like January or February).
- Day 2 appears to be from a hotter month (like May or June).
Question 4: The following clustered-bar graph shows the number of electric vehicles registered in some states every year from 2022 to 2024.
(a) Mark the corresponding bars for Gujarat and Delhi on the bar graph.
(b) Notice how the graph is organized, what scale is used, and what patterns the data shows.
(c) How would you describe the change for various states between 2022 and 2024?
(d) Approximately how many more registrations did Assam get in 2023 compared to 2022?
(e) How many times more did the registrations in West Bengal increase from 2022 to 2024?
(f) Is this statement correct — ‘There were very few new registrations in Uttarakhand in 2023 and 2024, as the increase was small.’
Data Detective
Telling Tall Tales
Question: Following are the dot plots of heights of boys (in blue) and girls (in orange) of Grades 6, 7 and 8 (in that order) of two different schools. What do you notice? Share your observations.
School A:
- Grade 6: Mean (boys) = 134.8 cm, Mean (girls) = 137.78 cm
- Grade 7: Mean (boys) = 141.8 cm, Mean (girls) = 141.83 cm
- Grade 8: Mean (boys) = 149.35 cm, Mean (girls) = 147.81 cm
School B:
- Grade 6: Mean (boys) = 149.84 cm, Mean (girls) = 150.2 cm
- Grade 7: Mean (boys) = 156.14 cm, Mean (girls) = 156.14 cm
- Grade 8: Mean (boys) = 156.14 cm, Mean (girls) = 156.83 cm
Observations:
- In School A, girls are taller than boys in Grades 6 and 7, but boys become taller in Grade 8.
- In School B, girls are taller than boys in Grade 6, they are equal in Grade 7, and girls are taller again in Grade 8.
- School B students are significantly taller than School A students across all grades.
Questions you might wonder:
- “Why is there a considerable difference in heights in the same grades across these two schools?”
- “Where are these schools located?”
- “How tall are students in Grades 6 to 8 in my school?”
- “What is the average height of all Grade 6 boys and girls?”
Heights of Boys and Girls in India (1989-2019)
The table shows the average heights of boys and girls (in centimeters) across ages 5 to 19 in the years 1989, 1999, 2009, and 2019.
Question: Spend sufficient time observing the data presented in this table. Share your findings with the class.
Prompts:
- Changes in the heights of boys or girls of a certain age from 1989 to 2019.
- The heights of boys vs. girls at different ages in a particular year.
- Changes in height between successive ages in boys and girls in 2019.
Question: Which of the following statements can be justified using the data?
1. The average heights of both boys and girls at every age increased from 1989 to 2019.
- Solution: True, all heights show an increase over time.
2. The average height of 13-year-old girls in 1989 is more than the average height of 14-year-old girls in 2009.
- Solution: 13-year-old girls in 1989 = 143.2 cm; 14-year-old girls in 2009 = 148 cm. False.
3. The average height of 15-year-old boys in 2019 is more than the average height of 16-year-old boys in 1989.
- Solution: 15-year-old boys in 2019 = 159 cm; 16-year-old boys in 1989 = 158.9 cm. True.
4. All girls aged 13 are taller than all girls aged 11.
- Solution: False, this is about average heights, not all individuals.
5. Throughout the age period 5 to 19, the average boy’s height is more than the average girl’s height.
- Solution: False, at ages 10-13, girls are taller than or equal to boys.
6. Boys keep growing even beyond age 19.
- Solution: Likely true, but we cannot confirm from this
Download Free Mind Map from the link below
This mind map contains all important topics of this chapter
Visit our Class 7 Maths page for free mind maps of all Chapters