Unit 2 Discussion
Subscribe
This discussion is based on the article Misleading Statistics & Data News Examples For Misuse of Statistics by Mona Lebied. Make sure to read this article before starting the discussion.
Statistics are a way of summarizing large data sets and making sense of them. Statistical results allow us to make decisions and test our preconceived opinions. While this makes statistics a powerful tool, it also means improper use can lead to misunderstanding data and making incorrect decisions. When people are trying to convince others that their arguments are the correct ones, they will use statistics to support their side. When winning is more important than the truth, they may intentionally present incorrect results and apply methodologies improperly.
The article below discusses some of the ways statistics can be used improperly to mislead other into believing one side against another. The article not only explains some common ways of doing this, it also gives some real life examples. Read the article linked above and then answer the discussion topic questions.
For this discussion, find one example where someone is misrepresenting data through improper use of statistics to support their viewpoint. Since identification of misleading statistics use can be difficult and tricky, you will use fact checking websites to find your example. Search for a case where a reliable source has identified someone misusing a statistic to mislead. Do not try to identify an example of misuse yourself.
The example you give should not simply be faulty analysis due to lack of information or skill. There should be an element of intentionally painting a misleading picture. You can find these examples in cases where theres a conflict of viewpoints and one side is pushing their own narrative. Categorize the example you found according to the six categories described in the article. Since the hard work of identifying an attempt to mislead is already done by the fact checking source, you will be evaluated on your selection of an example and subsequent analysis of the issue.
When responding to others, please only discuss whether you think the analysis was indeed misrepresentative or not. Do not discuss the overall claim that is being made and if you disagree with it. We are only concerned about the analysis methods, not the actual assertions made. Misleading statistics are intentionally used to push narratives- we are not here to argue anything but statistics.
Please use the template below in your answers, so everyone can easily follow your answers to all the questions (using the template below is part of the requirements; you will lose points if you dont follow the template or if you skip portions of what is being asked)
What is the title of the article? (Include a link to the article too)
What is the claim being made? Copy/paste or summarize the claim.
If its a visual (like a chart) then please include the chart in the post. You can usually copy paste images into the post. Include as much of the information here as possible so all readers can see what you are describing without having to visit the link and search for the problem.
What is the statistical analysis proposed to support this claim? Copy/paste their description of how they arrived at their results.
Which category does this issue fall under? Why?
Justify your selection and your identification of the issue. For example, if you say faulty polling you need to cite the improper wording. If you say data fishing you need to show they are looking for correlations without a proper hypothesis.
What would be a better analysis to evaluate the situation? Describe how to fix the faults in this analysis or suggest a different approach.
————————————-
SAMPLE POST
Contains unread postsCuneyt Altinoz posted Nov 5, 2020 8:48 AMSubscribe
To help you further understand what is expected from the discussion topic, I decided to post sample responses from former students as the situation calls for it. You should first read my guidelines post for the details of what you need to do, and then you can use these sample posts for ideas. I hope you find them helpful.
What is the title of the article?
The statistic I am presenting originally presented by the Georgia Department of Public Health on their daily COVID-19 status report and the title of the report is Georgia Overall Covid-19 Status. Within this report was a bar graph that showed the top five counties with the greatest number of confirmed COVID-19 cases over 14 days period. The graph has since been taken off the website due to backlash over its accuracy.
Website: https://dph.georgia.gov/covid-19-daily-status-report
What is the claim being made?
Governor Brian Kemp has reopened Georgia on April 24, citing a downward trend of COVID-19 cases being reported by the Georgia Department of Health. Georgians have been relying on this data such as number of cases, hospitalizations and deaths in their areas to determine whether or not it is safe for them to go out.
A bar chart shown below on the Georgia Department of Public Healths website appeared to show good news and claimed that new confirmed cases in the counties with the most infections had dropped every single day for the past two weeks from April 28th to May 9th (COVID-19 Status Report, n.d.).
What is the statistical analysis proposed to support this claim?
The statistical analysis proposed that this bar graph showed a downward trajectory for the top five Georgia counties with the greatest number of cases of coronavirus from April 28th to May 9th. According to spokeswoman Candice Broce, the x-axis was set up that way to show descending values to more easily demonstrate peak values and counties on those dates (COVID-19 Status Report, n.d.).
This graph was supposed to be helpful in supporting Governor Brian Kemps narrative that it is safe to reopen Georgia and it is safe for the public to go out, because he see a downward trend in COVID-19 cases over a 14 days period; however, it was intentionally painting a misleading picture.
Why is this analysis faulty (use categories from the misleading statistics article)? Which category does this issue fall under? Why?
The graph appears to show a steady decline in cases for the counties. However, examine closely, the dates are not in chronological order on the x-axis. Instead, it is organized by the highest number of cases on the left to the lowest number of cases on the right, despite the date. This bar graph also did not keep the counties in the same position each day, both of which caused confusion to the readers.
This graph would fall under the category that described as misleading data visualization. It is an insightful graphs include very basic, but essential element such as COVID cases per day. However, it should be viewed with a grain of salt, taking into account the mistakes of the dates not following the chronological order. Georgia Department of Public Health spokeswoman Nancy Nydam said the issue was due to incorrect sorting logic that did not consider the date of the confirmed cases (COVID-19 Status Report, n.d.).
The data visualizations of this bar graph designed to mislead the public because this misinformation occurs when the graphs producers ignore convention and manipulate the x-axis. The conventional way of organizing the x-axis is to start at the earlies date from the left to right in chronological order. By not setting the date in chronological order, small differences become deceptive and therefore play more on peoples prejudices rather than their rationality.
What would be a better analysis to evaluate the situation? Describe how to fix the faults in this analysis or suggest a different approach.
The faults could be fixed by listing the dates on the x-axes in chronological order and keep the counties in the same position for each day. By using the standard model for visual models following convention, graph producer can avoid misleading readers. Start with the earliest date on the left of the x-axis. Following convention also goes for pie charts as well. People are conditioned to look at pie charts as equaling 100% of the data; dont mislead people by only giving reader a slice of the pies data.
Although this graph could not support the reopening of Georgia because the analysis should represent there have been 38,721 cases confirmed by the Department of Health, as of May 19. Based on the numbers recorded, numbers are not necessarily going down but remaining steady (COVID-19 Status Report, n.d.).
Two major discrepancies appear to be around April 22 to April 24 (day of reopening) where the department initially reported a drop in new cases from nearly 900 to around 600 cases, and back up to about 700 cases. In reality, the state saw a significant drop from about 900 cases to around 400 cases, then a spike back to nearly 1,000 cases before another significant drop, according to the numbers documented by the Department of Health’s website (COVID-19 Status Report, n.d.).
Reference:
COVID-19 Status Report. (n.d.). Retrieved June 26, 2020, from https://dph.georgia.gov/covid-19-daily-status-report
——————————–
*Comment on this –
Unit 2 discussion
Contains unread postsKerry Bruso posted Nov 4, 2020 3:42 PMSubscribe
What is the title of the article?
The article is entitled The United Nations Report on American Poverty Is Just Plain Wrong published June 28, 2018 by Daniel J. Mitchell. The link is posted below.
https://fee.org/articles/the-united-nations-report-on-american-poverty-is-just-plain-wrong/
What is the claim being made?
Mitchell is focusing on the United Nations 2018 report on child poverty throughout the world. This report states the United States has an astronomically high number of children living in poverty compared to the rest of the world. Additionally, Mitchell acknowledges an article from The Washington Post by Valerie Strauss (2018) that states the United States in recent years by different organizations in which the United States has turned up at or near the top on issues such as poverty rates.
Mitchell goes on to quote material from the United Nations 2018 report on child poverty.
The United States¦has the highest youth poverty rate in the Organization for Economic Cooperation and Development (OECD)¦ The consequences of neglecting poverty¦ The United States has one of the highest poverty¦levels among the OECD countries¦ the shockingly high number of children living in poverty in the United States demands urgent attention. ¦About 20 per cent of children live in relative income poverty, compared to the OECD average of 13 per cent.
What is the statistical analysis proposed to support this claim?
Mitchell states it is true the United States does have the highest child poverty rate across the OECD countries. But Mitchell also knows this is a misuse of statistics. He references a chart from the November 2017 publication from the OECD. The chart below depicts what the OECD is actually basing its child poverty rates off of.
Mitchell believes this chart actually depicts relative poverty instead of measuring child poverty. The US looks bad in the chart because the US has a higher standard of living and a much higher median income than other countries.
Which category does this issue fall under?
This issue does not fall perfectly into any one category, it falls into two. It first fits into misleading data visualization. The actual issue is based on the 2018 United Nations study on child poverty. The chart Mitchell provided in his article points to the misleading data visualization the UN provides in its study. Additionally, the chart explains why the United States has the highest child poverty rate. The UN used a misleading statistic. The US has high income compared to the other countries. It is misleading for the UN to assume 50% of the median income would imply poverty. The median income is not consistent across all countries. This high threshold was made to make the US look far worse than the other countries who have a lower median income.
After we learn of the misleading data, we can then see the UN presented a flawed correlation: income to child poverty. Median income does not necessarily relate to poverty level. The misuse of statistic is the UN never acknowledging all countries do not have the same median income level. This leads to the misleading statement that the United States has the largest child poverty rate. The primary visual is misleading and the correlation is incorrect.
What would be a better analysis to evaluate the situation?
It is evident the improper use of median income is where the United Nations went wrong. In order for the statistic to be properly presented, the median income needs to become normalized. The UN relating the same median income for every country, counted many children above the poverty line because the median income threshold was too high for the United States.
The better analysis to evaluate child poverty across the globe, would be to calculate the poverty levels for each country based on its own median income. Additionally, only counting the children who fall below the poverty line will insure the child poverty rate accurately depicts that particular countrys child poverty rate. A proper data visualization would have each country on the X axis and the number of children that fall below each countrys poverty line on the Y axis. These changes to the data analysis will properly convey the percentage of children who suffer from poverty in each country.
References
Lebied, M. (2018, August 8). Misleading Statistics & Data News Examples For Misuse of Statistics. https://www.datapine.com/blog/misleading-statistics-and-data/.
Mitchell, D. J. (2018, June 25). The United Nations Report on American Poverty Is Just Plain Wrong. https://fee.org/articles/the-united-nations-report-on-american-poverty-is-just-plain-wrong/.
————-
*Comment on this –
UNIT 2 DISCUSSION
Sheriska Willie posted Nov 5, 2020 8:29 PMSubscribe
Sheriska Willie
What is the title of the article?
The statistics I will be presenting is from the Statistical Analysis Handbook 2018 edition on a report on declining teenage pregnancy. Throughout this report it shows the declining teenage pregnancy rates in Orkney off the north coast of Scotland.
Website: http://www.statsref.com/HTML/index.html?misuse_and_abuse_of_statistics.html
http://news.bbc.co.uk/2/hi/uk_news/magazine/8486221.stm
What is the claim being made?
In a rather different context highlighted in Jan 2010 by BBC journalist Michael Blastland. Reports of declining teenage pregnancy rates in Orkney off the north coast of Scotland, were shown to be highly misleading. Blastland showed two graphs. Scotland itself is high in the international league. Health workers in Orkney tried something new. They began talking to young people about sex in terms of relationships, not only mechanics. They also made condoms easily available because in a small community the shopkeeper might just be your auntie. The first appears to show a halving of the teenage pregnancy rate between 1994 and 2006, following an intensive program of education and support:
However, the reports omitted data for the intervening years, and as we know from stock market and many other types of data, rates of change depend very heavily on your start and end date. The data in this case is quite cyclical, and choosing 2006 rather than, say 2007, provides a completely misleading picture, as the graph below demonstrates.
What is the statistical analysis proposed to support this claim?
The statistical analysis proposed by this line graph to support this claim is what had gone down, briefly, went up, just as what sometimes jumps up often tends to come down in the smaller communities like Orkney. From 2000 to 2003 there was a huge increase in the teenage pregnancy rates which later decreased in 2006.
Which category does this issue fall under?
This bar graph shows an increase and decrease in the number of teenagers who are pregnant for the different years. This is not a radical explanation because some years there are more, and sometimes fewer numbers recorded. There was also no title for the years which looked a little confusing.
The graph will also fall under the category of a misleading data visualization. The data in this case is quite cyclical, and choosing 2006 rather than, say 2007, provides a completely misleading picture of the data. This is an interpretation of short-term data which gives the wrong conclusion. It also shows pregnancies per thousand women and not pregnancies per thousand teenage
women. This was mostly caused by random fluctuations in data.
What would be a better analysis to evaluate the situation?
From this report, the faults can be fixed by ensuring that the correct information is always recorded and when new cases arise it needs to be recorded as well. Sex education programs can also be enforced for the public and in schools to try and lessen the teenage pregnancy rate.
Reference
Blastland, Michael. The bumps in a falling teenage pregnancy rate, January 29th, 2010, from http://news.bbc.co.uk/2/hi/uk_news/magazine/8486221.stm.
Recent Comments