I have wanted to explore data using some of the advanced charting tools that are now available on the Internet for some time now. I’ve looked at quite a few options including Tableau and some GIS (Geographic Information System) programs such as ArcGIS and Instant Atlas. Most of these were expensive or used complex and cumbersome interfaces that had a steep learning curve.
When I first looked at the Google Data Public Explorer I was impressed by the “higher order” use of CVS files to store the data, the XML to parse it and its use of HTML to render the output. The use of CVS files saves a lot of time when it comes to entering new data as you only need to change the file when you have new figures to add. However the need to use separate sets of variables is counter-intuitive to my training in statistics, where all the data resides in one file. Although the use of motion charts and their ability to convey meaning by displaying change over time was of great interest to me in the end my real world data set (diabetic rates from Statistics Canada) was limited and did not did not lend itself to this display in this format. Instead I used Google Charts, which allows you to code characteristics of your chart as well as enter the data directly in the HTML file. Much simpler in the short term. But this tool is may not be a good choice in the long run as it means editing the code rather than calling up new CVS data files as utilized by the Data Public Explorer.
By using the basic bar chart option I was able to plot the overall rate of diabetes in nine provinces for five years (2005, 2007, 2008, 2009 and 2010). According the data source information for some provinces was considered not reliable enough to be published. There was no explanation provided as for why data was not available in the year 2006.
Image 1 (below) shows these results in a static bar chart format
Image 1: Canadian diabetes rates (source: Statistics Canada)
(I also learned that it is not easy to display a Google Chart in WordPress. To see the interactive elements of the chart click here for the dynamic version.)
The chart shows that rates of diabetes are higher on the east coast and that they have been increasing in past years (although there is a decrease in 2010). According to the Public Health Agency of Canada’s 2011 report on diabetes there are a number of associated risks including genetics, being overweight, not exercising, certain ethnic origins and increasing age. These tend to explain the “how” but not the “why”. For example, why does someone who knows they are at risk genetically for diabetes also not exercise or eat properly, which increases their risk? Changing these types of behaviours requires a deeper understanding of why they are taking place.
In the next installment I will explore how we can use other sources of information to help contextualize the numbers that were represented in the bar chart, which may lead to a better understand of ways in which behaviours can be changed.
Note: April 3rd – The follow-up post, “Patients and providers: identifying a diabetes dialogue gap?” is now available.