Thursday, September 12, 2013

How To Use The New Census Data

It involves holding your nose:

The only way Statscan was able to publish the 2011 NHS data was by remarkably dropping its quality standard by doubling the acceptable global non-response rate.

And accepting incomplete questionnaires:

What exactly did Statscan consider an acceptable response to the 2011 NHS questionnaire? Apparently, enumerators had been instructed to accept the long form with as few as 10 of 84 questions answered.  What the standard was for the 2011 NHS and how it differed from previous long-form Census are important questions that, to date, have not been answered. Given that StatsCan lowered the data quality standard by doubling the acceptable non-response rate, one suspects a similarly dramatic change to the acceptable questionnaire completeness rate for the 2011 NHS relative to prior year long-form Census.

And, as far as I know, we are still talking about data at the level of  Census subdivisions (CSDs). That is, municipalities.  At this level:

One way to demonstrate the dramatic difference this change in data quality’s had is by looking at the published 2011 NHS data for Census subdivisions (CSDs). Of 5252 CSDs in the 2011 NHS, data for 3439 was fit for release, while data for 1813 was suppressed, almost all for non-response, using the post-2011 NHS data quality standard.  However, using the pre-2011 NHS data quality standard, only 994 of those 5252 CSDs, or 19%, would have been fit for release, while 4258 would have been suppressed. By comparison, of 5418 CSDs in the 2006 long-form Census, data for 4534, or 84%, was fit for release, while data for only 884 was suppressed.

Below that, down to the tract/dissemination area level, data has still not been released.  

