f Indexed reports - Wiki

Indexed reports

From Wiki

(Difference between revisions)
Jump to: navigation, search
Line 22: Line 22:
*'''Filter out Z''' because [[Z|Mosaic Group Z]] represents postcodes which were missing in [[STATS19]] returns, and they distort the index calculation - you can apply a [[filter]] to the Mosaic [[category]] to do this
*'''Filter out Z''' because [[Z|Mosaic Group Z]] represents postcodes which were missing in [[STATS19]] returns, and they distort the index calculation - you can apply a [[filter]] to the Mosaic [[category]] to do this
*'''Look carefully at the value axis''' because sample size makes a big difference to the significance of indices - an index of 115 on a sample of 1,000 would probably be significant, but an index of 180 on a sample of 20 would probably be unreliable.
*'''Look carefully at the value axis''' because sample size makes a big difference to the significance of indices - an index of 115 on a sample of 1,000 would probably be significant, but an index of 180 on a sample of 20 would probably be unreliable.
-
*'''Think about the Home filter setting''' because the underlying population base for the index is drawn from the setting you choose - there is little point in looking only at crashes in one area while basing indices of the entire UK population, because local residents will be over-represented and this will probably distort the index calculation.
+
*'''Don't use a Crash Location filter unless you have also applied a matching Home filter''' because the profile of people crashing in a small area is unlikely to be similar to that of people living in a much larger area - so doing this will probably distort the index calculation.
-
*'''Don't use a Crash Location filter without a matching Home filter''' because the profile of people crashing in a small area is unlikely to be similar to that of people living in a much larger area - so doing this will probably distort the index calculation.
+
*'''If you apply a Home filter, use multiple selection filters sparingly, and always uncheck the NK and Unknown boxes''' because eccentric collections of areas and including missing postcodes in the index calculation is likely to give highly unpredictable results.
*'''If you apply a Home filter, use multiple selection filters sparingly, and always uncheck the NK and Unknown boxes''' because eccentric collections of areas and including missing postcodes in the index calculation is likely to give highly unpredictable results.

Revision as of 12:20, 6 November 2009

An indexed report is intended to put data into context, by indicating whether individual items of information are over or under represented compared to the norm.

MAST indexed reports display not only a measure, but also an index bar for each data point in chart view.

Contents

Mosaic Profiles with indexing

In the Initial Version of MAST, the only type of indexing which has been implemented compares the backgrounds of people involved in crashes to the corresponding population base.

Reports can only show indexing under the following circumstances:

Understanding Indices

Indices are expressed with a base value of 100: that is, an index value of 100 indicates that the corresponding data point is exactly representative of the underlying population, neither larger or smaller than would be expected. values over 100 indicate that a data point is over represented, while values under 100 indicate relative under-representation.

When interpreting index values, the following techniques should always be used:

  • Filter out Z because Mosaic Group Z represents postcodes which were missing in STATS19 returns, and they distort the index calculation - you can apply a filter to the Mosaic category to do this
  • Look carefully at the value axis because sample size makes a big difference to the significance of indices - an index of 115 on a sample of 1,000 would probably be significant, but an index of 180 on a sample of 20 would probably be unreliable.
  • Don't use a Crash Location filter unless you have also applied a matching Home filter because the profile of people crashing in a small area is unlikely to be similar to that of people living in a much larger area - so doing this will probably distort the index calculation.
  • If you apply a Home filter, use multiple selection filters sparingly, and always uncheck the NK and Unknown boxes because eccentric collections of areas and including missing postcodes in the index calculation is likely to give highly unpredictable results.


Index calculation algorithm

The method of calculation is as follows:

( {Value of data point} / {Value of all data points} ) / ( {Population corresponding to data point} / {Total population} ) * 100

Index calculation example

Group H drivers constitute 223 out of a total sample of 1,328 (16.8%)

Group H residents constitute 11,278 out of a total local population of 76,589 (14.7%)

The index value for the H data point is 114 ( 16.8% / 14.7% * 100 )

Future expansion of indexing functionality

It is intended that future versions of MAST will apply the concept of indexing much more widely. More rigorous features to ensure statistical significance are also being planned.

Personal tools