Data Classification

This week, we learned about different data classification techniques, how to calculate them and how/when to implement them in our ArcMaps. The methods used were equal interval, natural breaks (ArcGIS default), quantile, and standard deviation.

The equal interval class system breaks the data classes up into equal ranges. In our census data, the range is 15.83 percentage points. The only advantage that this offers is a set of classes with a comprehensible division, but the large disadvantage is the occurrence of classes with few or no values present. For example, the two middle classes in our census tract data (31.68-47.50 & 47.51-63.33) have no visible presence on the map, leaving out important information that could be used more wisely.

Quantile distributions have the same number of values in each class. This eliminates the empty class problem present in our equal interval distribution, and reveals some more valuable information about the demographic of senior citizens in Miami Dade County. The middle and bottom of the county has the fewest number of people over 65 and the area hugging the coastal regions has clusters of older folk. This demography makes sense because the bottom and center of the county is mostly agricultural and nursery land, which requires maintenance by people with a high degree of physical fitness and heat tolerance; characteristics usually associated with younger people. The clusters of older people are in more urban parts of the county.

The Standard Deviation method classifies data based on how far away a data point is from the mean, measured in standard deviations. It highlights how much variability there is from the average. Most of the information in the map is going to be within one standard deviation, so in the case of trying to figure out how many old people are in different parts of Florida, it’s not the most useful map representation. It does a good job of highlighting variability and extremes in the data, but frankly most of the metrics are decent at showing the data extremes as well.

Finally, the natural break method seeks to reduce the variance within classes and maximize the variance between classes. It identifies "natural" groupings inherent in the data. Natural Breaks is excellent for data with significant gaps or clusters, as it adapts to the inherent distribution of the dataset. This method reveals the natural clustering of data points, highlighting the intrinsic patterns that might not be visible with other methods. However, it may conceal smaller variations within those clusters. (The natural breaks method is in my opinion the most legible in terms of class differentiation)

My favorite personal touch to these map layouts was the meaningful graduated color symbology scale. I intentionally used a red-gray color scale, red being lowest percentage of people over 65 and gray being highest percentage of people over 65. This was a bit of an inside joke to myself, but also lends an attractive aesthetic to the map and a significance most readers can appreciate.

Labels

Search This Blog

GIS Gazette

Cartography: Module 4

Data Classification

Comments

Post a Comment

Popular posts from this blog

Module 5: Exploring and Manipulating Data