Data Quality

To determine your map's data quality, you must first determine the source of your data. As per Railyard's guidelines, 2 dimensions of data quality are considered: source quality and level of detail.

Data Source#

Data source refers to where you got your data from. For example, if you are making a map of the US, you could get your data from official US Census data, or you could get it from OpenStreetMap (OSM). The source of your data will determine the quality of the data in your map.

Methodology#

Methodology refers to how you processed your data. For example, if you are using OSM data, did you use the population counts/positions as is, or did you augment that data with government data to distribute the population more accurately? The methodology you used will also help determine the quality of the data in your map.

Source Quality#

The standard for quality of data sources is that of the vanilla Subway Builder game. Subway Builder uses real US Census data from 2023. The classifications for source quality are as follows:

Source QualityDescriptionExamples
high-data-qualityOfficial government sources (on par with vanilla maps)LODES, INSEE
medium-data-qualityOSM data augmented by government or official dataUsing municipal level gov't data but augmenting that with OSM density to distribute residents
low-data-qualityPure OSM unvetted source dataUsing OSM population counts/positions as is

Examples of Government Sources

Note

If you are using a tool that pulls from real government sources such as rslurry's US Demand Data Generator (the tool used by the creating custom US maps guide), your data will be considered high-quality as the tool uses official US LODES data.

Level of Detail#

The level of detail of your data is determined by the density of the dots on your map (how spread out they are). The classifications for level of detail are as follows:

Level of DetailDescriptionExamples
high-detailDots at the density of the minimal official statistical areaCensus blocks
medium-detailDots at the density of higher level statistical areasCensus tracts, broad meshes
low-detailDots at the density of entire neighborhoods/municipalitiesLow-resolution OSM data