AGS delivers an extensive range of the highest quality demographic data products and lifestyle segmentation system, Mosaic™. All databases are derived from superior source data and the most sophisticated, refined, and proven methodologies.
- Core Demographics
- Income by Age
- Consumer Spending
- Retail Potential
- Businesses and Employees
- MOSAIC Segmentation
- Crime Risk
The estimates and projections database includes a wide range of core demographic variables for the current year and 5- year projections, covering five broad topic areas: population, households, income, labor force, and dwellings. With a foundation of the Experian household-level databases and over fifteen years of experience in demographic forecasting, AGS offers the highest quality demographic estimates in the marketplace today.
Since the 2005 update, we have been steadily refining our base population and household models which more accurately incorporate changes to the postal delivery counts, which will be most noticeable in new growth areas.
We fully incorporate the Census Bureau’s American Community Survey (ACS) results. The ACS is an annual survey which over the course of the next few years will result in a national rolling estimates database which will be the replacement for the decennial SF3 sample database. The ACS results at the county scale are an excellent means of tracking demographic attributes over the course of the decade. These, however, will need to be fully supplemented over time with the detail available from the Experian household level files in order to provide block group estimates over the coming decade.
Methodology and Data Sources
- Census tabulations from 1990, 2000 and most recently, the release of the 2010 Census
- The Census Bureau’s American Community Survey (ACS) results. The ACS is an annual survey which over the course of several years will result in a national rolling estimates database which is eventually intended to replace the decennial SF3 sample database.
- USPS and commercial source ZIP+4 level delivery statistics.
- Census Bureau estimates and projections of population characteristics at various levels of geographic detail, including the latest estimates of population at the city level.
- Bureau of Labor Statistics estimates and projections of employment by industry and occupation at the county level.
- Medicare eligible population counts at the ZIP code level, including population by sex and 5-year age cohorts, provided by the Health Care Financing Administration of Social Security. These counts provide a very accurate local count of the population aged 65 and higher.
- Internal Revenue Service statistics on tax filers and year-to-year migration.
- The Census Bureau’s Current Population Survey, which provides detailed demographic breakdowns and enables a thorough longitudinal analysis of demographic trends.
- Experian’s INSOURCE database, a household level credit and demographic database which covers the vast majority of households.
Current Year Estimates and 5 Year Projections
Householder under 25 years
Less than $10,000
$10,000 to $14,999
$15,000 to $19,999
$20,000 to $24,999
$25,000 to $34,999
$35,000 to $39,999
$40,000 to $49,999
$50,000 to $59,999
$60,000 to $74,999
$75,000 to $99,999
$100,000 to $124,999
$125,000 to $149,999
$150,000 to $199,999
$200,000 and over
Householder 25 to 34 years
Householder 35 to 44 years
Householder 45 to 54 years
Householder 55 to 64 years
Householder 65 to 74 years
Householder 75 and over
The Consumer Spending database covers most major household expenditures in a multi-level hierarchical classification. Expenditures can be expressed either as aggregate expenditure or per household expenditure for any geographic level from the block group to national.
The major categories represented are:
- Food and Beverages
- Household Operations
- Household Furnishings/Equipment
- Health Care
- Personal Care
- Tobacco Products
- Miscellaneous Expenses
- Cash Contributions
- Personal Insurance
The retail potential database consists of average household and total market potential estimates by each of sixty-eight retail store types. The store types are based on the NAICS classification and are listed below:
44111 New Car Dealers
44112 User Car Dealers
44121 Recreational Vehicle Dealers
44122 Motorcycle and Boat Dealers
44131 Auto Parts and Accessories
44132 Tire Dealers
44211 Furniture Stores
44221 Floor Covering Stores
44229 Other Home Furnishing Stores
44311 Appliances and Electronics Stores
44312 Computer Stores
44313 Camera and Photography Stores
44411 Home Centers
44412 Paint and Wallpaper Stores
44413 Hardware Stores
44419 Other Building Materials Stores
44421 Outdoor Power Equipment Stores
44422 Nursery and Garden Stores
44511 Grocery Stores
44512 Convenience Stores
44521 Meat Markets
44522 Fish and Seafood Markets
44523 Fruit and Vegetable Markets
44529 Other Specialty Food Markets
44531 Liquor Stores
44611 Pharmacy and Drug Stores
44612 Cosmetics and Beauty Stores
44613 Optical Goods Stores
44619 Other Health and Personal Care Stores
44711 Gasoline Stations with Convenience Stores
44719 Gasoline Stations without Convenience Stores
44811 Men’s Clothing Stores
44812 Women’s Clothing Stores
44813 Childrens’ and Infant’s Clothing Stores
44814 Family Clothing Stores
44815 Clothing Accessory Stores
44819 Other Apparel Stores
44821 Shoe Stores
44831 Jewelry Stores
44832 Luggage Stores
45111 Sporting Goods Stores
45112 Hobby, Toy, and Game Stores
45113 Sewing and Needlecraft Stores
45114 Musical Instrument Stores
45121 Book Stores
45122 Record, Tape, and CD Stores
45211 Department Stores
45291 Warehouse Superstores
45299 Other General Merchandise Stores
45321 Office and Stationary Stores
45322 Gift and Souvenir Stores
45331 Used Merchandise Stores
45391 Pet and Pet Supply Stores
45392 Art Dealers
45393 Mobile Home Dealers
45399 Other Miscellaneous Retail Stores
45411 Mail Order and Catalog Stores
45421 Vending Machines
45431 Fuel Dealers
45439 Other Direct Selling Establishments
7211 Hotels and Other Travel Accommodations
7212 RV Parks
7213 Rooming and Boarding Houses
7221 Full Service Restaurants
7222 Limited Service Restaurants
7223 Special Food Services and Catering
7224 Drinking Places
Methodology and Data Sources
The primary data sources used in the construction of the database include:
- Current year AGS Consumer Expenditure Estimates
- 2002 Census of Retail Trade, Merchandise Line Sales
- Census Bureau Monthly Retail Trade
The Census of Retail Trade presents a table known as the Merchandise Line summary, which relates approximately 120 merchandise lines (e.g. hardware) to each of the store types. For each merchandise line, the distribution of sales by store type can be computed, yielding a conversion table which apportions merchandise line sales by store type.
The AGS Consumer Expenditure database was re-computed to these merchandise lines by aggregating both whole and partial categories, yielding, at the block group level, a series of merchandise line estimates which are consistent with the AGS Consumer Expenditure database.
These two components were then combined in order to derive estimated potential by store type. The results were then compared to current retail trade statistics to ensure consistency and completeness.
Business & Employees
BusinessCounts is a geographic summary database of business establishments, employment, and occupation. The core BusinessCounts data, which now utilizes the industry standard InfoUSA database as its primary source data, includes data to the major SIC group with detailed establishment types. The database is available at the block group level and higher, including all standard geographic aggregations.
BusinessCounts is a vital addition to residential demographic data, in that the success of many business establishments is dependent upon not only the residential population, but also the working population during the daytime. Based primarily on the InfoUSA business database and supplemented by various public data sources, BusinessCounts provides a clear look at the range and size of establishments and their employees within any geographic area.
BusinessCounts is a geographic summary database of business establishments and employees for nearly ten million businesses and one hundred and thirty million employees. The database is available for all standard levels of geography including block group.
BusinessCounts is a geographic compilation of the InfoUSA business list, supplemented by occupational data from the Bureau of Labor Statistics and the County Business Patterns program. The primary variables available include:
- Total – Establishments, Employees
- Size – Establishments by size
- Occupation – Employment by occupation
- Major Industry – Establishments, Employees
- NAICS – Establishments, Employees by 3 and 4 digit
Methodology and Data Sources
The core source for the InfoUSA Business Database that is built from a careful integration of commercial databases, compiled white and yellow page directory data, city directories, corporate annual reports, and securities filings. The BusinessCounts file is current to January 2008.
In years past, a different data source was used by AGS to compile this database, and users should review the notes at the end of this document that outline the type and scope of the impacts of the change in source data. The primary changes that will be noted by users include:
- The ability to release establishment level data for use in mapping applications, with selection based on company name, SIC, geographic area, and company size
- A greatly expanded number of establishments, many of which are small and unclassified, but nevertheless reflect changes in the corporate landscape
- Improved SIC coding at establishments which include more than one major industrial group
- Reduced duplication of records – and subsequent over counting of employment – at companies which contain multiple legal entities at the same address
The database has been thoroughly cleansed for address consistency and geocoded. Virtually all records within the database are geocoded, although in some cases with less positional accuracy than others.
MOSAIC is a geodemographic segmentation system developed by Experian and marketed in over twenty countries worldwide. Each of the nearly one-quarter million block groups were classified into sixty segments on the basis of a wide range of demographic characteristics. The basic premise of geodemographic segmentation is that people tend to gravitate towards communities with other people of similar backgrounds, interests, and means. MOSAIC is linked to the systems in other nations through the Global MOSAIC classification, which consists of four-teen market segments found in every modernized country.
MOSAIC is one of over twenty neighborhood classification systems built by Experian staff, whose international segmentation experiences stretches back over twenty years. Along with the international experience applied in this product, some of the most experienced geodemographers in North America were involved with the development of MOSAIC. During the product refinement process, MOSAIC was compared to other clustering systems in a variety of tests. The MOSAIC assignments are updated annually by incorporating updated AGS demographics into the segmentation model, ensuring that the assignment is as accurate as possible given shifts in local area demographics.
The resulting segmentation system consists of sixty segments which are presented as twelve separate groups:
- A Affluent Suburbia
- B Upscale America
- C Small-town Success
- D Blue-collar Backbone
- E American Diversity
- F Metro Fringe
- G Remote America
- H Aspiring Contemporaries
- I Rural Villages & Farms
- J Struggling Societies
- K Urban Essence
- L Varying Lifestyles
CrimeRisk is a block group and higher level geographic database consisting of a series of standardized indexes for a range of serious crimes against both persons and property. It is derived from an extensive analysis of several years of crime reports from the vast majority of law enforcement jurisdictions nationwide. The crimes include murder, rape, robbery, assault, burglary, larceny, and motor vehicle theft. These categories are the primary reporting categories used by the FBI in its Uniform Crime Report (UCR), with the exception of Arson, for which data is very inconsistently reported at the jurisdictional level.
In accordance with the reporting procedures using in the UCR reports, aggregate indexes have been prepared for personal and property crimes separately, as well as a total index. While this provides a useful measure of the relative “overall” crime rate in an area, it must be recognized that these are unweighted indexes, in that a murder is weighted no more heavily than a purse snatching in the computation. For this reason, caution is advised when using any of the aggregate index values.
The primary source of CrimeRisk was a careful compilation and analysis of the FBI Uniform Crime Report databases. On an annual basis, the FBI collects data from each of about 16,000 separate law enforcement jurisdictions at the city, county, and state levels and compiles these into its annual Uniform Crime Report (UCR). The latest national crime report can be obtained either from the FBI web site in Adobe Portable Document (PDF) format or can be ordered directly from the FBI. While useful, the UCR provides detailed data only for the largest cities, counties, and metropolitan areas.
The original analysis was undertaken by obtaining detailed jurisdictional level data for the years 1990 through 1996, which were supplemented with 1999 preliminary UCR statistics at the State level and for cities and metropolitan areas where those have been released. AGS now uses UCR data from 1998-2006. The preliminary 2007 release data was used to balance the models to the latest available data.
A considerable effort was made to correct a number of problems that are prevalent within the FBI databases, including:
- The standardization of jurisdictional names: the FBI does not employ Census bureau codes in its databases and the jurisdictional names contain numerous typographical errors and format discrepancies which needed to be manually corrected
- Reporting by individual jurisdictions can be inconsistent from year to year, in that data for some jurisdictions is missing for one or more years and required handling
- Reporting for some crime types is inconsistent between jurisdictions. The FBI handles this by simply suppressing the statistics entirely for those areas. This primarily affects the rape category for Illinois, where statistics are suppressed for all but the largest jurisdictions. These missing values were handled via the modeling process, in which rape estimates were prepared for these jurisdictions by using a model which related rape incidence to other crime types
- The standardization of the database to account for jurisdictional overlaps. For example, the California Highway Patrol has jurisdiction over only state and Interstate highways in urban areas.
- Crime rates in general have been declining over the past several years, so it was necessary to adjust the historical data to reflect current crime rates.
Once this correction and standardization effort was completed, the database consisted of a time series of six years of data covering:
- All cities and towns which have their own police agency.
- All cities and towns where policing for the local jurisdiction is contracted to a higher level agency but which tracks statistics separately.
- A record for each county, which covers the population not covered by either of the two cases above. This is normally either a County Sheriff (or equivalent) or a State level jurisdiction, which reports incidence of crime by county (e.g. in New York, the State Trooper).
The initial models were undertaken using a subset of this database. In the smallest cities, a single murder will have a profound effect on the crime rate per 100,000 population that would severely distort the resulting models. Cities with less than 2,500 people were reassigned to their parent counties for the purpose of the analysis. A wide range of 1990 Census and current year demographic attributes was extracted from AGS’ databases for the remaining areas (approximately 8,500 separate “jurisdictions”). This database was then used as the primary modeling database and was used later for scaling purposes.
Each of the seven crime types was modeled separately, using an initial range of about 65 socio-economic characteristics taken from the 2000 Census and AGS’ current year estimates. Separate models were constructed for each of the nine Census regions (e.g. New England, East North Central, Pacific) in order to account for regional differences in crime rates and the demographic characteristics, which underlay them. The models constructed typically accounted for over 85% of the variance in crime rates at this “jurisdiction” level, although it should be noted that the results for property crimes were generally more reliable than for personal crimes.
The results of these models were then applied to the block group level using the same demographic attributes compiled at the block group level. The resulting estimates were then scaled to match the master database of 8,500 jurisdictions. For cities, the block groups within each city were scaled to match the city total. For areas outside of these cities (or for smaller centers), results were scaled to match the county total after adjusting for those cities scaled separately.
The final crime rate estimates were then weighted by population and aggregated to the national totals.
Read More >