home / fivethirtyeight / most-common-name/adjusted-name-combinations-matrix

Menu
  • GraphQL API

most-common-name/adjusted-name-combinations-matrix: 3

This directory contains the code and data behind the story Dear Mona, What’s The Most Common Name In America?

The main script file is most-common-name.R

There are four input files:

  • state-pop.csv - Total population and Hispanic population by state.
  • surnames.csv - Data on surnames from the U.S. Census Bureau, including a breakdown by race/ethnicity.
  • aging-curve.csv - Data from the Social Security Administration on the chances that someone born in the decade shown was still alive in 2013: http://www.ssa.gov/oact/NOTES/as120/LifeTables_Tbl_7.html
  • adjustments.csv - Taken directly from Lee Hartman's article: http://mypage.siu.edu/lhartman/johnsmith.html.

And five output files:

  • adjusted-name-combinations-list.csv - Adjusted estimates for the most common full names.
  • adjusted-name-combinations-matrix.csv - The same data from the file adjusted-name-combinations-list.csv but in matrix form. These are the estimates presented in the second (and final) table of the article.
  • independent-name-combinations-by-pop.csv - Matrix of estimates for the top 100 most common first names by top 100 most common surnames. These were calculated using independent odds, and displayed in the first table presented in the article.
  • new-top-firstNames.csv - Final estimated ranking of top first names.
  • new-top-surnames.csv - Final estimated ranking of top surnames.

Data license: CC Attribution 4.0 License · Data source: fivethirtyeight/data on GitHub · About: simonw/fivethirtyeight-datasette

This data as json, copyable

rowid Unnamed: 0 FirstName SMITH JOHNSON WILLIAMS BROWN JONES GARCIA RODRIGUEZ MILLER MARTINEZ DAVIS HERNANDEZ LOPEZ GONZALEZ WILSON ANDERSON THOMAS TAYLOR LEE MOORE JACKSON
3 44 John 18715.2588143273 12576.6238204357 14458.1874809372 9491.51836762503 10511.9336476099 4175.73777549329 3561.59668036704 12369.1993434729 4084.62759372798 10736.8236342261       8311.00720172512 8349.66905631504 7910.78066433961 7503.30428117198 6801.60239746484 7491.87125931207 5159.56639804023
Powered by Datasette · Queries took 20.353ms · Data license: CC Attribution 4.0 License · Data source: fivethirtyeight/data on GitHub · About: simonw/fivethirtyeight-datasette