Jacob Van Leeuwen
Brigham Young University
Institutional Affiliation: Brigham Young University
NBER Working Papers and Publications
|September 2019||Combining Family History and Machine Learning to Link Historical Records|
with , , : w26227
A key challenge for research on many questions in the social sciences is that it is difficult to link historical records in a way that allows investigators to observe people at different points in their life or across generations. In this paper, we develop a new approach that relies on millions of record links created by individual contributors to a large, public, wiki-style family tree. First, we use these “true” links to inform the decisions one needs to make when using traditional linking methods. Second, we use the links to construct a training data set for use in supervised machine learning methods. We describe the procedure we use and illustrate the potential of our approach by linking individuals across the 100% samples of the US decennial censuses from 1900, 1910, and 1920. We...