Archive


home
in espanol
mission statment
news
board of directors
A.
Dr.
Joe
reports to congress
contact us
                

 

Household Matching Training

In January 2000, a staff representative from each side of the Board observed the Bureau's Before Follow-Up (BFU) clerical matching training at the National Processing Center in Jeffersonville, IN.  This training is key to the success of the Accuracy and Coverage Evaluation (ACE) survey.  Accurately matching results from the census and the survey in sampled areas is critical to the Bureau's Dual-System Estimation.  The Bureau, accordingly, endeavored to make the training as authentic as possible.  Throughout, the trainees sat in the same rooms and at the same computers which will be used when the actual matching process begins (the Bureau refers to this as the "production" as opposed to the "training" stage).  Moreover, the software package which will be used in production was used in training, and the block clusters the trainees worked were drawn from available Sacramento, CA and Columbia, SC dress rehearsal data.

The following is intended as an overview of the household matching guidelines and training.  It should not be considered an exhaustive reference.

Address Listing and Coding

During the ACE listing phase, listers canvassed neighborhoods within the ACE sample blocks and recorded address information in Independent Listing Books (ILBs).  This information will be entered into a mega-computer and compared to an extract of the Decennial Master Address File (DMAF).  Upon completion of the computer matching, four categories of addresses will emerge: matched, possibly matched, unlinked/non-matched census addresses, and unlinked/non-matched ACE addresses.  The computer-generated results will then be entered into a database management system called the Housing Unit Matching Review and Coding System (HUMaRCS) for clerical matching  (HUMaRCS was developed by the Gunnison Group working in tandem with the Bureau's Decennial Statistical Studies Division).  During this operation, known as Before Follow-Up (BFU) matching, all ACE and census addresses that are not computer-matched in the previous operation will be reviewed and assigned one of the following clerical match codes:


M The ACE and census addresses match.
P The ACE and census addresses possibly match.  There is not enough information to assign a match with confidence.
NI The ACE address does not match to a census address.
NE The census address does not match to an ACE address.
DI The ACE address is a possible duplicate with another ACE address.
DE The census address is a possible duplicate with another census address.
RV The match status is not clear.  A review by a Technician or Analyst is needed to resolve status before sending to the field.

Once all of the block cluster addresses are clerically coded, they will be sent to the next ACE operation, Housing Unit Field Follow-up (HUFF).  During HUFF, ACE interviewers will visit addresses to obtain additional information that will assist the clerical matchers in resolving the P, NI, NE, DI, and DE addresses.  The operation will then return to the clerical matcher for After Follow-up Housing Unit (AFUHU) coding.  This operation will result in the creation of the Preliminary Enhanced List (PEL) of all ACE and census housing units at the time of the housing unit follow-up operation in the ACE sample area.

Next comes the Person Phase, which will establish a list of persons who resided in the housing units listed on the Enhanced List, or EL (Before the Person Phase begins, large block subsampling will be used to reduce the number of listed housing units on the PEL).  ACE interviewers will then conduct Computer Assisted Personal Interviews (CAPI) with the people living at each of the addresses on the EL.  Using information collected during CAPI, a computer file containing ACE person data will be compared with a census file containing person information collected for Census Day.  The computer matching employs the Bureau's Primary Selection Algorithm (PSA).  Clerical matchers will then review ACE and census person information that the computer is unable to match (as in the Housing Unit BFU stage) and identify any ACE or census person requiring additional field information.  At this point, such cases will be sent to the field for follow-up before being returned to the clerical matchers in the After Follow-up (AFU) matching phase.

Housing Unit matching will be revisited after the Person Phase is completed; as the Housing Unit Phase will have been conducted before the inventory of census housing units is final.  As a result, a final housing unit processing will be necessary in order to produce housing unit coverage estimates.  Once again, the computer-clerical matcher pattern will be followed: the information will be fed into the computer, which will identify those housing units that were added or deleted from the DMAF; a clerical review will be done on all ACE and census housing units identified in the computer processing operation; and Final Housing Unit follow-up interviews will be conducted, the results of which will be fed back to the clerical matchers for After Follow-up (AFU) matching.  Those housing units that are found to exist will be assigned the appropriate match code and those found not to have existed will be deleted.  The ACE Housing Unit file will then be updated.

Contractor

Science Applications International Corporation (SAIC) was awarded the clerical matching training contract by the Bureau.  SAIC staff is well versed in the particulars of the software, having conducted similar training for the Dress Rehearsals.  While the clerical matchers (a.k.a., "clerks") are temporary decennial staff, the technicians and analysts who comprise the supervisory and quality check levels in the process are career professionals with years of Bureau experience.  They have all previously completed intensive matching training, and conducted the actual matching of the Dress Rehearsal data.  

Software

The HUMaRCS software divides the screen into roughly equal, scrollable thirds.  The top third contains those ACE and census addresses of a single block cluster that the computer linked/matched; the middle third contains unlinked/unmatched ACE addresses; and the bottom third unlinked/unmatched census addresses.  Clerks review unmatched cases, and attempt to identify matches.  The software allows the user to display images of both the ACE and census map-spotted maps to assist them in their work. (Computer matches are not usually reviewed).

The matching software is advanced and easy to operate, but has little flexibility.  Standard features on Windows-compatible software, such as sizable frames or "undo" options, are not available in HUMaRCS.  Bureau staff noted this inflexibility is by design, in order to minimize error and maximize uniformity in matching.  This is a sensible precaution.

Matching Principles

Clerical matching operates on certain principles, the most important being what might be called the principle of non-contradiction.  That is, clerks are encouraged to link as many ACE and census addresses as possible - even if the information in the two addresses is not identical - as long as the information in the addresses to be linked is not expressly contradictory.  In short, the address matching guidelines can be loosely summarized as follows:

AGREE if: The information in two addresses is identical or contains only minor spelling differences.

NON-CONTRADICTORY if: The information is supplied for only one address; the information supplied is partially complete in one or both addresses and that which is present in one address does not contradict information supplied for the other; and differences in information result from misspelling of householder's last name.

CONTRADICTORY if: Information from the two addresses cannot be classified into either the agree or non-contradictory category.

NO MATCH FOUND if: No possible match for either the ACE or census address is found.

Some examples: 415 Osborne Dr. and 415 Osbourn would, according to the guidelines, agree.  123 Main Ave. and 123 N. Main would be considered non-contradictory, if no other version of Main exists in the block cluster.  456 W. Frank St. and 456 E. Frank St. would be considered contradictory if both versions of Frank Street exist in the cluster.

Objective

The goal in the BFU stage is to produce a good address list for the follow-up interviewer.  Board staff did not observe the Bureau pushing clerks to unequivocally match addresses.  Rather, the clerks were instructed to match addresses only if they have a strong sense that they match.  The Bureau, according to the trainer, is counting on the clerks growing progressively comfortable over time - i.e., as they gain experience and familiarity, the clerks will grow more adept at resolving unmatched cases.

Presidential Members' Position

Based on the training observations, the Presidential Members of the Board do not believe there is cause to be concerned about an inordinate number of addresses being definitively, inaccurately matched.  The problem, in fact, may be a plethora of P coded addresses, at least initially.  But given the importance of accuracy, we believe that the Bureau is right to err on the side of an abundance of P as opposed to M matched addresses.

Congressional Members' Position

The Congressional Members of the Board have concerns that attempts to match addresses are subject to a certain level of ambiguity, even with total fidelity to the matching guidelines.  Given the sensitivity of the Dual System Estimation procedure planned by the Bureau, small errors in matching a relatively few households in the sample could magnify into large errors when projecting the results to larger regions or the entire country.

To cite one example, multi-unit structures with poorly-marked unit designations (often found in dense urban areas) are extremely difficult to match with certainty.  Units could easily be mismatched during address listing, with two different apartments in the same building being given the same unit designation (i.e. Apt. A).  Conceivably, the Bureau's computer matching could code those two different apartments as a match, prior to clerical review.  As noted above, computer matches are rarely reviewed by clerks or analysts.

At this time, the Board does not have sufficient information to assess the risk of this and other potential errors in matching, particularly since the Bureau has refused to provide the Dress Rehearsal evaluation report on the Primary Selection Algorithm.  However, without further information and analysis, we cannot affirm the integrity of the household matching procedure.


Quality Check

DSSD has produced detailed quality assurance guidelines.  In the early production stages, 100 percent of clerical matching will be checked by analysts and/or technicians.  After that, work will be reviewed on a sample basis.

CONCLUSION

The training provided for clerical level staff was well-executed to familiarize clerical staff with the Bureau guidelines to housing unit matching.

Given the importance of matching to the integrity of the Accuracy and Coverage Evaluation (ACE) survey, the Board plans to send observers to the Person Matching training, scheduled for the fall, and transmit the resulting observations in a future report.