Archive

U.S. CENSUS MONITORING BOARD

A Guide to Statistical Adjustment:

How it Really Works.

Report to Congress

May 23, 2001

U.S. Census Monitoring Board

Congressional Members

4700 Silver Hill Road

P.O. Box 610

Suitland, MD 20752

(301) 457-5080

www.cmbc.gov

email: feedback@cmbc.gov


TRANSMITTAL LETTER

Report to Congress

CMBC 72-424


Table of Contents  
Letter of Transmittal I
Executive Summary 1
Census 2000 & the Accuracy and Coverage Evaluation 5
Conclusion 19
Appendix A i
Appendix B xi

Census Monitoring Board, Congressional Members

III


Report to Congress

CMBC 72-424


Executive Summary

With the 2000 census now complete, every indication is that the Census Bureau and its dedicated staff have produced the most accurate census in the nation's history — without the use of statistical adjustment, a methodology that would have adjusted census figures at every level of geography using a post-enumeration survey. Important information gleaned from past enumerations together with the implementation of a series of "best practices" resulted in a net undercount of no more than 1.18 percent--the lowest undercount ever calculated. Even more important, African Americans, Latinos, American Indians living on reservations, and Asians each saw their undercount rates dramatically reduced by more than half. Hence, important progress was made without the use of statistical adjustment on two previously believed intractable problems — the overall net undercount and the differential undercount — that posed serious questions of social justice to the census.

Despite the success of the 2000 census, supporters of statistical adjustment nevertheless continue to advocate for its implementation as a means of increasing the accuracy of the census and for its use in redistricting and the allocation of government funds and services. Many thoughtful parties continue a sincere debate over these issues.

On March 1, 2001, the Census Bureau's Executive Steering Committee overseeing the post-census evaluation phase, the Accuracy and Coverage Evaluation (A.C.E.), recommended against adjusting the count. In doing this after months of study, the respected career statisticians of the Census Bureau moved the debate over the use of statistical adjustment a step toward closure. The Committee simply found too little evidence to support the use of these statistically questionable estimates and instead recommended using unadjusted data for redistricting.

In a letter to Secretary of Commerce, Donald L. Evans, the acting director of the Census Bureau, William Barron, Jr., wrote, "The Committee reached this recommendation because it is unable, based on the data and other information currently available, to conclude that the adjusted data are more accurate for use in redistricting." Former Clinton Administration Census Bureau Director Kenneth Prewitt called the decision a "scientifically defensible and prudent thing to do." (New York Times, 3/2/2001)

We know that the census fails to count every person living in the United States. Many hundreds of thousands of people are missed in each census.

Historically, many of these undercounted have resided in predominantly low-income, immigrant, minority, or rural communities. There is clearly agreement among all census shareholders on the need to focus attention on the problems of undercounts, especially the "differential" undercount, the differential between the accuracy of the census for Whites contrasted with the census accuracy for minor


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


ity communities.

This is not a recent phenomenon, however. Evidence of undercounts can be found as far back as the first census in 1790. The differential undercount was first discovered in 1940, when a disparity between the census count for African American males and military enlistment records became apparent.

For sixty years, the Census Bureau has tried to rectify the problem of undercounts by evaluating census results through the use of independent, outside data to check census counts for quality and accuracy. One of the most useful external measuring tools developed by the Census Bureau, demographic analysis, has allowed the Bureau to compare the results of the census at the national level to an independent set of data derived from administrative records such as birth and death certificates.

In addition to the demographic analysis, for the past several censuses, the Bureau has relied on a post-enumeration survey to evaluate the census for accuracy. Controversy arose when the Bureau, with what were the best of intentions, proposed using the post-enumeration survey for a very different purpose — as a method of addressing the problem of undercounts through statistical adjustment.

This is the first in a series of reports to Congress, developed by the Congressional Census Monitoring Board, that will analyze and discuss the A.C.E. methodology and the data from the 2000 Census. These issues have serious implications not only for this census and how its data are applied but also in how we can refine our methodology and practices in order to improve the accuracy of the next decennial census.

The Congressional Members of the U.S. Census Monitoring Board is the first shareholder to report to the Congress after the completion of the census on the complex issue of statistical adjustment methodology. Our analysis shows that each of the four stages of the statistical adjustment process contributes to an adjustment methodology that, in the end, raises serious concerns. These include:

· Post-Stratification — the creation of demographic subgroups.

· Dual system estimation — comparing census counts with the A.C.E. survey data.

· Adjustment factors — producing a numerical ratio for each subgroup to reflect either an undercount or an overcount.

· Synthetic estimation — applying that numerical ratio to create adjusted counts.

Through this analysis, the Board has identified several questionable assumptions or approaches upon which statistical adjustment methodology relies and which are key to understanding the problems adjustment poses for a fair and accurate census.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


· Despite the diverse racial and ethnic makeup of communities in the United States, the methodology assumes certain similarities in behavior within demographic groups, while ignoring critical sociological distinctions such as income, literacy and educational levels.

· When counts differ between the actual census data and the A.C.E. survey results, the methodology assumes the A.C.E. counts are always correct.

· Using adjustment factors to address undercounts, millions of people, virtual people who were not contacted during the actual census, would be added or hundreds of thousands who chose to participate would be eliminated from the census figures by statistical adjustment. Preliminary analysis in 2000 indicates that approximately one million people who returned their census forms would have their records nullified if statistical adjustment were employed — an ironic reward for cooperating with the census. Many of the nullified would come from groups that defy commonly held notions of the overcounted — such as Asian children and African American and American Indian homeowners.

· With statistical adjustment, most of the data used to adjust census figures in a particular state, city or town are from other states, other counties and other cities. This "synthetic estimation" process would apply individual data taken from distant, disparate parts of the country to other geographic locations. For example, it could adjust the profiles of some Mississippians with data taken from people in Alaska.

· The A.C.E. survey is the largest post-enumeration study ever done and is a useful evaluative tool to judge the overall accuracy and effectiveness of Census 2000. Despite its record size, however, survey data cannot be reliably used to actually adjust census counts at the block or neighborhood level with statistical validity as its supporters argue.

For all these reasons and more, there is no solid scientific consensus behind the use of statistical adjustment to reduce census overcounts and undercounts.

Taken together, the success of Census 2000 and the Census Bureau's recommendation not to use adjusted data for the purposes of redistricting offer an opportunity for census shareholders to not only discuss strategies for the future, but more importantly, to critically evaluate the methodology of statistical adjustment before any further decisions regarding adjusted data are made.

Because the Census Bureau plans to continue analyzing the A.C.E. survey data to determine the advisability of using adjusted data for purposes other than political redistricting, the Congressional Members of the Monitoring Board believe this report and its findings are crucial to the continuing dialogue. A future decision to produce adjusted data based on the A.C.E. results could affect the levels of


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


government funding communities will receive as well as the American people's confidence in their public institutions.

Reasonable people, including leading statisticians, have reviewed the statistical adjustment methodology and have raised legitimate questions. This report attempts to put that methodology into perspective and add to what will be further debate and discussion on how all of us as shareholders can continue to work together to ensure the most accurate census possible.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


Census 2000 & the Accuracy and Coverage Evaluation

Evaluative tools are nothing new for the Census Bureau. The Bureau has used many independent standards to measure the accuracy of the census ever since the detection of a differential undercount in 1940. For Census 2000, the Bureau's post-enumeration survey was incorporated in the Accuracy and Coverage Evaluation, or A.C.E. (based on an earlier evaluative methodology that was developed for the 1980 and 1990 censuses).

What is new is a proposal to use the A.C.E. for statistical adjustment — an elaborate mathematical process based on comparing the results of the Census 2000 enumeration with this extensive survey. In making such comparisons, the proponents of this process hope to produce an estimate of the true population and allocate population at all levels of geography — states, counties, cities, towns, and neighborhoods. It is this process to adjust the official census for every level of geography that is unprecedented — and the subject of legitimate scientific debate.

Why has statistical adjustment been such a topic of debate between statisticians? When it comes to an accurate count of the population, experts disagree on whether an estimate based on the A.C.E. survey, or the actual enumeration, is closer to the truth about the population. They also disagree as to whether statistical adjustment can actually improve the accuracy of the census data at all or could, in fact, make the problem worse. At the same time, no one in the scientific community believes statistical adjustment would totally eliminate the undercount.

It's important to first understand what the A.C.E. does and does not do. The A.C.E. does not determine how many real people are missed by the census. The A.C.E. produces an estimate of the population of the United States for every level of geography by comparing, or matching, the results of an independent survey to the results of the census. Because statistical adjustment does not involve real people, it cannot eliminate the undercount or "correct" all of the error in the census.

How does this process work?

The Bureau's approach is to break down the population by post-stratum, or by sub-groups of the population. This allows the Bureau to organize the population into manageable demographic categories. Central to this categorizing approach, however, is the assumption (the "homogeneity assumption") that these subgroups share similar characteristics and a similar potential of being counted or missed by the census. The Bureau then uses the A.C.E. survey to determine the accuracy of the census enumeration for each of these post-strata groups.

The Census Bureau determines whether there are overcounts or undercounts for these post-strata groups using dual system estimation. This method compares the results of the survey to the census enumeration at the block cluster level. (the


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


smallest unit of census geography.) It is from this comparison that adjustment factors are created to "adjust" the census. The results of the A.C.E. survey-Census 2000 comparison are extrapolated and weighted to determine estimates for overcounts and undercounts for each of the post-strata at all levels of geography, from block to states. This final process of synthetic estimation then "reformulates" the census count based on the Bureau's estimate of population.

To better understand how this is done, we need to consider in greater detail each of the four major steps of this process.

First Step: Post-Strata Groups and Presumptions of Homogeneity

The Bureau, in attempting to describe the diverse communities living in the United States, tries to define post-strata groups based on assumptions about their homogeneity. These assumptions are rooted in the Bureau's beliefs or theories about the probable similarities of people in different subgroups of the population. The Bureau attempts to group people into subgroups, post-strata, based on the potential of these persons to be counted by the census. The 2000 A.C.E. initially planned to create adjustment factors for 448 post-stratum groups.1

These groups are formed out of 64 basic post-strata groups that reflect Race/Hispanic Origin; Tenure (owner, non-owner); Type of Enumeration Area (TEA); Return Rate (Census Questionnaire); and Region for Non-Hispanic White. The 64 post-stratum groups are then sub-divided by seven Age/Sex groups to create the 448 post-stratum groups.

While this approach seems thorough, much is left out.2 This methodology makes assumptions about the potential undercount of millions of people living in an increasingly diverse nation; yet it is based on assumptions that don't reflect a number of potentially consequential characteristics.

For example, the Bureau does not record income, literacy or education levels, as well as other important sociological characteristics. It is reasonable to suggest that a person's educational level might be a factor in his or her decision to return the census form. Clearly, these post-strata results contain statistical assumptions about thousands of different racial and ethnic communities living in the United States based on limited assumptions.


1 Based on a pre-specified minimum sample size of 100 persons per post-stratum,, the Bureau opted to "collapse" 32 post-strata, creating an adjustment based on 416 post-strata.

2. This chart has simplified several complicated details of the post-stratification design for the purposes of presentation. Not all demographic (race/ethnicity) subgroups are as finely stratified or treated comparably. That is, while some demographic groups are stratified by a "full set" of variables including age, sex, region, tenure, and other characteristics, other groups are not comparably divided. For instance, there are only two final post-strata groups, Owner and Non-Owner, for each age group in several Race/Ethnicity subgroups: Asian Americans; Native Hawaiians or Pacific Islanders; American Indians on Reservation; and Americans Indians off Reservation.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


Figure 1

Census 2000 ACE Post-Stratum Groups

Figure 2

Census 2000 ACE Post-Stratum Groups

· For instance, the Bureau will assign the same statistical undercount rate for two different individuals in the same post-stratum i.e. Asian Females, Non-Owners, age 18-29. Yet, under this methodology, one may be an immigrant from China's Fujian province who has been in this country for less than eight months, who lives in a linguistically-isolated community and who works in an urban garment district, while the other may be an American citizen of Japanese descent whose family has lived in the United States for 100 years, who are third-generation graduates from college.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


· The Bureau will assume that a man who is living in the Bronzeville area of Chicago earning less than $20,000 a year has the same undercount rate as the CEO of a Fortune 100 company living in Washington, D.C. How can this be? Again, the methodology, only sees them as Black Males, Home Owners, Low (Mail) Return Rate, age 50+.

· The Bureau will assign the same undercount rate for an American Indian living in an isolated area on the Pine Ridge Reservation in South Dakota and earning less than $7,000 a year, as it will for one of the members of the Mashantucket Pequot Tribe in Connecticut, an Eastern tribe with considerable financial resources, and a resident living in one of the small closely-knit villages on the Zuni Reservation in New Mexico. Why? Because they are American Indians on Reservation, Home Owners, Females, age 50+.

The Bureau must make broad assumptions in order to create post-stratum groups that are of sufficient size or number in the sample to make reliable estimates.

Second Step: Dual System Estimation

The next step in the Bureau's process presents another set of issues. For each of the post-stratum groups, the Bureau must determine the appropriate adjustment factor using dual system estimation (DSE) — the comparison of the A.C.E. survey to the census enumeration. This comparison provides a coverage estimate. In other words, it creates an overall depiction of how many people for a particular post-stratum group might have been overcounted or undercounted.

How does this work? The dual system estimate operates by comparing the results of the census (all of the persons counted in a sample census block cluster) to the results of the A.C.E. survey. Some people will be found and matched in both the census and the survey. Some will be found in the census and not in the survey. And finally, some will be found in the survey and not in the census. Using the results of these comparisons and making certain assumptions, one could make an estimate of the population.

The Bureau organizes the results of this comparison by post-stratum. For instance, in the sample census block cluster, the Bureau recorded eight persons for the post-stratum 055 (White Female Owner, Large Metropolitan Areas, Low Return Rate, Northeast Region, Age 30-49). The Bureau then compares this to the A.C.E. that might have found seven persons counted for that block cluster. The eighth person who is "not in the survey" would eventually be considered part of the "overcount." The Bureau makes such a comparison for each block cluster in the sample automatically assuming the A.C.E data are correct.

The Bureau chooses to believe that this DSE design will yield an unbiased estimate of the coverage — both the net undercount and overcount. However, this


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


requires all other important assumptions underlying the methodology to hold true — including the independence and accuracy of the survey.

The Question of Independence

The Bureau ultimately assumes that the census and the survey are conducted independently and that neither influences the outcome of the other.3

But consider this. The A.C.E. survey included approximately 314,000 housing units, or less than one percent of the nation's approximately 120 million housing units. This survey was the largest post-enumeration survey ever taken to evaluate the accuracy and coverage of a decennial census.4 The Census Bureau sent A.C.E. interviewers to each of those 314,000 households to ask the same questions as the census, in addition to other questions, in order to obtain independent data for each household in the randomly selected blocks. This methodology again assumes that the prior census did not compromise or "condition" the subsequent survey — perhaps by making respondents suspicious, or resentful at being asked the same set of questions twice. However, this is a presumption that

Figure 3

Dual System Estimation


3 Theory aside, there is debate regarding how the census and survey interact in the real world. The Bureau does evaluate the effects of "conditioning" as a part of "correlation bias." Historically, the Bureau has ultimately ruled such conditioning out as a major contributor of bias or error in the dual system estimation process.

4 This survey of 314,000 housing units was roughly contained in 11,000 randomly selected A.C.E. block clusters — the P-sample. (Census blocks are the smallest units of census geography, enclosing only a few households, and are below the census tract level.) A block cluster may be one block, part of a block, or two or more blocks that are joined together. The A.C.E. blocks, randomly chosen, correspond geographically to the census sample blocks — the E-sample.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


is open to question5 especially given polls that show widespread and heightened concern about personal privacy among Americans.

The Question of Accuracy

When matching data, the Bureau operates on the presumption that the source of any "erroneous" information must be the census results not the A.C.E. survey. This is true even in cases in which the census taker was someone from the community who looked and sounded like the residents, and successfully secured the cooperation of someone who might have otherwise have refused to participate in a government survey such as the A.C.E.

In contrast, A.C.E. survey takers, often from outside the community, may have been unable to secure cooperation. In other cases, A.C.E. survey takers could walk away with a different set of answers even after a reconciliation process to verify both the census and initial A.C.E. survey answers. Nevertheless, even in these cases, the Bureau's adjustment methodology will consider the original census enumeration as an "erroneous census enumeration."6

Yet, it is known that there is error in the A.C.E. process. Errors can be made in carrying out A.C.E. and in the census. In 1990, for example, the Bureau estimated that 45 percent of the undercount was actually processing error in the PES, not undercount.7 Other researchers believe that this error could have been higher.

Third Step: Adjustment Factors

In the next phase, the Census Bureau takes the results from dual system estimation to create an estimate of the accuracy and coverage to determine an overcount and undercount for each post-stratum group included in the 11,000 census blocks in the survey and each of the post-strata. These are the adjustment factors.


5 At any given point in the A.C.E. process, error and bias — even the smallest amounts of error — can affect the results.

6 There is some debate as to whether the A.C.E. survey operation includes the reconciliation process. During this reconciliation phase A.C.E. survey workers are sent out to verify census and A.C.E. answers and determine which may be correct. The Congressional Members of the U.S. Census Monitoring Board have interpreted this reconciliation phase to be an integral part of the A.C.E. survey process, a quality control for the A.C.E. data. However, there are some, including representatives from the Census Bureau, who consider the reconciliation phase as a distinct operation, separate from the actual survey operation (when the A.C.E. data is initially collected).

7 U.S. Department of Commerce, Bureau of the Census, Committee on Adjustment of Postcensal Estimates (CAPE Committee), Assessment of Accuracy of Adjusted Versus Unadjusted 1990 Census Base for Use in Intercensal Estimates (Washington, DC, 7 August 1992), 15.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


Figure 4

Comparing the Census to the ACE Survey

"Erroneous Enumeration in the Census for 1 Asian Male, Age Under 18"

Consider one example. Suppose the census enumerated 800 White Male Renters in Large Metropolitan Areas, age 18-29, and that the A.C.E. found 808 White Male renters in large metropolitan areas in those same block clusters. According to the post-enumeration survey methodology, the Census Bureau will determine that there is an approximate one-percent undercount for this post-stratum. This one-percent undercount translates to an adjustment factor of 1.01 to be applied to the population of White Male renters in large metropolitan areas, age 18-29, for the entire nation.

If, for example, the census population for this post-stratum of three million, approximately 30,000 people would be added through adjustment in a somewhat random fashion throughout all large metropolitan areas. In other words, based on the sample of 800 people, the Bureau assumes an adjustment for 3 million people throughout the country.

The process can also work in reverse. Take a similar one-percent overcount for another population group such as Black Female Owners in medium-sized metropolitan areas, age 50+. If the census recorded 606 persons for this post-stratum, and the A.C.E. survey found only 600, the dual system estimate would yield a one-percent overcount for this category. This methodology, in turn, would yield an adjustment factor of 0.99 that would be applied to the entire subgroup population living in medium-sized metropolitan areas across the United States. (As adjustment factors are applied to individual census records, the Bureau uses a


Census Monitoring Board, Congressional Members




Report to Congress

CMBC 72-424


"controlled rounding" process to ensure that they are tabulated to reflect whole persons.)

As this example shows, statistical adjustment is not just about adding people. Statistical adjustment also subtracts or nullifies people from the census count. Why is this? To achieve a correct net estimate for the entire population and for each of the subgroups, the Bureau must both add and subtract population from the census count. Based on post-strata adjustment factors of less than one, roughly one million counted persons would have been eliminated from the population total by statistical adjustment based on the 2000 A.C.E.

Regarding the first attempt to adjust the 1990 census, Secretary of Commerce Robert A. Mosbacher cited this issue of subtraction as a significant concern. He said an adjustment would "add over six million unidentified people to the census while subtracting over 900,000 people who were actually identified and counted."8 In 1990, the Census Bureau would have ultimately subtracted about 1.4 million persons from the census9 by statistical adjustment. Moreover, many of the people eliminated or nullified through statistical adjustment would not reflect commonly held demographic notions as to the characteristics of the "overcounted."

The Bureau's 2000 A.C.E. pre-specification documents refer to the adjustment for overcount as "imputing a negative" value for the record of an overcounted person.10 The Census Bureau does not remove overcounted person records from the census — all reported census information is preserved in the census files. However, the Bureau does use a procedure to eliminate the overcount — in effect nullifying the count of persons who chose to participate in the census. The Census Bureau, upon identifying a person as an overcount will add, or "impute,"


8 Secretary of Commerce Robert A. Mosbacher, Notice of Final Decision, "Adjustment of the 1990 Census for Overcounts and Undercounts of Population and Housing," Federal Register 56, no. 140 (22 July 1991): 33584. Microfiche. Secretary Mosbacher's comments regard the original 1990 Post Enumeration Survey (PES) that had 1,392 post-strata. According to the original adjustment using the 1,392 post-strata, 919,000 people would have been subtracted. However the discovery of a processing error, as well as other problems in the PES, caused to the Bureau to revise the 1990 PES adjustment. According to Howard Hogan in "The 1990 Post-Enumeration Survey: Operations and Results," "the revisions gave 357 post-strata rather than 1,392. The restratification was most successful in avoiding the very small sample sizes, which had led to high variances…"

9 The 1.4 million persons eliminated reflect the Bureau's second attempt to adjust the census using a post-enumeration survey (PES) with 357 post-strata. Further explanation of the 1990 revisions can be found in Hogan, "The 1990 Post-Enumeration Survey: Operations and Results."

10 In 1990, the Census Bureau commonly referred to the process of eliminating overcount as "subtraction" or "deletion." This issue of subtraction raised concern and became a focus of discussion during the planning of Census 2000 and the A.C.E. Internal memoranda from the Census Bureau indicate a conscious decision to refer to the process as "imputing a negative," and not to refer to this process as "subtraction" or deletion." The methodology did not change from 1990 to 2000.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


a person with the same characteristics and a negative value to, in effect, eliminate or cancel out the overcounted person. According to the Bureau's Census 2000 Decision Memorandum No. 97:

If any overcounts are estimated for a particular post-stratum for the Census 2000 Accuracy and Coverage Evaluation Survey (A.C.E.), the census counts for this particular group must be corrected to reflect the estimated overcount . . . These records will then be assigned a weight of _1 and included in the census data files . . . the negative weights will be added to the census counts to incorporate the estimated overcount in the final results.11

In other words, the effect of imputing a negative value for the count of a person who chose to participate in the census, in reality, "nullifies" or "negates" the record of a real person.

How many persons living in America might be "nullified if statistical adjustment were implemented for the 2000 census?"

The preliminary results of the A.C.E. for 200012 — consistent with the results in 1990 — show that approximately one million people who filled out a census questionnaire would have had their records nullified in order for the adjustment methodology to work properly. It is not just Whites who would be eliminated or nullified through adjustment. Many Asian children would also be eliminated or nullified in this process. Many Black, Hispanic, Asian and Native Hawaiian homeowners would also be likely to be eliminated.13 These "nullified" persons represent the elimination of real people who did their civic duty by returning their census forms.


11 U.S. Department of Commerce, Bureau of the Census, Overcounts for the Census 2000 Accuracy and Coverage Evaluation Survey, Census 2000 Decision Memorandum No. 97, Prepared by Howard Hogan, Bureau of the Census (Washington, DC, 2000).

12 The Census Bureau's Executive Steering Committee on A.C.E. Policy (ESCAP) released the recommendation to Secretary of Commerce Donald L. Evans not to adjust Census 2000 for the purposes of political redistricting on 1 March 2001. Further information and analysis is available through the Census Bureau's website www.census.gov.

13 A full table for the 448 Post-Strata is available as an appendix to this document.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


Figure 5

Nullifying the Census Count

For example, to determine a total net undercount of the African-American population of 2.17 percent, the Bureau would need to add and nullify hundreds of thousands of records within the census count. The Bureau, using the A.C.E. methodology to adjust the census, would add approximately 845,000 records to the census for African-Americans to adjust for the undercount. In contrast, the Bureau would negate or nullify approximately 100,000 records for those post-strata in the African-American community where they estimated overcounts. These post-strata included approximately 35,000 African-American men and women living in large metropolitan areas, over 50 years of age, who own their own homes. The Bureau identified an overcount for these individuals, while at the same time identifying their younger children and neighbors as undercounted.

A similar phenomenon would occur for practically every community in the United States.

Fourth Step: The Geographic Consequences of Synthetic Estimation

The final part of the process, synthetic estimation, allows the Bureau to produce an estimate of population beyond the observed persons counted in the randomly selected sample blocks.

How is this done? The adjustment factors, determined from dual system estima


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


tion, are extrapolated or weighted to higher levels of geography in order to determine estimates for overcount and undercount for each of the post-strata. According to a report by the Chief of the Decennial Statistical Studies Division, Howard Hogan, the "distribution of the estimated undercount geographically below the post-stratum level was done by multiplying the post-strata adjustment factors by census counts for each post-stratum in each block in the census."14 These adjustment factors are applied to the post-strata for the entire population of the United States — every single census record or person — to determine the adjusted population for every state, county, city, town, census tract, and census block. Homogeneity is assumed while diversity is downplayed.

Why are such inferences made? The only possible direct measure for overcounts and undercounts is for those housing units in the randomly selected census blocks included in the A.C.E. For the millions of people living in the housing units and hundreds of thousands of census blocks that were not included in the A.C.E. survey, the Bureau makes inferences about the accuracy and coverage through synthetic estimation even though these persons have not been directly measured.

Once again analysis from 1990 reinforces these points, as Secretary Mosbacher observed: "the decisions about which places gain people and which lose people are based on statistical conclusions drawn from the sample survey. The additions and deletions in any particular community are often based largely on data gathered from communities in other states."15 Thus the data that are used to adjust for the undercount for any given state, county, city or town are almost entirely from other states, counties, cities and towns.

This means that data were shared by dramatically different communities in 1990. This remains true for 2000.16

· Some people in New York City must be adjusted with data from Boston, Chicago, Dallas, Detroit, Houston, Los Angeles, Miami and Seattle.

· Some people in Midland, Texas must be adjusted with data from not only Amarillo and Longview (both mid-size cities in Texas) but with data from Monroe, Louisiana; Cedar Rapids, Iowa; Stillwater, Oklahoma; Durham, North Carolina; and New Haven, Connecticut.

· Some people living in Cleveland, Ohio must be adjusted with data from San Diego, Memphis, Saint Louis, Charlotte and Denver.


14 Howard Hogan, "The 1990 Post-Enumeration Survey: Operations and Results," Journal of the American Statistical Association Vol. 88, No. 423 (1993): 1052.

15 Secretary of Commerce Robert A. Mosbacher, Notice of Final Decision, "Adjustment of the 1990 Census for Overcounts and Undercounts of Population and Housing," Federal Register 56, no. 140 (22 July 1991): 33584. Microfiche.

16 In fact, the A.C.E. design intensifies this problem since, unlike the 1990 PES post-strata, there is no A.C.E. provision that all data for the demographic post-strata should be regionally subdivided. (Only White Owner A.C.E. data is subdivided by region.)


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


· Some people in Leesburg, Virginia must be adjusted with data from Newark, Delaware; Couer d'Alene, Idaho; Alamogordo, New Mexico; and Florence, South Carolina.

· Some people in Janesville, Wisconsin must be adjusted with data from Las Cruces, New Mexico; Port Arthur, Texas; and Ogden, Utah.

· Some people in Visalia, California must be adjusted with data from Valdosta, Georgia; Lawrence, Kansas; Portland, Maine; and Kalamazoo, Michigan.

· American Indians living on the Gila River Reservation in Arizona must be adjusted with data from the Cheyenne River Indian Reservation in South Dakota; the Minnesota Chippewa Tribe; the Oneida Nation in New York; and the Viejas Reservation in California.

· Some people living in D'Iberville, Mississippi must be adjusted with data from Kenai, Alaska; Cabot, Arkansas; Boonville, Indiana; and Spring Creek, Nevada.

· Some people in Cody, Wyoming must be adjusted with data from Toccoa, Georgia; Plattsmouth, Nebraska; and Middlebury, Vermont.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


· Some people in Nashua, New Hampshire must be adjusted with data from Bloomington, Minnesota; Henderson, Nevada; Gresham, Oregon; and Reading, Pennsylvania.

· Some people living in Aurora, Illinois will be adjusted with data from Arvada, Colorado; Clearwater, Florida; Lowell, Massachusetts; Erie, Pennsylvania; Clarksville, Tennessee.

· Some people living in Valley City, North Dakota must be adjusted with data from Gardendale, Alabama; Aiea, Hawaii; and Absecon, New Jersey.

The Census Bureau refers to this process as "borrowing strength." This means that the data used to describe persons living in one area is more than likely not a direct reflection of an undercount or overcount for those persons who actually live in that area.


Census Monitoring Board, Congressional Members



Report to Congress

CMBC 72-424


Conclusion

Needless to say, some experts regard these issues of "borrowing strength," synthetic estimation, and the assumptions used to create the post-strata groups with considerable skepticism. Serious questions have been raised with regard to the accuracy of adjusted data.

Just as different physicians have different ideas about the correct course of medical treatments for a specific condition, so too do experts differ on how best to reduce undercounts. As with medical therapies, there may be more than one "treatment" to improve the accuracy of the census. Clearly, questions about the accuracy of adjusted data have not been resolved. Furthermore, few would disagree that statistical adjustment has its own side effects. It eliminates the count of real people. Assumptions required to satisfy the methodology may not reflect the "true" population.

In summary, we have learned several lessons from the success of the 2000 census, particularly with regards to reducing the differential undercount. One lesson is that an intensified enumeration is a valuable tool in producing census accuracy. Adopting the A.C.E. synthetic estimation process, however, appears to move us farther away from the very enumeration practices that have been shown effective, and substitutes a probability mechanism for adjusting the count that remains methodologically questionable. Further scientific evaluation of the A.C.E. is imperative before any decision to act on adjusted data. That much is owed to the real persons who chose to participate in the census.


Census Monitoring Board, Congressional Members

 --------------------------------------------------------------------------------

[ BACK ] [ DOCS HOME ] [ GPO HOME ] [ GPO INET Services ]