Office Of Management and Budget
PRIMER ON PERFORMANCE MEASUREMENT
(Revised February 28, 1995)
This "primer" defines several performance measurement terms, outlines areas or functions where performance measurement may be difficult, and provides examples of different types of performance measures.
I.Definition of Terms
No standard definitions currently exist. In this primer, the definitions of output and outcome measures are those set out in GPRA. Input measures and impact measures are not defined in GPRA. As GPRA is directed at establishing performance goals and targets, the definitions are prospective in nature. Variations or divisions of these definitions can be found in other Federal programs as well as non-Federal measurement taxonomies. For example, a measurement effort which retrospectively reports on performance might define "input" as resources consumed, rather than resources available. The nomenclature of measures cannot be rigidly applied; one agency's output measure (e.g., products produced) could be another agency's input measure (e.g., products received).
GPRA Definition: An assessment of the results of a program compared to its intended purpose.
Outcome measurement cannot be done until the results expected from a program or activity have been first defined. As such, an outcome is a statement of basic expectations, often grounded in a statute, directive, or other document. (In GPRA, the required strategic plan would be a primary means of defining or identifying expected outcomes.)
Outcome measurement also cannot be done until a program (of fixed duration) is completed, or until a program (which is continuing indefinitely) has reached a point of maturity or steady state operations.
While the preferred measure, outcomes are often not susceptible to annual measurement. (For example, an outcome goal setting a target of by 2005, collecting 94 percent of all income taxes annually owed cannot be measured, as an outcome, until that year.) Also, managers are more likely to primarily manage against outputs rather than outcomes.
- The measurement of incremental progress toward a specific outcome goal is sometimes referred to as an intermediate outcome. (Using the example above, a target of collecting 88 percent of taxes owed in 2002 might be characterized as an intermediate outcome.)
GPRA Definition: A tabulation, calculation, or recording of activity or effort that can be expressed in a quantitative or qualitative manner.
The GPRA definition of output measure is very broad, covering all performance measures except input, outcome or impact measures. Thus it covers output, per se, as well as other measures.
- Strictly defined, output is the goods and services produced by a program or organization and provided to the public or to other programs or organizations.
- Other measures include process measures (e.g., paperflow, consultation), attribute measures (e.g., timeliness, accuracy, customer satisfaction), and measures of efficiency or effectiveness.
- Output may be measured either as the total quantity of a good or service produced, or may be limited to those goods or services with certain attributes (e.g., number of timely and accurate benefit payments).
Some output measures are developed and used independent of any outcome measure.
All outputs can be measured annually or more frequently. The number of output measures will generally exceed the number of outcome measures.
In GPRA, both outcome and output measures are set out as performance goals or performance indicators.
- GPRA defines a performance goal as a target level of performance expressed as a tangible, measurable objective, against which actual performance can be compared, including a goal expressed as a quantitative standard, value, or rate.
e.g., A goal might be stated as "Improve maternal and child health on tribal reservations to meet 95 percent of the national standards for healthy mothers and children by 1998". (Note that this goal would rely on performance indicators (see below) to be measured effectively.)
- GPRA defines a performance indicator as a particular value or characteristic used to measure output or outcome.
e.g., Indicators for the maternal and child health goal above might include morbidity and mortality rates for this population cohort, median infant birth weights, percentages of tribal children receiving full immunization shot series, frequency of pediatric checkups, etc.
- Performance goals which are self-measuring do not require separate indicators.
e.g., A performance goal stating that the FAA would staff 300 airport control towers on a 24 hour basis in FY 1996.
Definition: These are measures of the direct or indirect effects or consequences resulting from achieving program goals. An example of an impact is the comparison of actual program outcomes with estimates of the outcomes that would have occurred in the absence of the program.
Measuring program impact often is done by comparing program outcomes with estimates of the outcomes that would have occurred in the absence of the program.
- One example of measuring direct impact is to compare the outcome for a randomly assigned group receiving a service with the outcome for a randomly assigned group not receiving the service.
If the impacts are central to the purpose of a program, these effects may be stated or included in the outcome measure itself.
- Impacts can be indirect, and some impacts are often factored into cost-benefit analyses. An outcome goal might be to complete construction of a large dam; the impact of the completed dam might be reduced incidence of damaging floods, additional acreage converted to agricultural use, and increased storage of clean water supplies, etc.
The measurement of impact is generally done through special comparison-type studies, and not simply by using data regularly collected through program information systems.
Definition: Measures of what an agency or manager has available to carry out the program or activity: i.e., achieve an outcome or output. These can include: employees (FTE), funding, equipment or facilities, supplies on hand, goods or services received, work processes or rules. When calculating efficiency, input is defined as the resources used.
Inputs used to produce particular outputs may be identified through cost accounting. In a less detailed correlation, significant input costs can be associated with outputs by charging them to the appropriate program budget account.
Often, a physical or human resource base (e.g., land acreage, square footage of owned buildings, number of enrollees) at the start of the measurement period is characterized as an input.
- Changes to the resource base (e.g., purchase of additional land) or actions taken with respect to the resource base (e.g., modernize x square footage, convert y enrollees to a different plan) are classified as outputs or outcomes.
AN EXAMPLE OF OUTCOME, OUTPUT, IMPACT, AND INPUT MEASURES FOR A HYPOTHETICAL DISEASE ERADICATION PROGRAM:
Outcome: Completely eradicate tropical spastic paraparesis (which is a real disease transmitted by human-to-human contact) by 2005
Outputs: 1.) Confine incidence in 1996 to only three countries in South America, and no more than 5,000 reported cases. (Some would characterize this step toward eradication as an intermediate outcome.)
2.) Complete vaccination against this retrovirus in 84 percent of the Western hemispheric population by December 1995.
Inputs: 1.) 17 million doses of vaccine
2.) 150 health professionals
3.) $30 million in FY 1996 appropriations
Impact: Eliminate a disease that affects 1 in every 1,000 people living in infested areas, which is progressively and completing disabling, and with annual treatment costs of $1,600 per case.
AN EXAMPLE OF OUTCOME, OUTPUT, IMPACT, AND INPUT MEASURES FOR A JOB TRAINING PROGRAM:
Outcome: 40 percent of welfare recipients receiving job training are employed three months after receiving job training.
Output: Annually provide job training and job search assistance to 1 million welfare recipients within two months of their initial receipt of welfare assistance.
Input: $300 million in appropriations
Impact: Job training increases the employment rate of welfare recipients from 30 percent (the employment level of comparable welfare recipients who did not receive job training) to 40 percent (the employment rate of those welfare recipients who did receive job training).
AN EXAMPLE OF OUTCOME, OUTPUT, IMPACT, AND INPUT MEASURES FOR A TECHNOLOGY PROGRAM:
Outcome: Orbit a manned spacecraft around Mars for 30 days in 2010 and return crew and retrieved Martian surface and subsurface material safely to Earth.
Output: (For FY 2007) Successfully complete a 900 day inhabited flight test of the Mars Mission Module in lunar orbit in the third quarter of CY 2007.
Input: Delivery of 36 EU-funded Mars Surface Sample Return probes from the Max Planck Institute in Germany.
Impact: A comprehensive understanding of the biochemical, physical and geological properties of the Martian surface and subsurface to a 35 meter depth. Detection of any aerobic or anaerobic life forms (including non-carbon based, non-oxygen dependent forms) in the Martian surface crust.
AN EXAMPLE OF OUTCOME, OUTPUT, IMPACT, AND INPUT MEASURES FOR A ENVIRONMENTAL RESOURCES PROGRAM:
Outcome: Restore the 653,000 square hectare Kolbyduke Paleoartic Biome Reserve to a pre-Mesolthic state, and preserve it in that state.
Output: (In FY 2002) Eradication on all non-native plants from 51,000 square hectares, for a cumulative eradication of non-native plants from 38 percent of the Reserve.
Input: (In FY 2002) Donation of 22,000 volunteer workhours from four wildlife organizations.
Impact: The protection of this biome as one of three internationally-designated Paleoartic biomes and perpetuating it as a research site for studies of the pre-historic ecological equilibrium.
II. Complexities of Measurement
A. FUNCTIONAL AREAS. Some types of programs or activities are particularly difficult to measure.
Basic Research, because often:
- likely outcomes are not calculable (can't be quantified) in advance;
- knowledge gained is not always of immediate value or application
- results are more serendipitous than predictable;
- there is a high percentage of negative determinations or findings;
- the unknown cannot be measured.
- (Applied research, applied technology, or the "D" in R&D is more readily measurable because it usually is directed toward a specific goal or end.)
Foreign Affairs, especially for outcomes, to the extent that:
- the leaders and electorate of other nations properly act in their own national interest, which may differ from those of the United States (e.g., Free Territory of Memel does not agree with US policy goal of reducing US annual trade deficit with Memel to $1 billion);
- US objectives are stated as policy principles, recognizing the impracticality of their universal achievement;
- goal achievement relies mainly on actions by other countries (e.g., by 1999, Mayaland will reduce the volume of illegal opiates being transhipped through Mayaland to the US by 65 percent from current levels of 1250 metric tons).
Policy Advice, because often:
- it is difficult to calculate the quality or value of the advice;
- advice consists of presenting competing views by different parties with different perspectives;
- policy advice may be at odds with the practicalities of political advice.
Block Grants, to the extent that:
- funds are not targeted to particular programs or purposes;
- the recipient has great latitude or choice in how the money will be spent;
- there is little reporting on what the funds were used for or what was accomplished.
B. BY TYPE OF MEASURE. Some measures are harder to measure than others. Some of the difficulties include:
For outcome, output, and impact measures
- Direct Federal accountability is lessened because non-Federal parties (other than those under a procurement contract) are responsible for the administration or operation of the program.
- The magnitude and/or intrusiveness of the performance reporting burden.
- The nature and extent of performance validation or verification requires a substantial effort.
- Individual accountability or responsibility is diffuse.
For outcome measures
- Timetable or dates for achievement may be sporadic.
- Achievement often lags by several years or more after the funds are spent.
- Results frequently are not immediately evident, and can be determined only through a formal program evaluation.
- Accomplishment is interrupted because of intervening factors, changes in priorities, etc.
- Changing basepoints can impede achievement (e.g., recalculation of eligible beneficiaries).
- Achievement depends on a major change in public behavior.
- The outcome is for a cross-agency program or policy, and assigning relative contributions or responsibilities to individual agencies is a complex undertaking.
For output measures
- Equal-appearing outputs are not always equal (e.g., the time and cost of overhauling one type of jet engine can be very different from another type of jet engine).
- It may be difficult to weight outputs to allow different (but similar appearing) outputs to be combined in a larger aggregate.
- Many efficiency and effectiveness measures depend on agencies having cost accounting systems and the capability to allocate and cumulate costs on a unit basis.
For impact measures
- Impacts are often difficult to measure.
- A large number of other variables or factors contribute to or affect the impact, and which can be difficult to separate out when determining causality.
- Federal funding or Federal program efforts are of secondary or even more marginal significance to the achieved outcome.
- Determining the impact can be very expensive, and not commensurate with the value received from a policy or political standpoint.
- Holding a manager accountable for impacts can be a formidable challenge.
For input measures
- The measurement itself should not be complicated, but the alignment of inputs with outputs can be difficult.
III. Emphasized Measures in GPRA
A. GPRA emphasizes the use and reporting of performance measures that managers use to manage. There are several reasons for this emphasis:
GPRA increases the accountability of managers for producing results.
Underscores that these measures are central to an agency's capacity and approach for administering programs and conducting operations.
- Because of this, the amount of additional resources to develop and improve performance measurement and reporting systems should be rather limited.
- The conundrum is that agencies requesting large amounts of additional resources would be conceding either that their programs were not being managed, or were being managed using an inappropriate or poor set of performance measures.
B. As output measures are more readily and easily developed than outcome measures, more of these are expected initially in the GPRA-required performance plans, but agencies should move toward increasing the number and quality of outcome measures.
IV. Selected Examples of Various Types of Performance Measures
Please Note: For the purpose of these examples:
Some of the outcome measures are much more narrowly defined than would otherwise be appropriate or expected.
Some of the outcome measures are not inherently measurable, and would require use of supplementary performance indicators to set specific performance targets and determine whether these were achieved.
Some measures include several aspects of performance. Italics are used to feature the particular characteristic of that example.
Many of the examples of output measures are process or attribute measures.
"TRADITIONAL" PRODUCTION OR DELIVERY TYPE MEASURES
Output: Manufacture and deliver 35,000 rounds of armor-piercing 120mm projectiles shells in FY 1997.
Outcome: Produce sufficient 120 mm armor-piercing projectiles to achieve a 60 day combat use supply level by 1999 for all Army and Marine Corps tank battalions.
Output: Process 3.75 million payment vouchers in FY 1995.
Outcome: Ensure that 99.5 percent of payment vouchers are paid within 30 days of receipt.
Output: Update earnings records for 45 million employee contributors to Social Security Trust Fund.
Outcome: Ensure that all earnings records are posted and current within 60 days of the end of the previous quarter.
Output: Provide meals and temporary shelter for up to 18 months for 35,000 homeless individuals for up to 18 months following the Short Beach tsunami disaster.
Outcome: Maintain a capacity to provide, nationally, meals and temporary shelter for an indefinite period for up to 100,000 individuals who are homeless as a result of major disasters.
Workload (Not otherwise categorized)
Output: Annually inspect 3200 grain elevators.
Outcome: Through periodic grain elevator inspection, reduce the incidence of grain dust explosions resulting in catastrophic loss or fatalities to zero.
Output: Issue 90 day national temperature and precipitation forecasts every six weeks.
Outcome: Provide users of meteorological forecasts with advance information sufficiently updated to be useful for agricultural, utility, and transportation planning.
Output: Store a minimum of 3.5 million barrels of petroleum stock.
Outcome: Petroleum stocks shall be maintained at a level sufficient to provide a 60 day supply at normal daily drawdown.
Output: Operate all tactical fighter aircraft simulator training facilities at not less than 85 percent of rated capacity.
Outcome: Ensure optimized operation of all simulator facilities to provide all active duty tactical fighter aircraft pilots with a minimum of 80 hours of simulator training every 12 months.
Output: All Corps of Engineer locks on the Showme River basin shall be operational during at least 22 of every consecutive 24 hours.
Outcome: Ensure no significant delays in commercial traffic transiting through the Showme River basin system.
Maintenance and Repair Intervals
Output: All out-of-service aircraft requiring unscheduled repairs shall be repaired within 72 hours.
Outcome: The Forest Service will maintain 90 percent of its 135 firefighting aircraft in an immediately deployable status during forest fire season.
Output: Not more than 1.25 percent of 120 mm armorpiercing projectiles shall be rejected as defective.
Outcome: No armor-piercing ammunition projectiles fired in combat shall fail to explode on impact.
Mean Failure rates
Output: Premature space Shuttle main engine shutdown shall not occur more than once in every 200 flight cycles.
Outcome: Space Shuttle shall be maintained and operated so that 99.95 percent of all flights safely reach orbit.
Output: The initial monthly estimate of the previous month's value of exports shall be within one percent of the revised final value.
Outcome: All preliminary, periodic estimates of economic activity shall be within three percent of the final value.
Output: Not more than four percent of initial determinations of the monthly entitled benefit amount shall be incorrectly calculated.
Outcome: (Not commonly measured as an outcome.)
Output: Not more than 2.5 percent of individuals seeking information will subsequently re-request the same information because the initial response was incomplete.
Outcome: (Not commonly measured as an outcome.)
Customer Satisfaction Levels (Output and outcome measures may often be indistinguishable.)
Output: In 1998, at least 75 percent of individuals receiving a service will rate the service delivery as good to excellent.
Outcome: At least 90 percent of recipients will rate the service delivery as good to excellent.
Output: Adjudicative decision on all claim disallowances will be made within 120 days of appeal hearings.
Outcome: Provide every claimant with timely dispositive determination on claims filed.
Adherence to schedule
Output: Operate 95 percent of all passenger trains within 10 minutes of scheduled arrival times.
Outcome: Provide rail passengers with reliable and predictable train service.
Output: 98 percent of notices to the Department of Transportation of navigational hazards will result both in an on-site inspection of the hazard and Notice to Mariners within 48 hours of receipt of the notice
Outcome: Ensure prompt response to potential public safety concerns in the navigation of coastal and off-shore waters.
EFFICIENCY AND EFFECTIVENESS MEASURES
Output: Annual transaction costs/production costs/delivery of service costs projected on a per unit basis. Produce 35,000 rounds of armor-piercing ammunition at a cost of $17.75 per round.
Outcome: (Not commonly measured as an outcome.)
Output: IN FY 1999, not more than 7,000 in-patients in military hospitals will be readmitted, post discharge, for further treatment of the same diagnosed illness at the time of initial admission.
Outcome: Annually, initial treatment will be therapeutically successful for 85 percent of all hospital admissions.
OTHER TYPES OF MEASURES
Milestone and activity schedules
Output: Complete 85 percent of required flight-worthiness testing for Z-2000 bomber by July 30, 1999.
Outcome: The Z-2000 bomber will be flight-certified and operational by December 1, 2000.
Output: Imaging cameras on Generation X observational satellite will have resolution of 0.1 arc second.
Outcome: Generation X observational satellite will successfully map 100 percent terrain of six Jovian moons to a resolution of 100 meters.
Status of conditions
Output: In 1995, repair and maintain 1,400 pavement miles of Federally-owned highways to a rating of "good".
Outcome: By 2000, 35 percent of all Federally-owned highway pavement miles shall be rated as being in good condition.
Output: Provide doses of vaccine to 27,000 pre-school children living on tribal reservations.
Outcome: 100 percent of children living on tribal reservations will be fully immunized before beginning school.