Extreme Measures

alaurent@govexec.com

G

overnment has gone measurement crazy. Washington's crawling with consultants eager to help agencies organize their outcomes, mend their metrics and balance their scorecards. Congress has gone grade happy. Nary an agency plan hits the street before Congress assesses, evaluates, ranks and reports on it. Once rule-bound federal agencies are now becoming ruler-bound, thanks largely to the 1993 Government Performance and Results Act.

By now, everyone working for Uncle Sam knows the "Results Act Rag:"

First, you get your mission clear,
Then, lend stakeholders your ear.
Mix up a strategic plan;
Set annual goals wherever you can.
Measure progress; learn from mistakes.
And to keep performance humming,
communicate, communicate, communicate.

Many agencies have learned the planning refrain, but most still haven't mastered goal-setting and performance-monitoring. The process has broken down in part because many agencies lack adequate management systems. Information systems are dated, can't communicate and aren't yet providing results-oriented program data. Human resources management focuses more on how current workers are classified and paid than on getting and keeping enough people with the right skills doing the right things to achieve performance goals. Though agencies tied their fiscal 1999 funding requests to performance goals, their budgets aren't yet written to reflect results achieved for dollars spent. Few agencies' financial systems account well enough for costs to enable managers to predict what an extra million dollars would buy in E. coli outbreaks prevented, elderly people kept from poverty, or earthquake damage avoided.

Management failings notwithstanding, agencies are marching smartly along to the performance tune. Goaded by the General Accounting Office and criticized by Congress, agencies are surveying customers, collecting baseline data and measuring performance as best they can. Still, their efforts are coming under fire as:

  • Too timid, because they focus on pedestrian outputs rather than lofty outcomes. GAO rapped the Social Security Administration, for example, for setting goals for the number of claims processed and the number of beneficiaries served without discussing the effect of social security on reducing poverty among the elderly. House Majority Leader Dick Armey, R-Texas, nicked the Environmental Protection Agency for focusing on outputs, such as numbers of inspections, while omitting outcome goals for its enforcement activities.
  • Too sketchy, because few agencies have good enough performance data to set accurate measurement baselines or reliably register progress. The Health Care Financing Administration's Medicaid and other financial reports rely on the 50 states and the District of Columbia to collect information, but different jurisdictions define data categories differently so the information is inconsistent. Occupational Safety and Health Administration officials admit they still don't have indicators to measure progress toward a number of agency goals. The Immigration and Naturalization Service can't chart progress in meeting its 1999 goal of increasing detention space for illegal immigrants because of problems with data showing how many immigrants are detained and for how long.
  • Too disconnected, because many agencies have failed to clearly link goals and outcomes, resources and performance. For example, while the Food and Drug Administration includes product safety and efficacy in its strategic goal for drug reviews, none of its six performance targets for that goal mention those concerns. Instead, performance targets focus on speeding up reviews. Though both the Federal Aviation Administration and IRS have won freedom from personnel rules they said constrained operations, neither identified human resources obstacles or needs in their performance plans.

Measure Mania

Performance measurement falls prey to deeper problems, as well. In the same way that obsession with the scale can lead to anorexia and bulimia in people, measurement mania in organizations can foster cheating, employee abuse and, perversely, decreased overall performance. In federal agencies, dysfunctional performance measurement can also end in violations of rules and laws. For a cautionary tale, one need look no further than the recent experience of the IRS.

Pressed by GAO and Congress to close the growing gap between tax receipts and taxes owed, and threatened by a push to contract out tax collection, IRS Commissioner Margaret Milner Richardson announced in 1994 that tax compliance had become an IRS priority.

IRS used statistical measures to gauge field office productivity and to set goals in a variety of areas, including tax enforcement. Regional commissioners and district directors were held personally accountable for improving tax collections to close the gap, which had grown from $76 billion in 1981 to $127 billion by 1992. IRS headquarters exhorted regional collections managers to improve productivity by using statistics to set goals and motivate employees. Also in 1994, IRS became a Results Act pilot agency, redoubling its interest in measuring performance. Use of statistics spurred hot competition among regions and districts to move up in the nationwide rankings by improving collections and other enforcement numbers such as property seizures and tax liens.

Regional and district offices apportioned shares of their numeric goals to divisions and branches, just the sort of casca- ding of goals throughout the organization the Results Act encourages. "In many cases, it was the revenue officers themselves who initiated group goal-setting because it gave them a baseline against which they measured their effectiveness," according to the August 1998 report of a special review panel appointed by IRS Commissioner Charles Rossotti. IRS' fiscal 1997 business review noted that competition over field office ratings was a highly effective productivity motivator--better even than targets set in the agency's strategic plan.

The trouble was that both an internal agency policy statement and the 1988 Taxpayer Bill of Rights prohibit IRS from using enforcement results to evaluate collections employees or to create production goals and quotas. Congress and the IRS itself have long been understandably cautious about setting goals and quotas for tax collectors. While the compliance push of the early 1990s was accompanied by repeated warnings against using statistics to evaluate people or develop quotas, "there is an almost instinctive need to share those organizational goals at all levels of the organization," Rossotti's reviewers found. That need to share ultimately ran afoul of IRS rules.

Inexorably, given the pressure from headquarters, production numbers displaced other concerns in some field offices. The Arkansas-Oklahoma district collection division, for example, rose in the rankings from the middle range of 33 divisions to No. 3 in less than three years, largely because its agents made eight times more property seizures per person than the national average. The Arkansas-Oklahoma collection division chief, who initially was rewarded for rising in the rankings, later was admonished when investigators found he had overemphasized seizures as a collection tool, wrongly used statistics in manager ratings, and inappropriately set goals for groups of revenue officers.

As the IRS toiled diligently to close the tax gap, congressional Republicans assimilated a lesson of the 1996 presidential campaign: Attacking taxes wins votes. When a flat-tax proposal brought dark horse Republican contender Steve Forbes unexpected support, other candidates adopted it. In September 1997, Sen. William Roth, R-Del., a Results Act author, hauled IRS officials before the Senate Finance Committee for three days of testimony featuring anguished citizens' tales of despair, ruin and even suicide induced by over-enthusiastic tax collectors. Republicans, already primed for tax code assaults and searching for 1998 congressional campaign issues, seized on the hearings to whip up public outcry. IRS' field office performance statistics and ranking of regions and districts became a focus of controversy.

In the hearings' wake, acting IRS Commissioner Michael Dolan was forced to apologize for taxpayer abuses, shut down the compliance effort, end virtually all use of collection statistics and then fall on his sword. Dolan, a career IRS executive, resigned soon after the hearings to give new Commissioner Rossotti freedom to bring on his own senior team. Twelve IRS managers, many of whom had earlier won accolades for using statistics to spur collections, received official reprimands. Investigations spawned by the hearings continued through the end of 1998.

Rossotti immediately received broad legislative authority to bring in 40 new managers at higher than normal salaries and restructure the agency. He quickly enacted a new mission statement emphasizing customer service, integrity and fairness and he promised to end abuses. The 1998 IRS Restructuring and Reform Act signed into law in July created an IRS oversight board of six private-sector representatives, the Treasury Secretary, IRS Commissioner and an employee representative to approve strategic, budget and reorganization plans and review operations and senior management compensation. Rossotti is reorganizing IRS by taxpayer group after more than 40 years of functional stovepipes.

Out of Balance

In the 1990s, the IRS compliance effort in many ways followed the formula for managing for results under the GPRA framework: IRS listened closely to members of Congress, its most powerful stakeholders, and set a strategic goal--closing the tax gap--based on legislators' input. IRS managers were consulted about performance goals to achieve the strategic objective. The larger goal was translated into annual performance targets in regions and districts. Numerical goals quickly filtered to the front line, often at employees' initiative. Performance was eagerly and assiduously measured. Program managers used multiple, useful measures to gauge output. Offices competed; collections increased. Nevertheless, the performance measurement effort turned out to be a key contributing factor in the agency's worst crisis in more than a decade.

IRS' experience illustrates a danger inherent in performance measurement: While what's being measured is getting done, what isn't being measured is getting short shrift. "It is common knowledge among employees and first-line managers that 'increased voluntary compliance' equals 'dollars per hour.' You get what you measure," wrote one employee on a survey conducted by GAO in March about the use of enforcement statistics (GGD-99-11). Focusing on the performance of easily measured activities, such as dollars collected or tax liens filed, crowds out less easily measured service dimensions, such as fairness, courtesy and quality of work.

At IRS, the focus on enforcement measures squelched concerns about misusing goals and brought increasing pressure on managers, some of whom sweated their staffers, who in turn squeezed laggard taxpayers harder than may have been appropriate. Now IRS is moving to bring its measures back into sync. Rossotti is imposing new performance measures to balance business results, customer satisfaction and employee satisfaction. "We have struggled because our focus on enforcement was out of balance with formal policies that prohibited the use of those statistics as a basis for evaluating front-line employees," he told executives at the annual IRS business meeting in September. "Confusion was common and we made mistakes. We're now turning this around."

The business results measures will assess quantity using numbers of cases closed or time per closing but not data reflecting enforcement outcomes such as penalties imposed, dollars collected or numbers of liens filed.

While IRS' measurement campaign brought a short-term bump in enforcement, the agency paid a long-term price in stakeholder support. Other enforcement agencies also have struggled to balance enforcement measurements. In June, for example, legislators voted to prohibit the Occupational Safety and Health Administration from imposing enforcement quotas on its inspectors. Until 1995, when an OSHA rule halted the practice, supervisors evaluated inspectors based on citations issued, penalties assessed and inspections conducted. Currently, OSHA measures how well inspectors promote voluntary compliance and participate in partnerships to reduce workplace injuries.

The potential for unbalanced measures looms elsewhere, as well. The Food and Drug Administration currently is performing a balancing act between speeding up drug reviews and ensuring pharmaceutical safety. FDA relies increasingly on drug review application fees paid by pharmaceutical firms. Drug application fees provided almost 10 percent of FDA's fiscal 1997 budget and financed about 8 percent of agency staff. Almost 40 percent of FDA's 1997 budget for processing human drug applications came from user fees. Results-based management would dictate a balanced focus on making sure drugs found to be safe and effective are quickly approved. But FDA's GPRA performance measures focus chiefly on review times.

Limited Control

It's increasingly clear that few GPRA performance plans can meet the requirements of the law and the expectations of Con-gress and OMB while still seeming reasonable and realistic to the agencies that drafted them. Many agencies' chief difficulty is that they don't completely control the outcomes they seek. EPA, for example, sets safe drinking water standards, but states oversee local water suppliers' compliance. FDA aims to reduce teen smoking and illness from food contamination, goals that are affected by many factors beyond the agency's control. Though the Food Safety and Inspection Service is completely remaking its meat inspection process to account for new forms of contamination, it still can't control whether consumers cook their meat thoroughly or not.

In December, GAO recommended strategies agencies can use to overcome their inherent limitations (GGD-99-16):

  • Select outcome goals over which the agency has some control and that represent both long-term and near-term goals believed to contribute to ultimate benefits. For example, the National Highway Traffic Safety Administration's outcome goal is to reduce motor vehicle crashes leading to death and injuries, but it measures seat belt usage as a contributor to that goal.
  • Redefine the scope of strategic goals to focus on the agency's actual activities. Thus, OSHA for 1999 set a goal of reducing by 3 percent three of the most prevalent workplace injuries and illnesses, and reducing injuries and illnesses in five "high hazard" industries, rather than setting dramatic targets for reducing all workplace injuries and maladies.
  • Break out goals for target populations for which the agency has different expectations. OSHA's "high-hazard" industry goals help correct for a possible increase in overall injury rates over time as a result of increased employment in hazardous industries.
  • Correct statistics to reflect the effect of external factors. NHTSA measures the ratio of fatalities per mile driven to adjust for the fact that when more miles are driven, more crashes are likely.

Useful News

The most important test of results-based performance measures is whether program managers use them. Heretofore, they have not. In June 1997, GAO reported that only about a third of federal managers received the types of performance information that would allow them to measure whether programs were achieving their intended results (GGD-97-109). Only 38 percent reported receiving enough performance information to measure the efficiency and quality of their operations.

"We're going to have to figure out a way collectively to infuse program managers with a desire to get the information and use it," G. Edward DeSeve, deputy director for management at the Office of Management and Budget, said during a June Results Act conference in Washington. "If GPRA is not valuable to the program manager, [if] he or she uses something else to manage, and if there's a different set of measures that are used by senior management in evaluating the program managers, then we have a disconnect. And if they can't get the information, and therefore they can't use it, we have a disconnect."

The Government Performance Project un-earthed little evidence that program managers currently are receiving and using much more results-based performance information than they did in 1997. In most agencies studied, collection of performance data was at an early stage and data quality still was questionable. Technology systems at most agencies still aren't wholly linked, so managers often must use several systems to get program and financial information, which rarely is timely in any case. An exception is the Veterans Health Administration, overall high scorer for the project in managing for results. VHA is putting in place a decision support system to give managers and health care staff data on patterns of patient care and health outcomes. The data will be used to analyze resource use and the cost of services.

More commonly, where performance data is available, it's primarily about program outputs, not outcomes. The Social Security Administration, for example, used information about busy signals and caller waiting times to turn around its toll-free telephone service so 95 percent of callers now get through on their first attempt. The Patent and Trademark Office is realigning staffing and funding to cut patent processing time from 16 to 12 months and trademark processing time from nearly 18 to 13 months by 2003. But performance goals and measures at the two agencies fail to reflect the quality of service provided or the ultimate goal of providing it. PTO customers say errors are routine on PTO documents at the current processing rate and could increase if cases move faster.

There is reason for more optimism about agencies' efforts to link resources and performance. PTO is using activity-based costing to set and adjust its fees based on the actual cost of processing applications for trademarks and patents. VHA is using the Veterans Equitable Resource Allocation (VERA) system to distribute appropriated funds among hospitals based on numbers of veterans served. It's a gutsy approach, given how politically important a VA hospital can be to a member of Congress.

In 1997, VERA's first year of operation, nine VHA integrated service networks (regional groupings of hospitals and clinics) lost funding while the remaining 13 gained. VHA also is funding community clinics instead of building more hospitals after discovering veterans more commonly need outpatient or nursing home care than hospital stays.

One sure way to get managers' attention focused on performance is to hold them personally accountable for it. IRS certainly succeeded at this during its enforcement push, but the goals managers were motivated to achieve turned out to be the wrong ones. VHA, on the other hand, is giving directors of its integrated service networks unprecedented freedom to allocate resources in exchange for tough personal performance agreements tied directly to agency strategic goals.

Getting reliable performance data and motivating managers to use it will become more important in the coming years. After a slow start, legislators have become broadly and increasingly deeply attentive to performance. The Congressional Research Service reported in December that the 105th Congress passed twice as many laws containing performance provisions as did the 104th. Most early congressional interest in results came from agency oversight committees, but appropritations committees now have caught up and are putting price tags on performance measures.

GPP report card

Return to GPP home

Managing for Results
Management Grades
VHA A
FAA B
FEMA B
FDA B
FNS B
FSIS B
IRS B
PTO B
SSA B
Customs C
FHA C
EPA C
HCFA C
OSHA C
INS C
Rating Criteria
  • Presence of a strategic, results-oriented plan.
  • Involvement of stakeholders in developing and evaluating the strategic plan.
  • Development and use of indicators and evaluative data by which progress toward results can be measured.
  • Availability and use of these measures for policy-making, management and evaluation of the agency's progress toward its goals.
  • Clear communication of results to citizens, elected officials and other stakeholders.
Best Practices
Measure performance related to all strategic goals; balance performance measures. IRS learned the hard way not to measure enforcement at the expense of fairness, courtesy and quality.

Use strategic goals to hold people personally accountable for results. The Veterans Health Administration gives directors of regional groupings of hospitals and clinics unprecedented freedom to allocate resources in exchange for tough personal performance agreements tied to agency strategic goals.

Develop solid baseline information and continue collecting useful, accurate data to measure performance at all levels of the organization. The Patent and Trademark Office has won accolades for clearly linking employee performance to agency goals, though those goals primarily focus on reducing processing times.

Get people who run programs to use performance information to manage. PTO is using activity-based costing to assign costs to activities based on resources the units consume, forcing managers to make better use of people and resources to achieve their goals.

Match resources and goals. VHA's resource allocation system is forcing hard choices about hospitals to account for changing veteran demographics.