Perfecting Performance

May 15, 2002

"In some sense, what we've been doing with GPRA is living on the supply side," says Paul Posner, GAO's managing director for strategic issues. "We've been developing measures and plans and trying to build linkages [to budgets]. We have to move to the demand side-how are we going to increase the demand for this information." OMB assigns its grades to entire departments. If one agency in the department is failing, it drags down the entire department's score. The grades are intended not only to shame agencies into better performance, but to drive funding decisions. "As programs learn to link performance and cost, they can set targets in their annual performance plan in line with their budget request," the administration says in its budget proposal. "This helps gain support for their request and holds them accountable to achieve the targets." More than most agencies, EPA seems to have moved from the supply side of GPRA to the demand side. EPA, along with the Transportation Department and the Small Business Administration, earned the only yellow lights for performance budgeting in the president's budget. The others all got red lights. While the demand for performance budgeting is growing, creating useful measures presents a challenge for some agencies. The Centers for Medicare and Medicaid Services uses a number of output-based measures. For instance, the agency relies on health insurance companies to process Medicare claims and communicate program changes to health care providers. They do so via Web sites, letters and toll-free hot lines. But the companies are judged mainly on process-how quickly they answer the phone, how fast they answer letters from health care providers-rather than quality. The agency has trouble assessing the reliability of information being given to doctors and other health care providers. The struggle for most agencies is to come up with a healthy mix of output measures-for example, whether the product or service is delivered on time-and outcome measures, such as how the service affects the recipient. "Managers typically manage to output goals," says Wholey. "That is fine if there is a sensible link between output and outcomes. I don't want to sound as if output is bad and outcomes are good. You should have a mix of the two. If you think back to getting children immunized, that is an output-how many kids got shots in their arms. But result is a known outcome-a reduction in disease."

Federal agencies are starting to get the hang of performance planning. Now comes the hard part-figuring out the price tag.

et's start with the obvious: Washington is a budget-driven town. Civil servants, military brass and trade association lobbyists spend most of January devouring every little leak about spending priorities and program funding. They are never satisfied, even with all the juicy nuggets of information that journalists provide them. A Lexis/Nexis search of The Washington Post and The New York Times between Jan. 1 and Feb. 4, when the Bush administration released its budget proposal, produced more than 300 related references. Once the budget actually comes out, policy wonks and lobbyists scurry all over Capitol Hill, defending the budget's funding allocations if not trying to get more money.

What's less obvious is how program performance and management factor into the equation. There is hardly the same level of hype or interest when agencies and departments deliver their Government Performance and Results Act reports to Congress. Where are the front-page stories about the Health and Human Services Department's annual performance plan? For that matter, when is the press conference?

A cynic would argue that performance rarely is considered during the budget process. Rather, performance, results and management are nothing more than afterthoughts. The point is not lost on President Bush. Here's what the administration had to say in its management agenda, released last August: "Everyone agrees that scarce federal resources should be allocated to programs and managers that deliver results. Yet in practice, this is seldom done because agencies rarely offer convincing accounts of the results their allocations purchase. There is little reward, in budgets or in compensation, for running programs efficiently. And once money is allocated to a program, there is no requirement to revisit the question of whether the results obtained are solving problems the American people care about."

The cynicism continues: "Agency performance measures tend to be ill-defined and not properly integrated into agency budget submissions and the management and operation of agencies.... After eight years of experience [under GPRA], progress toward the use of performance information for program management has been discouraging."

Is the process really that abysmal? Perhaps not.

According to the General Accounting Office, agencies are getting better at integrating performance and budget. In fiscal 2002, more than 75 percent of agencies were able to show a direct link between the two, GAO reported in January (GAO-02-236). That's up from 40 percent in fiscal 1999.

"The agencies in our review continued to show the capacity for meeting a basic requirement of GPRA: to 'prepare an annual performance plan covering each program activity set forth in the budget,'" GAO reported. "In addition, these agencies continued to show progress in translating plan-budget linkages into budgetary terms, thus indicating the performance consequences of their budget proposals." GAO's analysis considered compliance with the 1993 Results Act only. The Office of Management and Budget, in its assessment for the president's management agenda and the 2003 budget, considered a broader array of factors that included how well agencies define their measures and whether they use activity-based costing.

If the agencies studied in this year's Federal Performance Report are any example, the truth about how well agencies have succeeded in fulfilling GPRA's intent lies somewhere between GAO's findings and the administration's viewpoint. Of the six agencies reviewed, some, such as the Environmental Protection Agency and the Federal Aviation Administration, are getting better at managing for results. Meanwhile, others, such as the Centers for Medicare and Medicaid Services, struggle to create performance-based cultures.

For the most part, the agencies studied in this year's report have had only mixed results. At the Social Security Administration and the Internal Revenue Service, for example, senior officials are trying to create cultures of accountability. IRS Commissioner Charles Rossotti is tying bonuses for executives to their achievement of performance goals. Bonuses for managers are tied to similar agreements. At Social Security, a majority of employees surveyed by GAO report they are committed to helping the agency achieve its strategic goals and using performance information to set priorities. At the same time, GAO found that only 23 percent of managers at Social Security report having enough decision-making authority to accomplish the agency's strategic goals.

Further, some agencies in this year's report, most notably CMS and the Immigration and Naturalization Service, are having a tough time finding the right balance between outcome and output-related measures. INS, for instance, is having trouble assessing program effectiveness because the agency has yet to develop goals and measures that can "objectively capture and describe performance results," according to GAO.

Supply and Demand

The Bush administration hopes to do that by setting-and insisting agencies follow-a management agenda. One of the five pillars of the president's management agenda is better integration of budget and performance. That goal is further outlined in the fiscal 2003 budget proposal, which tells agencies to start using performance measures to develop policies, make budget decisions and improve everyday program management. As part of the budget process, OMB is holding departments accountable for how successful they are in meeting this goal-along with the four other areas of the agenda: human capital, e-government, financial management and procurement reform. OMB plans to review agencies in these five areas on a quarterly basis and publish a scorecard in the budget every year. Departments failing to meet the criteria get a red light, mixed results get a yellow light and success earns a green light. Standards for success in performance budgeting include:

A streamlined, clear, integrated agency plan that establishes outcome goals, output targets, and resources requested in the context of past results.
Budget, staff and program activities that are aligned to support program targets.
Full cost accounting that is used for program activities. The cost of outputs and programs is integrated with performance in budget requests.
Program effectiveness that is documented. Analyses show how program outputs and policies affect desired outcomes. Agency systematically applies performance to budget and can demonstrate how program results inform budget decisions.

Though the process still is in its infancy, it creates a healthy dialogue about program performance, according to OMB officials and some agency managers. In developing the 2003 budget proposal, OMB staff held extended conversations with agency staff about integrating performance and budget, according to Marcus Peacock, associate program director for natural resources at OMB. Peacock serves as point person for the performance and budgeting component of the management agenda. There was a lot of give-and-take and OMB sometimes asked agencies to revise their budget proposals more than once, Peacock says. While agency documents, such as strategic plans, were the starting point for grading, OMB staff also considered external documents, including GAO reports and academic studies.

A Health and Human Services Department manger credits OMB for its effort to thoroughly assess program performance, but says the process needs to be refined. Mostly, she says, OMB and agency staff struggled to find the right balance between performance and other factors, including a program's social or economic importance. Joseph Wholey, professor of public administration at the University of Southern California, says the process at least puts performance budgeting on the right track. A former senior adviser for performance and accountability at GAO, Wholey says the administration's approach lets agencies that can show tangible benefits and effective management improve their standing during budget negotiations.

For example, the National Weather Service got an increase in funding because it set specific targets to increase hurricane warning time by two hours by 2005 and double tornado "lead time" to 22 minutes by 2015. The agency got an A for managing for results in last year's Federal Performance Report, partly because its performance measures are focused on outcomes and tied to the overall mission. The weather service also generates reliable performance data at all levels. The administration also bolstered the Labor Department's Job Corps budget by 5 percent-adding funding for four new training centers and raising teachers' pay-because the department showed clearly that program participants "get jobs, keep them and increase earnings over their lifetimes."

"What performance budgeting does is allow decision-makers to have more information at their disposal," says John Kamensky, director of PricewaterhouseCoopers' managing for results practice. He was deputy director of the Clinton administration's National Partnership for Reinventing Government. "It means decisions can be made on something other than anecdotes. Some agencies deliberately don't collect data because they think they can get better budget results based on anecdotes."

True Performers

In 1997, EPA embarked on an effort to surpass GPRA's requirement that agencies develop annual performance plans. Rather, EPA officials believed the legislation provided an opportunity to truly change internal operations. The fiscal 1999 budget was the first time EPA started linking funding to overall environmental goals.

"If you looked at our pre-1999 budgets, they were arranged by program areas. You would have seen a request for the research and development office all by itself," explains David Ziegele, director of the Office of Planning, Analysis and Accountability. "Then, if you kept looking, you would see how much the air office got, and water office, and solid waste office."

But environmental protection hardly can be dealt with in a stovepipe fashion. Programs in the solid waste office affect the water and air offices. Research and development contributes to every part of the agency. So EPA officials restructured the budget to tie funds to broad environmental goals-improving air quality, for example. From there, they set smaller, more quantifiable objectives, such as reducing the amount of chemical pollutants being released into the air. The refined budget format allows the agency to tie funding to a specific set of outcomes, regardless of which division has lead responsibility. For example, under the clean air goal, EPA's budget now includes a line item for research and development, allowing agency officials to see how much R&D contributes to improved air quality. EPA's new budget format means that it now must think through performance measures before making budget allocations. In the spring, the agency starts developing priorities for the next fiscal year. Officials look at past performance and take a mid-year snapshot in an effort to develop goals. They also consider goals for the next five to 10 years. The process is not limited to agency staff. State and tribal government officials are brought into the fold. EPA relies on these jurisdictions for 94 percent of its information about pollution and the quality of natural resources. States also are heavily involved in enforcement. Once the priorities are set, agency leaders start figuring out the budget request.

"Prior to GPRA, goals were not an explicit part of the process," says Ziegele. "Frequently, the budget discussions focused more on bright ideas people would try to sell to the bosses. We're still looking for innovative ideas, but sooner or later the questions come up about what the results will be."

The process is not easy. Ziegele says the agency had to overcome two main hurdles: getting the planning and budget offices to work together on establishing specific goals and creating a performance-based culture. The agency solved the first problem partly by reorganizing the two departments so they reported to the same boss, instead of two different bosses. The longer-term, and perhaps more challenging problem, is changing the culture. Doing so requires a focus on outcome-based measures, says Ziegele. He cites a subtle, yet significant change in the enforcement program. Typically, the agency issued a year-end news release detailing the number of enforcement actions it had taken. But last year, the release highlighted the millions of pounds of pollution the agency had helped remove from the environment.

Nonetheless, performance budgeting has not paid off in terms of huge dollar increases for EPA-politics still controls the purse strings. Additionally, appropriations committees on Capitol Hill are not as enamored with the revised budgeting process as are authorizing committees. Frank Cushing, clerk for the House VA, HUD and Independent Agencies Appropriations Subcommittee, says most members want to look at how specific programs are doing, not necessarily broad goals.

"In terms of trying to meet GPRA, EPA has done a commendable job," he says. "Of the 20-some agencies under our jurisdiction, they have done better than anyone else at creating a budget request that is in a format consistent with GPRA." But, he adds, "If you look at their budget request, you can look at a program and find it under 10 or more goals. It's hard to see what they are getting at."

Still, Ziegele defends performance budgeting, saying it allows the agency to better explain and defend its programs.

Setting Targets

Even EPA, despite its success in managing for results, has trouble developing reliable measures. Nature doesn't always cooperate with an annual budget process. It can take decades for the results of pollution cleanups to show. Another huge complication is that EPA lacks data that would enable the agency to measure its achievements. The agency has a wealth of information about air quality because of its extensive, long-standing network of pollution monitors, but it lacks adequate data about solid waste, pesticides and other types of environmental problems since it must rely on inconsistent reporting from private industry and the states. The agency's inspector general noted last year that the inability to measure progress in some programs leads EPA to fall back on process goals such as counting the inspections it conducts or the permits it issues instead of looking for true environmental results.

Further complicating the situation are pressures from outside stakeholders. For the IRS' Rossotti, the pressure came from Congress, which passed a law in 1998 preventing his agency from measuring the amount of money revenue agents collect from noncompliant taxpayers. Lawmakers forced that change after complaints-many of which GAO later determined to be unfounded-poured into the Senate Finance Committee that performance measures for enforcement were driving agents to abuse taxpayers. A few years later, members of Congress took a look at the agency's dwindling collection statistics and complained that the IRS had gone soft on tax cheats.

Rossotti has also run into trouble trying to come up with reliable data for the most basic measure of his agency's performance: whether people pay all of the taxes they owe. In January, Rossotti announced that the IRS would conduct random audits of 50,000 taxpayers, mostly by analyzing data on a computer or by corresponding with taxpayers. About 2,000 taxpayers would have face-to-face audits. The random audits would help the IRS determine where people were having trouble complying with the tax code and where they were deliberately cheating. Then the IRS could educate the public about problem areas and target traditional audits at real cheaters. The random audits would also give the agency a statistically valid sample to estimate the amount of taxes that people are and aren't paying.

The announcement drew a firestorm of criticism from taxpayer advocates and tax practitioners, who say the heavy-handed random audits of the past-last conducted in 1988-put too many innocent taxpayers through excruciating examinations. Compared to the last time the agency tried to resurrect random audits, when the IRS planned to conduct 150,000 of them in 1994, Rossotti's new plan is far less intrusive. But he still may face new legal restrictions if opponents of the random auditing plan influence lawmakers.

"So far, what we've been dealing with are measures that are necessary to run the organization, that are operational in nature, and which have some value in stressing organizational performance," he says. "But let's say we answer more phone calls, and we do more audits, and we have the right answers to phone calls, and we collect overdue accounts; those are all measures that are important and we have them in our performance plan. But they're not really the ultimate measure. The ultimate measure of the IRS is: Are people paying all the taxes that are due and are people satisfied with the service they get from the IRS. Those are the ultimate measures. That's what we call strategic measures. We don't have those measures yet."

Baby Steps

Creating this mix takes time and patience, says Wholey. In the end though, developing useful measures and links to the budget help agencies tell their stories to policy-makers and the public. EPA's Ziegele says that becomes especially important in defending budget allocations.

"A lot of the work we do is controversial," he says. "The more we can explain why we are doing certain things, the more support we can get for our budget and our work. It elevates the debate. At least people are talking about substantive issues."

Rating Criteria

Does the agency have a clearly articulated statement of its mission, and does it understand how its activities drive mission success?
Does this mission understanding extend vertically throughout the organization?
Are the measures of success focused (at least in part) on outcomes?
Are the measures related to the mission and goals as reflected in the strategic plan?
Are the performance data reliable?
Are appropriate measures reported to individuals at different levels of the organization, and to external stakeholders?
Are performance measures used to influence and/or inform resource allocation decisions?
Is there any relationship between organizational performance and individual or group incentives to contribute to organizational performance?