From Our Readers

by: Bill Vesely, Science Applications International Corp.

Joseph Fragola, Science Applications International Corp.

Michael Stamatelatos, NASA Headquarters

[Editor Note: This letter to the editor was originally published in Volume 38 Issue 1 of Hazard Prevention (now Journal of System Safety) in 1Q 2002. The letter has been reformatted, but the text is unchanged.]

What’s Wrong with the Numbers? A Questioning Look at Probabilistic Risk Assessment

Read the original article here first

The following is a response to Jack Crawford’s article entitled “Opinion: What’s Wrong with the Numbers? A Questioning Look at Probabilistic Risk Assessment.” The article appeared in the third quarter 2001 issue of JSS.

In a recent opinion article in the third quarter 2001 issue of Journal of System Safety, Jack Crawford criticizes Probabilistic Risk Assessment (PRA) for various shortcomings. We would like to respond to those criticisms, and we have organized our responses according to the basic questions that Mr. Crawford poses, and the answers he submits.

Question 1: To what extent does PRA encompass the main causes of accidents?

Mr. Crawford presents examples in which a PRA did not identify the causes of failures or accidents occurring. The examples he presents are functional failures in which the system operated but did not function as required.

We respond that it is true that most PRAs do not model functional failures, but instead model one of two operational failures: failure to operate or failure to function. A failure to operate occurs when the system fails to run as defined by sufficient components running. A failure to run also includes the starting of the component. In contrast, a failure to function occurs when the system fails to perform as required. All the components of the system operate or run, but the system still fails its function.

An example of a functional failure is an air conditioning system that operates but fails to perform its function of cooling a control room. Another example is a spray system that operates but fails to wash the air of radioactive particles. Still another example is a booster rocket that operates but does not have sufficient thrust to lift the spacecraft to the desired orbit.

Most PRAs only model failure to operate. In particular, a PRA that uses only component failure rates models failure to operate. A well-constructed, failure-to-operate PRA is a credible and useful model if properly applied. It is applicable to a system that has been operating and functioning, or one that is a reproduction of a system that has operated and functioned. When a PRA models a new system that has not operated, the PRA needs to clearly indicate which failure is being modeled. This is not a deficiency in the basic PRA method but a deficiency in communicating the assumptions and boundaries of the analysis.

A PRA can also model functional failures. Many nuclear power plant PRAs include functional failures. For example, response times of safety systems and operators are modeled. Safety system performance is modeled as a function of temperature, flow, pressure and components that have failed. Steam binding of pumps and water-hammer effects on water lines are modeled. Another performance feature modeled is the dynamic forces on a valve when it attempts to close under accident pressures and flows. [Refs. 1-4]

When functional failures are modeled, performance must be modeled. Performance modeling involves defining output performance variables and their relation to input variables and conditions. Performance modeling also involves modeling continuous variables, not just the usual binary variables (fail or no fail) that are modeled in an operational model. When performance and functional failures are modeled, the analyst must obtain and interpret system performance requirements, and gather and model performance characteristics that have been measured in tests. The functional failure models that are developed are then incorporated into the PRA, as they are in nuclear power plant PRAs.

Thus, the fact that the PRAs examined by Mr. Crawford did not include functional failures is not an indictment of PRA methodology. It merely indicates a lack of communication of assumptions and constraints of the PRAs. When PRAs are compared using applicable failure definitions and accounting for quantified PRA uncertainties, the results are generally consistent with history. These comparisons are made by properly comparing probability predictions with experience.

When probability predictions are compared with experience, detailed causes generally cannot be compared. Instead, patterns of occurrence and categories of causes are compared. This is the standard approach in comparing any probability prediction with experience. The U.S. Nuclear Regulatory Commission (USNRC) has issued various reports on making such comparisons, including comparing precursors to accidents. [Ref. 5,6] Precursor occurrences, because of their higher probabilities, can be meaningfully compared with PRA predictions, using available data to obtain statistical conclusions with confidence.

The comparison of detailed causes is not really a statistically meaningful comparison, even if applicable system failure types are compared. Generally, when there is only one failure occurrence, no conclusions can be reached on comparisons between causes due to the sparseness of data. One needs to carry out valid statistical tests to reach valid conclusions. The author’s conclusions may be interesting conjectures, but they have no statistical basis.

Regarding observed failures, Mr. Crawford states in the beginning of his article that he rarely sees “random” failures modeled in a PRA. “Random” in PRA and in reliability modeling does not mean “having no cause.” “Random” means that the pattern of failure times can be modeled by a probability distribution. All failures have causes. However, a PRA is concerned with modeling the probability distribution of failures. Fitting a probability distribution to failure time occurrences is a standard statistical procedure.

The author goes on to say that a gap in PRA is caused by assumptions that become invalid or are invalidated. A PRA, as with any model, can only be meaningfully interpreted within the bounds of its assumptions. Using a PRA or any model with invalid assumptions is not the fault of the PRA or model. Instead, it is the fault of the application. It is important that the PRA clearly state its assumptions and bounds. If these become invalid or are invalidated, then the PRA is not applicable. The USNRC, as part of its risk management plan, reviews nuclear power plants to ensure that the assumptions of the PRA are valid. If they are not, actions are taken so that the assumptions become valid. [Ref. 7,8]

Mr. Crawford further criticizes PRAs for their lack of accounting for people’s perceptions of risk. PRAs do not account for perceptions of risk. Instead, they evaluate technical risk, including the probabilities and measurable consequences of accidents. The decision-maker must then use these results as one of the inputs in making decisions. A PRA does not make decisions. It provides input for decision-making. When changes are made, the assumptions and bounds of the PRA are no longer valid. Other risks may be incurred, as the article notes. However, if these are not in the bounds of the original PRA, the PRA is no longer applicable and needs to be modified.

Mr. Crawford notes that after accident free periods, complacency may set in. Overconfidence may breed shortcuts and underdesigns. Thus, he indicates that the PRA is no longer valid. This again reflects the fact that the PRA assumptions and bounds are no longer valid. To ensure that the risk remains low, the assumptions and bounds of the PRA need to be valid. This is part of an effective risk management plan, as is conducted by the USNRC.

Finally, the author criticizes PRAs for not modeling management effects. PRAs implicitly include modeling effects if the probabilities and failure rate data that are used are plant-specific. It is true that PRAs generally do not explicitly model management effects. However, this does not mean that the results evaluated by a PRA are erroneous. An effective risk management program includes understanding the PRA and ensuring that the assumptions and bounds are maintained to lower risk. A safety culture and risk culture need to have an understanding of the risks, and PRA can be a significant tool in developing this understanding. Within an effective risk management program, a PRA can be used to ensure low risk, if it is complemented with management actions that consider risk.

Question 2: Can statistical inference take us forward from the past to the future?

Mr. Crawford argues that to extrapolate collected data, conditions of stability must exist in the collected data. He also argues that the recorded data depend on the conditions under which they were collected. He points out that prediction therefore requires applying judgment and knowledge to the available data.

We respond that Mr. Crawford is right that data cannot be naively extrapolated. The conditions under which it was collected and the stability of data need to be considered. It is true that most databases do not ensure the stability of their data, and many do not clearly identify the conditions under which the data were collected. When data are used, information must be included that not only accounts for statistical uncertainties, but also for the variability in conditions.

When Mr. Crawford discusses numbers, he references only point values with no associated uncertainties. With PRA results, we cannot be confident in a point value for a probability, such as a failure probability of 0.0035. We can only be confident in intervals, or ranges, of probabilities. These probability intervals are often very wide because of uncertainties in the data, as well as in the modeling. In well-performed PRAs, the uncertainty range for a risk result is often a factor of 10 or larger where there are gaps in knowledge.

A critical component of PRA is the uncertainty and sensitivity analyses that accompany the calculations. A PRA that reports only a point value is inadequate. A decision-maker who uses only a point value is a misinformed decision-maker. Recorded data can be used to predict the future only if knowledge and judgment, as well as a thorough uncertainty and sensitivity analysis, accompany that prediction.

Mr. Crawford argues that most statistical analyses assume independence of failures. He also argues that the dependency approaches typically used in PRAs, such as common-cause failure beta factors, are arbitrary. Many statistical analyses do assume independence of failures. However, most good PRAs include dependency models, such as beta factor models. Beta factor models and other associated common-cause failure models are not arbitrary. Significant effort by the USNRC has resulted in databases for beta factors and other common-cause parameters that are used to estimate dependent failure probabilities. [Ref. 9,10]

Beta factors and other common-cause failure parameters are empirical parameters that are estimated from actual occurrences of dependent failures. They are used to estimate the probabilities of dependent failures occurring in similar conditions. When these dependent failure probabilities are identified as significant contributors to PRA risks, more specific analyses are carried out to identify causes and protective measures against such dependencies.

The author is correct in saying that the account accompanying a PRA result is at least as important as the number and associated uncertainty range. No PRA result should be taken at face value.

Question 3: How much force does the mathematical theory of probability add to the probability statement?

Mr. Crawford argues that the definitiveness of a PRA prediction is a delusion. He also gives references on the meaninglessness of precise probability statements.

We respond that the mathematical theory of probability determines the laws for combining probabilities. It says nothing about the precision or believability of the probabilities that are obtained. Again, we can’t be confident in any PRA result that is quoted as an exact point value — only in probability intervals. The uncertainty and sensitivity analyses that accompany the PRA results, as well as the modeling and assumption descriptions, are what give the PRA results credibility.

Question 4: If the numbers generated by a PRA do not represent probabilities of future events, are they still useful? If so, for what?

Mr. Crawford argues that the numbers are useful for identifying risk contributors and for identifying risky situations. He also says that he learns by digging for answers to the questions raised by the numbers.

We respond that PRA numbers are useful if interpreted within the bounds of the analysis and the associated uncertainties. However, PRA provides more than just the final numbers. The relative importance of the contributors is prioritized, which helps to focus attention and actions. These relative importances generally have smaller associated uncertainties than the final absolute numbers.

The risk uncertainties and risk sensitivities identified by the PRA offer information as valuable as the final numbers. These uncertainties and sensitivities identify focal points for gathering additional information, or for instituting actions to control the sensitivities. The qualitative results of the PRA are as valuable as the quantitative results. The qualitative results show the relationships among the contributors that lead to risk. They also show the nature of the contributors, and their redundancies and diversities. Viewing a PRA as providing only final numbers misses most of the useful information it contains.

References

U.S. Nuclear Regulatory Commission (USNRC). “Severe Accident Risks for Five U.S. Nuclear Power Plants,” NUREG-1150, June 1989.
Harper, F.T. et al. “Evaluation of Severe Accident Risks: Quantification of Major Input Parameters,” NUREG/ CR-4551, June 1991.
Kelly, D.L. et al. “Assessment of ISLOCA: Risk Methodology and Application to a Westinghouse Four- Loop Ice Condenser Plant,” NUREG/ CR-5774, April 1992.
USNRC. “Accident Source Terms for Light Water Nuclear Power Plants,” NUREG-1465, February 1995.
Belles, R.J. et al. “Precursors to Potential Severe Core Damage Accidents, A Status Report,” NUREG/CR-4674, June 1995.
Johnson, J.W. and Dale M. Rasmuson. “The USNRC’s Accident Sequence Program: An Overview and Development of a Bayesian Approach to Estimate Core Damage Frequency Using Precursor Information,” Reliability Engineering and System Safety, pp. 205-216, Vol. 53, No. 2, 1996.
Camp, A.L. et al. “The Risk Management Implications of NUREG-1150: Methods and Results,” NUREG/CR- 5263, August 1989.
Caruso, M.L. et al. “An Approach to Using Risk Assessment in Risk Informed Decisions on Plant Specific Changes to the Licensing Basis,” Reliability Engineering and System Safety, Special Issue on Developments in Risk-Informed Decision Making for Nuclear Power Plants, pp. 231-242, Vol. 63, No. 3, 1999.
Mosleh, A. et al. “Guidelines on Modeling Common Cause Failures in Probabilistic Risk Assessment,” NUREG/CR-5485, November 1998. 10. USNRC. “Common Cause Failure Data Collection and Analysis System,” NUREG/CR-6268, June 1998.

The video below provides a comprehensive introduction to modern PRA techniques – Ed.

2 thoughts on “From Our Readers”

What’s Wrong with the Numbers? A Questioning Look at Probabilistic Risk Assessment – Blog of System Safety says:

October 15, 2022 at 3:11 am

[…] From Our Readers Next up: Read the response to this article! […]

What’s Wrong with the Numbers? A Questioning Look at Probabilistic Risk Assessment – Blog of System Safety says:

November 23, 2022 at 10:03 pm

[…] From Our Readers Next up: Read the response to this article! […]

Byadmin

By admin

Related Post

Dear Editor – System Safety Career Path

A Perspective On System Safety

Safety Through Exhortation

2 thoughts on “From Our Readers”

Leave a Reply Cancel reply

Safety Blogroll

You May Have Missed

Dear Editor – System Safety Career Path

A Perspective On System Safety

Safety Through Exhortation

Hazarding an Opinion: Why a Systems Safety Society?