Thursday, March 5, 2026

Fail Positions

Introduction

In the hierarchy of critical decisions a process engineer makes during the design of a water or wastewater treatment facility, few specifications have as immediate a safety impact as fail positions. While pump curves and pipe sizing dictate the system’s efficiency during normal operation, the fail position of control valves and actuators dictates the system’s behavior during a catastrophe. A power outage, a severed signal cable, or a loss of plant air supply can instantly transform a routine flow control loop into a potential flood, a biological process washout, or a dangerous pressure surge event.

Despite the high stakes, fail positions are often treated as an afterthought in P&IDs or copied from previous project specifications without adequate hydraulic analysis. It is estimated that nearly 20% of control valve specifications in municipal retrofits contain ambiguous or hydraulically dangerous failure mode requirements. For example, specifying a discharge control valve to “Fail-Closed” to prevent backflow might seem logical, but if that closure happens instantaneously during a power loss while the column of water is still moving, the resulting water hammer could rupture the header.

This article addresses the engineering logic required to specify fail positions correctly for municipal and industrial water systems. We will move beyond the basic definitions of Fail-Open (FO), Fail-Closed (FC), and Fail-Last (FL) to explore the nuances of spring-return versus battery backup, the implications of process coupling, and the integration of failure modes into reliable plant automation strategies. Whether you are designing a new headworks facility or troubleshooting a problematic lift station, understanding the physics and risks associated with these positions is mandatory for ensuring operational resilience.

How to Select / Specify

Selecting the correct fail positions requires a multi-disciplinary approach, blending process engineering, hydraulic modeling, and instrumentation design. The decision cannot be made in isolation; it must consider what happens upstream and downstream of the device when energy is lost.

Duty Conditions & Operating Envelope

The primary driver for selection is the process consequence of flow continuation versus flow stoppage. Engineers must evaluate the system state under “loss of utility” conditions.

  • Criticality of Flow: In a cooling water loop for an ozone generator, loss of flow could lead to equipment damage; therefore, a Fail-Open position is typically required. Conversely, in a chemical feed system dosing sodium hypochlorite, a Fail-Open condition could lead to a toxic release or overdosing violation, mandating a Fail-Closed position.
  • System Pressure Dynamics: Consider the pressure differential ($dP$) across the valve during the failure event. Spring-return actuators must be sized to close (or open) the valve against the maximum shutoff pressure, not just the operating pressure. If a valve is specified as Fail-Closed, the spring torque must be sufficient to seat the valve fully against the line pressure without hydraulic assist.
  • Variable Flow Regimes: For valves operating in highly variable flow regimes (e.g., stormwater management), a Fail-Last position might be preferable to prevent sudden hydraulic shocks, provided the system can tolerate the flow rate existing at the moment of failure for a short duration until manual intervention occurs.

Materials & Compatibility

The method of achieving a fail position influences material selection, particularly regarding the actuator housing and internal components.

  • Spring Fatigue and Corrosion: Mechanical springs are the most reliable method for achieving fail positions. However, in wastewater environments rich in hydrogen sulfide ($H_2S$), spring housings must be properly sealed. If the spring chamber is vented to the atmosphere, the ingress of corrosive gases can lead to stress corrosion cracking of the spring over time.
  • Battery and Capacitor Constraints: For electric actuators where mechanical springs are impractical (e.g., large multi-turn gate valves), battery backup systems or supercapacitors are used to drive the valve to a fail position. These components have temperature sensitivities. If the valve is installed outdoors in a northern climate, lithium-ion battery performance degrades significantly below freezing, potentially preventing the valve from reaching its safety position.

Hydraulics & Process Performance

The speed at which a valve travels to its fail position is often more critical than the position itself. This is where hydraulic modeling becomes essential.

  • Water Hammer Mitigation: A Fail-Closed isolation valve on a pump discharge can induce severe water hammer if it closes faster than the pressure wave propagation time ($2L/a$). Engineers must specify not just “Fail-Closed” but “Fail-Closed with adjustable closure speed” or utilize a two-stage closure strategy if supported by the hardware (e.g., hydraulic damping).
  • Pump Dead-Heading: If a control valve downstream of a positive displacement pump fails closed while the pump is still coasting down (or has backup power), the pressure spike can burst piping. In such applications, a Fail-Open or a pressure relief bypass is mandatory.
  • Cavitation Risks: If a valve fails to a partially open position (common in poorly maintained spring systems), it may operate continuously in a high-velocity cavitation zone, destroying the trim and valve body within hours.

Installation Environment & Constructability

The physical size and weight of actuators designed for specific fail positions can impact facility design.

  • Actuator Size: Single-acting pneumatic actuators (spring-return) are significantly larger and heavier than double-acting actuators because the spring must be sized to overcome both the valve torque and the opposing air pressure during normal operation. This requires increased clearance around the piping and stronger pipe supports to handle the eccentric load.
  • Power dependency: Pneumatic systems are inherently better suited for “Fail-Safe” operations because a simple air tank can provide reserve energy, or the spring can provide mechanical energy. Electric actuators require battery packs, which add bulk and require accessibility for regular replacement.

Reliability, Redundancy & Failure Modes

Engineers must analyze the reliability of the “energy storage” mechanism that powers the fail transition.

  • Mechanical vs. Electrical Energy: A compressed mechanical spring has a very high Mean Time Between Failures (MTBF) and is considered a passive failure protection. A battery backup system is active; it relies on charge circuits, battery health, and switching logic. For ultra-critical applications (e.g., influent isolation to prevent dry-weather overflow), mechanical spring return is the preferred engineering standard.
  • Fail-Last (Fail-in-Place): This is the default failure mode for many electric motor actuators without battery backup. While acceptable for non-critical modulating loops (like an aeration basin air control valve), it is unacceptable for safety shutoff valves. Engineers must explicitly state if Fail-Last is not permitted.

Controls & Automation Interfaces

The control system must be aware that a device has entered its fail position.

  • Loss of Signal vs. Loss of Power: Specifications should distinguish between loss of the 4-20mA control signal and loss of motive power (air/electricity).
    • Loss of Signal: The valve can be programmed to go to a specific percentage (e.g., 50%), Fail-Open, Fail-Closed, or Fail-Last using the actuator’s onboard intelligence.
    • Loss of Power: The valve is physically forced to its mechanical fail position (if equipped).
  • Feedback Loops: The position feedback signal (4-20mA out) should ideally be powered by a separate source or the backup battery so SCADA can confirm the valve actually reached its fail position.

Maintainability, Safety & Access

The choice of fail position mechanism dictates the long-term O&M burden.

  • Stored Energy Hazards: Spring-return actuators contain powerful compressed springs. Maintenance personnel must be trained on the specific hazards of disassembling these units. Specifications should require “Captured Spring” designs where the spring cartridge is a sealed unit, preventing accidental release during maintenance.
  • Battery Maintenance: If electric fail-safe actuators are specified, a preventive maintenance (PM) schedule for battery replacement (typically every 2-3 years) must be established. The plant design should include easy access to these actuators; they should not be located 20 feet in the air without a catwalk.

Lifecycle Cost Drivers

While spring-return pneumatic actuators often have a higher initial installation cost due to the need for a compressed air system (compressors, dryers, tubing), their lifecycle cost is often lower than electric fail-safe actuators in large plants due to extreme longevity and repairability. Electric fail-safe actuators have higher OPEX related to battery replacements and electronic board failures.

Comparison Tables

The following tables provide a structured comparison of fail-position technologies and an application matrix to assist engineers in matching the failure logic to the process requirement.

Table 1: Fail-Safe Technology Comparison

Comparison of Actuator Technologies for Achieving Fail Positions
Technology Type Fail Mechanism Best-Fit Applications Limitations & Considerations Typical Maintenance
Pneumatic Spring-Return Mechanical Spring Rapid-response isolation, modulation, hazardous areas (explosion-proof by design). Requires instrument air system. Larger physical footprint. Spring fatigue over very long cycles. Low. Diaphragm/seal replacement every 5-7 years. Air quality maintenance is critical.
Electric with Battery Backup DC Motor + Battery Remote sites without air supply. Large multi-turn valves (gates/sluices). Battery degradation in temperature extremes. “Sleep” failures where battery dies undetected. Slower operation. High. Battery replacement every 2-3 years. Regular function testing required.
Electric with Supercapacitor Stored Electrical Charge Small quarter-turn valves, precise control loops. Limited torque capability compared to batteries/springs. Short hold-up time. Moderate. Capacitors last longer than batteries (10+ years) but require electronic monitoring.
Electro-Hydraulic Hydraulic Accumulator / Spring Very high torque/thrust large valves. Critical pump check valves. High complexity and CAPEX. Potential for hydraulic oil leaks. Moderate-High. Oil changes, seal checks, accumulator nitrogen charge verification.
Fail-Last (Lock-in-Place) Self-Locking Gear / Brake Aeration air control, flow splitting, non-critical modulation. Not a safety position. Does not protect against upstream/downstream catastrophe. Low. Standard actuator maintenance.

Table 2: Application Fit Matrix for Fail Positions

Recommended Fail Positions by Water/Wastewater Application
Application / Location Recommended Fail Position Primary Engineering Logic Safety/Risk Implication
Influent Pump Station Isolation Fail-Open (Typically) Prevents backup into collection system or basement flooding during power loss. Risk: Plant flooding. *Note: Some designs prefer Fail-Closed to protect downstream biological processes, utilizing overflow weirs instead.
RAS/WAS Flow Control Fail-Last or Fail-Open Maintaining biomass circulation is critical. Stopping RAS can kill the biology in the reactor. Fail-Closed risks clarifier blanket rising and solids washout.
Chemical Feed (Chlorine/Acid) Fail-Closed Prevent toxic chemical release and overdosing into the environment. Safety Critical. Must prevent gravity feed or siphoning when pump stops.
Filter Backwash Supply Fail-Closed Prevents draining the clearwell or backwash tank uncontrolled. Preserves treated water inventory for fire protection or restart.
Aeration Blower Discharge (Surge) Fail-Open Protects blower from surge conditions if the downstream header is blocked. Equipment Protection. Prevents catastrophic blower damage.
Final Effluent Discharge Fail-Open Ensures treated water can leave the plant; prevents hydraulic backup. Regulatory compliance usually prioritizes hydraulic throughput over potential short-term quality dips during emergencies.

Engineer & Operator Field Notes

Specifying the fail position is only step one. Ensuring it works in the field requires rigorous commissioning and maintenance protocols. Here are insights from the field regarding the lifecycle of safety actuation.

Commissioning & Acceptance Testing

The Factory Acceptance Test (FAT) and Site Acceptance Test (SAT) are the only times to verify fail positions safely.

  • The “Pull the Plug” Test: Do not rely on software simulation to test fail positions. Physically disconnect the power to the electric actuator or the air supply to the pneumatic actuator. The valve must move to its specified position immediately.
  • Timing Verification: Measure the time it takes to travel from 100% open to the fail position.
    • Pro Tip: If a pneumatic Fail-Closed valve slams shut in 0.5 seconds on an 18-inch line, you have a water hammer generator. Adjust the exhaust restrictors (speed controls) on the solenoid valve to slow the spring return stroke to a safe duration (e.g., 10-20 seconds).
  • Reset Logic: Verify how the system recovers when power/air is restored. Does the valve stay in the fail position until acknowledged, or does it immediately hunt for the control signal? Automatic resetting can be dangerous if the process conditions haven’t stabilized.
PRO TIP: Inspect the Solenoid Vent

For pneumatic spring-return actuators, the solenoid valve venting mechanism is the critical path. Ensure the solenoid vent port is fitted with a bug screen or a breather/muffler. Mud daubers and insects love these ports; a clogged vent prevents the air from escaping, meaning the spring cannot close the valve during a failure event.

Common Specification Mistakes

Engineers often introduce ambiguity into contract documents that leads to change orders or operational risks.

  • Ambiguous “Fail Safe” Terminology: Using the term “Fail Safe” without defining it. To a process engineer, “Safe” might mean the valve opens to relieve pressure. To a containment engineer, “Safe” might mean the valve closes to stop a spill. Always specify Fail-Open (FO) or Fail-Closed (FC) explicitly.
  • Ignoring Air Supply Failure: In pneumatic systems, specifications sometimes address power loss but forget air compressor failure. A “Fail-Last” pneumatic valve (double-acting) will drift if the air supply pressure drops gradually. Use an air lock-up valve (pilot-operated check valve) if position retention is critical on air loss.
  • Oversizing Springs: Specifying an excessive safety factor for the spring (e.g., closing against 2x design pressure) results in massive actuators that may not fit in the pipe gallery. Stick to realistic worst-case differential pressures.

O&M Burden & Strategy

Operators must treat the fail mechanism as a distinct component requiring maintenance.

  • Partial Stroke Testing: For valves that remain static for months (e.g., emergency isolation valves), the probability of the valve sticking (stiction) is high. Implement “Partial Stroke Testing” (PST) in the SCADA logic. This automatically moves the valve 10% and back to prove the spring and actuator are functional without disrupting the process.
  • Spring Inspection: Listen to the spring during operation. A “crunching” sound indicates internal corrosion or broken coils.
  • Battery Management: For electric fail-safe units, label every actuator with the “Battery Install Date” and “Next Replacement Date” clearly on the housing. Don’t rely solely on the BMS (Battery Management System) alarm.

Troubleshooting Guide

  • Symptom: Valve fails to close fully on loss of air.
    Root Cause: Spring fatigue, excessive packing friction, or debris in the valve seat.
    Check: Disconnect the actuator from the valve stem. Does the actuator stroke fully? If yes, the problem is the valve body (debris/corrosion). If no, the spring is compromised or the vent is plugged.
  • Symptom: Electric valve oscillates (hunts) upon restoration of power.
    Root Cause: The PID loop in the PLC winds up during the power outage (integral windup).
    Fix: Program the PID loop to freeze or reset the integral term when the “Drive Fault” or “Power Loss” status is active.

Design Details / Calculations

Correctly sizing the actuator for a specific fail position requires calculating torque at critical points in the stroke.

Sizing Logic & Methodology

Sizing a spring-return actuator is more complex than a double-acting one because the spring is at its weakest when it is most extended (the end of the fail stroke).

  1. Determine Valve Torque ($T_v$): This includes seating torque, running torque, and breakaway torque. It is influenced by the differential pressure and fluid media (sludge vs. clean water).
  2. Determine Spring Torque ($T_s$):
    • Start of Spring ($T_{start}$): Torque generated when the spring is fully compressed (beginning of the fail stroke). This is the strongest point.
    • End of Spring ($T_{end}$): Torque generated when the spring is extended (end of the fail stroke). This is the weakest point.
  3. Determine Air Torque ($T_a$): The torque the air cylinder produces to compress the spring and move the valve.
  4. The Critical Rule:
    • For Fail-Closed: $T_{end}$ must be greater than the Valve Seating Torque. If the spring is too weak at the end of the stroke, the valve won’t seal tight.
    • For Fail-Open: $T_{end}$ must be greater than the Valve Breakout Torque (if flow tends to close the valve) or Running Torque.

“A common error is sizing based on the ‘average’ spring torque. You must size based on the ‘End of Spring’ torque to ensure the valve completes its safety function.”

Specification Checklist

When writing the equipment specification (Division 40 or 43), ensure these items are explicitly defined:

  • Failure Condition: Define action upon Loss of Electrical Power, Loss of Control Signal, and Loss of Supply Media (Air/Hydraulic) independently.
  • Manual Override: Specify if a manual handwheel is required. Note: Handwheels on spring-return actuators must be declutchable or designed so the handwheel doesn’t spin dangerously when the spring fires.
  • Speed Control: “Actuator shall be equipped with adjustable flow control valves to regulate the speed of the failure stroke to prevent water hammer.”
  • Safety Factor: “Actuator sizing shall include a minimum 1.25 safety factor above maximum valve seating torque at the End-of-Spring position.”

Standards & Compliance

  • ISA-5.1: Instrumentation Symbols and Identification. Defines the standard markings for P&IDs:
    • FO: Fail Open
    • FC: Fail Closed
    • FL/FIP: Fail Last / Fail in Place
    • FI: Fail Indeterminate (Avoid using this)
  • AWWA C541 / C542: Standards for Hydraulic and Electric Actuators. These standards provide baseline testing and cycle life requirements.
  • NEC (NFPA 70): wiring requirements for fail-safe circuits, particularly regarding separation of power and control in critical fail-safe loops.

FAQ Section

What is the difference between Fail-Safe and Fail-Secure?

While often used interchangeably, “Fail-Safe” generally implies the system moves to a condition that protects personnel and equipment (e.g., opening a relief valve). “Fail-Secure” typically implies the system moves to a state that prevents loss of product or unauthorized access (e.g., closing a tank outlet). In water/wastewater, “Fail-Safe” is the dominant term, but engineers should specifically use Fail-Open (FO) or Fail-Closed (FC) to avoid confusion.

Can I use a double-acting pneumatic actuator for a fail position?

Yes, but it requires an external air accumulation tank and a solenoid valve arrangement that routes the stored air to the appropriate side of the piston upon signal loss. While feasible, this adds complexity and potential leak points compared to a passive spring-return design. It is generally reserved for very large valves where springs are impractical.

How do I select the fail position for a 3-way diverting valve?

For 3-way valves, “Fail-Open” or “Fail-Closed” is ambiguous. You must specify the failure path. For example, “Fail to Port A” (Recycle) or “Fail to Port B” (Discharge). Common practice in sludge lines is to Fail to Recycle to prevent discharging untreated sludge, whereas pump protection bypasses should Fail to Bypass.

What is the typical lifespan of a spring-return actuator vs. battery backup?

A quality pneumatic spring-return actuator can last 15-25 years with seal replacements every 5-7 years. The spring itself rarely fails if the housing is sealed. Electric actuators with battery backup typically require battery replacement every 2-4 years, and the electronic charging boards have a typical service life of 10-15 years. The mechanical spring offers a lower total cost of ownership (TCO) over 20 years.

Why is Fail-Last (Fail-in-Place) risky for control valves?

Fail-Last is risky because it assumes the process conditions at the moment of failure are stable. If a flow control valve fails in place at 10% open during a low-flow period, and then a storm event causes flow to increase, that restriction becomes a bottleneck causing upstream flooding. Fail-Last is generally acceptable only for large open basins (like aeration) where level changes are slow.

Does a Fail-Open valve require more torque than a Fail-Closed valve?

Not necessarily, but the sizing logic differs. It depends on the valve type and flow direction. For a butterfly valve, dynamic torque caused by fluid velocity tends to close the valve. Therefore, a Fail-Open spring must be strong enough to overcome this hydraulic force (dynamic torque) to push the valve open against the flow. Sizing must account for the specific hydrodynamic characteristics of the valve trim.

Conclusion

Key Takeaways for Engineers

  • Safety Over Process: When selecting a fail position, prioritize personnel safety and equipment protection (preventing surges/bursts) over process efficiency.
  • Physics Wins: Mechanical springs are inherently more reliable than electronic battery backups. For critical isolation, prefer pneumatic spring-return.
  • End-of-Spring Torque: Always size spring-return actuators based on the torque available at the end of the spring stroke, not the beginning.
  • Speed Kills: A Fail-Closed valve that slams shut instantly causes water hammer. Specify adjustable speed controls on the failure stroke.
  • Context Matters: A valve is part of a system. Analyze the impact of the fail position on the pump, the pipe rating, and the downstream biological process.

Specifying fail positions is one of the most critical responsibilities of the design engineer. It requires looking beyond the “normal” operating state shown on the P&ID and envisioning the worst-case scenario. The correct selection—Fail-Open, Fail-Closed, or Fail-Last—acts as the final line of defense against flooding, biological upsets, and catastrophic pipe failures.

For municipal and industrial applications, the trend is moving toward verifiable safety. This means favoring mechanical spring-return systems for critical isolation, integrating position feedback that is independent of main power, and conducting rigorous field simulation tests during commissioning. By treating the fail position as a distinct, engineered operating mode rather than a default checkbox, utilities can ensure their systems remain resilient even when the lights go out.



source https://www.waterandwastewater.com/fail-positions/

No comments:

Post a Comment

and SCADA Integration

INTRODUCTION One of the most persistent challenges in modern municipal water and wastewater engineering is the “digital gap” between mechan...