Subido por kevin456_70


The problem of system objectives and allocation of function in computer control of a basic
oxygen furnace has been discussed by Crawley (1968), He gives as an example of a
function which is more appropriately allocated to man the interpretation of the noise signal
given by a bomb thermocouple. The point is also made, however, that the optimal allocation
of function changes continuously as technology progresses.
The difficulty of automating certain functions is further illustrated by the work of
Ketteringham, O’Brien and Cole (Ketteringham, O’Brien and Cole, 1970; Ketteringham and
O’Brien, 1974) on the scheduling of soaking pits using a man-computer interactive system.
Problems of providing all the necessary information to the computer and of complexity in the
decision-making make it difficult to automate this function, but it is possible to provide
computer facilities which, by storage of large amounts of information, the use of predictive
models and the provision of appropriate displays, greatly assist the operator to make
decisions. The work involved simulation using a realistic interface and actual steel works
schedulers. The problem is very similar to that of air traffic control, which has been
extensively investigated in human factors.
The importance of organizational and social factors is shown by the work of Engelstad
(1970), who investigated these in a paper mill. The work revealed that individuals tended to
operate too much in isolation and to have goals which were not necessarily optimal as far
as the system was concerned. It sought to encourage communication by treating the control
room as an information and control centre, which all concerned should use, and by designing
jobs to give greater variety, responsibility, learning opportunity and wholeness. Other
workers too have commented on the losses which can occur in continuous flow processes,
if there are poor relationships between the men controlling the process at different points.
Other studies have been concerned with task analysis, fault administration, displays,
selection and training, and human error. These topics are considered below.
Allocation of Function
As already emphasized, human factors has at least as important a role to play in matters of
system design such as allocation of function as in those of detailed design. The classic
approach to allocation of function is to list the functions which machines perform well and
those which men perform well and to use this as a guide. One of the original lists was
compiled by Fitts (1962) and such a list is often referred to as a ‘Fitts’ list’. However, this
approach needs some qualification. As Fitts pointed out later, the functions which should be
allocated to man are not so much those which he is good at as those which it is best from
the system point of view that he perform, which is slightly different. The question of
motivation is also important. De Jong and Koster (1971) have given a similar type of list
showing the functions which man is motivated to perform. Further accounts of allocation of
function is given by H.E. Price (1985) and Kantowitz and Sorkin (1987). Particularly
important in relation to loss prevention are functions concerned with fault administration,
fault diagnosis, plant shut-down and malfunction detection.
Information Display
Once the task has been defined, it is possible to consider the design of displays. Information
display is an important problem, which is intensified by the increasing density of information
in modern control rooms. The traditional display is the conventional control panel. Computer
graphics now present the engineer with a more versatile display facility, offering scope for
all kinds of display for the operator, but it is probably fair to say that he is somewhat uncertain
what to do with it.
The first thing which should be emphasized is that a display is only a means to an end, the
end being improved performance by the operator in executing some control function. The
proper design of this function in its human factors aspects is more important than the details
of the display itself. Some types of display which may be provided are listed in Table 14.5.
The provision of displays which the operator deliberately samples with a specific object in
view is only part of the problem. It is important also for the display system to cater both for
his characteristic of acquiring information ‘at a glance’ and for his requirement for information
redundancy. There is a need, therefore, for the development of displays which allow the
operator to make a quick and effortless survey of the state of the system. As already
described, the operator updates his knowledge of system state by predicting forward, using
a mental model of the process and sampling key readings to check that he is on the right
lines. He needs a survey display to enable him to do this.
This need still exists even where other facilities are provided. Facilities such as alarm
systems are based on a ‘management by exception’ approach, which is essential if the large
amounts of information are to be handled. But when the exceptional condition has been
detected, the operator must deal with it and for this he needs knowledge of the state of the
process, which a survey display provides. A display of system state also allows the operator
to use his ability to recognize patterns. This aspect is considered below.
Table 14.5 Some displays for the process operator
Displays of flow and mimic diagrams
Displays of current measurements, other variables, statuses (other variables include
indirect measurements, valve positions, etc.)
Displays of trends of measurements, other variables, statuses
Displays of control loop parameters
Displays of alarms
Displays of reduced data (e.g. histograms, quality control charts, statistical parameters)
Displays of system state (e.g. mimic diagrams, “status array", “surface” and polar plots)
Displays for manual control (e.g. predictive displays)
Displays for alarm analysis
Displays for sequential control
Displays for scheduling and game-playing
Displays for valve sequencing
Displays for protective system checking
Displays of maloperation
Displays for malfunction detection
Command displays
Figure 14.4 Control panel in a chemical plant (courtesy of Kent Instruments Ltd)
14.7.1 Regular instrumentation
It will be apparent from the foregoing that the conventional control panel has certain virtues.
The panel shown in Figure 14.4 is typical of a modern control panel. The conventional panel
does constitute a survey display, in which the instruments have spatial coding, from which
the operator can obtain information at a glance and on which he can recognize patterns.
These are solid advantages not to be discarded lightly. This is only true, however, if the
density of information in the panel is not allowed to become too great. The advantages are
very largely lost if it becomes necessary to use dense blocks of instruments which are
difficult to distinguish individually.
An important individual display is the chart recorder. A trend record has many advantages
over an instantaneous display. As the work of Crossman, Cooke and Beishon (1964) shows,
it assists the operator to learn the signal characteristics and facilitates his information
sampling. Both Attwood (1970) and B. West and Clark (1974) found that recorders are useful
to the operator in making coarse adjustments of operating point, while the latter authors also
noted the operator’s use of recorders in handling fault conditions. Anyakora and Lees
(1972a) have pointed out the value of recorders in enabling the operator to learn the signal
characteristics and so recognize instrument malfunctions.
14.7.2 Computer consoles
The computer console presents a marked contrast to conventional instrumentation. This is
illustrated with particular starkness in Figure 14.5(a), which shows the console of the original
ICI direct digital control (DDC) computer. Some of the features of ergonomic importance in
this panel are: (1) specific action is required to obtain a display; (2) there is no spatial coding
and the coding of the information required has to be remembered or looked up; (3) only one
variable is displayed at a time; (4) only the instantaneous value of the variable is displayed;
and (5) the presentation is digital rather than analogue.
This is a revolutionary change in the operator’s interface. While there is, among engineers,
a general awareness that the change is significant, its detailed human factors implications
are not so well appreciated. The subsequent refinement of process computer consoles and
the introduction of computer graphics have somewhat mitigated these features. A more
modern computer console is shown in Figure 14.5(b). Moreover, the facilities which the
computer offers in functions such as alarm monitoring or sequential control are powerful new
aids to the operator.
It remains true, however, that the transition from the conventional panel to the computer
console involves some serious human factors losses, particularly in information display. This
does not necessarily mean that the change should not be made. Conventional panels are
expensive and lose much of their advantage if the information density becomes too high.
Computers offer some very worthwhile additional facilities. But the change should at least
be made in the full awareness of its implications and with every effort to restore in the new
system the characteristic advantages of the old.
Figure 14.5(a) Control panel of the direct digital control (DDC) computer on ICI's ammonia
soda plant at Fleetwood, 1961 (courtesy of Ferranti Ltd and Imperial Chemical Industries
Figure 14.5(b) Control panel and display of Foxboro Intelligent Automation Computer
System (Foxboro Company Ltd)
Alarm Systems
As stated in the previous chapter, the progress made in the automation of processes under
normal conditions has focused attention on the monitoring and handling of abnormal or fault
conditions. It is the function of the control system to prevent, if possible, the development of
conditions which will lead to plant shut-down, but to carry out the shut-down if necessary.
The responsibility for averting shut-down conditions falls largely to the operator. The
principal automatic aid provided to assist him is the alarm system. Alarm systems are an
extremely important but curiously neglected and often unsatisfactory aspect of process
An alarm system is a normal feature of conventional control systems. If a process variable
exceeds specified limits or if an equipment is not in a specified state, an alarm is signalled.
Both audible and visual signals are used. Accounts of alarm systems include those by E.
Edwards and Lees (1973), Andow and Lees (1974), Hanes (1978), Swain and Guttman
(1983), Schellekens (1984), Shaw (1985) and the Center for Chemical Process Safety
(CCPS) (1993/14). There have also been investigations of the operation of alarm systems,
as described below.
14.8.1 Basic alarm systems
The traditional equipment used for process alarms is a lightbox annunciator. This consists
of a fascia, or array, of separate small rectangular coloured glass panels, behind each of
which there is a lamp which lights up if the alarm is active. Each panel is inscribed with an
alarm message. The panels are colour coded, usually with red being assigned to the highest
priority. This visual display is complemented by a hooter, when a new alarm occurs, the
hooter sounds and the fascia light flashes until the operator acknowledges receipt by
pressing a button. The panel then remains lit until the alarm condition is cleared by operator
action or otherwise, as described below.
There are a number of variations on this basic system. An account of some of these is given
by Kortland and Kragt (1980a). One is a two-level hierarchical arrangement in which there
is a central fascia and a number of local fascias, the light on the former indicating the number
of the local fascia on which the alarm has occurred. These authors also describe hierarchical
systems with additional levels. Another arrangement is a mimic panel with alarms located at
the relevant points on the flow diagram.
In computer-controlled systems alarms are generally printed out on hard copy in the time
sequence in which they occur and are also displayed on the VDU. On the latter there are
numerous display options, some of which are detailed by Kortlandt and Kragt. One is the
use of a dedicated VDU in which the alarms come up in time sequence, as on the printer.
Another is a group display in which all the alarms on an item of equipment are shown with
the active alarms(s) highlighted. Another is a mimic display akin to the hardwired mimic
The occurrence of an alarm is normally accompanied by an audible warning in the form of
the sounding of the hooter and a visual warning in the form of a flashing light. On a computer
system the sound of the printer serves as another form of audible signal. The operator
‘accepts’ the alarm by depressing the appropriate pushbutton.
Annunciator sequences are given in ISA Std S18.1: 1975 Annunciator Sequences and
Specifications, which recognizes three types: (1) automatic reset (A), (2) manual reset (M)
and (3) ringback (R). Automatic reset returns to the normal state automatically once the
alarm has been acknowledged and the process variable has returned to its normal state.
Manual reset is similar except that return to the normal state requires operation of the
manual reset pushbutton. Ringback gives a warning, audible and/or visual, that the process
condition has returned to normal.
There are alarms on the trips in the Safety Interlock System (SIS). Operation of a trip is
signalled by an alarm. The SIS alarm system, like the rest of the SIS, should be separate
from the basic process control system (BPCS).
In a conventional instrument system, the hardware used to generate an alarm consists of a
sensor, a logic module and the visual display. The sensor and the logic elements may be
separate or may be combined in an alarm switch. Such combined switches are referred to
as flow switches, level switches, and so on.
In a computer-based system there are two approaches which may be used. In one the
computer receives from the sensor an analogue signal to which the program applies logic
to generate the alarm. In the other the signal enters in digital form. The latter can provide a
cheaper system, but reduces the scope for detection of instrument malfunction.
Some basic requirements for an alarm are that it attracts the attention of the operator, that
it be readily identifiable and that it indicate unambiguously the variable which has gone out
of limit.
14.8.2 Alarm system features
As just described, an alarm system is a normal feature of process computer control. The
scanning of large numbers of process variables for alarm conditions is a function very
suitable for a computer. Usually the alarm system represents a fairly straightforward
translation of a conventional system on to the computer. Specified limits of process variables
and states of equipment are scanned and resulting alarms are displayed on a typewriter,
necessarily in time order, and also on a VDU, where time order is one of a number of alarm
display options.
The process computer has enormous potential for the development of improved alarm
systems, but also brings with it the danger of excess. There is first the choice of variables
which are to be monitored. It is no longer necessary for these to be confined to the process
variables measured by the plant sensors. In addition, ‘indirect’ or ‘inferred’ measurements
calculated from one or more process measurements may be utilized. This considerably
increases the power of the alarm system.
Then there are a number of different types of alarm which may be used. These include
absolute alarms, relative, deviation or set-point alarms, instrument alarms and rate-ofchange alarms, in which the alarm limits are, respectively, absolute values of the process
variable, absolute or proportional deviations of the process variable from the loop set-point,
zero or full-scale readings of the instrument and rate of change of the process variable.
The level at which the alarm limits are set is another important feature. Several sets of alarm
limits may be put on a single variable to give different degrees of alarms such as early
warning, action or danger alarms. The alarms so generated may be ordered and displayed
in various ways, particularly in respect of the importance of the variable and the degree of
The conventional alarm system, therefore, is severely limited by hardware considerations
and is relatively inflexible. The type of alarm is usually restricted to an absolute alarm. The
computer-based alarm system is potentially much more versatile.
The alarm system, however, is frequently one of the least satisfactory features of the control
system. The most common defect is that there are too many alarms and that they stay active
for too long. As a result, the system tends to become discredited with the operator, who
comes to disregard many of the alarm signals and may even disable the devices which
signal the alarms.
Computer-based alarm systems also have some faults peculiarly their own. It is fatally easy
with a computer to have a proliferation of types and degrees of alarm. Moreover, the most
easily implemented displays, such as time-ordered alarms on a typewriter or a VDU, are
inferior to conventional fascias in respect of aspects such as pattern recognition.
The main problem in alarm systems is the lack of a clear design philosophy. Ideally, the
alarm system should be designed on the basis of the information flow in the plant and the
alarm instrumentation selected and located to maximize the information available for control,
bearing in mind instrumentation reliability considerations. In fact, an alarm system is often a
collection of subsystems specified by designers of particular equipment with the addition of
some further alarms. An alarm system is an aid for the operator. An important but often
neglected question is therefore what action he is required to take when an alarm is signalled.
There are also specific problems which cause alarms to be numerous and persistent. One
is the confusion of alarms and statuses. A status merely indicates that an equipment is in a
particular state, e.g. that an agitator is not running. An alarm, by contrast, indicates that an
equipment is in a particular state and should be in a different one, e.g. that an agitator is not
running but should be. On most plants there are a number of statuses which need to be
displayed, but there is frequently no separate status display and so the alarm display has to
be used. As a result, if a whole section of plant is not in use, a complete block of alarms may
be permanently up on a display, even though these are not strictly alarm conditions. The
problem can be overcome by the use of separate types of display, e.g. yellow for alarms,
white for statuses, in conventional systems and by similar separate displays in computer
systems, but often this is not done.
A somewhat similar problem is the relation of the alarms to the state of the process. A
process often has a number of different states and a signal which is an alarm in one state,
e.g. normal operation, is not a genuine alarm in another, as with, say, start-up or
maintenance. It may be desirable to suppress certain alarms during particular states. This
can be done relatively easily with a computer but not with a conventional system. It should
be added, however, that suppression of alarms needs to be done with care, each case being
considered on its merits.
On most plants there is an element of sequential control, e.g. start-up. As long as no fault
occurs, control of the sequence is usually straightforward, but the need to allow for faults at
each stage of the sequence can make sequential control quite complex. With sequential
operations, therefore, the sequential control and alarm systems are scarcely separable. A
computer is particularly suitable for performing sequential control.
The state of the art in alarm system design is not satisfactory, therefore. In conventional
systems this may be ascribed largely to the inflexibility of the hardware, but the continuance
of the problem in computer systems suggests that there are also deficiencies in design
The process computer provides the basis for better alarm systems. It makes it possible to
monitor indirect measurements and to generate different types and degrees of alarm, to
distinguish between alarms and statuses, and to adapt the alarms to the process state. It is
also possible to provide more sophisticated facilities such as analysis of alarms, as
described below. However, there is scope for great improvement in alarm systems even
without the use of such advanced facilities.
14.8.3 Alarm management
It will be apparent from the foregoing that in many systems some form of alarm management
is desirable. Alarm management is discussed by the CCPS (1993/14). Approaches to the
problem include (1) alarm prioritization and segregation, (2) alarm suppression and (3) alarm
handling in sequential operations.
Alarms may be ranked in priority. The CCPS suggests a four level system of prioritization in
which critical alarms are assigned to Level 1 and important but noncritical alarms to Level
2, and so on. The alarms are then segregated by level.
With regard to alarm suppression, the CCPS describes two methods. One is conditional
suppression, which may be used where an alarm does not indicate a dangerous situation
and where it is a symptom which can readily be deduced from the active alarms. The other
is flash suppression, which involves omitting the first stage in the alarm annunciation
sequence, namely the sounding of the hooter and flashing of the fascia lamp. Instead, the
alarm is shown illuminated, as if already acknowledged.
As already discussed, sequential operations such as plant start-up tend to activate some
alarms. With a computer-based system arrangement may be made for appropriate
suppression of alarms during such sequences.
Alarm management techniques need to be approached with caution, taking account of the
overall information needs of the operator, of the findings of research on the operation of
alarm systems and of other factors such as the fact that an alarm which is less important in
one situation may be more so in another.
14.8.4 Alarm system operation
Studies of the operation of process alarm systems have been conducted by Seminara,
Gonzalez and Parsons (1976) and Kragt and co-workers (Kortlandt and Kragt, 1980a,b;
Kragt, 1983, 1984a,b; Kragt and Bonten, 1983; Kragt and Daniels, 1984).
The work of Seminara, Gonzalez and Parsons (1976) was a wide-ranging study of various
aspects of control room design. On alarms, two features found by these authors are of
particular interest. One is that in some cases the number of alarms was some 50-100 per
shift, and in one case 100 in an hour. The other is the proportion of false alarms, for which
operator estimates ranged from ‘occasional’, through 15% and 30%, up to 50%.
Kortland and Kragt (1980a,b) studied five different control room situations, by methods
including questionnaire and observation, two of the principal investigations being in the
control rooms of a fertilizer plant and a high pressure polyethylene plant. On both plants the
authors identified two confusing features of the alarms. One was the occurrence of
oscillations in which the measured values moved back and forth across the alarm limit. The
other was the occurrence of clusters of alarms. The number of alarms registered on the two
plants was as follows:
Fertilizer plant
Polyethylene plant
Signals from clusters
Observation period (h)
occurring during clusters
or oscillations
clusters or oscillations
Estimated from analysis of the data
The authors suggest that oscillations can be overcome by building in hysterisis, which is
indeed the normal approach, and that clusters may be treated by grouping and suppression
of alarms. The intervals between successive signals, disregarding oscillations and clusters,
were analysed and found to fit a log-normal distribution. The response of the operators to
the alarms was as follows:
Fertilizer plant
Polyethylene plant
Signal followed by action
Action followed by signal
No action (%)
Thus about half the alarms were actually feedback cues on the effects of action taken by the
operators, who in many cases would have been disturbed not to receive such a signal.
Further evidence for this is given by the fact that on the fertilizer plant 55% of the alarm
signals were anticipated. On this plant, the operators judged the importance of the signals
as follows: important 13%, less important 36%, not important 43%, and unknown 8%.
Kragt (1984a) has described an investigation of the operator’s use of alarms in a computercontrolled plant. The main finding was that sequential information presentation is markedly
inferior to simultaneous presentation.
Fault Administration
As already stated, it is the function of the control system to avert the development of
conditions which may lead to shut-down, but if necessary to execute shut-down. Generally,
there are automatic trip systems to shut the plant down, but the responsibility of avoiding
this situation if at all possible falls to the operator.
Fault administration can be divided into three stages: fault detection, fault diagnosis and
fault correction or shut-down. For the first of these functions the operator has a job aid in the
form of the alarm system, while fault correction in the form of shut-down is also frequently
automatic, but for the other two functions, fault diagnosis and fault correction less drastic
than shut-down, he is largely on his own.
14.9.1 Fault detection
The alarm system represents a partial automation of fault detection. The operator still has
much to do, however, in detecting faults. This is partly a matter of the additional sensory
inputs such as vision, sound and vibration which the operator possesses. But it is also partly
due to his ability to interpret information, to recognize patterns and to detect instrument
14.9.2 Fault diagnosis
Once the existence of some kind of fault has been detected, the action taken depends on
the state of the plant. If it is in a safe condition, the next step is diagnosis of the cause. This
is usually left to the operator. The extent of the diagnosis problem may vary considerably
with the type of unit. It has been suggested, for example, that whereas on a crude distillation
unit the problem is quite complex, on a hydrotreater it is relatively simple (Duncan and Gray,
There are various ways in which the operator may approach fault diagnosis. Several workers
have observed that an operator frequently seems to respond only to the first alarm which
comes up. He associates this with a particular fault and responds using a rule-of-thumb.
This is an incomplete strategy, although it may be successful in quite a high proportion of
cases, especially where a particular fault occurs repeatedly.
An alternative approach is pattern recognition from the displays on the control panel. The
pattern may be static or dynamic. The static pattern is obtained by instantaneous
observation of the displays, like a still photograph. The operator then tries to match this
pattern with model patterns or templates for different faults. Duncan and Shepherd (1975b)
have developed a technique for training in fault diagnosis in which some operators use this
method. The alternative, and more complex, dynamic pattern recognition involves matching
the development of the fault over a period of time.
Another approach is the use of some kind of mental decision tree in which the operator
works down the paths of the tree, taking particular branches, depending on the instrument
readings. Duncan and Gray (1975a) have used this as the basis of an alternative training
technique. Yet another method is the active manipulation of the controls and observation of
the displays to determine the reaction of the plant to certain signals. Closely related to this
is the situation where no fault has been detected, but the operator is already controlling the
process when he observes some unusual feature and continues his manipulation to explore
this condition.
Whatever approach is adopted by the operator, fault diagnosis in the control room is very
dependent on the instrument readings. It is therefore necessary for the operator to check
whether the instruments are correct. The problem of checking to detect malfunctions is
considered later, but attention is drawn here to its importance in relation to fault diagnosis.
These different methods of fault diagnosis have important implications for aspects such as
displays and training. The conventional panel assists the recognition of static patterns,
whereas computer consoles generally do not. Chart recorders aid the recognition of dynamic
patterns and instrument faults. The question of training for fault diagnosis is considered later.
Fault diagnosis is not an easy task for the operator. There is scope, therefore, for computer
aids, if these can be devised. Some developments on these lines are described below.
14.9.3 Fault correction and shutdown
When, or possibly before, a fault has been diagnosed, it is usually possible to take some
corrective action which does not involve shutting the plant down. In some cases, the fault
correction is trivial, but in others, such as operating a complex sequence of valves, it is not.
Operating instructions are written for many of these activities, but otherwise this is a
relatively unexplored area. Fault correction is one of the activities for which interlocks may
be provided. Conventional interlocks were described in the previous chapter and
developments in computer software interlocks are outlined below.
Some fault conditions, however, require plant shut-down. Although fault administration has
been described in terms of successive stages of detection, diagnosis and correction, in
emergency shut-down usually little diagnosis is involved. The shut-down action is triggered
directly when it is detected that a critical process limit has been passed.
There are differing philosophies on the problem of allocation of responsibility for shutting
down the plant under fault conditions. In some plants the operator deals with fault conditions
with few automatic aids and is thus required to assure both safety and economic operation.
In others, automatic protective systems are provided to shut the plant down if it is moving
close to an unsafe condition and the operator thus has the economic role of preventing the
development of conditions which will cause shut-down.
In a plant without protective systems the operator is effectively given the duty of keeping the
plant running if he can, but shutting it down if he must. This tends to create in his mind a
conflict of priorities. Usually he will try to keep the plant running if he possibly can and, if
shut-down becomes necessary, he may tend to take action too late. Mention has already
been made of the human tendency to gamble on beating the odds. There are numerous
case histories which show the dangers inherent in this situation (Lees, 1976b).
The alternative approach is the use of automatic protective systems to guard against serious
hazards in the plant. The choice is made on the basis of quantitative assessment of the
hazards. This philosophy assigns to the operator the essentially economic role of keeping
the plant running.
Although the use of protective systems is rapidly increasing, the process operator usually
retains some responsibility for safe plant shut-down. There are a number of reasons for this.
In the first place, although high integrity protective systems with 2/3 voting are used on
particularly hazardous processes (R.M. Stewart, 1971), the majority of trip systems do not
have this degree of integrity. The failure rate of this simple 1/1 trip system has been quoted
as 0.67 faults/year (Kletz, 1974d).
Protective systems have other limitations which apply even to high integrity systems. One is
that it is very difficult to foresee and design for all possible faults, particularly those arising
from combinations of events. It is true, of course, that even if a process condition arises from
an unexpected source a protective system will usually handle it safely. But there remains a
residual of events, usually of low probability, against which there is no protection, either
because they were unforeseen or because their probability was estimated as below the
designer’s cut-off level.
Another problem is that a protective system is only partially effective against certain types
of fault, particularly failure of containment. In such an event the instrumentation can initiate
blowdown, shut-off and shut-down sequences, but while this may reduce the hazardous
escape of materials, it does not eliminate it.
Yet another difficulty is that many hazards occur not during steady running but during normal
start-up and shut-down or during the period after a trip and start-up from that condition. A
well designed protective system caters, of course, for these transitional regimes as well as
continuous operations. Nevertheless, this remains something of a problem area. Even with
automatic protective systems, therefore, the process operator tends to retain a residual
safety function. His effectiveness in performing this is discussed later.
14.10 Malfunction Detection
Another aspect of the administration of fault conditions by the control system is the detection
of malfunctions, particularly incipient malfunctions in plant equipment and instruments.
These malfunctions are distinguished from alarms in that although, they constitute a fault
condition of some kind, they have not as yet given rise to a formal alarm.
Malfunction detection activities are not confined to the control system, of course. Monitoring
of plant equipment by engineers, as described in Chapter 19, is a major area of work, usually
independent of the control system. Insofar as the control system does monitor malfunctions,
however, this function is primarily performed by the operator. The contribution of the
computer to malfunction detection is considered later.
Detection of instrument malfunction by the operator has been investigated by Anyakora and
Lees (1972a). In general, malfunction may be detected either from the condition of an
instrument or its performance. Detection from condition is illustrated by observation on the
plant of a leak on the impulse line to a differential pressure transducer or of stickiness on a
control valve. Detection from performance is exemplified by observation in the control room
of an excessively noise-free signal from the transducer or of an inconsistency between the
position of a valve stem and the measured flow for the valve. Most checks on instrument
condition require the operator to visit the equipment and use one of his senses to detect the
fault. Most performance checks can be made from the control room by using instrument
displays and are based on information redundancy.
Some of the ways in which the operator detects malfunction in instruments are illustrated by
considering the way in which he uses for this purpose one of his principal detection aids the chart recorder. Some typical chart records are shown in Figure 14.6. The operator
detects error in such signals by utilizing some form of redundant information and making a
comparison. Some types of redundant information are (1) a priori expectations, (2) past
signals of instrument, (3) duplicate instruments, (4) other instruments, and (5) control valve
Thus it may be expected a priori that an instrument reading will not go “hardover” to zero or
full scale, that it will give a ‘live’ rather than a ‘dead’ zero, that it will exhibit a certain noise
level, that its rate of change will not exceed a certain value and that it is free to move within
the full scale of the instrument. On the basis of such expectations, the operator might
diagnose malfunction in the signals shown in Figures 14.6(b-f). However, the firmness of
such expectations may vary with the plant operating conditions. For example, during startup, zero readings on some instruments may be correct.
It may not be possible to decide a priori what constitutes a reasonable expectation. The level
of noise, for example, tends to vary with the individual measurement. In this case, the
operator must use his knowledge of the range of variation of the noise on a particular
instrument in the past. Thus Figures 14.6(c) and 14.6(d) might or might not indicate
If there is a duplicate instrument, then detection of the fact that one of the instruments is
wrong is straightforward, although it may not be possible to say which. However, duplication
is not usual in the normal instrument systems which are of primary concern here. On the
other hand, near duplication is quite common. For example, the flow of a reactant leaving a
vaporizer and entering two parallel reactors may be measured at the exit of both the
vaporizers and the reactors, and the flow measurement systems provide a check on each
What constitutes a reasonable signal may depend on the signals given by other instruments.
Thus, although a signal which exhibits drift, such as that in Figure 14.6(g), may appear
incorrect, a check on other instruments may show that it is not.
Some types of variation in a signal, such as a change in the noise level, appear easy to
detect automatically. Others, such as that shown in Figure 14.6(i), are probably more
difficult, especially if their form is not known in advance. Here the human operator with his
well-developed ability to recognize visual patterns has the advantage.
There are a number of ways in which the readings of other instruments can serve as a check.
Some of these are (1) near-duplication, (2) mass and heat balances, (3) flow—pressure drop
relations, and (4) consistent states. This last check is based on the fact that certain variables
are related to each other and at a given state of operation must lie within certain ranges of
values. The position of control valves also provides a means of checking measurements.
This is most obvious for flow measurement but it is by no means limited to this.
These remarks apply essentially to measuring instruments, but checks can also be
developed for controllers and control valves. A general classification of instrument
malfunction diagnoses by the operator is shown in Table 14.6.
Many of the checks described do not show unambiguously that a particular instrument is not
working properly; often they indicate merely that there is an inconsistency which needs to
be explored further. However, this information is very important. Another kind of information
which the operator also uses in checking instruments is his knowledge of the probabilities
of failure of different instruments. He usually knows which have proved troublesome in the
The detection of instrument malfunction by the process operator is important for a number
of reasons. Instrument malfunctions tend to degrade the alarm system and introduce
difficulties into loop control and fault diagnosis. Their detection is usually left to the operator
and it is essential for him to have the facilities to do this. This includes appropriate displays
and may extend to computer aids.
Figure 14.6 Typical chart recorder displays of measurement signals (Anyakora and Lees,
1972a): (a) normal reading; (b) reading zero; (c) reading constant; (d) reading erratic; (e)
reading suddenly displaced; (f) reading limited below full scale; (g) reading drifting; (h)
reading cycling; (i) reading with unusual features (Courtesy of the Institution of Chemical
Table 14.6 General classification of methods of instrument malfunction detection (Anyakora
and Lees, 1972a) (Courtesy of the Institution of Chemical Engineers)
Measuring instruments
Control action and controllers
Measurement reading zero or full scale
Reading zero
Control action faulty
Control erratic
Reading full scale
Control sluggish
Measurement reading noise or dynamic
response faulty
Reading constant
Control cycling
Control unstable
Reading erratic
Control error excessive
Reading sluggish
Control otherwise faulty
Measurements reading displaced suddenly
Controller performance faulty
Reading fell suddenly
Controller tested by active tests
Reading rose suddenly
Controller condition faulty
Measurement reading limited within full
Reading limited above zero
Control valves and valve positioners
Reading limited below full scale
Measurement reading drifting
Reading falling
Reading rising
Measurement reading inconsistent with
duplicate measurement
Measurement reading inconsistent with
one other measurement
Reading inconsistent with nearduplicate measurement
Reading inconsistent with levelflow integration
Reading otherwise inconsistent
Measurement reading inconsistent with
simple model
Reading inconsistent with mass
Reading inconsistent with heat
Reading inconsistent with flowpressure drop relations
Reading otherwise inconsistent
Valve position inconsistent with signal
to valve (this requires independent
measurement of position)
Valve position inconsistent with flow
(but not necessarily flow measurement)
Valve passing fluid when
Valve not passing fluid when
Valve position inconsistent with
one other measurement
Valve position inconsistent with
simple model
Valve not moving
Valve stays closed
Valve stays open
Valve stays part open
Valve movement less than full travel
Valve not closing fully
Valve not opening fully
Valve travel otherwise limited
Reading zero but variable not zero
Reading not zero but variable zero
Valve movement erratic
Reading low
Valve movement sluggish
Reading high
Reading otherwise inconsistent
Measurement reading inconsistent with
control valve position
M10.1 Reading (flow) zero with valve
Valve movement faulty
Valve movement
Valve movement cycling
Valve or positioner performance faulty
Valve or positioner tested by active
Reading (flow) not zero with valve
Reading otherwise inconsistent
Measurement reading periodic or cycling
Measurement reading showing intermittent
Measurement reading faulty
Valve or positioner condition faulty
Measurement instrument tested by active
Measuring instrument condition faulty
14.11 Computer-based Aids
There are some functions which the computer can perform automatically, whereas there are
others which at present are performed by the operator, but for which computer aids have
been, or may be, developed to assist him. Some computer-based aids to assist the operator
which have been described include:
(1) system state display;
(2) alarm diagnosis;
(3) valve sequencing;
(4) malfunction detection.