The problem of system objectives and allocation of function in computer control of a basic oxygen furnace has been discussed by Crawley (1968), He gives as an example of a function which is more appropriately allocated to man the interpretation of the noise signal given by a bomb thermocouple. The point is also made, however, that the optimal allocation of function changes continuously as technology progresses. The difficulty of automating certain functions is further illustrated by the work of Ketteringham, O’Brien and Cole (Ketteringham, O’Brien and Cole, 1970; Ketteringham and O’Brien, 1974) on the scheduling of soaking pits using a man-computer interactive system. Problems of providing all the necessary information to the computer and of complexity in the decision-making make it difficult to automate this function, but it is possible to provide computer facilities which, by storage of large amounts of information, the use of predictive models and the provision of appropriate displays, greatly assist the operator to make decisions. The work involved simulation using a realistic interface and actual steel works schedulers. The problem is very similar to that of air traffic control, which has been extensively investigated in human factors. The importance of organizational and social factors is shown by the work of Engelstad (1970), who investigated these in a paper mill. The work revealed that individuals tended to operate too much in isolation and to have goals which were not necessarily optimal as far as the system was concerned. It sought to encourage communication by treating the control room as an information and control centre, which all concerned should use, and by designing jobs to give greater variety, responsibility, learning opportunity and wholeness. Other workers too have commented on the losses which can occur in continuous flow processes, if there are poor relationships between the men controlling the process at different points. Other studies have been concerned with task analysis, fault administration, displays, selection and training, and human error. These topics are considered below. 14.6 Allocation of Function As already emphasized, human factors has at least as important a role to play in matters of system design such as allocation of function as in those of detailed design. The classic approach to allocation of function is to list the functions which machines perform well and those which men perform well and to use this as a guide. One of the original lists was compiled by Fitts (1962) and such a list is often referred to as a ‘Fitts’ list’. However, this approach needs some qualification. As Fitts pointed out later, the functions which should be allocated to man are not so much those which he is good at as those which it is best from the system point of view that he perform, which is slightly different. The question of motivation is also important. De Jong and Koster (1971) have given a similar type of list showing the functions which man is motivated to perform. Further accounts of allocation of function is given by H.E. Price (1985) and Kantowitz and Sorkin (1987). Particularly important in relation to loss prevention are functions concerned with fault administration, fault diagnosis, plant shut-down and malfunction detection. 14.7 Information Display Once the task has been defined, it is possible to consider the design of displays. Information display is an important problem, which is intensified by the increasing density of information in modern control rooms. The traditional display is the conventional control panel. Computer graphics now present the engineer with a more versatile display facility, offering scope for all kinds of display for the operator, but it is probably fair to say that he is somewhat uncertain what to do with it. The first thing which should be emphasized is that a display is only a means to an end, the end being improved performance by the operator in executing some control function. The proper design of this function in its human factors aspects is more important than the details of the display itself. Some types of display which may be provided are listed in Table 14.5. The provision of displays which the operator deliberately samples with a specific object in view is only part of the problem. It is important also for the display system to cater both for his characteristic of acquiring information ‘at a glance’ and for his requirement for information redundancy. There is a need, therefore, for the development of displays which allow the operator to make a quick and effortless survey of the state of the system. As already described, the operator updates his knowledge of system state by predicting forward, using a mental model of the process and sampling key readings to check that he is on the right lines. He needs a survey display to enable him to do this. This need still exists even where other facilities are provided. Facilities such as alarm systems are based on a ‘management by exception’ approach, which is essential if the large amounts of information are to be handled. But when the exceptional condition has been detected, the operator must deal with it and for this he needs knowledge of the state of the process, which a survey display provides. A display of system state also allows the operator to use his ability to recognize patterns. This aspect is considered below. Table 14.5 Some displays for the process operator Displays of flow and mimic diagrams Displays of current measurements, other variables, statuses (other variables include indirect measurements, valve positions, etc.) Displays of trends of measurements, other variables, statuses Displays of control loop parameters Displays of alarms Displays of reduced data (e.g. histograms, quality control charts, statistical parameters) Displays of system state (e.g. mimic diagrams, “status array", “surface” and polar plots) Displays for manual control (e.g. predictive displays) Displays for alarm analysis Displays for sequential control Displays for scheduling and game-playing Displays for valve sequencing Displays for protective system checking Displays of maloperation Displays for malfunction detection Command displays Figure 14.4 Control panel in a chemical plant (courtesy of Kent Instruments Ltd) 14.7.1 Regular instrumentation It will be apparent from the foregoing that the conventional control panel has certain virtues. The panel shown in Figure 14.4 is typical of a modern control panel. The conventional panel does constitute a survey display, in which the instruments have spatial coding, from which the operator can obtain information at a glance and on which he can recognize patterns. These are solid advantages not to be discarded lightly. This is only true, however, if the density of information in the panel is not allowed to become too great. The advantages are very largely lost if it becomes necessary to use dense blocks of instruments which are difficult to distinguish individually. An important individual display is the chart recorder. A trend record has many advantages over an instantaneous display. As the work of Crossman, Cooke and Beishon (1964) shows, it assists the operator to learn the signal characteristics and facilitates his information sampling. Both Attwood (1970) and B. West and Clark (1974) found that recorders are useful to the operator in making coarse adjustments of operating point, while the latter authors also noted the operator’s use of recorders in handling fault conditions. Anyakora and Lees (1972a) have pointed out the value of recorders in enabling the operator to learn the signal characteristics and so recognize instrument malfunctions. 14.7.2 Computer consoles The computer console presents a marked contrast to conventional instrumentation. This is illustrated with particular starkness in Figure 14.5(a), which shows the console of the original ICI direct digital control (DDC) computer. Some of the features of ergonomic importance in this panel are: (1) specific action is required to obtain a display; (2) there is no spatial coding and the coding of the information required has to be remembered or looked up; (3) only one variable is displayed at a time; (4) only the instantaneous value of the variable is displayed; and (5) the presentation is digital rather than analogue. This is a revolutionary change in the operator’s interface. While there is, among engineers, a general awareness that the change is significant, its detailed human factors implications are not so well appreciated. The subsequent refinement of process computer consoles and the introduction of computer graphics have somewhat mitigated these features. A more modern computer console is shown in Figure 14.5(b). Moreover, the facilities which the computer offers in functions such as alarm monitoring or sequential control are powerful new aids to the operator. It remains true, however, that the transition from the conventional panel to the computer console involves some serious human factors losses, particularly in information display. This does not necessarily mean that the change should not be made. Conventional panels are expensive and lose much of their advantage if the information density becomes too high. Computers offer some very worthwhile additional facilities. But the change should at least be made in the full awareness of its implications and with every effort to restore in the new system the characteristic advantages of the old. Figure 14.5(a) Control panel of the direct digital control (DDC) computer on ICI's ammonia soda plant at Fleetwood, 1961 (courtesy of Ferranti Ltd and Imperial Chemical Industries Ltd) Figure 14.5(b) Control panel and display of Foxboro Intelligent Automation Computer System (Foxboro Company Ltd) 14.8 Alarm Systems As stated in the previous chapter, the progress made in the automation of processes under normal conditions has focused attention on the monitoring and handling of abnormal or fault conditions. It is the function of the control system to prevent, if possible, the development of conditions which will lead to plant shut-down, but to carry out the shut-down if necessary. The responsibility for averting shut-down conditions falls largely to the operator. The principal automatic aid provided to assist him is the alarm system. Alarm systems are an extremely important but curiously neglected and often unsatisfactory aspect of process control. An alarm system is a normal feature of conventional control systems. If a process variable exceeds specified limits or if an equipment is not in a specified state, an alarm is signalled. Both audible and visual signals are used. Accounts of alarm systems include those by E. Edwards and Lees (1973), Andow and Lees (1974), Hanes (1978), Swain and Guttman (1983), Schellekens (1984), Shaw (1985) and the Center for Chemical Process Safety (CCPS) (1993/14). There have also been investigations of the operation of alarm systems, as described below. 14.8.1 Basic alarm systems The traditional equipment used for process alarms is a lightbox annunciator. This consists of a fascia, or array, of separate small rectangular coloured glass panels, behind each of which there is a lamp which lights up if the alarm is active. Each panel is inscribed with an alarm message. The panels are colour coded, usually with red being assigned to the highest priority. This visual display is complemented by a hooter, when a new alarm occurs, the hooter sounds and the fascia light flashes until the operator acknowledges receipt by pressing a button. The panel then remains lit until the alarm condition is cleared by operator action or otherwise, as described below. There are a number of variations on this basic system. An account of some of these is given by Kortland and Kragt (1980a). One is a two-level hierarchical arrangement in which there is a central fascia and a number of local fascias, the light on the former indicating the number of the local fascia on which the alarm has occurred. These authors also describe hierarchical systems with additional levels. Another arrangement is a mimic panel with alarms located at the relevant points on the flow diagram. In computer-controlled systems alarms are generally printed out on hard copy in the time sequence in which they occur and are also displayed on the VDU. On the latter there are numerous display options, some of which are detailed by Kortlandt and Kragt. One is the use of a dedicated VDU in which the alarms come up in time sequence, as on the printer. Another is a group display in which all the alarms on an item of equipment are shown with the active alarms(s) highlighted. Another is a mimic display akin to the hardwired mimic panel. The occurrence of an alarm is normally accompanied by an audible warning in the form of the sounding of the hooter and a visual warning in the form of a flashing light. On a computer system the sound of the printer serves as another form of audible signal. The operator ‘accepts’ the alarm by depressing the appropriate pushbutton. Annunciator sequences are given in ISA Std S18.1: 1975 Annunciator Sequences and Specifications, which recognizes three types: (1) automatic reset (A), (2) manual reset (M) and (3) ringback (R). Automatic reset returns to the normal state automatically once the alarm has been acknowledged and the process variable has returned to its normal state. Manual reset is similar except that return to the normal state requires operation of the manual reset pushbutton. Ringback gives a warning, audible and/or visual, that the process condition has returned to normal. There are alarms on the trips in the Safety Interlock System (SIS). Operation of a trip is signalled by an alarm. The SIS alarm system, like the rest of the SIS, should be separate from the basic process control system (BPCS). In a conventional instrument system, the hardware used to generate an alarm consists of a sensor, a logic module and the visual display. The sensor and the logic elements may be separate or may be combined in an alarm switch. Such combined switches are referred to as flow switches, level switches, and so on. In a computer-based system there are two approaches which may be used. In one the computer receives from the sensor an analogue signal to which the program applies logic to generate the alarm. In the other the signal enters in digital form. The latter can provide a cheaper system, but reduces the scope for detection of instrument malfunction. Some basic requirements for an alarm are that it attracts the attention of the operator, that it be readily identifiable and that it indicate unambiguously the variable which has gone out of limit. 14.8.2 Alarm system features As just described, an alarm system is a normal feature of process computer control. The scanning of large numbers of process variables for alarm conditions is a function very suitable for a computer. Usually the alarm system represents a fairly straightforward translation of a conventional system on to the computer. Specified limits of process variables and states of equipment are scanned and resulting alarms are displayed on a typewriter, necessarily in time order, and also on a VDU, where time order is one of a number of alarm display options. The process computer has enormous potential for the development of improved alarm systems, but also brings with it the danger of excess. There is first the choice of variables which are to be monitored. It is no longer necessary for these to be confined to the process variables measured by the plant sensors. In addition, ‘indirect’ or ‘inferred’ measurements calculated from one or more process measurements may be utilized. This considerably increases the power of the alarm system. Then there are a number of different types of alarm which may be used. These include absolute alarms, relative, deviation or set-point alarms, instrument alarms and rate-ofchange alarms, in which the alarm limits are, respectively, absolute values of the process variable, absolute or proportional deviations of the process variable from the loop set-point, zero or full-scale readings of the instrument and rate of change of the process variable. The level at which the alarm limits are set is another important feature. Several sets of alarm limits may be put on a single variable to give different degrees of alarms such as early warning, action or danger alarms. The alarms so generated may be ordered and displayed in various ways, particularly in respect of the importance of the variable and the degree of alarm. The conventional alarm system, therefore, is severely limited by hardware considerations and is relatively inflexible. The type of alarm is usually restricted to an absolute alarm. The computer-based alarm system is potentially much more versatile. The alarm system, however, is frequently one of the least satisfactory features of the control system. The most common defect is that there are too many alarms and that they stay active for too long. As a result, the system tends to become discredited with the operator, who comes to disregard many of the alarm signals and may even disable the devices which signal the alarms. Computer-based alarm systems also have some faults peculiarly their own. It is fatally easy with a computer to have a proliferation of types and degrees of alarm. Moreover, the most easily implemented displays, such as time-ordered alarms on a typewriter or a VDU, are inferior to conventional fascias in respect of aspects such as pattern recognition. The main problem in alarm systems is the lack of a clear design philosophy. Ideally, the alarm system should be designed on the basis of the information flow in the plant and the alarm instrumentation selected and located to maximize the information available for control, bearing in mind instrumentation reliability considerations. In fact, an alarm system is often a collection of subsystems specified by designers of particular equipment with the addition of some further alarms. An alarm system is an aid for the operator. An important but often neglected question is therefore what action he is required to take when an alarm is signalled. There are also specific problems which cause alarms to be numerous and persistent. One is the confusion of alarms and statuses. A status merely indicates that an equipment is in a particular state, e.g. that an agitator is not running. An alarm, by contrast, indicates that an equipment is in a particular state and should be in a different one, e.g. that an agitator is not running but should be. On most plants there are a number of statuses which need to be displayed, but there is frequently no separate status display and so the alarm display has to be used. As a result, if a whole section of plant is not in use, a complete block of alarms may be permanently up on a display, even though these are not strictly alarm conditions. The problem can be overcome by the use of separate types of display, e.g. yellow for alarms, white for statuses, in conventional systems and by similar separate displays in computer systems, but often this is not done. A somewhat similar problem is the relation of the alarms to the state of the process. A process often has a number of different states and a signal which is an alarm in one state, e.g. normal operation, is not a genuine alarm in another, as with, say, start-up or maintenance. It may be desirable to suppress certain alarms during particular states. This can be done relatively easily with a computer but not with a conventional system. It should be added, however, that suppression of alarms needs to be done with care, each case being considered on its merits. On most plants there is an element of sequential control, e.g. start-up. As long as no fault occurs, control of the sequence is usually straightforward, but the need to allow for faults at each stage of the sequence can make sequential control quite complex. With sequential operations, therefore, the sequential control and alarm systems are scarcely separable. A computer is particularly suitable for performing sequential control. The state of the art in alarm system design is not satisfactory, therefore. In conventional systems this may be ascribed largely to the inflexibility of the hardware, but the continuance of the problem in computer systems suggests that there are also deficiencies in design philosophy. The process computer provides the basis for better alarm systems. It makes it possible to monitor indirect measurements and to generate different types and degrees of alarm, to distinguish between alarms and statuses, and to adapt the alarms to the process state. It is also possible to provide more sophisticated facilities such as analysis of alarms, as described below. However, there is scope for great improvement in alarm systems even without the use of such advanced facilities. 14.8.3 Alarm management It will be apparent from the foregoing that in many systems some form of alarm management is desirable. Alarm management is discussed by the CCPS (1993/14). Approaches to the problem include (1) alarm prioritization and segregation, (2) alarm suppression and (3) alarm handling in sequential operations. Alarms may be ranked in priority. The CCPS suggests a four level system of prioritization in which critical alarms are assigned to Level 1 and important but noncritical alarms to Level 2, and so on. The alarms are then segregated by level. With regard to alarm suppression, the CCPS describes two methods. One is conditional suppression, which may be used where an alarm does not indicate a dangerous situation and where it is a symptom which can readily be deduced from the active alarms. The other is flash suppression, which involves omitting the first stage in the alarm annunciation sequence, namely the sounding of the hooter and flashing of the fascia lamp. Instead, the alarm is shown illuminated, as if already acknowledged. As already discussed, sequential operations such as plant start-up tend to activate some alarms. With a computer-based system arrangement may be made for appropriate suppression of alarms during such sequences. Alarm management techniques need to be approached with caution, taking account of the overall information needs of the operator, of the findings of research on the operation of alarm systems and of other factors such as the fact that an alarm which is less important in one situation may be more so in another. 14.8.4 Alarm system operation Studies of the operation of process alarm systems have been conducted by Seminara, Gonzalez and Parsons (1976) and Kragt and co-workers (Kortlandt and Kragt, 1980a,b; Kragt, 1983, 1984a,b; Kragt and Bonten, 1983; Kragt and Daniels, 1984). The work of Seminara, Gonzalez and Parsons (1976) was a wide-ranging study of various aspects of control room design. On alarms, two features found by these authors are of particular interest. One is that in some cases the number of alarms was some 50-100 per shift, and in one case 100 in an hour. The other is the proportion of false alarms, for which operator estimates ranged from ‘occasional’, through 15% and 30%, up to 50%. Kortland and Kragt (1980a,b) studied five different control room situations, by methods including questionnaire and observation, two of the principal investigations being in the control rooms of a fertilizer plant and a high pressure polyethylene plant. On both plants the authors identified two confusing features of the alarms. One was the occurrence of oscillations in which the measured values moved back and forth across the alarm limit. The other was the occurrence of clusters of alarms. The number of alarms registered on the two plants was as follows: Fertilizer plant Polyethylene plant 63 70 816 1714 280a 325a 410a NA Signals from clusters 1288a 1300a Total 2794 3339 Observation period (h) Single alarms, not occurring during clusters or oscillations Single alarms during clusters or oscillations Signals during oscillations a Estimated from analysis of the data The authors suggest that oscillations can be overcome by building in hysterisis, which is indeed the normal approach, and that clusters may be treated by grouping and suppression of alarms. The intervals between successive signals, disregarding oscillations and clusters, were analysed and found to fit a log-normal distribution. The response of the operators to the alarms was as follows: Fertilizer plant Polyethylene plant 47 43 46 50 7 7 Signal followed by action (%) Action followed by signal (%) No action (%) Thus about half the alarms were actually feedback cues on the effects of action taken by the operators, who in many cases would have been disturbed not to receive such a signal. Further evidence for this is given by the fact that on the fertilizer plant 55% of the alarm signals were anticipated. On this plant, the operators judged the importance of the signals as follows: important 13%, less important 36%, not important 43%, and unknown 8%. Kragt (1984a) has described an investigation of the operator’s use of alarms in a computercontrolled plant. The main finding was that sequential information presentation is markedly inferior to simultaneous presentation. 14.9 Fault Administration As already stated, it is the function of the control system to avert the development of conditions which may lead to shut-down, but if necessary to execute shut-down. Generally, there are automatic trip systems to shut the plant down, but the responsibility of avoiding this situation if at all possible falls to the operator. Fault administration can be divided into three stages: fault detection, fault diagnosis and fault correction or shut-down. For the first of these functions the operator has a job aid in the form of the alarm system, while fault correction in the form of shut-down is also frequently automatic, but for the other two functions, fault diagnosis and fault correction less drastic than shut-down, he is largely on his own. 14.9.1 Fault detection The alarm system represents a partial automation of fault detection. The operator still has much to do, however, in detecting faults. This is partly a matter of the additional sensory inputs such as vision, sound and vibration which the operator possesses. But it is also partly due to his ability to interpret information, to recognize patterns and to detect instrument errors. 14.9.2 Fault diagnosis Once the existence of some kind of fault has been detected, the action taken depends on the state of the plant. If it is in a safe condition, the next step is diagnosis of the cause. This is usually left to the operator. The extent of the diagnosis problem may vary considerably with the type of unit. It has been suggested, for example, that whereas on a crude distillation unit the problem is quite complex, on a hydrotreater it is relatively simple (Duncan and Gray, 1975a). There are various ways in which the operator may approach fault diagnosis. Several workers have observed that an operator frequently seems to respond only to the first alarm which comes up. He associates this with a particular fault and responds using a rule-of-thumb. This is an incomplete strategy, although it may be successful in quite a high proportion of cases, especially where a particular fault occurs repeatedly. An alternative approach is pattern recognition from the displays on the control panel. The pattern may be static or dynamic. The static pattern is obtained by instantaneous observation of the displays, like a still photograph. The operator then tries to match this pattern with model patterns or templates for different faults. Duncan and Shepherd (1975b) have developed a technique for training in fault diagnosis in which some operators use this method. The alternative, and more complex, dynamic pattern recognition involves matching the development of the fault over a period of time. Another approach is the use of some kind of mental decision tree in which the operator works down the paths of the tree, taking particular branches, depending on the instrument readings. Duncan and Gray (1975a) have used this as the basis of an alternative training technique. Yet another method is the active manipulation of the controls and observation of the displays to determine the reaction of the plant to certain signals. Closely related to this is the situation where no fault has been detected, but the operator is already controlling the process when he observes some unusual feature and continues his manipulation to explore this condition. Whatever approach is adopted by the operator, fault diagnosis in the control room is very dependent on the instrument readings. It is therefore necessary for the operator to check whether the instruments are correct. The problem of checking to detect malfunctions is considered later, but attention is drawn here to its importance in relation to fault diagnosis. These different methods of fault diagnosis have important implications for aspects such as displays and training. The conventional panel assists the recognition of static patterns, whereas computer consoles generally do not. Chart recorders aid the recognition of dynamic patterns and instrument faults. The question of training for fault diagnosis is considered later. Fault diagnosis is not an easy task for the operator. There is scope, therefore, for computer aids, if these can be devised. Some developments on these lines are described below. 14.9.3 Fault correction and shutdown When, or possibly before, a fault has been diagnosed, it is usually possible to take some corrective action which does not involve shutting the plant down. In some cases, the fault correction is trivial, but in others, such as operating a complex sequence of valves, it is not. Operating instructions are written for many of these activities, but otherwise this is a relatively unexplored area. Fault correction is one of the activities for which interlocks may be provided. Conventional interlocks were described in the previous chapter and developments in computer software interlocks are outlined below. Some fault conditions, however, require plant shut-down. Although fault administration has been described in terms of successive stages of detection, diagnosis and correction, in emergency shut-down usually little diagnosis is involved. The shut-down action is triggered directly when it is detected that a critical process limit has been passed. There are differing philosophies on the problem of allocation of responsibility for shutting down the plant under fault conditions. In some plants the operator deals with fault conditions with few automatic aids and is thus required to assure both safety and economic operation. In others, automatic protective systems are provided to shut the plant down if it is moving close to an unsafe condition and the operator thus has the economic role of preventing the development of conditions which will cause shut-down. In a plant without protective systems the operator is effectively given the duty of keeping the plant running if he can, but shutting it down if he must. This tends to create in his mind a conflict of priorities. Usually he will try to keep the plant running if he possibly can and, if shut-down becomes necessary, he may tend to take action too late. Mention has already been made of the human tendency to gamble on beating the odds. There are numerous case histories which show the dangers inherent in this situation (Lees, 1976b). The alternative approach is the use of automatic protective systems to guard against serious hazards in the plant. The choice is made on the basis of quantitative assessment of the hazards. This philosophy assigns to the operator the essentially economic role of keeping the plant running. Although the use of protective systems is rapidly increasing, the process operator usually retains some responsibility for safe plant shut-down. There are a number of reasons for this. In the first place, although high integrity protective systems with 2/3 voting are used on particularly hazardous processes (R.M. Stewart, 1971), the majority of trip systems do not have this degree of integrity. The failure rate of this simple 1/1 trip system has been quoted as 0.67 faults/year (Kletz, 1974d). Protective systems have other limitations which apply even to high integrity systems. One is that it is very difficult to foresee and design for all possible faults, particularly those arising from combinations of events. It is true, of course, that even if a process condition arises from an unexpected source a protective system will usually handle it safely. But there remains a residual of events, usually of low probability, against which there is no protection, either because they were unforeseen or because their probability was estimated as below the designer’s cut-off level. Another problem is that a protective system is only partially effective against certain types of fault, particularly failure of containment. In such an event the instrumentation can initiate blowdown, shut-off and shut-down sequences, but while this may reduce the hazardous escape of materials, it does not eliminate it. Yet another difficulty is that many hazards occur not during steady running but during normal start-up and shut-down or during the period after a trip and start-up from that condition. A well designed protective system caters, of course, for these transitional regimes as well as continuous operations. Nevertheless, this remains something of a problem area. Even with automatic protective systems, therefore, the process operator tends to retain a residual safety function. His effectiveness in performing this is discussed later. 14.10 Malfunction Detection Another aspect of the administration of fault conditions by the control system is the detection of malfunctions, particularly incipient malfunctions in plant equipment and instruments. These malfunctions are distinguished from alarms in that although, they constitute a fault condition of some kind, they have not as yet given rise to a formal alarm. Malfunction detection activities are not confined to the control system, of course. Monitoring of plant equipment by engineers, as described in Chapter 19, is a major area of work, usually independent of the control system. Insofar as the control system does monitor malfunctions, however, this function is primarily performed by the operator. The contribution of the computer to malfunction detection is considered later. Detection of instrument malfunction by the operator has been investigated by Anyakora and Lees (1972a). In general, malfunction may be detected either from the condition of an instrument or its performance. Detection from condition is illustrated by observation on the plant of a leak on the impulse line to a differential pressure transducer or of stickiness on a control valve. Detection from performance is exemplified by observation in the control room of an excessively noise-free signal from the transducer or of an inconsistency between the position of a valve stem and the measured flow for the valve. Most checks on instrument condition require the operator to visit the equipment and use one of his senses to detect the fault. Most performance checks can be made from the control room by using instrument displays and are based on information redundancy. Some of the ways in which the operator detects malfunction in instruments are illustrated by considering the way in which he uses for this purpose one of his principal detection aids the chart recorder. Some typical chart records are shown in Figure 14.6. The operator detects error in such signals by utilizing some form of redundant information and making a comparison. Some types of redundant information are (1) a priori expectations, (2) past signals of instrument, (3) duplicate instruments, (4) other instruments, and (5) control valve position. Thus it may be expected a priori that an instrument reading will not go “hardover” to zero or full scale, that it will give a ‘live’ rather than a ‘dead’ zero, that it will exhibit a certain noise level, that its rate of change will not exceed a certain value and that it is free to move within the full scale of the instrument. On the basis of such expectations, the operator might diagnose malfunction in the signals shown in Figures 14.6(b-f). However, the firmness of such expectations may vary with the plant operating conditions. For example, during startup, zero readings on some instruments may be correct. It may not be possible to decide a priori what constitutes a reasonable expectation. The level of noise, for example, tends to vary with the individual measurement. In this case, the operator must use his knowledge of the range of variation of the noise on a particular instrument in the past. Thus Figures 14.6(c) and 14.6(d) might or might not indicate malfunction. If there is a duplicate instrument, then detection of the fact that one of the instruments is wrong is straightforward, although it may not be possible to say which. However, duplication is not usual in the normal instrument systems which are of primary concern here. On the other hand, near duplication is quite common. For example, the flow of a reactant leaving a vaporizer and entering two parallel reactors may be measured at the exit of both the vaporizers and the reactors, and the flow measurement systems provide a check on each other. What constitutes a reasonable signal may depend on the signals given by other instruments. Thus, although a signal which exhibits drift, such as that in Figure 14.6(g), may appear incorrect, a check on other instruments may show that it is not. Some types of variation in a signal, such as a change in the noise level, appear easy to detect automatically. Others, such as that shown in Figure 14.6(i), are probably more difficult, especially if their form is not known in advance. Here the human operator with his well-developed ability to recognize visual patterns has the advantage. There are a number of ways in which the readings of other instruments can serve as a check. Some of these are (1) near-duplication, (2) mass and heat balances, (3) flow—pressure drop relations, and (4) consistent states. This last check is based on the fact that certain variables are related to each other and at a given state of operation must lie within certain ranges of values. The position of control valves also provides a means of checking measurements. This is most obvious for flow measurement but it is by no means limited to this. These remarks apply essentially to measuring instruments, but checks can also be developed for controllers and control valves. A general classification of instrument malfunction diagnoses by the operator is shown in Table 14.6. Many of the checks described do not show unambiguously that a particular instrument is not working properly; often they indicate merely that there is an inconsistency which needs to be explored further. However, this information is very important. Another kind of information which the operator also uses in checking instruments is his knowledge of the probabilities of failure of different instruments. He usually knows which have proved troublesome in the past. The detection of instrument malfunction by the process operator is important for a number of reasons. Instrument malfunctions tend to degrade the alarm system and introduce difficulties into loop control and fault diagnosis. Their detection is usually left to the operator and it is essential for him to have the facilities to do this. This includes appropriate displays and may extend to computer aids. Figure 14.6 Typical chart recorder displays of measurement signals (Anyakora and Lees, 1972a): (a) normal reading; (b) reading zero; (c) reading constant; (d) reading erratic; (e) reading suddenly displaced; (f) reading limited below full scale; (g) reading drifting; (h) reading cycling; (i) reading with unusual features (Courtesy of the Institution of Chemical Engineers) Table 14.6 General classification of methods of instrument malfunction detection (Anyakora and Lees, 1972a) (Courtesy of the Institution of Chemical Engineers) Measuring instruments Control action and controllers M1 C1 Measurement reading zero or full scale M1.1 Reading zero Control action faulty C1.1 Control erratic M1.2 M2 M3 M4 M5 M6 M7 M8 M9 M10 Reading full scale C1.2 Control sluggish Measurement reading noise or dynamic response faulty M2.1 Reading constant C1.3 Control cycling C1.4 Control unstable M2.2 Reading erratic C1.5 Control error excessive M2.3 Reading sluggish C1.6 Control otherwise faulty Measurements reading displaced suddenly C2 Controller performance faulty M3.1 Reading fell suddenly C3 Controller tested by active tests M3.2 Reading rose suddenly C4 Controller condition faulty Measurement reading limited within full scale M4.1 Reading limited above zero Control valves and valve positioners M4.2 V2 Reading limited below full scale V1 Measurement reading drifting M5.1 Reading falling M5.2 Reading rising Measurement reading inconsistent with duplicate measurement Measurement reading inconsistent with one other measurement M7.1 Reading inconsistent with nearduplicate measurement M7.2 Reading inconsistent with levelflow integration M7.3 Reading otherwise inconsistent Measurement reading inconsistent with simple model M8.1 Reading inconsistent with mass balance M8.2 Reading inconsistent with heat balance M8.3 Reading inconsistent with flowpressure drop relations M8.4 Reading otherwise inconsistent V3 V4 V5 Valve position inconsistent with signal to valve (this requires independent measurement of position) Valve position inconsistent with flow (but not necessarily flow measurement) V2.1 Valve passing fluid when closed V2.2 Valve not passing fluid when open V2.3 Valve position otherwise inconsistent with flow measurement V2.4 Valve position inconsistent with one other measurement V2.5 Valve position inconsistent with simple model Valve not moving V3.1 Valve stays closed V3.2 Valve stays open V3.3 Valve stays part open Valve movement less than full travel V4.1 Valve not closing fully V4.2 Valve not opening fully V4.3 Valve travel otherwise limited M9.1 Reading zero but variable not zero M9.2 Reading not zero but variable zero V5.1 Valve movement erratic M9.3 Reading low V5.2 Valve movement sluggish M9.4 Reading high V5.3 M9.5 Reading otherwise inconsistent Measurement reading inconsistent with control valve position M10.1 Reading (flow) zero with valve open Valve movement faulty V6 Valve movement faulty Valve movement cycling otherwise V7 Valve or positioner performance faulty V8 Valve or positioner tested by active tests M10.2 M10.3 Reading (flow) not zero with valve closed Reading otherwise inconsistent M11 Measurement reading periodic or cycling M12 Measurement reading showing intermittent fault Measurement reading faulty M13 M14 M15 V9 Valve or positioner condition faulty Measurement instrument tested by active test Measuring instrument condition faulty 14.11 Computer-based Aids There are some functions which the computer can perform automatically, whereas there are others which at present are performed by the operator, but for which computer aids have been, or may be, developed to assist him. Some computer-based aids to assist the operator which have been described include: (1) system state display; (2) alarm diagnosis; (3) valve sequencing; (4) malfunction detection.