|At time of publication:|| HCI Group and Dept. of Computer Science,
University of York
Frequently only steady state functionality is considered when specifying and documenting interactive systems, the lag between user's commands and the system's response is ignored. Various compromises are used when implementing these systems in order to approximate the ideal of an infinitely fast machine in the real situation. Unfortunately, in this process, properties such as "what you see is what you have got" may be lost. Such problems are especially bad when applications are embedded in surrounding systems. This paper addresses these problems with the aid of a simple formal model which clarifies typical system behaviour and describes what information is required by the user. The appearance of such information is discused, critically analysing existing techniques (e.g. wait cursors) and proposing novel ones (e.g. munchman buffers), and the demands that such techniques put on the surrounding systems are examined.
keywords: interactive systems, timing properties, formal methods, synchronization.
Most specification and documentation of interactive systems covers only steady state functionality. That is, the effect of each user command on the state and display of the system is described, but the effect of lags between the entry of these and when the system actually responds is ignored. Effectively, the system is seen as executing on an infinitely fast machine! The obvious exceptions to this are those systems which explicitly deal with real time phenomena; response timing, games, simulations etc. However it is those systems whose real time behaviour is incidental which this paper addresses.
There are good reasons for ignoring real time behaviour. Specifications are deliberately aimed at some aspect of the functionality of a system, and from this abstraction they obtain their power of concise expression. It is therefore important that any considerations of non-steady state behaviour do not clutter up unduly these abstractions, and can be considered independently. For this reason our work at York on abstract models [1,2,3] has deliberately ignored timing issues, as, for example, have the various contributors on formal methods to HCI'86 .
Again documentation is often complex enough anyway. It is comparatively easy to describe steady state behaviour in terms of changes in the objects manipulated and snap-shots of screens, but it is both difficult and probably confusing to describe dynamic behaviour without on-line or video presentations. Further, the exact dynamic behaviour may depend on the run-time environment and machine whereas manuals are often written in a machine independent fashion.
Both these considerations prompt one to look for ways of dealing with real time behaviour which are both simple, and can be factored out from the main time-independent description of functionality. The next section examines various system compromises that are used to approximate the ideal machine (buffering and intermittent and partial update), and the problems these give rise to. Section 3 introduces a simple formal model which includes temporal behaviour and uses it to give precise descriptions This will give us a framework within which to make precise the problems which arise and the possible solutions. In particular, a precise definition is given of the steady state functionality of a system and of typical system behaviour, such as buffering, as they appear to the user, without regard to a particular model of implementation. This model is then used as a basis for proposing properties that a dynamic system must obey if its steady state functionality is to be usable. Various concrete expressions of these properties are discussed in section 5, both existing techniques and novel, including analysis of their user impact. These techniques put demands on the surrounding system software and hardware, these are outlined in section 6 showing where changes are needed in order to put these proposals into practice. Finally, these features are brought together in a design approach where the dynamic aspects of system behaviour are considered separately from the steady state functionality.
In reviewing the research into computer response times, Shneiderman  finds that, with a few exceptions, faster is better. This is certainly the general approach taken when implementing interactive systems. At first, care is taken to obtain general efficiency, concentrating on those areas where serious lags are expected. Then, where this proves insufficient, either because the computer is too slow, or because the output device cannot handle the rate of refresh, various additional strategies may be used.
The most common such strategy is for either the application or its surrounding environment (operating system, window manager etc.) to provide buffering to avoid loss of user commands. So normal has buffering become that its absence can be quite unnerving. Slightly less frequently seen are intermittent update strategies where the application only occasionally updates the screen , for instance after receiving 10 scroll-up-line commands the system may decide to only display one screen scrolled up 10 lines, ignoring the intermediate stages. Some versions of the word-processor Wordstar augment this by adopting a partial update strategy, where it attempts to keep the current cursor line permanently up to date but only updates the rest of the screen intermittently as it has the time. Both intermittent and partial update are particularly useful when the output device is the limiting factor. They also save some processor time. I have suggested non-deterministic intermittent update  to reduce further processor time but the analysis presented below does not cover this more complex case.
Not only can dynamic behaviour be confusing for the user, but it can also seriously undermine various desirable properties. A system that could be described as "what you see is what you have got"  on an ideal machine may well not be so on a real buffered machine, as the lag means that the current context as measured by the commands entered so far but still unprocessed, is not the context displayed. A case of "what you see is what you had"!
Problems particularly arise when applications are embedded within a surrounding environment; for instance, multi-processing often leads to long delays which would invalidate a partial update strategy. Further, the environment's strategies may conflict with those of the application, for example the operating systems buffering and the applications buffering may interact badly.
The worst problems occur when the user feedback loop is important, in other words the most interactive systems. Key-ahead is usually no problem so long as delays are not too long, but cursor movement and mouse positioning are far more critical. When using cursor keys, delays in cursor movement may lead to entry at the wrong point, and in the worst case cursor tracking. This is the phenomenon where, because of display lag after using cursor keys, the cursor is moved too far and misses its target, then in moving back it again overshoots..... Users develop strategies of their own for dealing with this, moving one character in the opposite or an orthogonal direction after a long sequence of moves, or even inserting a character just to see if the cursor has stopped. These strategies become so automatic that some users claim that cursor tracking is therefore not a problem. On the other hand, for small moves, feedback is not so critical, correcting typing mistakes on the fly is relatively easy since the number of moves to make is known and can be an almost automatic reaction.
Mouse-based systems pose more serious problems. Some systems obtain the mouse position independently of button clicks, and thus, for instance, in a drawing application, depressing a button and moving will result in a start point somewhere along the mouse's path. Users of such systems get used to a "click and hold" strategy when drawing or manipulating icons and menus. Because the meaning of a mouse action is so heavily dependent on the display context, some systems disallow mouse-ahead. Again this leads to a "click and wait for confirmation" strategy for the user. The task of accurately positioning the mouse-cursor already places high cognitive demands on the user, which are increased if the user must be constantly vigilant to check that the various mouse actions had the intended effect. This is stressful, makes it difficult for the user to think ahead, and prevents the user developing effective motor responses. In particular, the illusion that the mouse is an extension of the user  is destroyed.
In this section we are going to develop an abstract formal model of interactive systems, taking into account the timing of input and output. We will use this to define explicitly what is meant by steady state functionality, to describe the various departures from this ideal functionality and the various observability constraints that are required to make such a system usable. This approach is justified more fully elsewhere , but in short, models are formal so that they are amenable to analysis and so that actual systems can be verified against them, and are abstract so that they are general and can thus be applied to a large class of real systems. Further, in line with previous work, the model will as far as possible be a surface model, in the sense that it will try to describe the system from the outside. For example, the definition of a perfectly buffered system is not in terms of a physical representation, such as a block of memory in which key-strokes are stored, but in terms of the behaviour of the system. This approach both avoids odd hidden problems (eg. the system may save key-strokes, then later ignore them), but also by staying outside the system is a description of its interface only and has therefore a greater validity for hci.
There are many possible formulations, and the one presented here is chosen because it is most similar to our previous work. We consider a sequence of discrete time intervals. At each time interval the user may enter a single command (from a set C) or do nothing (represented by - a tick). Also, with each time interval, we associate a current display (from a set D), however we assume that this occurs just after the input for that interval, so that there is time for a fast computer response, but so that the gap is imperceptible to the user. Thus the display in the same interval can be seen as the instant response, but lags are represented by responses several time steps later.
The display is obtained from the current machine state (from a set E), by a function display: E D. This state comprises all the computer's memory of the past. The relation of this state to the input history can be represented in two ways, either by considering the whole history at once, or by an incremental approach. The first method calls for an interpretation function (I) which gives the state due to any input history. The input history (from a set H) will consist of sequences of commands (from C) and ticks ().
I: H E
The alternative is to use a state transition function, which, given the input for any time step and the last state, gives the new current state.
doit: ( C ) E EThese two representations are of course not independent and are linked by the equations:
I() = einit - the initial state I(h::c) = doit ( c, I ( h ) )
( Note. :: is the concatenation operator so that "abc"::"d" = "abcd", and  represents the empty sequence, in this case the history where nothing has yet happened.)
That is the state obtained for a history is obtained by starting at the initial state and repeatedly applying doit.
To talk about steady state functionality, we need an idea of when the system has stopped changing. This is clearly when further time intervals with no user input (i.e. s) don't alter the state.
E stable doit( , e ) = e
If the state does not change, then the display, which is obtained from the state clearly doesn't change either. However, just because the display is stable, it does not follow that the state is. For instance, in a multi-processing environment, a background task may well be altering the file system but not produce any visual effect, or, when a large global edit is being performed, there may not be any indication of completion, or, the most extreme case, consider a user response timer continually ticking away with no visible change until the user responds and the delay is recorded.
We will define steady state functionality by associating it with the response obtained by a patient user who always waits for the system to stabilise before entering more commands. Unfortunately, from the last example, we see that there are in fact systems that never stabilise, another similar example would be an alarm clock. However it is usually the case that either the system does always stabilise, or that the parts that do not, can be abstracted away. Making this assumption, we can associate with any sequence of inputs a longer sequence with additional ticks on the end, which gives rise to a stable state. These ticks can be thought of as the time a patient user would have to wait.
steady : H H
steady ( h ) = h :: w
where w is the shortest sequence of s such that I ( h :: w ) is stable.
If the machine were infinitely fast, it would reach a stable state instantly, and w would be empty. As there would be no time to type-ahead, all users would effectively be patient. For any sequence from H, we can obtain the sequence that a patient user would have entered.
patient: H H
patient(  ) = steady(  ) patient( h :: ) = patient( h ) patient( h :: c ) = steady( patient( h ) :: c ) when c
We can now define the steady state functionality to be exactly that which the patient user would obtain. That is I composed with patient. This is precisely the functionality defined by most specifications and described in most documentation, and is therefore the behaviour that should be validated against them.
How does this differ from the actual functionality observed by a less patient user? To see this, we compare the steady state obtained from a sequence of commands (and ticks) H with that obtained if the user had been patient. That is:-
I ( steady( h ) ) compared with I ( patient( h ) )If these are equal, then the system obtains the same steady state no matter how fast the user enters commands. That is equality in the above is the definition of a perfectly buffered system. Note that this definition is purely in terms of the behaviour of the system, it makes no reference to a buffer as an implementation device. Later on we shall consider departures from perfect buffering.
We can also examine total versus intermittent display update using this scheme. We say a system has a total update strategy if all steady state displays occur in the actual display sequence.
h H h0 h1 h2 ... hn h
s0 s1 s2 ... sn steady( p )
st i display( steady( hi ) ) = display( si )
(Here is taken to mean initial subsequence of, so that "ab" "abc" but not "bc" "abc")
A large proportion of systems are perfectly buffered and adopt a total update strategy, this is because they are effectively programmed for an ideal machine and buffering is supplied by the environment. The environment can supply degrees of automatic intermittent update in addition to buffering, for instance TIP , a screen control package, attempts to suppress display update when there is input pending.
The decision on whether to update or not is not just one of efficiency. Some intermediate displays convey information by their dynamic behaviour, for instance seeing a display scroll up line by line gives the user more of an impression of direction than a sudden jump. On the other hand if the user has typed-ahead it is likely that any such feedback will be too late!
As we have said in section 2, the interpretation that the user gives to commands depends on the display context. This problem can be addressed in two ways.
Typically a system will use a mixture of the two, in fact, we will see that responses of the first kind necessitate the provision of specific information for the user.
In the first category, the most obvious example is disallowing any mouse-ahead. This amounts to:-
h steady( h ) and m mouse commands I ( h :: m ) = I ( h :: )
This says that when we are not in steady state, any mouse command is treated just like a tick, an interval with no input. That is mouse commands are ignored out of steady state. This may be too rigid however and mouse ahead may well be acceptable if the displayed context when a command is issued is the same as the context when it is eventually applied. For instance, if there are two independent editing windows, mouse-ahead in one would be allowable even if there were commands outstanding for the other. Even if the contexts are different, the denotation in the context of invocation may still have meaning at the time of application. For instance, if there are still keyboard entries outstanding for a find/replace buffer and the user wants to initiate the action, then, if a simple function key was being used, we would expect most applications to queue it; it seems reasonable therefore that the lexical change to a find/replace button icon should behave similarly, allowing mouse-ahead.
A system that disallows some commands, but not others can be represented by a function noted:
noted: H H
h H let n = noted( h ) n is a subsequence of H and I ( steady( h ) ) = I ( patient( n ) )
This says that for any history, we can associate a shorter history of commands that have been taken note of. These noted commands are a subsequence of the original commands (dropping out the ignored ones), and the behaviour of the system is the same as if only the noted commands had been patiently entered. Note the similarity between this and the definition of perfect buffering in section 3.3. This similarity is because systems that posses such a noted function will obey the exception rule "on exception do nothing" , which suggests that when exceptions occur a system should not perform a special action, but instead merely ignore the command that raised the exception.
The user should be able to discern what the correct context is, in particular, is this steady state? This decision must be possible from the contents of the current display, requiring the existence of a decision procedure for the user, is_steady?.
is_steady? : D Bool
e E is_steady? ( display( e ) ) = stable( e )
Although cast as a user decision function, its importance is of course the information it requires of the system. This information could be positive (e.g. a status light when the system is stable), or negative (a red cursor when its not), formally both give equivalent information. Note however that the system response producing the information must be immediate, it is no use waiting for steady state for information to say the system wasn't in steady state when you typed the command! This is why the definition refers to the current state and its display. Some systems are always in steady state except when the screen is actually changing (e.g. cursor moving or text scrolling), and thus lack of change indicates steady state. We could modify the above condition to look at several displays, however it is reasonable to maintain the definition and regard change as being an attribute of a single time frame.
The user needs to determine whether the current context is steady, both to make valid decisions based on display, and to predict dynamic response for cases like mouse-ahead. In the latter situation we see that there is interference between the computer's strategies for dealing with dynamic behaviour and the user's. For instance, if a user changes windows, the window manager might throw away key-ahead until the window has been displayed, to avoid accidental typing to the wrong window. This could be very annoying if the user does not know when type-ahead can begin again.
If the simple strategy of no mouse-ahead is used then the existence of is_steady? is sufficient. If, on the other hand, one of the context sensitive mouse-ahead strategies is used a more complex observability criterion must be introduced. This will be related to the function noted introduced above. So that the user can know whether a command has been noted or not, we need a user predicate similar to is_steady?. We will call this is_noted?.
is_noted? : D Bool
h is_noted? ( display ( I ( h::c ) ) ) iff noted( h::c ) = noted( h ) :: c
That is is_noted? tells us immediately whether or not our command will have an effect in steady state. An example of this would be a buzzer sounding whenever a mouse-click has been ignored. Perhaps even better would be to predict before hand.
will_be_noted?: D C Bool
h will_be_noted? ( display ( I ( h ) ), c ) iff noted( h::c ) = noted( h ) :: c
That is will_be_noted? tells us before we enter a command whether or not that command will have an effect. An example of this would be hour-glass mouse icons .
If the user is able to tell from the display whether or not a command has taken effect, (but not of course whether the effect has been as expected, which requires steady state), then certainly the system needs (in some sense) to know. This disallows retroactive decisions, such as circular overwriting keyboard buffers (as in some UNIX systems). In practice this will mean being conservative about allowing mouse-ahead, only accepting it if we can be sure that the current context and the steady state context are in the right relation.
Note also, that the information for these decision functions must be available all the time. In particular, it cannot be subject to intermittent or partial display strategies.
By considering the problem of real-time behaviour formally we have:
Supplying sufficient information to ensure the existence of such decision procedures is not sufficient, the information must also be in a form comprehensible and noticeable to the user. We should also take account of whether we expect the user to be acting consciously or automatically, and where we expect the centre of attention to be. For example, a user having just asked for a program to be compiled, would expect it to take a while, and thus would consciously look out for indications of completion (e.g. a new prompt); on the other hand, if the user were typing fast into a word processor, we could not expect the same degree of awareness. Again, if mouse-ahead is prohibited, flashing the mouse cursor would be an acceptable signal since attention is likely to be centred on the mouse, yet signifying a locked keyboard by a flashing LED on a key is unacceptable for touch typists. Monk  discusses this and similar issues when considering signaling modes to the user. In fact the various conditions considered here (the existence of type-ahead and ignoring commands) could be thought of as a form of modiness.
We look first at information to represent steady state. Here there is a well-established use of hour-glass, busy bee or watch icons, substituted for the normal mouse icon, when the computer is busy. However these tend only to be used when some significant pause such as file transfer or global editing, is occurring. When used they tend to signify that either mouse input or possibly all input is being ignored. Thus they fall closer to the will_be_noted? than the is_steady? information. However on such systems, it is often the case that mouse clicks can be ignored in other contexts also. The mouse icon could be used consistently in all non-steady state situations though, especially if one is using the rule of ignoring all such mouse input. This would only be acceptable if we could be sure that in all critical situations attention is focused on the mouse cursor, which does fortunately seem quite likely. The alternative centre of attention is the text entry point or current selection (the only one in non-mouse systems). Similar mechanisms can be used here, however on more conventional terminals it is unlikely that there is much choice of cursor attributes, the one choice that is often available is between flashing and steady cursor, and this could be quite appropriate.
The disadvantage of making steady state information so prominent is that it could be a significant distraction: imagine the cursor changing icon for a fraction of a second on each keystroke! If we expect most uses of steady state information (as opposed to explicit will_be_noted? information) to be in situations where users expect problems, then it could be consigned to a status line, either as a simple flag or more elaborately. Perhaps the most extreme form of such information would be the munchman buffer. As you type, the characters appear on the bottom line of the screen. As the application deals with each character, it is consumed in the bottom left corner (by the munchman!). Although slightly strange, such an interface might be useful in certain circumstances, such as command interfaces. Perhaps more likely, particularly where quite powerful personal computers are used as terminals, would be to have an icon associated with a window representing the state of the interaction. In its collapsed form it could have a small bar indicating the fullness of the buffer, then, when expanded, more information could be available, in particular this could be combined with the status information for a terminal emulator. This would again be of particular value with command based interfaces, it allows the user to determine steady state, to make predictions about the context where commands will be interpreted, and gives feedback as to progress which can be very important if response is slow .
If we consider mouse-ahead acceptability, or keyboard locking, then changing mouse or cursor attributes again becomes by far the best solution. For the mouse these attributes will be context sensitive, so that the mouse may display a locked form when over a text window with outstanding updates, but be in its normal state when moved over a control panel where mouse input is acceptable. This effectively supplies us with the will_be_noted? information. However, even though it may be safe to assume the user's attention is focused on the mouse/cursor, merely visual cues may not be noticed if the user is acting semi-automatically. Thus it is insufficient only to supply will_be_noted? information, feedback (preferably aural) is required when the user makes a mistake. That is, we need to also have the is_noted? information, and in a different medium. The usual objections to such feedback, (people don't like machines that beep at them), don't hold here as there is sufficient visual warning before the fact: the self-aware user can use the visual will_be_noted? information and thus need never hear a thing.
If any of the strategies proposed or any alternatives were adopted, and were only partially correct, then they may be worse than useless. Users may be lulled into a false sense of security and then make mistakes which would go unnoticed because of the reduced attention, or they may eventually come to ignore the feedback considering it an unreliable and therefore irrelevant distraction.
Successful implementation depends on the mutual cooperation of the application, the environment (operating system, window manager etc.) and the interface peripherals. We will see that one recalcitrant partner can spoil the system. Any of these parties, or a combination of them, can take the initiative in the process. Thus we will consider each of these (environment, application and peripheral) in turn.
Except where communication is over large slow networks, we can assume that the operating system has almost immediate control over its interface hardware. It may not be able to refresh an entire screen in an imperceptible interval, but at least it will be able to guarantee immediate control over some portion of the screen, enough to display some status information. Several problems may arise however if the application does not cooperate.
If the application buffers input for itself (e.g. in multiuser systems where system calls are expensive), the environment cannot know how much has been used and hence can only give coarse estimates to the user (ie no per character munchman buffers). In single user systems (for instance CP/M) system calls are usually on a per character basis and the system can thus keep a more exact track of input pending. In both cases if the operating system allows polling of input then there is the possibility of busy waits, not allowing the environment to detect steady state. This will happen, in particular, when the application is attempting to perform screen update optimisations, as it will poll to decide whether or not to update: this is a case of antagonistic temporal strategies. Clearly the application must pass some message back, for instance guaranteeing to wait on input eventually. More problems arise when considering non-noted events. Unless there is a system-wide policy, such as ignoring all mouse-ahead always, only the application can know the allowable contexts. This is alleviated if the operating system includes a rudimentary user interface management system . The actual application would be perfectly buffered, and a separate lexical/syntactic module would determine acceptable events, this second module guaranteeing to run fast enough to be effectively instantaneous.
There are several reasons for investing temporal control in the application. It has most knowledge about the way it will be used and what sort of displays and feedback are appropriate and consistent with the rest of its operations. Further, the turnover of applications is far greater than that of operating systems and thus new ideas can be included more quickly. Finally, and more tentatively, similar to the desire for consistency across a system, is the desire for consistency of the same application on different systems.
The application is frequently prevented from exercising control by the primitives supplied by the environment. On the input side, only blocking input may be supplied, not allowing the application to decide whether there is any pending input. This is true for example, of older Unix systems. In these cases intermittent update strategies are often possible by requesting large chunks of input and updating on each chunk, however steady state indicators are very difficult, even when scheduling can be guaranteed to be frequent.
Scheduling problems combine with the paucity of primitives to prevent proper synchronisation of input and output. Imagine we are about to output a screen that comprises a major context shift, and we wish to ignore all outstanding input. The operating system provides a
flush_os_buffers call that does immediately flush all of its outstanding events. The following sequence is no good.
flush_os_buffers() ; update( screen ) ;
The application may be descheduled between the two lines and further input may be entered before the screen is displayed. Even if there is no delay, the update itself may take some time and still unwanted input may be accepted. However if we reverse the order the situation is worse.
update( screen ) ; flush_os_buffers() ;
If there is descheduling, the user's input will be ignored, despite the fact that a new screen is displayed which, among other things, will imply that input is acceptable. On the other hand if there is no delay, the call to
update is likely to mean that the request has been placed in the appropriate queue, but not that it has completed, thus getting the same problems as the previous order. These problems are particularly evident with bitmap displays where the refresh time can be significant.
Where events are time-stamped, we might try to compare these against the time of last update, however obtaining this time falls into the same problems as before. It is only the front-end software that can meaningfully synchronise input and output and thus this must supply the relevant information. The simplest solution therefore, which could be provided at little cost, is to have the window manager or tty-driver, upon request, insert an event into the input stream when an update completes. The application would then know that any events received before this were entered while the screen was not up to date.
Poorly designed interface hardware could destroy attempts at good temporal behaviour, perhaps by having uncontrollable buffering, or insufficiently rich operations; however in practice this is usually an annoyance, but not totally destructive. On the other hand, the peripherals rarely contribute anything positively. This need not be the case, terminals are getting increasingly intelligent and this intelligence can be used. To a large extent, a terminal taking control has the same problems with the underlying system as the operating system has with the application, however there are cases where it could be worthwhile despite this. For instance, where a PC is acting as a terminal over slow lines or a network, it may well be worth reporting the state of its buffers even when there may be additional buffering in the host. Perhaps, more importantly, the terminal can cooperate either with the operating system to take workload from it, or with the application despite a possibly antagonistic operating system. This latter case is most interesting, as it can overcome the technical inertia of the operating system. For instance, synchronisation, such as suggested above, where output events generate input tokens, can be accomplished trivially (in fact some vdus have a who-are-you request which can be used for this purpose). Instant feedback is more difficult as this has to be handled by the terminal if fast scheduling cannot be ensured. However the solutions proposed for environment control apply here. Either catch-all policies such as no mouse-ahead, or simple descriptive mechanisms such as region sensative mouse-ahead could be provided by appropriate control codes. The use of PCs as terminals, specially designed programmable terminals such as TIN  or intelligent workstations such as the BLIT  allow the possibility of downloading programs of significant complexity which could in particular handle the parts of an interface requiring close synchronisation and rapid response, leaving a bare-bones application in the host.
The above discussion shows that it is possible to talk about and build fully temporal systems, by embedding a steady state design in a real temporal system. From the above discussion, we have the beginnings of a design approach.
Stages 2 to 4 are considered separately to stage 1. In particular they would be specified and probably documented separately, there should be support for them from packages, the environment, the peripherals or some combination and further there may be system wide policies for them, especially for presentation (stage 4).
Following such a design approach should lead to systems with real time properties that are easier to specify, easier to document and easier to use.
This work was supported by a UK Alvey Directorate grant GR/D/0231.7, and I am indebted to various members of the HCI group at York for their comments and help.
Alan Dix 20/8/97