5 Estimating Oculomotor Events from Raw Data Samples In this chapter, we will present the algorithms responsible for ca\c\ilaiing fixations, saccades, smooth pursuits, and other events directly from the raw data samples. This chapter is organized as follows. • In Section 5.1 (p. 148) we introduce the illusively simple calculation of fixations and saccades, illustrated by manufacturers' analysis software. • Section 5,2 (p. 150) introduces what the algorithms have to work on; namely position, velocity, and acceleration data. We classify the events to be detected and the algorithms that do it. • In Section 5.3 (p. 153) we provide a condensed list of hands-on advice for the beginning user of an event detection algorithm. • Current algorithms are far from perfect. In Section 5.4 (p. 154), the major challenges are listed. The selection of settings is given particular emphasis. Read this if you want to know in detail what the algorithms may do with your data, • If you arc interested in the algorithms themselves and the design issues and computational reasons behind them, read Section 5.5 (p. 171). • Section 5.6 (p. 175) focuses on data recorded onto gaze-overlaid video, for which manual segmentation of fixation duration and other events is often the only option. • Blink events (Section 5.7, p. 176) are easily detected, but smooth pursuit is not (Section 5.8, p. 175). • Noise and artefacts (Section 5.9, p. 181) are not even considered events, but there are good reasons for algorithms to detect such periods, and for us as researchers to decide how to treat them. • There are detection algorithms also for some of the lesser known events, for instance microsaccades and square-wave jerks (Section 5.10, p. 182). • The chapter is summarized in Section 5.11 (p. 185) by listing the events that can be detected and the values of which we carry with us for further analysis. Very often, the first step in data analysis is the calculation of events such as fixations and saccades, with all their parameters. Indeed, the fixation and saccade values exported by the algorithms of this chapter are of great importance. They are heavily used in research in themselves, as well as in a multitude of combinations with other ways to measure and visualize eye-tracking data. Sometimes, it is even thought to be impossible to analyse eye-tracking data without this calculation. This is wrong, however. In many cases, fixation and saccade analysis is not a prerequisite to data analysis. For instance, heat map visualizations and the dwell time measure (p. 386) and scanpath length (p. 319) can all be calculated on raw unprocessed data just as well. Only raw data, but not fixations, can be taken as input when the analysis is tightly connected to running sample time, as with proportion over time curves (p. 197). 148 | ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES niter Settings Detection Pajametas Min. Duration B Auto Peak Velocity Threshold 75 | »/s Low Pass Filter Size • Differential Filter Size Poak Velocity Whobw Start: 1 20 | % of saccade length End: I 80 I % of saccade length (a) SMI BeGaze 2007 |---Hq* ;llMtl-riRflgr M*kiUrtIIaj f[T U« iiDqrtnrjr MarV mpitsg. (n~ .-T'AT tltf [ " >DAT »op I FKVCnAlgalhBnCitsu 2 3 Eye Twc*.* Ur*i/*c"» 2« TmeOign- I- s«ta locicr JjT | OK 1 Caned | (c) ASL Eyenal 2004 validity ffter [Normal 17 Use fixation hter F*er Settings-- Fixation radus (pixels) Mm fixation duration (ms) 100 Fixation filter suggestions: i'.rinwli w*h mosbV pictures; Fixation f*« 50 pcxete and 200 ms. SUmul with mostly resting: Fixation filter 20 pixels and 40 ms. StirmA w*h mixed content: Fixation filter 30 pixels and 100 ms, (b) Tobii ClearView 2007 f'.""iii and Dntrt ProcessIng Eye Eurlit tkiU j:'1 '"href ' Soccndr Sensillulty ijj USB 'T""ri Tile Snaple Filter [of f] J*i JP5JI Llnk/AiKilag Filter toF (d) SR EyeLink 2007 Fig. 5.1 Settings dialogues for fixation analysis in three analysis packages and one recording software from commercial eye-tracking manufacturers. 5.1 The setting dialogues and the output In theory, the event detection algorithm takes raw, possibly filtered data samples and tries to detect events within them. The most reported of such events are fixations and saccades. It sounds simple and something that could be done automatically, and that wc should not really have to bother thinking about, and indeed software engineers have made it illusively simple to use fixation algorithms. All you have to do is to accept the pre-sel values in a dialogue like those in Figure 5.1, and click OK. In reality, however, these setting dialogues provoke many questions: What does minimal time or minimum fixation duration actually mean? What is a fixation radius, and how does it relate to monitor resolution, measured in pixels? And what is peak velocity threshold? When is a normal saccade-detection sensitivity better than a high sensitivity; should not high sensitivity always be better? And what do all the ASL Eyenal parameters mean? How sensible are the suggestions given in the Tobii dialogue? Why should there be different settings for different kinds of stimuli: are fixations more stable in reading than in picture viewing when THE SETTING DIALOGUES AND THE OUTPUT| 149 benennen teil dargeboten Bitte verfolge die Darbietung daJ dw Inhaltcgut?^ ""■y^^ I ib Ankchluss werden Dir einige Frnjurn und.AufgabeA zu den gcnnrujten f.'Jwnkteristifca geseilt, die I>u mit Hilfe &r gelen 4-rri -.....1 f r\ \'erm Du nuf die rechte Maustaste kiicfcst, beginnt die Darbic (a) Raw samples at 50 Hz. T50II (b) Raw samples at 50 Hz. 1200 ms. 140 ms. 1100 ms. (c) Raw samples at 1250 Hz. Fig. 5.2 Enlarged views of raw samples plotted against the stimulus background. Note how close samples are to one another in the 1250 Hz recording compared to the 50 Hz data, in particular during long saccades. Also note the low precision in (b) recording compared to (a) and (c). fding software and tries to saccades. It bould not really sively simple a dialogue like does minimal and how does threshold? Id not high s How sensible lent settings for - a ins when such recordings are taken on the same eye-tracker? Is there a danger that my data is too noisy to do a fixation analysis, or does filtering automatically fix this? Docs it matter what settings I choose? The purpose of this chapter is to give a better understanding of how event detection works, and provide insights into how to approach a settings dialogue such as the ones in Figure 5.1. Figure 5.2 shows the input, the raw data samples, from three different eye-trackers. The raw data constitute the data that you get from your eye-tracker after recording. When plotting raw data against the background of the stimulus, each data sample is a little dot. During a saccade, when the eye is moving quickly, the distance between dots is large. During fixations, the dots aggregate to form one large blob from many dots. How closely the raw sample dots are positioned is directly related to the sampling frequency of your eye-tracker. How smooth the raw data appear is a direct consequence of the precision of the eye-tracker. Both these system properties are crucial to how the fixation and saccade algorithms are designed, and largely decide what a given algorithm can deliver. In Figure 5.2(c), a full stimulus display is shown with an overlaid raw data plot. Fixation blobs and thinner saccadic lines are clearly seen. Vertical lines are blinks in progress. At the bottom and to the left, there is some high-velocity noise, probably caused by a dual corneal reflection; either a split corneal reflection, or one real and one falsely detected. Overall, this is the type of data you should expect from your eye-tracker. Figure 5.3(b) shows fixations and saccades calculated from the same raw data, using manufacturer software and default settings. Fixations are now seen as circles with a diameter indicating the duration, and abstracted straight lines for saccades. During the fixation analysis, blinks and artefacts were filtered off. In total, the fixation scanpath looks much cleaner than the raw data plot. Nevertheless, there is something deeply wrong with the scanpath of fixations and saccades. A lot of fixations that we can clearly see in the raw data plot are gone. There are two fixations on the word "Magician" in the first line, for instance, that have simply disappeared after event detection. Each line has lost one or more fixations. If you were to export this fixation and saccade data, 150 I ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES (a) Soanpalh with raw data samples. (b) Scanpath with fixations and saccades. Fig. 5.3 Data recorded at 1250 Hz with a tower-mounted eye-tracker. Fixations were calculated with the commercial software BeGaze 2.1, applying a velocity-based analysis and default settings with a peak velocity threshold of 75°/s. Note that parsing the data using this algorithm omits some of the grouped samples of 'blobs' which from manual inspection seem to be valid fixations. and calculate fixation durations or saccadic amplitudes from it, you would get erroneous data: too few fixations, too large saccadic amplitudes, and the wrong average for fixation duration. You would use corrupted data, and the results you present in your report or journal paper may not be valid. 5.2 Principles and algorithms for event detection Before we start addressing event detection in more detail, let us slop for a moment and reflect over what an event is. For example, should we always look for blobs in the raw gaze plots when we want to find a fixation, or can the fixation event be defined by other criteria? The only raw material all algorithms have to work with is the stream of data samples recorded by the eye-tracker. In this stream of data samples, there are sometimes portions that exhibit a prototypical behaviour signifying that an oculomotor event has been recorded. For example, the saccade event is loosely defined as a period when the eye 'moves fast', and the fixation event where the eye 'is rather still'. The goal of event detection is to, according to a set of rules, robustly extract such events from the stream of data samples. Most often, this is done automatically by applying a detection algorithm to the gaze data, but it can also be done manually using subjective judgements. Event terms such as 'fixation' are used both for the events algorithmically or manually detected in the data stream, and the oculomotor events of the eye that were recorded. In reality, perfect matches between the fixations detected by an algorithm and moments of stillness of the eye are very rare. To make matters worse, the term fixation is sometimes also used for the period during which the fixated entity is cognitively processed by the participant. The oculomotor, the algorithmically detected, and the cognitive 'fixations' largely overlap, but are not the same. When reading, for instance, it is considered proven that a word can be processed parafoveally prior to being fixated (Rayner, 1998). It is in fact easy to decouple the fixation position from the position where attention is located and processing takes place, if the task and stimuli are simple (Posner, 1980). Furthermore, there are 'eye-in-head' fixations when the eye is still in its socket, irrespective of whether the head moves or not, and 'eye-on-stimulus' fixations when the eye is fixated on a target but possibly moving inside the head to compensate for head and body motion. Only when the head is immobile relative to the stimulus are they identical. PRINCIPLES AND ALGORITHMS FOR EVENT DETECTION | 151 ited with the ; with a peak s of the grouped neons data: Drj duration. I paper may nt and reflect I raw gaze plots r criteria? ' data samples > portions that i recorded. For i fast', and the , according to a t often, this is i also be done or manually 1 In reality, > of stillness of ■ also used for icipant. The iy overlap, but >rd can be rto decouple the ; takes place, if . irrespec-: tg e is fixated r motion. Only x-coordinate Velocity (7s) Acceleration (7s2) Visual angle (*) Time Fig. 5.4 Idealized gaze position, velocity, and acceleration profile over time showing one saccade in-between the end of a fixation (to the left of the saccade) and the beginning of a new fixation (to the right of the saccade). The term 'fixation' in this chapter refers to an event in the data file that has been detected by an algorithm or subjectively by a person. 'Fixation' in this chapter does not refer to the cognitive event during which the fixated entity is processed by the participant (for a discussion about the relationship between fixation and cognitive processing, see page 377). Event detection algorithms make use of three data streams from the recording and subsequent calculations: Gaze position (*,y), gaze velocity (in °/s) and gaze acceleration (in °/s2). Besides pupil size which is sometimes used to detect blinks, that is all there is. Figure 5.4 illustrates such data from an idealized saccade15 represented by the vector to the right in the figure. Velocity is calculated using the distance between two data samples (first derivative of gaze position), while acceleration is estimated from three consecutive samples (second derivative of gaze position).16 Wyatt (1998) proposes the use of jerk (the third derivative of gaze position) to identify saccades, but it is noisy and fairly impractical in software implementations. As we saw in Chapter 2, filtering can significantly influence the data, in particular velocity and acceleration profiles. In the remainder of this chapter, we assume that filtering has already been done, but acknowledge that the results of event detection are tightly coupled to both the precision of the eye-tracker and the filters applied to the recorded gaze positions, as well as used in velocity and acceleration calculations. There are some general principles many algorithms use to detect specific events: 1. Fixations are predominantly detected by a maximum allowed dispersion or velocity criterion. In the former case, temporally adjacent samples must be located within a 15Saccadc taking the shortest path between two fixation positions. l6See page 48 for details about these calculations. 152 jESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES Dispersion threshold (a) Data samples must reside within the circle for a minimum amount of time to be a fixation. Saocade samples Velocity threshold Fixation samples (b) Velocity samples below the threshold belong to fixations and the samples above the threshold belong to saccades. Fig. 5.5 Fixation identification by maximum allowed (a) dispersion and (b) velocity. spatially limited region (typically 0.5-2.0°) for a minimum duration (anywhere from 50-250 ms in the literature) whereas in the latter case, fixations are identified as contiguous portions of the gaze data where gaze velocity does not exceed a predefined threshold (about 10-50°/s). These scenarios are depicted in Figure 5.5(a) and 5.5(b). 2. Saccades are commonly identified as periods where the eyes 'move fast', and are in practice defined by velocity or acceleration thresholds; everything above the thresholds are saccades (as in Figure 5.5(b)). Saceade detection thresholds vary significantly, but are usually in the range 30-l00°/s (velocity) and 4000-8000°/s2 (acceleration). 3. Smooth pursuit identification docs not exist in any current commercial implementation, and it is currently an open research problem to develop a robust and generic algorithm for such a purpose. The few algorithms that do exist mostly use information about the velocity (typically less than 30^K)°/s) and direction of smooth pursuit eye movement. 4. Blinks are often identified as (x — Q.y — 0) coordinates or when the pupil diameter is zero, indicative of a closed eyelid. Note, however, that a careful investigation of blink parameters requires us to measure eyelid movement, and this information is only crudely (if at all) related to the coordinates from your eye-tracker. 5. An artefact is a rather ill-defined 'event', but can for example occur when data samples report high velocity movement that physically cannot derive from real movement of the eye. Typically, such parts of the eye-movement data are identified and removed during initial analysis (e.g. the filtering stage), and sometimes even online during recording. More generally, artefacts can be considered as consecutive data samples that do not conform to any known eye-movement event. If the percentage of such 'unknown' data samples is high, this may be an indication that the data is of poor quality and should not be used in further analysis. It can also indicate that the algorithm is not appropriate to use on your recorded data. 6. While the above events, in particular fixation and saceade events, will be of main focus in this chapter, other events indeed exist and algorithms have been developed to detect them. For nystagmus, for instance, Juhola (1988) presents an adaptive digital recursive filter capable of detecting all maxima and minima in a sequence of alterations. The square-wave jerk (p, 183) is another event that occurs frequently in healthy participants' eye movements. Then there are fixational eye-movement drifts, microsaccades, and tremor, for which a few algorithms exist (the one by Engbert & Kliegl, 2003 for microsaccade detection, for instance). Nystrom and Holmqvist (2010) proposed an al- HANDS-ON ADVICE FOR EVENT DETECTION I 153 t samples I threshold J belong - the threshold i (anywhere from i identified as con-I a predefined 5 i and 5.5(b). fast", and are in : the thresholds ' significantly, but deration). ! implementation, I generic algorithm ation about the at eye movement, i the pupil diameter investigation of s information is only r when data samples I movement of the I removed during | during recording, pies that do not i "unknown' data r quality and should > is not appropriate I be of main focus eloped to detect t digital recursive I alterations. The i healthy partici-. microsaccades, : Kliegl. 2003 for 0» proposed an al- gorithm to quantify movements known as glissades, a type of wobbling eye movement at the end of saccades. The existing algorithms do not detect all event types; in fact they rarely detect more than one. The common identification by dispersion threshold algorithm11 (I-DT) detects fixations only and does not separate between remaining events. Other algorithms delect only saccadic portions of the data. Overall, the existing methods can be divided into three broad groups: Dispersion- (and duration-) based fixation detection algorithms using positional information, and the related clustering algorithms using Principle 1. By making alterations to the dispersion criterion, several varieties have developed: Salvucci and Goldberg (2000) test five different dispersion-based algorithms, and Urruty, Lew, Ihadaddene, and Simovici (2007) have developed a completely new algorithm based on projection clustering. Santella and DeCarlo (2004) developed a mean shift clustering algorithm that could be used for fixation detection. This group of algorithms is common in commercial implementations, such as Gazetracker, ASL Eyenal, faceLab, and SMI BeGaze, and are not uncommon in research papers. Typically, dispersion-based algorithms are used for data collected with a low-speed eye-tracker. Velocity and acceleration algorithms using Principles 1 or 2, (mostly) use velocity and/or acceleration data to calculate events. Software packages by Tobii, SMI, and SR Research (EyeLink) include detection algorithms based on such principles, although their details are quite different. This class of algorithms typically requires data collected at higher sampling rates (say > 200 Hz). Manual detection of events where a number of experienced eye-tracking researchers subjectively parse data samples into events. This is a method to find fixations, but it is not an algorithm. Many researchers trust only manual detection, in particular when data are collected with head-mounted eye-trackers without head tracking. 5.3 Hands-on advice for event detection If you need to analyse your data with a fixation or saccade detection algorithm, and care about the validity of the output, what should you do? Some general recommendations are: • Perhaps most importantly, plot your fixations next to your raw data, as in Figure 5.3, and examine what the algorithm does at different settings. • Examine the distributions of events in your measure at different settings (look at the histograms), before you decide which setting to use. • Make parallel analyses with several settings, and sec how this affects your results; see Green (2006) who does this. • Recommendations for algorithmic settings depend on factors such as the eye-tracker used for data collection, individual traits such as fixation stability, and the particular circumstances during calibration and recording (see pp. 154—161 for a detailed discussion). • If you want to compare your results to previous literature, use similar algorithms and settings. Unfortunately, however, not all researchers report the settings they use. • Fixations with unreasonably low durations often result from the current algorithms. Remove them if they significantly influence your results, or use a better algorithm. "Refers in this hook to the implementation described in Salvucci and Goldberg (2000). 154 | ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES • Beware of smooth pursuit or movement that looks like smooth pursuit in your data file. This is likely to be part of your data if you use animated stimuli or a head-mounted eye-tracker. Current algorithms have been designed to analyse only data recorded from static stimuli. • Dispersion based algorithms are not suited to analyse data collected with higher sampling frequencies {> 200 Hz). If you have access to velocity-based algorithms, they are more likely to produce a good output. • Using velocity and acceleration data, make sure that you understand how the filters used to generate the data affect detection. For example, lowpass filtering smooths out velocity peaks and thus effects where the peak threshold intersects with the velocity curves. • Some researchers use measures that require data samples as input, but only from fixations. The fixation detection algorithms can then be used to 'clean' the data from all other events prior to using the measure. • Beware that some implementations divide events (such as a fixation) that cross a trial border into two parts. This may lead to artificially low first fixation durations for trials; one option is to exclude such partial events from the analysis. • In you article, always report the algorithm and the detection parameters you have used. • Clearly define the events you use in your article. For example, do 'fixations' refer to implicitly detected inter-saccadic intervals or explicitly detected oculomotor periods of stillness? 5.4 Challenging issues in event detection There are a range of issues that influence the results of event detection. As we saw in Figure 5.1, some of them are possible to control in the settings dialogue boxes in commercial software, while others may require post-processing or new algorithmic solutions. The validity of your results depends on how you deal with these issues. 5.4.1 Choosing parameter settings Given a set of raw data samples, parameter settings arc used to identify a specific event or to separate different types of events from each other. Therefore, they largely define the properties of a detected event. While the settings mostly serve to distinguish between event types, they are also commonly used by researchers to exclude data that are unreasonable with respect to what is known about the physiological limitations of eye movements, or with regard to the experimental design. With the important choice of parameter settings in mind, what are the proper values to choose for the settings in event detection algorithms, and how are they motivated? Recommendations, arguments, and practices Some manufacturers have provided their customers with recommendations for fixation analysis settings, but how well founded is such advice? In the Tobii Clearview settings dialogue (p. 148), the recommended lower fixation duration threshold of 40 ms for reading studies compared to 200 ms in picture viewing probably reflects the observation that fixations are typically shorter during reading than during picture viewing (p. 377). But what if the researcher has participants making 165 ms fixations during picture viewing? Should she then just lose those fixations from later statistical calculations, as Figure 5.3 exemplifies? And what about CHALLENGING ISSUES IN EVENT DETECTION | 155 oifa pursuit in your data file, stimuli or a head-mounted . nly data recorded from collected with higher sam-|»cic>-based algorithms, they are understand how the filters lowpass filtering smooths out intersects with the velocity * input, but only from fix-I to -clean' the data from all i a fixation) that cross a trial I fixation durations for trials; parameters you have used, pmpfe. do -fixations' refer to ■aecied oculomotor periods of ction. As we saw in Figure boxes in commercial soft-: solutions. The validity of a specific event or to i define the properties een event types, they sonable with respect nts. or with regard to r settings in mind, what arc —rithms. and how are they tions for fixation ana-rview settings dialogue > ms for reading studies ~a that fixations are typ-i what if the researcher old she then just lose ~es? And what about the 50 pixel radius suggestion for picture viewing, compared to 20 pixels for reading? Are fixations more stable during reading than while viewing images? If the participant makes two fixations close to one another during reading, the fixation analysis would give two fixations. But if the same person makes two fixations at the same close distance during picture viewing, should the fixation analysis produce just one long fixation? The ASL Eyenal Manual (2001) offers the following motivation for their thresholds (defaults are 1° and 100 ms): "Specifically, there is research documenting the minimum latency of saccades in response to visual stimuli (thus suggesting a minimum fixation duration) and data defining the maximum amplitude of involuntary eye movements during the fixation (thus establishing maximum fixation boundaries)". Involuntary eye movements such as drift and microsaccades indeed make up part of the movements inside a fixation, but the imprecision in the specific eye-tracker and the specific measurement may be much larger than 1°. Also, dispersion can be calculated in a number of different ways. Blignaut and Beelders 12009) present the following varieties of dispersion: 1. The maximum horizontal and vertical distance covered by the gaze positions in a fixation, ((max(» - rnin(;t)) + (max(y) -min(y)))/2 < threshold (Salvucci & Goldberg, 2000). 2. The distance between points in the fixation that are the furthest apart (Salvucci & Goldberg, 2000). 3. The distance between any two successive points, which is an estimate of the eye velocity (Shic, Scassellati, & Chawarska, 2008). 4. The distance between points and the centre of the fixation, i.e. the radius (Camilli. Terenzi, & Nocera, 2008). 5. The average or the standard deviation of the distances of all points from the centre of a fixation (Anliker, 1976; Applied Science Laboratories, 2001). ASL Eyenal uses the standard deviation as dispersion measure and sets it to 1° as default, but for many of the other implementations, it is unclear what dispersion is. Surprisingly many softwares ask for dispersion thresholds in pixels instead of visual degrees. SMI's BeGaze 2.1 software uses 100 pixels as default. To make sense of this value, the experimenter first needs to convert them to degrees of visual angle, taking into account the viewing distance as well as the size and resolution of the screen (p. 24), and also understand which calculation of dispersion is used in the particular implementation they have at hand. The dispersion setting is closely connected to the imprecision of the recorded data, and some implementations attempt to compensate for such noise. For instance, the ASL Eyenal dispersion algorithm for 50 Hz data requires three data samples to be outside of the dispersion radius for the fixation to end, not just one. Allowing single data samples to deviate is an insurance against low precision in the data; what we saw in Figure 5.2(b). In a way, this can be seen as a temporal increase of the dispersion threshold to allow one or two deviating fixation samples to pass unnoticed. In contrast, the I-DT ends the fixation as soon as the dispersion criteria are violated, which makes it more sensitive to noise, possibly requiring a higher dispersion setting. Researchers have not paid too much attention to the dispersion setting, but Rotting (2001) reviews studies with dispersion settings ranging from 0.5°-2° of visual angle. Blignaut and Beelders (2009) and Blignaut (2009) argue that the optimal dispersion setting is 1° (for the radius dispersion measure), but relies heavily on the dispersion measure used. The minima] fixation duration setting has been a long-standing discussion among researchers, however. Inhoff and Radach (1998) write that they themselves mostly use a cutoff point for fixation duration of 50 ms, but that many of their reading research colleagues use cutoff points ranging from 70-100 ms (hut do not state which algorithms they use). Rotting 156 |ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES j------------1m_ ^-coordinate .........y-coordinate Velocity (7s) 20,400 20,600 Time [ms] 20,800 Fig. 5.6 A <45 ms fixation; following a regressive saccade during reading (Data from our example reading set). Excluding the glissade (the little bump in the velocity curve), the fixation duration becomes even shorter; about 30 ms. Data recorded at 1250 Hz with a tower-mounted system. (2001) summarizes a number of studies, mainly in human factors, that use dispersion algorithms and duration settings ranging from 60-120 ms. At the other end of the scale, Granka. Hembrooke, Gay, and Feusner (2008) use a 200 ms duration setting. Manor and Gordon (2003) notice that 200 ms has become the de facto standard in clinical studies, originally derived from a 1962 study of eye movements in reading. Engmann et at. (2009) used no cutoff at all, but found that only 3.9% of the fixations in their study had a duration of less than 100 ms. Is this divergence in settings a problem for researchers who want to compare their results against someone else's? Perhaps we could decide duration setting on the basis of what is known about how short fixations can be. However, there seems to be no concensus on how common even shorter fixations really are. They do exist, that is clear, as shown by the fixation in Figure 5.6, which measures exactly 45 ms when including the smaller velocity peak after the main saccade. If we exclude the glissade duration, the true fixation is around 30 ms in duration. As we will see on page 377, information intake may be closed for the entire duration of such short fixations, but is this a good reason to exclude them? Rotting (2001) seems to argue that we should, while others use cutoffs without specifying the motivation. But the short fixations are still real oculomotor events, even if intake is closed. The fixation blob in the raw data files represents an oculomotor event, and it is our task to measure it as best we can, and distinguish it from other short periods of stillness that can be found in the data. The question of intake is very important, but should not be built into the event-detecting algorithms. The EyeLink velocity algorithm allows for 'low', 'medium', or 'high' saccade sensitivity, although the settings dialogue in Figure 5.1(d) shows only 'normal and high'. Medium sensitivity corresponds to a velocity threshold of 30°/s and an acceleration threshold of 8000°/s2, while the high sensitivity uses 22°/s and 4000°/s2. The algorithm assumes that a given sample of raw data is part of a saccade in progress, if at least one of the velocity and the acceleration values is above the respective threshold. This is sensible when detecting saccades online. It is safer to use two criteria than only one, as there is only one chance to get it right. For the same reason, the settings for the EyeLink algorithm are chosen before recording the :- ... CHALLENGING ISSUES IN EVENT DETECTION | 157 boot example reacting ration becomes even \mx dispersion algo-M the scale, Granka, ; Manor and Gordon padies. originally de-B009) used no cutoff ■nation of less than ■nt to compare their ra about how short amnion even shorter *■ Figure 5.6, which *e main saccade. If »auon. the entire duration (2001) seems to n. But the short ■bon blob in the raw I as best we can, and data. The question ■g algorithms, saccade sensitivity, Medium sensi-IHbold of 80007s2, mat a given sam-ty and the ac-detecting saccades to get it right. before recording the data. The EyeLink manual of 2007 recommends the more sensitive setting for oculomotor research, and the medium setting for cognitive and reading research (compare the recommendations in Figure 5.1(b)), arguing that "The larger threshold also reduces the number of microsaccades detected, decreasing the number of short fixations (less than 100 ms in duration) in the data" and noting that "Some short fixations (2% to 3% of total fixations) can be expected, and most researchers simply discard these". Not everyone discards the short fixations, however. Velichkovsky, Dornhofer, Pannasch, and Uncma (2000) not only take them seriously, but name them 'express fixations', after finding that they make up 7% of the total number of fixations given by their EyeLink system in a car simulator task. However, do poor data, noise, microsaccades, smooth pursuit, and a too low velocity threshold—below the precision level of the system—lie behind these frequent 'express fixations', rather than actual oculomotor behaviour? There is a large spanwidth of velocity threshold settings among researchers. Duchowski (2007, pp. 149-152) makes a theoretical argument about the settings for velocity algorithms, suggesting a lower threshold of 130°/s, which "should effectively detect saccades of amplitudes roughly larger than 3°". Most other researchers use lower velocity threshold settings. For instance, Smeets and Hooge (2003) used a velocity threshold of 75°/s when studying rather large saccades, and Inchingolo and Spanio (1985) compare the settings 10°/s and 50°/s. Beintema, Van Loon, and Van Den Berg (2005) chose the very low setting of 20°/s, but added a minimal saccade amplitude criterion of 1°, and a minimal duration between saccades of 30 ms to distinguish saccades from noise. While the dispersion setting may be difficult to motivate, the choice of thresholds for velocity algorithms could be made in relation to the purpose of your study: what size of saccades do you want to detect, how much noise is there in the recorded fixations, and where is the line between the velocities of the fastest saccades you want and the slowest movement due to artefacts you have? The precise settings inside these spans could be selected from visual inspection of some typical samples in your data, using a plot of velocity and position, such as Figure 5.7. The following paragraph summarizes issues related to the saccade velocity threshold: • Saccade velocity threshold The major setting. How small are the saccades you need to detect? Detection of small saccades requires a lower threshold. How much noise is there in the fixations? A lot of noise requires a higher threshold. Settings in the literature typically range from 20-130°/s, as discussed above. In fact, the problem with undetected fixations seen in Figure 5.3 was that the default saccade velocity threshold (75°/s) was set too high. This means that short saccades and their rwo surrounding fixations are grouped as one single fixation. We can see four such short saccades in Figure 5.7, three of which move with a velocity below 50°/s. An appropriate setting for this data is rather 30-40°/s. Although the saccade velocity threshold is the most commonly used, there are three additional thresholds for velocity and acceleration based algorithms: • Saccade on- and offset velocity Deciding when a saccade starts and stops, and is always equal or lower than the saccade velocity threshold. For high-quality recordings, a setting of 10—15°/s is often used but there is no consensus on how such thresholds should be set. • Maximum velocity threshold A little-used artefact-removal threshold. The fast-moving artefactual movements from split and false corneal reflections, mascara, and droopy eyelids are above the interval 750-1000°/s, which can be considered as a physical limitation on how fast the eye can move. Not many algorithms use an upper velocity 158 I ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 757s Time Fig. 5.7 Gaze velocity (black) and gaze .^-coordinate (grey) for reading data. The vertical scale is in °/s and pixels, respectively. The horizontal scale is samples (time). 75°/s is marked by a line, which is clearly too high for many of the saccades. 1250 Hz data from a tower-mounted system. threshold, but Duchowski (2007) makes a theoretical argument suggesting 750°/s as a suitable threshold. • Saccade acceleration threshold Used in the EyeLink algorithm to allow for quick detection of saccades online. An acceleration threshold can also be useful for distinguishing saccades from periods of smooth pursuit; quick pursuit velocity can be larger than slow (small) saccade velocity, but saccades always have larger accelerations (Behrens & Weiss, 1992). The EyeLink software allows for post-recording filtering of the fixation and saccades resulting from the online saccade algorithm. The filter takes minimal fixation duration and minimal saccadic amplitude as settings. Defaults thresholds are 50 ms and 1°. Subsequent fixations that are shorter and closer than the threshold settings stipulate are merged into one fixation. Remedying noisy recordings post-hoc seems to be the major function of this tool. It introduces two new settings, however, making it a total of four settings for the algorithms deciding what fixations and saccades should remain for data analysis. Overall, the many heuristic elements of the EyeLink online saccade algorithm appear to be difficult to overview for the average user, which is perhaps why the settings dialogue primarily provides the summary settings of 'medium' or 'high' saccade sensitivity (Figure 5.1(d)). Effects of settings It has long been known that fixation and saccade output is very sensitive to the choice of algorithm settings (Karsh & Breitenbach, 1983, e.g.). Using 60 Hz data and a dispersion-based algorithm, Shic, Scassellati, and Chawarska (2008) show that the effect of parameter changes on mean fixation duration is a linear function of parameters, with a considerable slope. As our reading data (p. 5) in Figure 5.8 show, the effect is that all basic fixation measures are heavily altered when using the common dispersion-based I-DT algorithm. Both the dispersion and the duration settings may give rise to artificially significant differences that may change the result of a study completely. For instance, the average fixation duration at setting I scale is in °/s which is clearly Resting 750°/s as a iflow for quick de-ful for distinguishes be larger than Iterations (Behrens m and saccades on duration and 1". Subsequent : merged into one i of this tool. It rthe algorithms de-. the many heuris-I to overview for i the summary the choice of al-dispersion-based ■ameter changes derable slope. As measures are Both the disperses that may duration at setting 3,5 8 25 =5 1-5 0.5 CHALLENGING ISSUES IN EVENT DETECTION 159 Effect of l-DT settings on dependent measures ■ Avg. fixation duration (20 ms) ■ Avg. fixation duration (60 ms) Avg. fixation duration (100 ms) ■ Total # fixations (20 ms) ■ Total # fixations (60 ms) Total # fixations (100 ms) 20/0.67 60/2.0 Pixels/degrees of visual angle 100/3.3 Fig. 5.8 How fixation measures differ with different dispersion diameters and duration settings in a commercial implementation of the l-DT algorithm (1250 Hz reading data from page 5). The slope is similar to that from 50 Hz data in Shic, Scassellati, and Chawarska (2008). 60 ms and 60 pixels differs significantly from the average fixation duration at 100 ms and 60 pixels (two-sided /-test with 36490 fixations each, t (36489) = 3.07, p < 0.01). The same thing happens if you change the dispersion from 60 pixels to 100 pixels while keeping the duration at 100 ms (/(26950) = 3.22, p < 0.01). Fixation durations will not only differ in their averages—a change in dispersion and duration thresholds also alters the distribution, as shown in Figure 5.9. A change from a 100 ms 60 pixel setting to a 100 ms 100 pixel setting dramatically decreases the number of 'fixations' around the 200 ms duration, and increases the number of 'fixations' with durations around 400-600 ms. Such a change in distribution affects averages but also the variance of the data, which in turn affects all your variance-based significance tests (/-tests and ANOVAs, for instance). Even this small examination of the I-DT algorithm clearly shows that dispersion and duration settings should be chosen with the utmost care. These effects are not unique for dispersion-based algorithms, but are also present in algorithms using velocity data. Figure 5.10 shows how basic saceade and fixation measures are affected by parameter changes in the SMI velocity algorithm; at a 90°/s setting, for example, the average fixation is 2.5 times as long as it is at the 30°/s setting (/(14340) = 2.85, p < 0.01). Shic, Scassellati, and Chawarska (2008) found similar variation when changing the saceade velocity setting from 18°/s to 81°/s. It is clear that the choice of setting can be the determining factor to the success or failure of an eye-tracking study. Although otherwise similar, studies using different settings of the peak saccadic velocity are not directly comparable. It is important to notice that the basic measures in Figure 5.10 are the foundation that many other dependent measures in eye-tracking research are built upon. Virtually all dependent measures will alter their values when this setting is changed. 160 I ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 2500 2000 1 1000 2 500 100 ms, 60 pixels 100 ms, 100 pixels JIJ1jlJ1j1Ji.ru, 200 400 600 800 Fixation duration (ms) 1000 1200 Fig. 5.9 Distribution of fixation durations for two dispersion settings of the l-DT algorithm (data source described on page 5). 3 r Effect of settings on dependent measures "5 2 5 1.5 I 0.5 -Avg. fixation duration --Total # fixations ----Total # saccades —*— Avg. saccade duration —m— Avg. saccade amplitude —— Avg. saccade peak velocity 30 40 50 60 70 80 Peak velocity threshold (degrees/s) 90 Fig. 5.10 How important dependent variables change with the setting of saccade velocity threshold in the SMI velocity algorithm (a commercial implementation of Smeets & Hooge, 2003). Reading data recorded at 1250 Hz and described on page 5. Fig. 5.11 Corr^i eta tocity is large—are sad tower-mounted systeai Data driven thresl It is well known tasks, trials, and e with it throughout M velop algorithms M (1981) suggested la| pensate for vary and Schroder-Pr momentary acce data acquired at Hollands 119%) each side of the the difference bevm Niemenlehto <2004 constant false alan Assuming the ■ whole trial, and taea dominant principle I and Kliegl (2003U and then set the tM outside the illustrates. Since principle can be choosing saccade Holmqvist 2010: 5.4.2 Noise. Noise can derive 1 unwanted Hill 61 CHALLENGING ISSUES IN EVENT DETECTION| 161 1.5 I a i 0.5 0 1 -0.5 -1 0 5 vs (degrees/s) 10 15 Fig. 5.11 Control ellipse for saccade detection where samples outside the ellipse—where the eye velocity is large—are saccade candidates. One and a half seconds of data collected at 1250 Hz with a tower-mounted system during reading. Data driven threshold It is well known that the noise levels in eye-tracking data can change across individuals, tasks, trials, and even within trials, so why should we choose a setting subjectively and stick with it throughout the entire analysis? This particular question has led researchers to develop algorithms that let the data itself assist in how to set the thresholds. Tole and Young (1981) suggested locally adapting the acceleration threshold used to detect saccades to compensate for varying noise levels they observed in the data. Similarly, Behrens, MacKeben, and Schroder-Preikschat (2010) proposed a saccade detection algorithm where an adaptive, momentary acceleration threshold was calculated based on the preceding 200 samples (for data acquired at 1000 Hz). A related algorithm is described by Marple-Horvat, Gilbey, and Hollands (1996), who use a "double-window" technique where two temporal windows on each side of the current velocity sample are subtracted, and a saccade is detected only if the difference between the average value within each window exceeds a certain threshold. Niemenlehto (2009) based the resilience against varying noise for saccade detection on a constant false alarm technique. Assuming the noise is constant over a trial, one can estimate the noise level over the whole trial, and then use this estimate to set the thresholds. For tixational eye movements, the dominant principle for microsaccade detection is based on the algorithm proposed by Engbcrt and Kliegl (2003), who first estimate the velocity noise in .x and v-dimensions separately, and then set the thresholds as multiples of the estimated variance in the noise; all samples outside the control ellipse formed by such thresholds are saccade candidates, as Figure 5.11 illustrates. Since the dynamics of microsaccades are similar to normal saccades, the same principle can be used to find appropriate saccade detection thresholds. Similar strategies for choosing saccade detection thresholds have been employed in other recent work (Nystrbm & Holmqvist, 2010; Van DerLans, Wedel, & Pieters, 2010). 5.4.2 Noise, artefacts, and data quality Noise can derive from the oculomotor system, the eye-tracker, or the environment, and adds unwanted variation to the acquired data. Artefacts can be seen as a special type of noise. 162 lESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 15 mc Fig. 5.13 Varstte Fig. 5.12 False fixations with black numbers 1, 4, 7, 9, and 12 result from imprecision (p. 33) in data. This means that raw data differ so much from sample-to-sample even within a single fixation that some of the samples end up outside of the dispersion radius, and will be segmented into minute fixations of their own. Recorded at 50 Hz on the remote system on a blue-eyed participant with contact lenses, and analysed using a dispersion-based algorithm with manufacturer standard settings. The task was to look at the centre of each white number in increasing order. Dark filled circles represent detected 'fixations'. but are typically larger and easier to distinguish from known eye-movement characteristics. Data quality is a more imprecise term, but is related to accuracy, precision, percentage of data loss, perhaps in addition to a subjective rating from the person responsible for the recording. Having access to all these quality indicators gives you an idea of whether the recorded data are useful for further analysis, or should be discarded. It is generally easier to detect events in recordings with high data quality. Figure 5.8 showed results from data with high quality in the sense that the calibration was judged as good and no problems were reported by the operator during the recording. Unfortunately, not all recorded data have the same high quality, and the algorithms need to deal with the imperfections too. In fact, data quality is an important factor to consider when using the algorithms, which can erroneously interpret various recording imperfections as actual eye-movement events. In dispersion-based algorithms, high noise levels can make a sample that rightfully belongs to a fixation move outside of the dispersion radius, end the fixation, and trigger a new one. Figure 5.12 shows how a number of such false 'fixations' are created from stray samples in the vicinity of the real fixations. Some varieties of dispersion algorithms attempt to address this problem by temporarily allowing a few samples to exceed the maximum dispersion threshold without ending the fixation (such as ASL Eyenal). High velocity artefactual eye movements will also be assumed to be saccades with intermediate fixations, but the 'fixations' will now be deleted because they are too short. The velocity algorithms are also best suited for high quality data. High velocity artefacts and imprecision are major obstacles; if the imprecision inside a fixation has a velocity above the velocity threshold used, it gives rise to false 'saccades', effectively ending the fixation. The velocity threshold can be superseded many times, giving a whole array of unrealistically short 'fixations'. Figure 5.13 shows how this happens during the first and fifth fixations. 2400 2200 2.000 1.800 1 600 1.400 1.200 1.000 800 V»nh bo sane exeat er a higher % of the r»x> r over 1000°/ CHALLENGING ISSUES IN EVENT DETECTION] 163 I (p. 33) in data. >fixation that some i minute fixations of i contact lenses, and The task was to look Selected fixations'. ■ent characteristics, u percentage of data It for the recording, er the recorded data quality. Figure 5.8 mon was judged as ing. Unfortunately, sed to deal with the tier when using the boos as actual eye-that rightfully be-and trigger a new I from stray sam-ithms attempt to : maximum disper-i velocity artefactual dons, but the 'fix- i %elocity artefacts I has a velocity above i ending the fixation, ay of unrealistically I fifth fixations. 550 500 450 _ 400 « 350 300 1 250 » 200 150 100 50 0 0 0 «jaJ»iJlu.li« IJLu in ki i:n! m iHirn 15.000 20.000 25,000 30.000 Time [ms] 35.000 40.000 Fig. 5.13 Variable precision. Data acquired for oculomotor fixations 1-5 are noisy (imprecise) when the participant looks at the top of the stimulus (first and fifth fixations) and precise at the bottom (second to fourth fixation). Recorded with a remote system at 250 Hz and analysed with a velocity-based algorithm with a threshold of 75°/s. 2,200 1,800 I" 1.200 | 1,000 800 600 400 f I I i ! ______i......... j ! f — ......... :::: i i □0 - A- A Time Fig. 5.14 Saccadic velocity plot showing the effect of having multiple competing corneal reflections (eye image in Figure 4.13). High-speed artefacts to the left, and slower reading saccades on the right. Recorded at 1250 Hz a tower-mounted system, participant with contact lenses. With both dispersion- and velocity-based algorithms, the effect of imprecision can to some extent be alleviated by raising the threshold setting. With a larger dispersion radius, or a higher velocity threshold, sample-to-sample motion can be quicker without endangering the consistency of fixations. This remedy comes at a price, however. For instance, raising the velocity threshold of Figure 5.13 above the peaks of this imprecision will make it so high that many real saccades will not be identified. When a saccade is not identified, the two surrounding fixations are reported as one single 'fixation', with a duration that equals the sum of the two real fixations and the intermediate saccade. Figure 5.14 shows high-speed optic artefacts: false eye movements with velocities well over lOOC/s and virtually infinite acceleration. Such velocities appear for instance when the 93 164 I ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 2500 400 600 Fixation duration (ms| 1000 Fig. 5.15 13 readers with very poor recording quality (mascara, contact lenses, drooping eyelids, etc). The SMI velocity algorithm in BeGaze 2.1 with different peak (saccade) velocity threshold settings. 10 readers with very high data quality at setting 40c/s for comparison. corneal reflection moves instantly from one position to another, as in Figures 4.12(d) and 4.13 on page 124. The two 'bumps' further right are real saccades, included for comparison. In the reading data used for examples here, 0.4-0.9% (depending on settings) of the saccades had a velocity higher than 800° /s. In corresponding data with low quality, 4-9% of the saccades had such high velocities.18 Obviously, the poor data quality caused this tenfold increase, by having the algorithm identify as saccades various false 'saccadic' movements like the ones in Figure 5.14. As a comparison, the return sweeps, when readers switch from one line to the next, across the entire monitor, a distance about 25° of visual angle, had an average peak velocity of 440°/s. In between the false high speed 'saccades' of the artefactual data, the SMI velocity algorithm finds false 1 ms 'fixations', as it attempts to fill the almost non-existent period between two false 'saccades*. In fact, when running data with poor quality through the SMI velocity algorithm of BeGaze 2.1, it identifies a huge number of false 'fixations' with durations shorter than 40 ms, as clearly shown in the histogram of Figure 5.15. Similar 'blips' of very short 'fixations' in the related 1-VT algorithm were reported by Salvucci and Goldberg (2000). In both cases, the lack in many velocity-based algorithms of a temporal criterion for fixation duration could be one part of the problem. The other part is the lack of an upper velocity threshold that could eliminate high-speed artefacts with intermediate 1-sample false 'fixations'. For very high quality data, the number of unreasonably short 'fixations' is much smaller (dotted comparison line), but they still exist. For durations above ~80 ms, the distribution is very similar. This suggests that with an improved algorithm, a good portion of the poor quality data could in fact be used, at least for some types of analyses. 5.4.3 Glissades Interestingly enough, the very high quality data in Figure 5.15 also exhibit a small proportion of 1 ms 'fixations'. In the high quality data, the unreasonably short fixations are not found ''Analysis was made using the SMI velocity algorithm of BeGaze 2.1 with a threshold of 40°/s 5 Fig. 5.16 Saccades 1250 Hz with a » events (gaps in bel amongst noise, bm peaks, knows tsjl participant does wd again, but too fact still and the rKariai are well beyond ■ algorithm finds. m\ Therefore, two c saccade is recog tween the peaks not uncommon i and many other saccades start w Some algorii glissades as can confuse the larly. DuchowsTri glissades i. thus data. The Eye However, from A ates very short fia to avoid that. It cleanup may be need to be discs to be eliminated] sades remains to First tixatiom cause when a with a glissade, sade. and also 'i ■ : CHALLENGING ISSUES IN EVENT DETECTION! 165 pes. drooping eyelids, etc), poty threshold settings. 10 5,050 5,100 5,150 5.200 5.250 5,300 5,350 5,400 5.450 5,500 5.550 5,600 Time [ms] Fig. 5.16 Saccades with multiple velocity peaks and false 1 ms fixations between them. Recorded at 1250 Hz with a tower-mounted system. Fixations (white lines), saccades (grey lines), and undetected events (gaps in between) according to SMI BeGaze 2.1 are indicated at the bottom of the graph. igures 4.12(d) and 4.13 for comparison. In the 5) of the saccades had , 4-9<7f of the saccades told increase, by jvements like the ones * itch from one line to . had an average peak ctual data, the SMI I almost non-existent velocity algorithm of ; shorter than 40 ms, : -;r. short "fixations' in : (2000). In both cases, r fixation duration could velocity threshold that 'fixations'. For very i smaller (dotted com-Jtion is very similar. : poor quality data could Lr.- - it a small proportion ations arc not found of407s amongst noise, but when the algorithm faces a main saccade ending with smaller velocity peaks, knows as glissades (see definition on page 183), as in Figure 5.16. The saccade of this participant does not stop at the intended fixation goal, but continues beyond it, and then back again, but too far, and thus wobbles back and forth for a while, before it comes to a standstill and the fixation can start. The velocity peaks in these very strong glissadic movements are well beyond the normal velocity threshold (here 40°/s), and therefore the SMI velocity algorithm finds, not a fixation right after the saccade ending, but essentially a new 'saccade'. Therefore, two of the saccades in Figure 5.16 are not recognized as saccades at all. The third saccade is recognized, but only its first velocity peak. We see false 1-10 ms 'fixations' between the peaks inside the second and the third saccades. Glissades of this extreme size are not uncommon in data that we have recorded from reading, mathematical problem solving, and many other tasks. Between 20-40% of all saccades end with a glissade, but almost no saccades start with this type of movement (Nystrom & Holmqvist, 2010). Some algorithms treat glissades like just another type of noise. Stampe (1993) describes glissades as noise that "includes ringing or overshoot artefacts following saccades, which can confuse the saccade detector into extending the saccades into the next fixation". Similarly, Duchowski (2007) describes filters that are optimized for idealized saccades (without glissades), thus smoofhening out the glissades before the fixation algorithm gets the velocity data. The EyeLink parser seems to assign glissades to fixations, as shown in Figure 5.18. However, from this figure it remains unclear whether the EyeLink algorithm at times generates very short fixations as a result of poor glissade treatment, or whether it is robust enough to avoid that. The manual tends to point in the former direction: "Post-processing or data cleanup may be needed to prepare data during analysis. For example, short fixations may need to be discarded or merged with adjacent fixations, or artefacts around blinks may have to be eliminated" (SR Research, 2007). However, whether the short fixations are due to glissades remains to be investigated. First fixation duration values are extremely sensitive to these short 'fixations' in data, because when a participant makes a saccade into an area of interest, the saccade very often ends with a glissade, and the SMI velocity algorithm often outputs a false 'fixation' before the glissade, and also before the real fixation. Therefore, during analysis, forgetting to remove short 166 I ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES |—i First fixation duration on ' critical AOI _ First fixation duration on ' non-critical AOI Low Middle High Low Middle High Fig. 5.17 First fixation durations on critical areas in a mathematical problem solving task (described on page 5). 'Low', 'High', and 'Middle' are three groups of participants with varying levels of mathematical competences. Fixations were detected using the SMI velocity algorithm and a 40°/s threshold. The same data is presented both left and right, but to the right the data include the small 'fixations'. Why such a huge difference on first fixations? After a saccade into an area, the very short 'fixation' that the SMI velocity algorithm finds between the saccade and the following glissade is taken as the first fixation, since it precedes the real fixation. Note that the very short fixations' not only make the averages much lower, they also introduce noise that conceals significant effects. 300 200 100 4.9418 4.942 4.9422 Time (ms) 4.9424 4.9426 x 106 Fig. 5.18 Event detection with the EyeLink parser for reading data ('Normal' sensitivity). Data were collected with the head-mounted system at 500 Hz. Thick lines at the bottom of the graph indicate where fixations' have been detected. Note how glissadic movements are systematically assigned to the following fixation and how parts of the saccades are also attributed to fixations. 'fixations' before calculating first fixation duration values means using lots of unreasonably short 'fixations', and getting averages that are lower, as in Figure 5.17. Glissades have until recently been treated unsystematically and differently across algorithms and even within the same algorithm, sometimes being attributed to saccades, other times to fixations. Some researchers express the need to exclude them completely from further analysis. Gilchrist and Harvey (2006) require the velocity to remain below 30"/s for at least five samples (20 ms) to count as a fixation, which "excludes the interval of ocular in- stability just after the of fixation location, d glissade is assigned considered the eye period started 20 m dynamic overshoot £ The prevalence larger, shorter, and eradicated or transfial man, 1995; Frens Jd any video-based fatal they may be more «] than 20 ms (one Positioned in the glissade to. (McConkie & quence relati nature. As g more studies 5.4.4 Sampti There is a tendeaj sampling frequent pling frequeacied velocity-based al| Dispersion-bt defined by the anytime. The i black lines, start 4 is clearly inconsdj points and dispcaa Figure 5.19 that J point of the aswt away from its laaj data samples arefl the saccade. Tbi] think the disj deep inside tfaej with its new accommodate 1 Dispersk vidual saccade all sampling freaj since the sasaj VtlociTf-ktd of sampling fiaa possible to seta The reason it m particaiari CHALLENGING ISSUES IN EVENT DETECTION | 167 ■duration on ■ t High ing task (described on leveis of mathematical s threshold. The same I fixations'. Why such fen' that the SMI rst fixation, since averages much lower, r 4.9426 x 106 ty). Data were col-ph indicate where gned to the follow- iots of unreasonably •ntly across algo-to saccades, other completely from furbelow 30°/s for at . rval of ocular in- stability just after the saccade", and argue that this leads to a more accurate calculation also of fixation location, although fixation durations may be shorter than typically reported, as the glissade is assigned to the saccade. Investigating post-saccadic drift, Collewijn et al. (1988) considered the eye velocity during a 100 ms period after a saccade, but arguing that "This period started 20 ms after the end of each saccade in order to avoid contamination by the dynamic overshoot frequently associated with a saccade". The prevalence of glissades appears to vary across eye-trackers, being more common, larger, shorter, and heavily curved in DPI-systems compared to video-based, while they are eradicated or transformed to post-saccadic drift in coil-based eye-trackers (Deubel & Bridge-man, 1995; Frens & Van Der Geest, 2002). Glissades can be observed in data collected with any video-based high-speed eye-tracker with good precision. In low-speed, remote systems, they may be more difficult to see, since the average glissade duration is only slightly larger than 20 ms (one sample in a 50 Hz eye-tracker) (Nystrom & Holmqvist, 2010). Positioned in between saccades and the following fixation, the question is which to assign the glissade to. The fact that perceptual visual intake appears to be closed during glissades (McConkie & Loschky, 2002), as well as the fact that glissades follow the same main sequence relationships as saccades (p. 318) tells us that they are predominantly saccadic in nature. As glissade-detecting algorithms become more available, we can surely expect to see more studies using this event. 5.4.4 Sampling frequency There is a tendency for dispersion-based algorithms to be used for data collected at a low sampling frequency, such as 50 Hz, and velocity algorithms for data collected at higher sampling frequencies (say > 200 Hz), but there are also exceptions. The Tobii Fixation filter is a velocity-based algorithm for fixation detection in data as slow as 30 and 50 Hz, for instance. Dispersion-based algorithms end a 'fixation' as soon as the raw samples cross the border defined by the dispersion radius. In high-speed data, this border can be crossed just about anytime. The two 'saccades' identified by the I-DT algorithm in Figure 5.19, indicated by black lines, start a bit into a real saccade, and two thirds into a real fixation, respectively. This is clearly incorrect, and reflects the fact that the dispersion algorithm is only aware of centre points and dispersion, but only indirectly velocity and acceleration. Take the 'fixation' in Figure 5.19 that starts in the middle of the first real saccade. I-DT starts calculating the centre point of the new 'fixation' here, even though the eye is in full motion and some distance away from its landing point in the real fixation. Once the eye has reached the real fixation, data samples are close to the dispersion radius border, as most of the distance was spanned by the saccade. This means that even very small movements inside the real fixation make I-DT think the dispersion border has been crossed. In Figure 5,19, this happens at time 8360 ms, deep inside the real fixation. After a minimal 'saccade', the I-DT starts a new 'fixation', with its new centre point and dispersion, which happen to be chosen generously enough to accommodate the next real saccade inside the 'fixation'. Dispersion-based algorithms have a large problem with their imprecise estimate of individual saccade and fixation durations. The same miscalculation of fixation onsets occurs at all sampling frequencies, but for the lowest sampling frequencies, this is not as big a problem since the sampling frequency by itself is the major limiting factor. Velocity-based saccade detection algorithms are better suited for use with a wide spectrum of sampling frequencies. With suitable filtering when velocity data is calculated, it is quite possible to get clear velocity peaks even for 50 Hz data, as shown for instance in Figure 5.20. The reason it is uncommon to use velocity algorithms for low-speed data is that velocity, and in particular acceleration, can be calculated only crudely when the sampling frequency is low. 168 I ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 8.150 8,200 8,250 8.300 8,350 8.400 8,450 Time [ms] Fig. 5.19 Note how the dispersion algorithm reduces the duration of saccades, and even inserts false 'saccades' in the midst of a fixation in this reading data recorded at 1250 Hz with a tower-mounted system. Grey lines depict the x- and v-coordinates in the coordinate system of the scene video. The dark line is eye velocity. The bottom bar indicates 'fixations' (light) and saccades' (darker) according to the l-DT algorithm with 100 ms and 80 pixels settings. If our task is to detect saccades correctly rather than to measure them with high precision, all we need is a fair estimate of peak velocity, and this we can get even at 50 Hz. In conclusion, while dispersion-based algorithms do not produce valid event data for higher sampling frequencies, the velocity algorithms have good potential for the entire spectrum of sampling frequencies. The use of acceleration is only suitable for data acquired with high-speed systems, however. 5.4.5 Smooth pursuit An increasing number of studies use stimuli and experimental set-ups that induce the participant to make smooth pursuit movements, for instance by using animated or video stimuli, or taking the participant out with a head-mounted eye-tracker to make simultaneous head and eye movements. Many of these studies use data recorded at low sampling frequencies, and dispersion algorithms are used to calculate fixations. Figure 5.20 shows data from a participant walking past a shelf in the supermarket (experiment 3 on page 5). He walks, turns his head, and moves his eyes simultaneously. When applying the I-DT algorithm to the data in this figure, three fairly correct saccades are indeed found, but also four or five false ones. The impact on variables such as fixation duration or saccadc rate is therefore disasterous. Such event data cannot be used. Interestingly, the velocity peaks of this 50 Hz data seem to better estimate where the saccades are located, posing the question of whether a velocity-based algorithm would have been a better choice (which finds some support in Munn, Stefano, & Pelz, 2008). Velocity algorithms typically assign smooth pursuit data into the same category as fixations. Ilti (2006), using video data, applies a velocity algorithm to remove all saccades, arguing that the remaining mixture of fixation and smooth pursuit data can be seen as a 'visual intake' category. Depending on the purpose of the study, such a mixed category could be sound or not. Itti wanted to compare all visual intake to that predicted by his algorithm, but made no duration statistics on the data. In most commercial software packages, this is indeed also the best case scenario of how smooth pursuit is handled; in the Tobii Fixation filter im- 8.450 ewen inserts false '-mounted system. Tie dark line is eye the l-DT algorithm igh precision, all event data for the entire spec-acquired with oduce the partici-video stimuli, or aneous head and frequencies, and supermarket (ex-Wlaneously. When les are indeed duration or igly. the veloc-located, posing choice (which : category as fix-■ all saccades, i be seen as a 'vi-gory could be • algorithm, but , this is indeed lion lilter im- CHALLENGING ISSUES IN EVENT DETECTION! 169 X 12,400 12.600 12,800 13.000 13,200 Time [ms] 13,400 13,600 13,800 Fig. 5.20 Eye movement data from a head-mounted eye-tracker at 50 Hz on a participant walking past a shelf in a supermarket. Dark lines are the x- and y-coordinates in the coordinate system of the scene video. The grey line is eye velocity. Bottom bar indicates fixations (light grey) and saccades (dark grey) according to the l-DT algorithm with 80 ms and 80 pixels settings. 5.075 5.0755 5.076 5.0765 5.077 5.0775 5.078 Time (ms) x 106 Fig. 5.21 Event detection with the EyeLink parser for smooth pursuit data (Normal' sensitivity). Data were collected at 500 Hz with the head-mounted system from a person viewing a pendulum movement. The lines at the bottom of the graph indicate where 'fixations' have been detected. Notice how fixation and smooth pursuit are merged Into the same category. plementation, the 'visual intake* category is still labelled 'fixations', and users with smooth pursuit data may be misled into making various statistics on the duration and prevalence of these 'fixations'. This is also the current status for parsers of high-speed data from EyeLink and SMI. Figure 5.21 illustrates how the EyeLink parser treats smooth pursuit data from a person following a pendulum movement on a computer screen. The worst case scenario, Figure 5.20 is an example of this, is that smooth pursuit eye movement causes an algorithm to output events that are clearly not present. Using velocity thresholds only, it may be hard to separate fast smooth pursuit, which can reach velocities of 170 [ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 340 320 300 280 260 240 220 200 180 I 160 I 120 I 100 so 60 40 20 0 Left eye Right eye J- A— f I —_ •\ = A. ----jHC 19,300 19,350 19.400 19,450 19,500 19,550 19,600 19,650 19.700 19.750 19.800 19,850 19.900 19,950 Time (ms] Fig. 5.22 Reading near the borders of a flat monitor: Velocities of saccade far to the right, return sweep, and saccade far to the left. Black line is left eye; grey line right eye. Recorded with a tower-mounted system at 500 Hz In binocular mode. 100°/s (Meyer, Lasker, & Robinson, 1985), from slow saccades. If we want correct fixation and saccade data with animated stimuli, smooth pursuit needs to be identified as an event in its own right, and we will see later how this could be done. 5.4.6 Binocularity When processing binocular dam, the SMI velocity algorithm of BeGaze 2.1 finds differences in both the number and duration of fixations and saccades, as indicated by Figure 5.22. This should come as no surprise, since it is known that the eyes do not move in complete synchrony, either in position or in speed and acceleration (p. 24 and 449). Nevertheless, the hard thresholds used by the algorithms could make even subtle differences in eye movement between the left and the right eye count. Part of the reason why events can be detected very differently is that the two eyes do not make exactly the same glissadic movements after the saccade, and sometimes only one of the eye velocities reaches down below the saccadic offset threshold, examplified by the right (grey curve) eye in Figure 5.22. The amplitude and duration of the saccade then differs between the eyes. Another reason is due to the two eyes having different distances to the right and left part of the monitor. Figure 5.22 shows first a saccade in the far right of the monitor (ai around 19,350 ms), then a return sweep (at about 19,650 ms), and finally a reading saccade at the far left of the monitor (at around 19,900 ms). At the right-hand side of the monitor, the right-eye saccades have a larger amplitude and the fixation durations are shorter than for the left eye. Conversely, on the left side of the monitor, the fixations of the left eye will be shorter, and its saccadic amplitudes longer than in the right side of the monitor (compare Figure 2.4 on page 24). This is not really a problem of the algorithm, but rather questions whether we should continue to record monocularly, and accept only the saccades and fixations of the one eye that we happen to select. Since the majority of eye-tracking research is monocular, velocity algorithms have mostly been applied to monocular data. The I-DT algorithm appears not be used in any real binocular research, probably as it is too imprecise in itself. Binocular event detection algorithms using the covariance between the eyes have been developed (Van Der Lans el ai, 2010). ALGORITHMIC DEFINITIONS! 171 Fig. 5.23 The l-DT detection criteria: gaze must reside within a limited spatial region for a specified minimum duration. Two fixations clearly fulfil the spatial dispersion criterion, but what about the more dispersed blob on Sorcerer? Is that one fixation or two? In the end. your settings will decide. : -9 950 , return sweep, i lower-mounted ■meet fixation ■ as an event in finds differ-r Figure 5.22. k in complete Bvsrtheless, the | movement two eyes do es only one tied by the I then differs »to the right : monitor (at tig saccade r monitor, the ■ than for the i be shorter, : Figure 2.4 ' whether we ; of the one > have mostly j real binocular ithms using MO). 5.5 Algorithmic definitions 5.5.1 Dispersion-based algorithms Dispersion-based algorithms are the most common type of event-detection algorithms, and are implemented in many commercial analysis software packages. They have mostly been used for low-speed data, and have long been considered the prime choice when analysing 50 Hz data. In short, dispersion algorithms detect only fixations and collect all other events to a common category. They identify fixations by finding data samples that are close enough to one another for a specified minimal period of time. They do not make any use of velocity or acceleration information to calculate the precise on- or offsets of fixation. Related cluster algorithms are presented by Urruty et al. (2007), Santella and DeCarlo (2004), and Goldberg and Schryver (1995b). The most used and also best of the dispersion algorithms is, according to Salvucci and Goldberg (2000), the identification by dispersion threshold (I-DT) algorithm; they tested six fixation algorithms with respect not only to accuracy and robustness, but also ease of implementation and speed. There are a number of commercial implementations of dispersion-based algorithms, for example by ASL, and SMI. The pseudo-code for the I-DT algorithm is: I-DT. Input: (raw data samples, dispersion threshold, duration threshold) 1. While there are still data samples (a) Initialize window over first samples to cover duration threshold (b) While dispersion <= threshold, add samples to window (c) Note a fixation at the centroid of window samples (d) Remove window points from samples Dispersion is defined as d = [max(jc) — min(.t)] + [max(v) - min(y)], where (x,y) represent the samples inside the window. The dispersion algorithms combine a temporal window (duration threshold) with a spatial requirement (the dispersion threshold). For instance, the temporal threshold may be 100 ms, and the dispersion threshold 1" of visual angle. This would then mean that only when the data samples stay within a 10 diameter for at least 100 ms is that sequence of data samples considered a fixation. This principle is illustrated in Figure 5.23. The I-DT algorithm has a number of cousins who all use a temporal threshold, but calculate the spatial dispersion criterion somewhat differently (Blignaut. 2009; Shic, Scassellati, & Chawarska, 2008; Salvucci & Goldberg. 2000). Moreover, the algorithmic variations have different ways to deal with noise, as we have pointed out above. 5.5.2 Velocity and acceleration algorithms Fixation detection algorithms As for dispersion algorithms, fixation velocity algorithms use a duration criterion, but instead combine it with a stillness criterion based on eye velocity. The eye velocity is seldom at the 172 i ESTIMATING OCULOMOTOR EVENTS FROM RAW DATA SAMPLES 160 140 130 120 110 "5" » J- 90 so J* 70 I 60 0) 50 > 4Q 4.450 4,500 4,550 4.600 4,650 4,700 4,750 4,800 4,850 4,900 4,950 5.000 Time [ms] Fig. 5.24 Velocity chart for three fixations and two saccades, recorded at 1250 Hz with a tower-mounted system. Velocity calculated with BeGaze 2.1. absolute zero level, because of micro-movements in the eye and eye-tracker-related noise. Therefore users of this algorithm must decide an upper velocity threshold for fixations. Figure 5.24 shows a velocity over time chart for three fixations and two intermediate saccades. The velocity during fixations in this reading data has its peaks at 6-10°/s. The shortest saccades typically have velocity peaks of about 30-40°/s. Rotting (2001) summarizes settings for the velocity threshold used for this type of fixation analysis in five quoted studies: < 16°/s, < 20° /s. < 6.587s, < 50°/s, < 37.7°/s. The two last settings most likely reflect a considerable noise in the eye-tracking systems used in the quoted studies, and definitely run the risk of categorizing some short saccades as parts of fixations. Thresholds could also vary since the velocity samples have undergone different types of lowpass filtering prior to detection; little filtering requires higher thresholds. This type of algorithm also requires an additional minimal duration threshold for fixations, which can be set to anything between 60 and 120 ms, according to Rottings review, also see pages 155-156 for an extended discussion. The algorithm thus finds fixations as periods longer than a minimal duration, during which the eye velocity is below a maximum velocity threshold. Even though the 'fixation radius' and 'Min fixation duration' settings in Figure 5.1(b) on page 148 invite the user to believe that it is a dispersion-based algorithm, the Tobii ClearView fixation algorithm is in principle similar to the I-VT algorithm by Salvucci and Goldberg (2000); the 'fixation radius' setting refers to the maximum distance between two consecutive samples in pixels. Fixations comprise consecutive samples whose distances are shorter than the 'fixation radius' over a period longer than the minimum fixation duration. Note, however, that according to Blignaut and Beelders (2009) classification of dispersion metrics on page 155, the Tobii Clearview algorithm could well fit the under the umbrella of dispersion-based algorithms. Saccade detection algorithms A velocity-based saccade detection algorithm focuses on identifying the saccadic velocity peaks. Motion above a velocity threshold, for instance 75°/s, is assumed to be a saccade. In order to differentiate real saccades from artefacts, which can also be fast movements, there are usually additional constraints on saccades, such as a clear speed peak near the middle of the saccade (Smeets & Hooge, 2003), or that the peak saccade velocity cannot be higher than a certain threshold (Nystrdm & Holmqvist, 2010). What is not identified as a saccade is typically assumed to be a fixation. Surprisingly, very few, if any, algorithms have used the fact that saccades follow i (i.e. eye movemen Velocity-based; are also available i found in the co lustrate strengths i is a more elab (1981) and Salvt by Inchingolo and I from 1965 and IS The particular a pies above is based < paper spells out i be specific about I SMI employee a i used in research \ their BeGaze i The SMI veh data stream twice. Tl time to find saccadMl SHI velocity saccade peak 1. For all (a) Cal (b) Defc (c) Cal 2. For all (a) Ccllecfl is sol "iresjJ ibi CollwJ velocljl (c) Deiecsj velociB part J (dl Deud] (e) FnatJ In other » ordsj accepts these as