Author’s Accepted Manuscript A hitchhiker's guide to lesion-behaviour mapping Bianca de Haan, Hans-Otto Karnath PII: S0028-3932(17)30396-2 DOI: https://doi.org/10.1016/j.neuropsychologia.2017.10.021 Reference: NSY6540 To appear in: Neuropsychologia Received date: 6 June 2017 Revised date: 16 October 2017 Accepted date: 17 October 2017 Cite this article as: Bianca de Haan and Hans-Otto Karnath, A hitchhiker's guide to lesion-behaviour mapping, Neuropsychologia, https://doi.org/10.1016/j.neuropsychologia.2017.10.021 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. www.elsevier.com/locate/neuropsychologia A hitchhiker’s guide to lesion-behaviour mapping Bianca de Haan1,I ; Hans-Otto Karnath1,2 1 Center of Neurology, Division of Neuropsychology, Hertie-Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany 2 Department of Psychology, University of South Carolina, Columbia, USA I Present address: Division of Psychology, Department of Life Sciences, Centre for Cognitive Neuroscience, Brunel University London, Uxbridge, UK Corresponding author: Bianca de Haan Division of Psychology, Department of Life Sciences Brunel University London Kingston Lane, Uxbridge, UB8 3PH, UK. E-mail: bianca.dehaan@brunel.ac.uk Tel: 0044 (0)1895 265797 1 Abstract Lesion-behaviour mapping is an influential and popular approach to anatomically localise cognitive brain functions in the human brain. Multiple considerations, ranging from patient selection, assessment of lesion location and patient behaviour, spatial normalisation, statistical testing, to the anatomical interpretation of obtained results, are necessary to optimize a lesion-behaviour mapping study and arrive at meaningful conclusions. Here, we provide a hitchhiker’s guide, giving practical guidelines and references for each step of the typical lesion-behaviour mapping study pipeline. Key words: Lesion analysis; voxel based lesion symptom mapping; VLSM; VLBM; brain behaviour inference; human 2 1 Introduction In a classical lesion analysis, the aim is to infer the cognitive function of an area of the human brain by observing the behavioural consequences of damage to that brain area. In the early days, the lesion method was the only approach available to study the functional architecture of the brain and these early studies have contributed tremendously to our understanding of a wide variety of cognitive functions (e.g. Damasio & Damasio, 1989). Nowadays, functional brain imaging and transient neuroinhibition/neurostimulation techniques have complemented this traditional approach. Nevertheless, the lesion method continues to be an essential and influential approach for neuroscientists aiming to study the functional architecture of the brain (see Rorden and Karnath, 2004 for a review). This continued importance of the lesion method as an approach to aid understanding of cognitive function is also illustrated by the steadily increasing number of scientific publications featuring this method over the years (see Figure 1). --- Figure 1 about here --- Surprisingly, despite this continued and increasing popularity of the lesion method, there currently are virtually no papers or books to assist scientists interested in using this method (see Wilson, 2016 for a notable exception). This is in stark contrast to the many excellent papers and books available to guide scientists interested in using other neuroscientific methods. Thus, inspired by similar recent guides for diffusion tensor imaging (DTI; Soares et al., 2013) and functional magnetic resonance imaging (fMRI; Soares et al., 2016), we here compile a hitchhiker’s guide detailing the necessary considerations at each step of the typical lesion study pipeline (see Figure 2), with a primary focus on ‘classical’ univariate lesion analysis approaches (Bates et al., 2003; Rorden et al., 2007). New multivariate lesion analysis approaches have also been proposed (Smith et al., 2013; Mah et al., 2014a; Zhang et al., 2014; Pustina et al., in press, this issue) and, in light of known drawbacks associated with the univariate approach, such as limited statistical power and potential spatial bias (Mah et al., 2014a; Inoue et al., 2014; for a review, see also Sperber & Karnath, in press, this issue), it has been suggested that these multivariate approaches should be preferred over univariate lesion analysis approaches (Mah et al., 2014a). However, multivariate approaches have their own issues requiring future improvements, such as the feature selection method (Yourganov et al., 2015; Rondina et al., 2016). Moreover, the spatial bias in univariate approaches can be 3 reduced considerably by correcting for lesion volume and ensuring sufficient minimum lesion overlap (Sperber & Karnath, 2017; see also section 5.2.1 below), and the possibility that multivariate approaches are not likewise affected by a spatial bias has not been ruled out yet. As such, a more nuanced view would be to consider univariate and multivariate lesion analysis approaches as complementary (Karnath et al., in press). --- Figure 2 about here --- 2 Patient selection The first decision a researcher planning a lesion-behaviour mapping study has to make, is which patients to select for the study. Patient assessment is time-consuming and, as such, it is generally most efficient to restrict patient assessment to those patients that will allow both meaningful conclusions on the functional architecture of the brain and a meaningful investigation of the research hypothesis. 2.1 Lesion aetiology One frequently used patient selection criterion is lesion aetiology. Of the 181 studies depicted in Figure 1 (i.e. the publications between 1995 and 2015 that used the lesion method found searching Pubmed), the vast majority (66.3%) were conducted with stroke patients. 2.1.1 Use acute stroke patients to investigate the functional architecture of the brain In the acute phase, strokes are associated with clear behavioural consequences that can, for the large part, be directly linked to the original function of the functionally impaired part of the brain (as strokes are sudden and as such the brain has not yet had time to functionally reorganise, see also Shahid et al. [2017]). As long as patients with mass shifts due to extensive haemorrhage or extensive oedema are excluded, all brain structures are typically still at their original locations and the parts of the brain affected by stroke can be reliably visualised in computed tomography (CT) (Mohr et al., 1995; Mayer et al., 2000) and/or magnetic resonance (MR) (Genovese et al., 2002) images (see section 3.1.1 below). As such, acute stroke patients are highly suitable for studies where the aim is to investigate the functional architecture of the brain. 4 2.1.2 Use chronic stroke patients to investigate the neural correlate of chronic dysfunction Unfortunately, acute stroke patient data is not always easy to acquire. Access to stroke units and acute stroke patients may be restricted and the behavioural assessment of acute stroke patients is frequently difficult due to their often poor general state of health. As a consequence, many lesion-behaviour mapping studies have relied on more readily available chronic stroke patient data. However, in the chronic stroke phase, functional reorganization of the brain in the course of normal recovery can complicate the investigation of the functional architecture of the brain (Karnath and Rennig, 2016). This is mostly due to the fact that during lesion-behaviour mapping analyses, chronic stroke patients who have recovered from their initial cognitive deficit are grouped together with chronic stroke patients that never had a deficit. As a consequence, the lesion-behaviour mapping analysis (incorrectly) assumes that the parts of the brain damaged in chronic stroke patients who have recovered are not, or less critically, associated with the cognitive function of interest. Karnath and Rennig (2016) recently tested the consequences of this effect, comparing the three most common combinations of structural imaging data and behavioural scores used in previous lesionbehaviour mapping studies. Only the combination of acute behavioural scores and acute structural imaging precisely identified the targeted brain areas. In contrast, lesion-behaviour mapping analyses based on chronic behaviour, in combination with either chronic or acute imaging, hardly detected any of the targeted substrates. Moreover, in the chronic stroke phase, the precise determination of the parts of the brain that are functionally impaired in CT and/or MR images is complicated by secondary morphological changes to the brain due to tissue resorption following brain damage, such as structural distortions, sulcal widening, and ventricle enlargement (Karnath and Rorden, 2012). As such, chronic stroke patients are less suitable than acute stroke patients for studies where the aim is to investigate the functional architecture of the brain. However, if combined with CT and/or MRI data obtained in the acute stroke phase, the behavioural data from chronic stroke patients has been shown to be highly suitable for studies where the aim is to investigate the neural correlates of chronic cognitive deficits, i.e. studies where the aim is to determine where in the brain acute damage results in a cognitive deficit that is still present in the chronic stroke stage 6 months or more following stroke onset (Karnath et al., 2011; Abela et al., 2012; Wu et al., 2015). This is a question with a high clinical relevance, as it ultimately has the potential to enable long-term clinical predictions based on the location of the acute brain damage. 5 2.1.3 Avoid combining acute and chronic patients in the same lesion-behaviour mapping analysis Importantly, combining both acute and chronic stroke patients in the same lesion-behaviour mapping study entails the risk of ending up with the worst of two worlds: Even if CT and/or MRI data is obtained in the acute stroke phase for all patients, the different amounts of cortical reorganization in different patients are likely to confound the interpretation of the lesion-behaviour mapping results, regardless of whether the aim was to study the functional architecture of the brain, or to study the neural correlates of chronic cognitive deficits. This problem is most straightforwardly demonstrated with a thought experiment (see Figure 3): Imagine that both lesions depicted on the top of Figure 3 equally affect the brain area crucially related to a certain cognitive function of interest. Thus, directly following stroke onset, the behavioural deficit in these two patients is maximal and equal (i.e. the behavioural deficit score is ‘54’ for both cases). In both patients the severity of the behavioural deficit decreases equally over time due to spontaneous recovery. Now imagine that both patients are recruited for a lesion-behaviour mapping study. However, whereas one of the patients is recruited and behaviourally assessed in the acute stroke phase (and thus shows a behavioural deficit score of ‘54’ in Figure 3), the other patient is first seen and assessed in the intermediate/chronic stroke phase following considerable spontaneous recovery of his/her behavioural deficit (let us assume by 50%, leading to a measured behavioural deficit score of ‘27’ in Figure 3). Although both brain lesions equally affect the brain area crucially related to the cognitive function of interest, the lesion analysis now erroneously weighs the lesion location of the patient with the behavioural deficit score of ‘27’ as being less relevant for the cognitive function of interest than the lesion location of the subject with the behavioural deficit score of ‘54’. As such, the overall contribution of this brain area to the cognitive function of interest is ultimately underestimated. The main underlying issue is that lesionbehaviour mapping analyses assume that each patient is assessed at the same point in time following stroke onset and that thus the contribution of a certain brain area to a certain cognitive function is directly reflected in the behavioural scores in each patient. Combining both acute and chronic stroke patients in the same lesion-behaviour mapping study violates this assumption. ---- Figure 3 about here ---- 6 2.1.4 Lesion aetiologies other than stroke Beyond strokes, lesion-behaviour mapping analyses have also been conducted with other lesion aetiologies. In the 181 lesion-behaviour mapping studies depicted in Figure 1, the second and third most popular lesion aetiologies were traumatic brain injury and brain tumour (10.5% and 5.5% of the studies respectively). Moreover, a further 9.9% of these studies combined patients with different lesion aetiologies in the same study (e.g. included patients with stroke, traumatic brain injury and brain tumour). There is, however, considerable debate concerning the suitability of these patients with lesion aetiologies other than stroke for lesion-behaviour mapping analyses. Specifically, a considerable body of work suggests that traumatic brain injury and tumour patients might be less suitable than stroke patients for lesion-behaviour mapping analyses that aim to study the functional architecture of the healthy brain. In traumatic brain injury patients, a major neuropathological component, beyond focal brain damage and regardless of traumatic brain injury severity (mild, moderate or severe) or mechanism (closed or penetrating), is diffuse axonal injury (Gennarelli et al., 1982; Povlishock and Katz, 2005; Büki and Povlishock, 2006; Su and Bell, 2016). Moreover, this diffuse axonal injury has been suggested to contribute significantly to the cognitive impairments (and their recovery) observed following traumatic brain injury (Povlishock and Katz, 2005; Levine et al., 2013). However, while areas of diffuse axonal injury of sufficient size can be detected in MRI (Su and Bell, 2016), the full extent of diffuse axonal injury can only be detected histopathologically (Adams et al., 1991; Povlishock, 1993; Johnson et al., 2013). This presents a significant problem for lesion-behaviour mapping analyses, as these analyses require an accurate in vivo determination of which areas of the brain are functionally impaired and which areas of the brain are functionally intact. Additionally, traumatic brain injury patients are typically investigated in a chronic disease phase, in which case the same caveats as noted above for chronic stroke patients hold. In brain tumour patients, there is likewise evidence that an accurate determination of functionally impaired and functionally intact areas of the brain can be problematic. The most common type of malignant primary brain tumour is glioma, accounting for 74.6% of all malignant brain and other central nervous system tumours (Ostrom et al., 2016). In gliomas, however, the precise spatial extent of the tumour is impossible to determine. The vast majority of gliomas are characterised by diffuse infiltration of surrounding tissue (Scherer, 7 1940) that can extend considerably beyond the tumour border visible in conventional T1 or T2 MRI images (Burger et al., 1988; McKnight et al., 2002; Swanson et al., 2004). While more recent imaging modalities such as diffusion tensor imaging and proton MR spectroscopy may improve the visualisation of the tumour (Claes et al., 2007), many areas of tumour infiltration occur at a spatial scale that cannot be detected even with these newer imaging modalities. Critically, and presenting a significant problem for lesion-behaviour mapping analyses, it is unclear whether or to what extent brain function is impaired in these (in vivo not detectable) areas of diffuse tumour infiltration (Karnath and Steinbach, 2011). In the case of preoperative tumour patients, this problem in accurately determining the functionally impaired and functionally intact areas of the brain is further exacerbated by observations suggesting that brain function can sometimes be preserved within the tumour, particularly (but not exclusively) in patients with a slow-growing low-grade glioma (Ojemann et al., 1996; Skirboll et al., 1996; Schiffbauer et al., 2001). An additional problem with using glioma patients in lesion-behaviour analyses is that these tumours tend to develop on a relatively long time-scale. That is, unlike strokes or traumatic brain injury, gliomas do not have a sudden onset. Instead, gliomas slowly grow and are only diagnosed when they are both large enough to be detected in CT or MR images and clinically symptomatic (Swanson et al., 2003; Pallud et al., 2013). This relatively long time-scale of development means that, by the time the tumour is diagnosed, most glioma patients‘ brains have undergone a considerable amount of (compensatory) functional reorganization (Wunderlich et al., 1998; Fandino et al., 1999; Thiel et al., 2001; Holodny et al., 2002; Meyer et al., 2003; Taniguchi et al., 2004; Shaw et al., 2016). This means that in tumour patients, the behavioural consequence of brain damage may no longer reflect the original function of the damaged part of the brain, which presents a serious problem for lesion-behaviour mapping analyses that aim to study the functional architecture of the brain. 2.2 Lesion location Beyond lesion aetiology, another patient selection criterion is lesion location. For example, based on a convincing body of previous findings, we might a priorily expect certain language functions (e.g. language production) to be associated with a certain part of the brain (e.g. the left hemisphere). As such, when interested in, e.g., language production, our research question might be „which areas of the left hemisphere causally contribute to language production?“. In this case, it would make little sense to also assess patients with right 8 hemispheric brain damage. Likewise, on a smaller spatial scale, we might know from various previous investigations that a certain function of interest is located in a particular region of the brain, e.g., that a specific executive function is governed by prefrontal cortex. As such, when interested in describing where exactly within the prefrontal cortex this specific function is located, it would make little sense to also assess patients with posterior brain damage. However, while restricting patient selection to patients with damage to only a certain part of the brain is valid in these cases, it is also important to realise that this then by definition means that no inferences can be made about the potential (and perhaps even larger) contributions of not-investigated areas of the brain. Moreover, caution should be used when extending this logic to multiple parts of the brain (see also section 5.3 below). In patients with larger strokes, brain lesions are likely to encompass more than just one of these areas of interest. This creates a significant problem during classification of these cases. Exclusion is no solution, as this would create a bias towards smaller lesions and potentially milder cognitive symptoms, ultimately leading to different anatomical conclusions. 2.3 General criteria Finally, beyond these main patient selection criteria, there are a few typical general exclusion criteria. Firstly, patients with evidence of clinically relevant cognitive impairments such as dementia or mental retardation and/or evidence of psychiatric disorders are usually excluded. In these patients, a valid assessment of the behaviour of interest would be difficult. Secondly patients with evidence of additional (pre-existing) neurological disorders beyond the neurological disorder of interest, such as Parkinson’s disease, infections of the central nervous system, or older and/or additional diffuse brain lesions due to, e.g., previous strokes or chronic hypertension, are also usually excluded (although up to 2-5 pre-existing silent lacunes are typically allowed). In these patients it would be difficult to determine which aspect(s) of the behavioural deficit (if observed) can be attributed to the neurological disorder of interest and which aspect(s) of the behavioural deficit might instead be due to these additional neurological disorders. Finally, patients with mass shifts due to extensive haemorrhage or oedema should be excluded. In these patients, brain areas are no longer at their original positions, which potentially confounds the interpretation of lesion-behaviour mapping analysis results. Importantly, however, absence of the behavioural deficit of interest should not be used as an exclusion criterion. The inclusion of patients that do not have the behavioural deficit of interest (control patients) is essential, as this allows us to differentiate between areas of the brain where damage is associated with the deficit of interest and areas of 9 the brain where damage merely reflects increased vulnerability to injury (Rorden and Karnath, 2004). Moreover, restricting patient selection solely to patients that show the behavioural deficit of interest reduces variance in the behavioural data and so ultimately reduces statistical power to detect an effect in lesion-behaviour mapping analyses (see also section 3.2 below). Instead, researchers should ideally a priorily decide on a reasonable patient recruitment time period and unselectively include all suitable (i.e. matching all inclusion criteria and none of the exclusion criteria) patients that present during that time period in the study. 3 Patient assessment The next decision a researcher planning a lesion-behaviour mapping study has to make, is how to assess lesion location and behavioural status in each patient. Given the problems associated with lesion aetiologies other than stroke (see section 2.1.4 above), the following section and the rest of this manuscript will focus on stroke patients. 3.1 Assessing lesion location As mentioned in section 2.1 above, imaging data should be obtained in the acute stroke phase, regardless of whether the aim is to study the functional architecture of the brain, or to study the neural correlates of chronic cognitive deficits. In acute stroke patients, the lesion can be visualised using either CT (Mohr et al., 1995; Mayer et al., 2000) or MRI (NeumannHaefelin et al., 1999; Ricci et al., 1999; Schlaug et al., 1999). The development of CT templates for spatial normalisation of individual patient images (Rorden et al., 2012; see also section 4.1 below) has removed the main reason to disregard CT for lesion-behaviour mapping studies. Moreover, modern spiral CT scanners provide high resolution images and in many clinical institutions CT remains the dominant imaging modality of choice at admission. Importantly, the choice between administering CT or MRI to patients at admission is not random, but follows specific clinical criteria. As a consequence, the systematic exclusion of patients with CT images only implements a selection bias, typically influencing important factors such as lesion size, general clinical status, severity of cognitive deficits etc. (for a detailed discussion, see Sperber and Karnath, in press, this issue). As such, both CT and MRI data can and should be used for lesion analysis studies. 10 We suggest the following practical guidelines for the assessment of lesion location: In patients with CT imaging only, use noncontrast CT images to visualise the brain lesion. In noncontrast CT images, acute haemorrhagic strokes appear as hyperintense areas within minutes to hours following stroke onset (Bergström et al., 1977). Ischemic strokes, on the other hand, appear as hypointense areas between 24 to 36 hours following stroke onset (Mohr et al., 1995; von Kummer et al., 2001). In patients with MR imaging only, use diffusionweighted images (DWI) to visualise the lesion if imaging is performed less than 48 hours following stroke onset, and use T2FLAIR images (ideally supplemented with DWI, particularly during the first 5 days following stroke onset [Ricci et al., 1999]) if imaging is performed more than 48 hours following stroke onset. In DWI, ischemic strokes appear as hyperintense areas within 2-6 hours following stroke onset (Warach et al., 1992; González et al., 1999), while the initial T2FLAIR infarct hyperintensity might be too subtle for accurate lesion visualisation within the first 48 hours (Lansberg et al., 2001). Finally, while not suitable for the visualisation of the lesion in the acute phase of a stroke, a T1 image might aid spatial normalisation (see section 4.2 below). In patients with both CT and MR images, the researcher is in the privileged situation to choose the best from both modalities, i.e. to use those images where the lesion is most conspicuous. The information provided by structural imaging data could be meaningfully complemented with imaging data that allows visualisation of areas of the brain that are structurally intact, but may function abnormally (e.g. the ischemic penumbra and/or areas of diaschisis). That is, multimodal imaging of brain damage, where structural and functional information are combined, might provide a more accurate picture of the full extent of brain damage than structural imaging alone. In clinical settings, visualisation of areas of the brain that are structurally intact, but may function abnormally can be done using perfusion CT (Mayer et al., 2000; Koenig et al., 2001) or MR perfusion-weighted imaging (PWI; Schlaug et al., 1999; Schaefer et al., 2002; Zopf et al., 2012). Shahid et al. (2017) were recently able to show that, when patient assessment was performed within the first 48 hours following stroke onset, lesion analysis inferences were more accurate when based on both structural and perfusion imaging than when based on structural imaging alone. Unfortunately, however, effective usage of perfusion data is often difficult due to the fact that the precise relationship between the severity of hypoperfusion and the severity of the functional impairment is largely unknown. While guidelines have been posited concerning the degree of hypoperfusion likely to lead to a behaviourally relevant functional impairment for some areas (Neumann-Haefelin 11 et al., 1999; Hillis et al., 2001; Motta et al., 2014), it is currently unclear whether these guidelines are equally applicable to all areas of the brain. This is in contrast to areas that are lesioned, where we know that function is completely lost. Moreover, while there is evidence for the behavioural relevance of the ischemic penumbra (Shahid et al., 2017), evidence for the behavioural relevance of remote diaschisis is still mixed. While several studies suggest that subcortical damage may result in behaviourally relevant remote cortical hypoperfusion (Hillis et al., 2001, 2002, 2005; Karnath et al., 2005; Ticini et al., 2010), evidence that cortical damage results in behaviourally relevant remote cortical hypoperfusion is so far lacking (Zopf et al., 2009). 3.1.1 Lesion delineation Following CT or MR data acquisition, the lesion needs to be delineated on each slice of the patient’s brain image. The standard is manual lesion delineation, which can be done using programs like MRIcroN (https://www.nitrc.org/projects/mricron; Rorden and Brett, 2000), or ITK-SNAP (http://www.itksnap.org; Yushkevich et al., 2006). However, manual lesion delineation is time-consuming and potentially observer-dependent (Ashton et al., 2003). To address these disadvantages, both fully automated and semi-automated lesion delineation methods have been developed. Fully automated lesion delineation methods can roughly be divided in unsupervised (e.g. Gillebert et al., 2014; Mah et al., 2014b) and supervised (e.g. Griffis et al., 2016; Pustina et al., 2016) classification algorithms. As the name implies, fully automated methods do not require any user interaction. As such, these methods are substantially less time-consuming and observer-dependent than manual lesion delineation, which improves replicability and reproducibility across labs. A considerable downside of these fully automated methods, however, is that they may be more susceptible to imaging artefacts and thus less precise than the current gold standard, manual lesion delineation. Importantly, this reduced precision associated with fully automated lesion delineation methods may influence subsequent lesion-behaviour mapping results (Pustina et al., 2016). Given this downside, semi-automated lesion delineation methods that combine fully automated steps with mandatory user interaction might be able to provide an optimal compromise. While several semi-automated lesion delineation approaches exist (e.g. Wilke et al., 2011), the semi-automated lesion delineation approach Clusterize (https://www.medizin.uni-tuebingen.de/kinder/en/research/neuroimaging/software/; Clas et al., 2012) has recently been shown to be capable of significantly speeding up lesion delineation, without loss of either lesion delineation precision or lesion delineation 12 reproducibility in acute stroke patients scanned in both CT and a range of common MRI modalities (de Haan et al., 2015). The principle of Clusterize is simple: On the basis of local intensity maxima and iterative region growing, the whole CT or MR brain image is fully automatically clusterized, including the lesioned area. The user subsequently manually selects those clusters that correspond to the lesion. Clusterize may thus combine the best of two worlds: the presence of fully automated steps makes it less time-consuming than manual lesion delineation, while mandatory user interaction results in lesion delineation that is more precise and/or less error-prone (in the sense that they are closer to the results from the current gold standard, manual delineation) than that obtained by fully automated lesion demarcation methods (de Haan et al., 2015). 3.2 Assessing behavioural status In a lesion-behaviour mapping analysis, the behavioural status needs to be assessed for each patient. Typically, this means that the cognitive function of interest needs to be operationalised. Several important considerations concerning this operationalisation of cognitive functions, such as taking care to distinguish between impaired performance on a test used to operationalise a cognitive function and the clinical syndrome of interest, and ensuring that the test used to operationalise a cognitive function is as specific to the cognitive function of interest as possible, are discussed in Sperber & Karnath (in press, this issue). Additionally, it is important to ensure that the test used to characterize the behavioural deficit has sufficient sensitivity. To help ensure this, the range of scores used in a test should be chosen to measure the full range of the underlying behaviour with both sufficient and meaningful resolution. That is, the optimal range of scores would allow both a measurement of the full possible range of behaviour and a measurement of the smallest meaningful difference in the behaviour of interest. Insufficient sensitivity both reduces the ability to detect a true effect in lesion-behaviour mapping analyses, and reduces the likelihood that an observed significant effect reflects a true effect (Button et al., 2013; Ingre, 2013). Moreover, given that cognitive performance is typically expressed on a continuous scale, dichotomization of this inherently continuous performance should be avoided. Dichotomization of continuous behaviour is known to result in a significant loss of information and thus of statistical power (Cohen, 1983). In the rare cases where dichotomization of continuous behaviour is unavoidable (for example when performing lesion subtraction analyses to study exceedingly rare deficits, see section 5.1 below), the dichotomization and subsequent classification of individual patients as ‘impaired’ or ‘non- 13 impaired’ should be performed with proper statistical procedures (Crawford and Howell, 1998; Crawford and Garthwaite, 2005). Finally, when assessing the behavioural status of a patient, potential neuropsychological co-morbidity should be taken into account (see also Bonato et al., 2012; Sperber & Karnath, in press, this issue). Care should be taken to ensure that the test used to operationalise the cognitive function of interest does not introduce a systematic bias against patients with certain co-occurring deficits. Moreover, frequently cooccurring deficits (e.g. reduction of general cognitive status, language impairments) whose severity might correlate with the severity of the cognitive deficit of interest should ideally be assessed in addition to the cognitive function of interest, so that they can be controlled for during the lesion-behaviour mapping analyses (e.g. by including them as nuisance covariates, see section 5.2.1 below). 4 Spatial normalisation of patient brain and lesion map Following patient imaging and lesion delineation, we have a 3D binary lesion map reflecting the voxels where brain function is impaired for each patient that we can use for lesionbehaviour mapping analysis. However, all brains differ in orientation, size, and shape. As such, before we can perform voxelwise (statistical) comparisons, we need to spatially normalise the patient brains and lesion maps to ultimately ensure that a given voxel (roughly) represents the same anatomical structure in each patient. Thus, the third decision a researcher has to make, is how to spatially normalise the patient brain and lesion map. Spatial normalisation can be performed with programs such as BrainVoyager (http://www.brainvoyager.com/; Goebel, 2012), SPM (http://www.fil.ion.ucl.ac.uk/spm/), FSL (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki; Jenkinson et al., 2012), AFNI (https://afni.nimh.nih.gov/; Cox, 1996, 2012), or ANTs (http://stnava.github.io/ANTs/; Avants et al., 2011). The analysis package SPM is widely used due to its platform independence, free obtainability, and availability of many add-ons. As such, we here will focus on the spatial normalisation routines of SPM, as implemented in the Clinical Toolbox (https://www.nitrc.org/projects/clinicaltbx/; Rorden et al., 2012). This Clinical Toolbox provides specialised templates that allow spatial normalisation of both CT and MR brain images of elderly, stroke-aged populations (see section 4.4 below). As such, the Clinical Toolbox is ideally suited to be used in lesion-behaviour mapping studies where the patients included are typically older, and where different modalities, i.e. CT as well as MR images, 14 are present in different patients. Beyond routines for traditional spatial normalisation and unified segmentation and normalisation approaches (see sections 4.1 and 4.2 below), the Clinical Toolbox also provides processing steps that aid the spatial normalisation of scans from stroke patients, including corrections for the presence of a lesion (see section 4.3 below). 4.1 Spatial normalisation of imaging data with low radiometric resolution Imaging data with a low radiometric resolution is imaging data where the image intensity values offer a poor differentiation between different tissue types, particularly between grey and white matter brain tissue. In acute stroke patients, this typically applies to CT and T2FLAIR data. Here, the typical approach is to perform spatial normalisation by matching the orientation, size, and shape of each patient brain to the orientation, size, and shape of a template brain in standard stereotaxic space. This matching is done with an automated algorithm that aims to find the necessary image transformations that minimize the least mean square difference between the voxel intensities of the patient and template brain (Ashburner and Friston, 2003). In a first step, the patient brain is matched to the template brain using linear (affine) image transformations that can include translations, rotations, zooms, and/or shears. These affine transformations change the entire patient brain in the same way and so result in a global match between the patient brain and the template brain. Subsequently, in a second step, the patient brain is further matched to the template brain using nonlinear (nonaffine) image transformations consisting of cosine basis functions. These nonaffine transformations allow local changes to the patient brain and so improve the match between patient and template brain. To avoid overfitting during this second step, the difference in the amount of nonlinear transformations of adjacent areas is simultaneously minimized (known as ‚regularisation‘). As such, spatial normalisation matches the overall orientation, size and shape of the patient brain to that of the template brain, but not individual sulci. Once the necessary image transformations have been estimated, they can be applied to both the patient‘s brain image and the lesion map, bringing both in standard stereotaxic space. 4.2 Spatial normalisation of imaging data with high radiometric resolution Imaging data with a high radiometric resolution is imaging data where the image intensity values offer a good differentiation between different tissue types, particularly between grey and white matter brain tissue. This typically does not apply to the imaging data used to visualise the lesion in acute stroke (see section 3.1 above), but does usually apply to an 15 additionally collected T1 image. Likewise, when DWI data is collected, this is usually collected with different b-values (typically b0, b500, and b1000). While the b1000 image best suited to visualise the lesion in acute stroke patients has a low radiometric resolution, the additionally collected b0 image often has a relatively high radiometric resolution (Mah et al., 2014b). In these cases, the typical approach is to first coregister the image with a low radiometric resolution (e.g. the T2FLAIR or the b1000 DWI), used to visualise the lesion, to the image with a high radiometric resolution (e.g. the T1 or the b0 DWI). Subsequently, the image with a high radiometric resolution is normalised using the unified segmentation and normalisation approach (Ashburner and Friston, 2005). This approach has been shown to be superior to the traditional normalisation approach discussed in section 4.1 above in both healthy (Crinion et al., 2007; Klein et al., 2009) and stroke (Crinion et al., 2007) populations, but requires an image with a high radiometric resolution. The unified segmentation and normalisation approach combines tissue classification (i.e. segmentation), bias correction, and image registration (i.e. spatial normalisation) in a single model. Estimation of the model parameters is done by repeatedly alternating between tissue classification (to assign each voxel to a tissue class on the basis of its intensity), bias correction (to correct for image intensity nonuniformity due to magnetic field inhomogeneity), and image registration (to bring patient image and template-based tissue probability maps into common space using regularised affine and non-affine transformations and so derive the image transformations necessary to bring the patient image into standard stereotaxic space) steps. As this model accounts for the conditional dependencies between the steps (i.e. tissue classification aids the bias correction and image registration steps, and bias correction and image registration aid the tissue classification step), the results are superior to those obtained following serial application of the same steps. Once the necessary image transformations have been estimated, they are applied to the patient’s brain image(s) and the lesion map, bringing all images in standard stereotaxic space. 4.3 Correcting for the lesion during spatial normalisation While both spatial normalisation approaches described above work well with imaging data from neurologically healthy subjects, the presence of a lesion in imaging data from stroke patients presents a challenge, as lesions are characterised by abnormal image intensity. This area of abnormal image intensity in the patient brain locally creates a large mismatch between the patient brain and the template brain / template-based tissue probability maps, ultimately leading to local overfitting in the lesion area during the minimization of this mismatch (Brett 16 et al., 2001; Andersen et al., 2010). The two dominant solutions to this overfitting problem are cost function masking (Brett et al., 2001) and enantiomorphic normalisation (Nachev et al., 2008). During cost function masking, lesioned voxels are excluded during spatial normalisation using (a typically slightly smoothed version of) the binary lesion map. As such, the image transformations necessary to bring the patient’s brain image(s) and the lesion map in standard stereotaxic space are derived from intact areas of the brain only. During enantiomorphic normalisation, on the other hand, the lesion is ‚corrected‘ by replacing it with brain tissue from the lesion homologue in the intact hemisphere of the brain. As such, the image transformations necessary to bring the patient’s brain image(s) and the lesion map in standard stereotaxic space are effectively derived from a brain image without a lesion. Logically, one would expect enantiomorphic normalisation to perform better than cost function masking when lesions are large and unilateral, as spatial normalisation with cost function masking becomes less accurate as lesion size increases (as the area from which the image transformations can be derived decreases with increasing lesion size), while enantiomorphic normalisation does not. Cost function masking would, however, be expected to perform better than enantiomorphic normalisation when lesions are bilateral and affect similar areas in both hemispheres, as enantiomorphic normalisation would in this case replace the lesion with the likewise lesioned homologue. Moreover, as enantiomorphic normalisation assumes that the brain is essentially symmetric, enantiomorphic normalisation might be suboptimal in areas known to be considerably asymmetric (e.g. the planum temporale). 4.4 Choosing the right template Finally, an important consideration during spatial normalisation concerns the choice of template / template-based tissue probability maps. Firstly, when using the traditional spatial normalisation approach described in section 4.1 above, the template image should ideally have the same image modality as the patient image that is spatially normalised, as the accuracy of this approach depends on how similar the voxel intensities of a given brain area are between the patient and the template image. The unified segmentation and normalisation approach described in section 4.2 above, on the other hand, is modality-independent. Secondly, regardless of the spatial normalisation approach chosen, the population from which the template image or template-based tissue probability maps are derived should roughly match the population of the lesion mapping study. That is, if the lesion-behaviour mapping 17 study is performed in elderly stroke patients, the template or template-based tissue probability maps used should ideally have been derived from an elderly population. For elderly adults (mean age 61.3 years), a template is available for CT imaging data, as well as template-based tissue probability maps for MR imaging data (Rorden et al., 2012). For young adults (mean age 25 years), templates are available for T1 and T2 imaging data, as well as template-based tissue probability maps (Mazziotta et al., 1995, 2001a, 2001b). For paediatric populations, various templates are available for T1 imaging data, as well as template-based tissue probability maps (e.g. Wilke et al., 2008; http://jerlab.psych.sc.edu/neurodevelopmentalmridatabase/). Finally, a template derived from a wide range of adults (mean age 35.4, range 18-69 years) is available for T2FLAIR imaging data (http://glahngroup.org or http://brainder.org; Winkler et al.). Given that the average age of acute stroke patients included in a lesion-behaviour mapping study is typically over 60, the Clinical Toolbox (https://www.nitrc.org/projects/clinicaltbx/) includes the above-mentioned CT template and template-based tissue probability maps derived from elderly adults, as well as the T2FLAIR template. This toolbox can, however, easily be modified for use with other templates or template-based tissue probability maps. 5 Performing voxelwise (statistical) comparisons Following lesion delineation and spatial normalisation, we have a spatially normalised binary lesion map for each patient. Moreover, we have a behavioural measurement for each patient. With these two sources of information, we are ready to perform a voxelwise lesion-behaviour mapping analysis to relate lesion location and patient behaviour. Over the years, the methods to perform voxelwise analyses have continuously improved, from early subtraction analyses, to voxelwise statistical analyses (with correction for multiple comparisons), and ultimately to voxelwise statistical analyses that account for nuisance covariates such as lesion volume. Each of these methods will be discussed in the sections below. 5.1 Lesion subtraction analyses The simplest type of voxelwise analysis is a lesion subtraction analysis. Here, the lesion overlap map of patients without the cognitive deficit of interest is subtracted from the lesion overlap map of patients showing the cognitive deficit of interest. This can be performed with 18 programs such as MRIcroN (https://www.nitrc.org/projects/mricron). To account for potential sample size differences between the two patient groups, these subtraction analyses need to use proportional values. That is, for each voxel the percentage of patients without the cognitive deficit of interest that have a lesion at the voxel is subtracted from the percentage of patients with the cognitive deficit of interest that have a lesion at the voxel. The result of the subtraction analysis is then a map with the percentage relative frequency difference between these two groups for each voxel. For example, say we have 10 patients with the cognitive deficit of interest and 20 patients without the cognitive deficit of interest, where at a given voxel 9 of the 10 patients with the deficit have a lesion (i.e. 90%), while 10 of the 20 patients without the deficit have a lesion (i.e. 50%). In this case, the percentage relative frequency difference at that voxel would be 90% - 50% = 40%, indicating that this voxel is damaged 40% more frequently in patients with the cognitive deficit of interest than in patients without this deficit. These subtraction analyses are superior to simple overlap analyses that focus on only those patients that show the disorder of interest, because overlap analyses might simply highlight regions that reflect increased vulnerability of certain regions to injury (see Rorden & Karnath, 2004). In contrast, a lesion subtraction analysis highlights those areas of the brain where lesions are more frequent in patients with than in patients without the cognitive deficit of interest, and so distinguishes between regions that are merely often damaged in strokes and regions that are specifically associated with the deficit of interest (Rorden and Karnath, 2004). To control for neuropsychological co-morbidity, the two patient groups contrasted in a lesion subtraction analysis need to be comparable with respect to additional neurological impairments (of no interest) such as, e.g., paresis, visual field defects, etc. While subtraction analyses have merit in the study of exceedingly rare cognitive deficits (where it would be near to impossible to obtain a larger sample size), it is important to realise that these analyses are purely descriptive and allow no statistical inference. For statistical inference, voxelwise statistical analyses are necessary. 5.2 Voxelwise statistical lesion-behaviour mapping analyses In a voxelwise statistical lesion-behaviour mapping analysis, we perform a statistical test at each voxel to relate voxel status (lesioned/nonlesioned) and patient behaviour. These voxelwise statistical analyses can be performed with programs such as VLSM (https://langneurosci.mc.vanderbilt.edu/resources.html; Bates et al., 2003), VoxBo (https://www.nitrc.org/projects/voxbo/, Kimberg et al., 2007), NPM (https://www.nitrc.org/projects/mricron; Rorden et al., 2007), NiiStat 19 (https://www.nitrc.org/projects/niistat/) and/or LESYMAP (Pustina et al., in press, this issue). When the behavioural data is continuous, the behavioural data of the group of patients in whom a given voxel is damaged is statistically compared to the behavioural data of the group of patients in whom that same voxel is intact. This is traditionally done with a two-sample ttest, which assumes that the behavioural data is normally distributed and measured on an interval scale. Unfortunately, however, behavioural data from patient populations is often not normally distributed. During behavioural assessment, patients without the deficit of interest will typically all demonstrate close to maximum performance, whereas performance in patients with the deficit of interest will typically be poorer and more variable over patients. As a consequence, the distribution of the behavioural data from patient populations is often negatively skewed. Moreover, behavioural data from patient populations is often not measured on an interval scale. Instead, many tests designed to assess cognitive function in patient populations measure patient behaviour on an ordinal scale, where the behavioural data is ordered (e.g. higher scores denote better performance), but the distances between the individual measurements are not known (e.g. the difference between a score of ‚1‘ and ‚2‘ is not necessarily the same as the difference between a score of ‚2‘ and ‚3‘). Unfortunately, the t-test tends to be overly conservative when its test assumptions are violated, resulting in a reduction of statistical power to detect an effect in lesion-behaviour mapping analyses. Instead, the assumption free rank order test proposed by Brunner and Munzel (2000) might be more appropriate in these situations. In a lesion-behaviour mapping analysis with simulated data, this so-called Brunner-Munzel test has been shown to have higher statistical power than the t-test, while offering similar protection against false positives, in situations where the distribution of the behavioural data is skewed (Rorden et al., 2007). When the behavioural data is binomial (i.e. when the deficit is either present or absent, as in e.g. hemianopia), we statistically assess, for each voxel, whether the variables ‘voxel status’ (voxel lesioned vs. voxel intact) and ‘behavioural status’ (deficit present vs. deficit absent) are associated or independent. The statistical test typically used in these situations is the Pearson’s chi-squared test. In many lesion-behaviour mapping analyses, however, expected cell frequencies are lower than 5-10 in at least some voxels, resulting in inflated false positive rates when using Pearson’s chi-squared test. Traditional solutions to this problem are to use Yates’s correction for continuity or a Fisher’s exact test. These solutions, however, both assume fixed marginals, meaning that they assume that the column totals (in how many patients is this voxel lesioned and in how many patients is this voxel intact) and row totals 20 (how many patients have a deficit and how many patients do not) are fixed in advance before data collection starts. Obviously, this is not the case in a typical lesion-behaviour mapping analysis, and as a consequence, both Yates’s correction for continuity and Fisher’s exact test tend to be overly conservative. A statistical test that might be more appropriate in these situations is the quasi-exact test proposed by Liebermeister (1877). In a lesion-behaviour mapping analysis with simulated data, observed false positive rates for this so-called Liebermeister test closely approximated the set false positive threshold, whereas observed false positive rates for Fisher’s exact test tended to be too low (Rorden et al., 2007). 5.2.1 Inclusion of nuisance covariates and ensuring sufficient minimum lesion overlap One variable known to correlate strongly with the severity of behavioural deficit in stroke populations is lesion volume (the larger the lesion, the more likely it is that a patient will show a behavioural deficit). Thus, to avoid identifying brain areas where damage is related simply to lesion volume instead of patient behaviour, lesion volume should be included as a nuisance covariate in a voxelwise statistical lesion analysis. This can be done using regression approaches (e.g. logistic regression or the general linear model). Moreover, statistical power varies over voxels as a function of the amount of lesions that overlap at each voxel, with statistical power theoretically being maximal at voxels that are lesioned in half of the patient sample. Importantly, statistical power is absent at voxels that are damaged in none of the patients (as this would result in one empty group for the t-test and Brunner-Munzel test, or two empty cells for the Liebermeister test). Thus, to ensure sufficient statistical power, voxels damaged in a very low percentage of the patient sample should be excluded. Correcting for lesion volume as well as ensuring sufficient minimum lesion overlap has been shown to reduce the spatial bias and so improve the anatomical validity in univariate voxelwise statistical analyses (Sperber and Karnath, 2017). Finally, similarly as done for lesion volume, other nuisance covariates can additionally be included, such as the severity of frequently co-occurring deficits that may correlate with the cognitive function of interest (see also section 3.2 above), fiber tract disconnection likelihood (Rudrauf et al., 2008), etc. 5.2.2 Correcting for multiple comparisons During a voxelwise statistical lesion-behaviour mapping analysis, the same statistical test is performed at many individual voxels. However, if each statistical test has the typical false positive probability of 5%, performing a statistical test at e.g. 100 voxels will be expected to result in 5 false positives. That is, as more and more statistical tests are performed, the 21 probability of observing at least one false positive increases. In fact, performing 100 independent statistical tests, each with the typical false positive probability of 5%, will increase the overall probability of at least one false positive to 99.4%. In these situations, we do not want to control the probability of observing a false positive in each individual voxel. Instead, we want to control the overall probability of observing a false positive (over all voxels tested), also known as the family-wise error rate. To do this, we need to correct for multiple comparisons. The traditional method to correct for multiple comparisons is the Bonferroni correction. Here, we simply divide our desired false positive probability by the amount of tests that we perform. Thus, if we assess 100 voxels and want to ensure that the family-wise error rate does not exceed 5%, we would set the false positive probability threshold for each individual voxel at 5/100=0.05%. While this method offers excellent control of the family-wise error rate, it is also very conservative (particularly in voxelwise lesion-behaviour mapping analyses where the individual voxels are not truly independent), and thus severely reduces the statistical power to detect an effect. As such, considerable efforts have been made to develop alternative, less conservative, ways to correct for multiple comparisons. A more exact way to correct for multiple comparisons and control the family-wise error rate, without sacrificing statistical power, is permutation thresholding. Permutation thresholding aims to determine whether an observed test statistic at a voxel (e.g. a t-test, Brunner-Munzel or Liebermeister statistic) is truly due to the difference in voxel status (lesioned or nonlesioned) or not. The underlying logic is that if the observed test statistic is truly due to the difference in voxel status, similar or more extreme test statistics would be unlikely to arise in situations where the pairing of behavioural data and voxel status is scrambled (i.e. situations where there is no association between behavioural data and voxel status). To determine how likely certain test statistics are under this null hypothesis of no association between behavioural data and voxel status, the behavioural scores of the patients with and the patients without damage at a certain voxel are randomly scrambled (i.e. permuted) thousands of times, each time calculating a new test statistic. With this, a distribution of permuted test statistics is created, reflecting the probability of observing certain test statistics under the null hypothesis. Using this null distribution of permuted test statistics, the 5% threshold value can be determined, with test statistics exceeding this threshold value having a probability of less than 5% under the null hypothesis. Finally, by comparing the originally observed test statistic to 22 this null distribution of permuted test statistics, we can determine whether the original test statistic was extreme enough to allow rejection of the null hypothesis. In the context of voxelwise statistical lesion-behaviour mapping, this approach is extended by using the maximum test statistic (over all voxels) obtained in each permutation to create the null distribution, instead of individual voxel test statistics. As such, the 5% threshold value is not exceeded anywhere in the brain in more than 5% of the permutations, that is, permutation thresholding offers the same control of the family-wise error rate as the Bonferroni correction. Importantly, however, in situations where the individual voxels are not truly independent, permutation thresholding offers better statistical power than the Bonferroni correction. Moreover, while permutation thresholding typically focusses on the maximum test statistic obtained in each permutation (controlling the probability of observing a single false positive), this approach can also be generalised by focussing on the n-th extreme test statistic, where n > 1 (Mirman et al., in press, this issue). This so-called continuous permutation-based family-wise error rate correction method (controlling the probability of observing n false positives) might allow for a better balance between false positives and false negatives in typical lesion-behaviour mapping studies where the anatomical interpretation of the results rarely depends on a single voxel. Finally, an alternative, less conservative approach to correcting for multiple comparisons is offered by false discovery rate thresholding (Benjamini and Hochberg, 1995; Genovese et al., 2002). Here, the goal is not to control the family-wise error rate, but to control the proportion of false positives amongst observed positives. As a consequence, a false discovery rate threshold of 5% means that up to 5% of the observed positives might be false positives. In situations where no positives are observed, false discovery rate thresholding will provide the same control of the family-wise error rate as the Bonferroni correction. However, in situations where positives are observed, false discovery rate thresholding will result in more positives surviving the correction for multiple comparisons than either the Bonferroni correction or permutation thresholding. In fact, as the amount of observed positives increases, the false discovery rate threshold decreases. This adaptiveness of false discovery rate thresholding, however, comes at the price of reduced control of the family-wise error rate (as up to 5% of the positives surviving the correction for multiple comparisons could be false positives). Moreover, in smaller samples (n = 30-60), false discovery rate thresholding might considerably underestimate the proportion of false positives amongst observed positives (Mirman et al., in press, this issue). In situations where control of the family-wise error rate is 23 paramount and/or where the test assumptions of false discovery rate correction may be violated, permutation thresholding (see above) should thus be the preferred approach to correct for multiple comparisons in lesion-behaviour mapping analyses. 5.3 Avoid dividing samples into subsamples on the basis of an a priori hypothesis Often, we have an a priori hypothesis concerning the parts of the brain that might contribute to a certain cognitive process. Accordingly, it might seem intuitive to divide the patients and their brain lesions into different subsamples and perform separate lesion-behaviour mapping analyses for each of these subsamples. For example, based on the a priori hypothesis that action-related aspects of cognition are represented in anterior parts of the brain while perception-related aspects of cognition are located in posterior brain regions, one might divide an unselectively recruited patient sample into a subsample of patients with more anteriorly located brain damage and a subsample of patients with more posteriorly located brain damage. There are, however, several problems with this approach (see Figure 4). Firstly, patients with large lesions (for example covering both anterior and posterior parts of the brain) are difficult to categorise. This might lead to an extra category (e.g., ‘anterior & posterior’) for which no clear a priori hypothesis exists. As mentioned before (see section 2.2 above), exclusion of these patients is no solution, as this would not only result in a significant loss of valuable information, but would also create a bias towards smaller lesions and potentially milder cognitive symptoms, ultimately leading to different anatomical conclusions. Secondly, dividing a single patient sample into subsamples will anatomically bias the results of a lesion-behaviour mapping analysis into the direction of this a priori hypothesis. Dividing a patient sample into, for example, a subsample with more anterior brain damage and a subsample with more posterior brain damage, will reveal two neural correlates: one neural correlate somewhere in the more anterior regions of the brain, and one neural correlate somewhere in the more posterior regions (see Figure 4B, left side). This result is a priorily expected and simply a consequence of dividing the sample into these two anatomical subsamples, regardless of the cognitive deficit displayed by the patients. The a priori hypothesis of an anterior vs. posterior dissociation of perception- vs. action-related cognitive processes will thus lead to an observation that corresponds to this hypothesis. Had we taken the same data sample of stroke patients as before, but instead divided this sample into one subsample with more ventrally located brain damage and one subsample with more dorsally located brain damage (based on the equally defensible a priori hypothesis that perceptionrelated aspects of cognition are represented in more ventral brain areas whereas action-related 24 aspects of cognition are represented in more dorsal areas of the brain), we would have revealed a different result (see Figure 4B, right side). In this case, the lesion-behaviour mapping analysis of the sample with more ventral brain damage would have revealed a neural correlate somewhere in more ventrally located regions of the brain, while the lesionbehaviour mapping analysis of the subsample with more dorsal brain damage would have found a neural correlate somewhere in more dorsally located regions. The problem illustrated here, is that the results of a lesion-behaviour mapping analysis can be biased by the categorization of patients on the basis of their lesion location. That is, the categorization of an unselected patient sample on the basis of an a priori anatomical hypothesis will lead to anatomical results that correspond to this hypothesis. As such, this should be avoided. --- Figure 4 about here --- 6 Anatomical interpretation of lesion analysis results Following a voxelwise statistical lesion-behaviour mapping analysis, we obtain a statistical map highlighting the voxels where voxel status (lesioned vs. non-lesioned) and patient behaviour are significantly related. In the case of a lesion subtraction analysis, on the other hand, we obtain a map highlighting areas of the brain where lesions are descriptively more frequent in patients with than in patients without the cognitive deficit of interest (often thresholded to isolate those percentage relative frequency difference values thought to be meaningful [with typical threshold values of 20-50%]). Anatomical interpretation then consists of describing the location of these significant or meaningful voxels, typically with the help of a brain atlas. For convenience, coordinates of peak voxels or a coordinate ranges of a cluster can be provided to describe the location of the results of the voxelwise lesionbehaviour mapping analysis or subtraction analysis. It is, however, important to realise that all voxels identified as statistically significant in a voxelwise statistical lesion-behaviour mapping analysis, or meaningful in a subtraction analysis, have the same importance, and thus should be given equal weights when interpreting the results. Nowadays, there are many different cortical atlases to choose from. The first division that can be made is between atlases derived from single-subject data and atlases derived from multisubject data. Whereas atlases derived from single-subject data remain popular (i.e. the 25 Brodmann atlas, or the AAL atlas of Tzourio-Mazoyer et al., 2002), probabilistic atlases derived from multi-subject data should be preferred, as these are able to quantify the intersubject variability in location and extent of each anatomical area. Within these multisubject atlases, a second division can be made based on the brain characteristics used to parcellate distinct areas in different atlases. Whereas some probabilistic multi-subject atlases are based on macroscopical landmarks such as gyri and sulci (e.g. Hammers et al., 2003; Shattuck et al., 2008), others are based on histology (Zilles et al., 1997), or on functional connectivity patterns (e.g. Joliot et al., 2015). Finally, in addition to these cortical atlases, multi-subject atlases exist for white matter fiber tracts, based on either DTI fiber tracking (e.g. Zhang et al., 2010; Thiebaut de Schotten et al., 2011), or on histology (Bürgel et al., 2006). Which atlas to choose for the anatomical interpretation of the results of a lesionbehaviour mapping study is not a trivial issue. It is important to realise that different atlases might result in different anatomical interpretations of the same lesion-behaviour mapping results (de Haan and Karnath, 2017). Acknowledgements This work was supported by the Deutsche Forschungsgemeinschaft (HA 5839/4-1 to BdH; KA 1258/20-1, KA 1258/23-1 to HOK). We would like to thank Christoph Sperber for helpful and inspiring discussions. 26 References Abela, E., Missimer, J., Wiest, R., Federspiel, A., Hess, C., Sturzenegger, M., Weder, B., 2012. Lesions to primary sensory and posterior parietal cortices impair recovery from hand paresis after stroke. PloS One 7, e31275. doi:10.1371/journal.pone.0031275 Adams, J.H., Graham, D.I., Gennarelli, T.A., Maxwell, W.L., 1991. Diffuse axonal injury in non-missile head injury. J. Neurol. Neurosurg. Psychiatry 54, 481–483. Andersen, S.M., Rapcsak, S.Z., Beeson, P.M., 2010. Cost function masking during normalization of brains with focal lesions: still a necessity? NeuroImage 53, 78–84. doi:10.1016/j.neuroimage.2010.06.003 Ashburner, J., Friston, K.J., 2005. Unified segmentation. NeuroImage 26, 839–851. Ashburner, J., Friston, K.J., 2003. Spatial normalization using basis functions, in: Frackowiak, R.S.J., Friston, K.J., Frith, C., Dolan, R., Price, C.J., Zeki, S., Ashburner, J., Penny, W.D. (Eds.), Human Brain Function. Academic Press, San Diego. Ashton, E.A., Takahashi, C., Berg, M.J., Goodman, A., Totterman, S., Ekholm, S., 2003. Accuracy and reproducibility of manual and semiautomated quantification of MS lesions by MRI. J. Magn. Reson. Imaging 17, 300–308. doi:10.1002/jmri.10258 Avants, B.B., Tustison, N.J., Song, G., Cook, P.A., Klein, A., Gee, J.C., 2011. A reproducible evaluation of ANTs similarity metric performance in brain image registration. NeuroImage 54, 2033-2044. doi:10.1016/j.neuroimage.2010.09.025 Bates, E., Wilson, S.M., Saygin, A.P., Dick, F., Sereno, M.I., Knight, R.T., Dronkers, N.F., 2003. Voxel-based lesion-symptom mapping. Nat. Neurosci. 6, 448–450. doi:10.1038/nn1050 Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300. Bergström, M., Ericson, K., Levander, B., Svendsen, P., Larsson, S., 1977. Variation with time of the attenuation values of intracranial hematomas. J. Comput. Assist. Tomogr. 1, 57–63. Bonato, M., Sella, F., Berteletti, I., Umiltà, C., 2012. Neuropsychology is nothing without control: A potential fallacy hidden in clinical studies. Cortex 48, 353-355. doi:10.1016/j.cortex.2011.06.017 Brett, M., Leff, A.P., Rorden, C., Ashburner, J., 2001. Spatial normalization of brain images with focal lesions using cost function masking. NeuroImage 14, 486–500. 27 Brunner, E., Munzel, U., 2000. The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation. Biom. J. 42, 17–25. doi:10.1002/(SICI)1521-4036(200001)42:1<17::AID-BIMJ17>3.0.CO;2-U Büki, A., Povlishock, J.T., 2006. All roads lead to disconnection? -Traumatic axonal injury revisited. Acta Neurochir. (Wien) 148, 181-193; discussion 193-194. doi:10.1007/s00701-005-0674-4 Bürgel, U., Amunts, K., Battelli, L., Mohlberg, H., Gilsbach, J.M., Zilles, K., 2006. White matter fiber tracts of the human brain: three-dimensional mapping at microscopic resolution, topography and intersubject variability. NeuroImage 29, 1092–1105. doi:10.1016/j.neuroimage.2005.08.040 Burger, P.C., Heinz, E.R., Shibata, T., Kleihues, P., 1988. Topographic anatomy and CT correlations in the untreated glioblastoma multiforme. J. Neurosurg. 68, 698–704. doi:10.3171/jns.1988.68.5.0698 Button, K.S., Ioannidis, J.P.A., Mokrysz, C., Nosek, B.A., Flint, J., Robinson, E.S.J., Munafo, M.R., 2013. Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14, 365–376. doi:10.1038/nrn3475 Claes, A., Idema, A.J., Wesseling, P., 2007. Diffuse glioma growth: a guerilla war. Acta Neuropathol. (Berl.) 114, 443–458. doi:10.1007/s00401-007-0293-7 Clas, P., Groeschel, S., Wilke, M., 2012. A semi-automatic algorithm for determining the demyelination load in metachromatic leukodystrophy. Acad. Radiol. 19, 26–34. doi:10.1016/j.acra.2011.09.008 Cohen, J., 1983. The cost of dichotomization. Appl. Psychol. Meas. 7, 249–253. Cox, R.W., 1996. AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 29, 162-173. Cox, R.W., 2012. AFNI: What a long strange trip it's been. NeuroImage 62, 743-747. doi:10.1016/j.neuroimage.2011.08.056 Crawford, J.R., Garthwaite, P.H., 2005. Testing for suspected impairments and dissociations in single-case studies in neuropsychology: evaluation of alternatives using monte carlo simulations and revised tests for dissociations. Neuropsychology 19, 318–331. doi:10.1037/0894-4105.19.3.318 Crawford, J.R., Howell, D.C., 1998. Comparing an individual’s test score against norms derived from small samples. Clin. Neuropsychol. Neuropsychol. Dev. Cogn. Sect. D 12, 482–486. doi:10.1076/clin.12.4.482.7241 28 Crinion, J., Ashburner, J., Leff, A., Brett, M., Price, C., Friston, K., 2007. Spatial normalization of lesioned brains: performance evaluation and impact on fMRI analyses. NeuroImage 37, 866–875. Damasio, H., Damasio, A.R., 1989. Lesion analysis in neuropsychology. Oxford University Press, New York. de Haan, B., Clas, P., Juenger, H., Wilke, M., Karnath, H.-O., 2015. Fast semi-automated lesion demarcation in stroke. NeuroImage Clin. 9, 69–74. doi:10.1016/j.nicl.2015.06.013 de Haan, B., Karnath, H.-O., 2017. “Whose atlas I use, his song I sing?” - The impact of anatomical atlases on fiber tract contributions to cognitive deficits after stroke. NeuroImage 163, 301-309. doi:10.1016/j.neuroimage.2017.09.051 Fandino, J., Kollias, S.S., Wieser, H.G., Valavanis, A., Yonekawa, Y., 1999. Intraoperative validation of functional magnetic resonance imaging and cortical reorganization patterns in patients with brain tumors involving the primary motor cortex. J. Neurosurg. 91, 238–250. doi:10.3171/jns.1999.91.2.0238 Gennarelli, T.A., Thibault, L.E., Adams, J.H., Graham, D.I., Thompson, C.J., Marcincin, R.P., 1982. Diffuse axonal injury and traumatic coma in the primate. Ann. Neurol. 12, 564–574. doi:10.1002/ana.410120611 Genovese, C.R., Lazar, N.A., Nichols, T., 2002. Thresholding of statistical maps in functional neuroimaging using the False Discovery Rate. NeuroImage 15, 870–878. Gillebert, C.R., Humphreys, G.W., Mantini, D., 2014. Automated delineation of stroke lesions using brain CT images. NeuroImage Clin. 4, 540–548. doi:10.1016/j.nicl.2014.03.009 Goebel, R., 2012. BrainVoyager – past, present, future. NeuroImage 62, 748-756. doi:10.1016/j.neuroimage.2012.01.083 González, R.G., Schaefer, P.W., Buonanno, F.S., Schwamm, L.H., Budzik, R.F., Rordorf, G., Wang, B., Sorensen, A.G., Koroshetz, W.J., 1999. Diffusion-weighted MR imaging: diagnostic accuracy in patients imaged within 6 hours of stroke symptom onset. Radiology 210, 155–162. doi:10.1148/radiology.210.1.r99ja02155 Griffis, J.C., Allendorfer, J.B., Szaflarski, J.P., 2016. Voxel-based gaussian naïve Bayes classification of ischemic stroke lesions in individual T1-weighted MRI scans. J. Neurosci. Methods 257, 97-108. doi:10.1016/j.jneumeth.2015.09.019 Hammers, A., Allom, R., Koepp, M.J., Free, S.L., Myers, R., Lemieux, L., Mitchell, T.N., Brooks, D.J., Duncan, J.S., 2003. Three-dimensional maximum probability atlas of 29 the human brain, with particular reference to the temporal lobe. Hum. Brain Mapp. 19, 224–247. Hillis, A.E., Newhart, M., Heidler, J., Barker, P.B., Herskovits, E.H., Degaonkar, M., 2005. Anatomy of spatial attention: Insights from perfusion imaging and hemispatial neglect in acute stroke. J. Neurosci. 25, 3161–3167. Hillis, A.E., Wityk, R.J., Barker, P.B., Beauchamp, N.J., Gailloud, P., Murphy, K., Cooper, O., Metter, E.J., 2002. Subcortical aphasia and neglect in acute stroke: the role of cortical hypoperfusion. Brain 125, 1094–1104. Hillis, A.E., Wityk, R.J., Tuffiash, E., Beauchamp, N.J., Jacobs, M.A., Barker, P.B., Selnes, O.A., 2001. Hypoperfusion of Wernicke’s area predicts severity of semantic deficit in acute stroke. Ann. Neurol. 50, 561–566. Holodny, A.I., Schulder, M., Ybasco, A., Liu, W.-C., 2002. Translocation of Broca’s area to the contralateral hemisphere as the result of the growth of a left inferior frontal glioma. J. Comput. Assist. Tomogr. 26, 941–943. Ingre, M., 2013. Why small low-powered studies are worse than large high-powered studies and how to protect against “trivial” findings in research: comment on Friston (2012). NeuroImage 81, 496–498. doi:10.1016/j.neuroimage.2013.03.030 Inoue, K., Madhyastha, T., Rudrauf, D., Mehta, S., Grabowski, T., 2014. What affects detectability of lesion-deficit relationships in lesion studies? NeuroImage Clin. 6, 388- 397. doi:10.1016/j.nicl.2014.10.002 Jenkinson, M., Beckmann, C.F., Behrens, T.E., Woolrich, M.W., Smith, S.M., 2012. FSL. NeuroImage 62, 782-790. doi:10.1016/j.neuroimage.2011.09.015 Johnson, V.E., Stewart, W., Smith, D.H., 2013. Axonal pathology in traumatic brain injury. Exp. Neurol. 246, 35–43. doi:10.1016/j.expneurol.2012.01.013 Joliot, M., Jobard, G., Naveau, M., Delcroix, N., Petit, L., Zago, L., Crivello, F., Mellet, E., Mazoyer, B., Tzourio-Mazoyer, N., 2015. AICHA: An atlas of intrinsic connectivity of homotopic areas. J. Neurosci. Methods 254, 46–59. doi:10.1016/j.jneumeth.2015.07.013 Karnath, H.-O., Rennig, J., 2016. Investigating structure and function in the healthy human brain: validity of acute versus chronic lesion-symptom mapping. Brain Struct. Funct. doi:10.1007/s00429-016-1325-7 Karnath, H.-O., Rennig, J., Johannsen, L., Rorden, C., 2011. The anatomy underlying acute versus chronic spatial neglect: A longitudinal study. Brain 134, 903–912. doi:10.1093/brain/awq355 30 Karnath, H.-O., Rorden, C., 2012. The anatomy of spatial neglect. Neuropsychologia 50, 1010–1017. Karnath, H.-O., Sperber, C., Rorden, C., in press. Mapping human brain lesions and their functional consequences. NeuroImage. Karnath, H.-O., Steinbach, J.P., 2011. Do brain tumours allow valid conclusions on the localisation of human brain functions?--Objections. Cortex 47, 1004–1006. doi:10.1016/j.cortex.2010.08.006 Karnath, H.-O., Zopf, R., Johannsen, L., Berger, M.F., Nagele, T., Klose, U., 2005. Normalized perfusion MRI to identify common areas of dysfunction: patients with basal ganglia neglect. Brain 128, 2462–2469. doi:10.1093/brain/awh629 Kimberg, D.Y., Coslett, H.B., Schwartz, M.F., 2007. Power in voxel-based lesion-symptom mapping. J. Cogn. Neurosci. 19, 1067–1080. Klein, A., Andersson, J., Ardekani, B.A., Ashburner, J., Avants, B., Chiang, M.C., Christensen, G.E., Collins, D.L., Gee, J., Hellier, P., Song, J.H., Jenkinson, M., Lepage, C., Rueckert, D., Thompson, P., Vercauteren, T., Woods, R.P., Mann, J.J., Parsey, R.V., 2009. Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration. NeuroImage 46, 786–802. Koenig, M., Kraus, M., Theek, C., Klotz, E., Gehlen, W., Heuser, L., 2001. Quantitative assessment of the ischemic brain by means of perfusion-related parameters derived from perfusion CT. Stroke 32, 431–437. Lansberg, M.G., Thijs, V.N., O’Brien, M.W., Ali, J.O., de Crespigny, A.J., Tong, D.C., Moseley, M.E., Albers, G.W., 2001. Evolution of apparent diffusion coefficient, diffusion-weighted, and T2-weighted signal intensity of acute stroke. AJNR Am. J. Neuroradiol. 22, 637–644. Levine, B., Kovacevic, N., Nica, E.I., Schwartz, M.L., Gao, F., Black, S.E., 2013. Quantified MRI and cognition in TBI with diffuse and focal damage. NeuroImage Clin. 2, 534– 541. doi:10.1016/j.nicl.2013.03.015 Liebermeister, C., 1877. Über Wahrscheinlichkeitsrechnung in Anwendung auf therapeutische Statistik. Samml. Klin. Vorträge Inn. Med. No 31-64 110, 935–962. Mah, Y.-H., Husain, M., Rees, G., Nachev, P., 2014a. Human brain lesion-deficit inference remapped. Brain 137, 2522–2531. doi:10.1093/brain/awu164 Mah, Y.-H., Jager, R., Kennard, C., Husain, M., Nachev, P., 2014b. A new method for automated high-dimensional lesion segmentation evaluated in vascular injury and 31 applied to the human occipital lobe. Cortex 56, 51–63. doi:10.1016/j.cortex.2012.12.008 Mayer, T.E., Hamann, G.F., Baranczyk, J., Rosengarten, B., Klotz, E., Wiesmann, M., Missler, U., Schulte-Altedorneburg, G., Brueckmann, H.J., 2000. Dynamic CT perfusion imaging of acute stroke. AJNR Am. J. Neuroradiol. 21, 1441–1449. Mazziotta, J.C., Toga, A.W., Evans, A., Fox, P., Lancaster, J., 1995. A probabilistic atlas of the human brain: Theory and rationale for its development. The International Consortium for Brain Mapping (ICBM). NeuroImage 2, 89-101. Mazziotta, J.C., Toga, A., Evans, A., Fox, P., Lancaster, J., Zilles, K., Woods, R., Paus, T., Simpson, G., Pike, B., Holmes, C., Collins, L., Thompson, P., MacDonald, D., Iacoboni, M., Schormann, T., Amunts, K., Palomero-Gallagher, N., Geyer, S., Parsons, L., Narr, K., Kabani, N., Le Goualher, G., Boomsma, D., Cannon, T., Kawashima, R., Mazoyer, B., 2001a. A probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM). Philos. Trans. R. Soc. Lond. B Biol. Sci. 356, 1293-1322. Mazziotta, J.C., Toga, A., Evans, A., Fox, P., Lancaster, J., Zilles, K., Woods, R., Paus, T., Simpson, G., Pike, B., Holmes, C., Collins, L., Thompson, P., MacDonald, D., Iacoboni, M., Schormann, T., Amunts, K., Palomero-Gallagher, N., Geyer, S., Parsons, L., Narr, K., Kabani, N., Le Goualher, G., Feidler, J., Smith, K., Boomsma, D., Hulshoff Pol, H., Cannon, T., Kawashima, R., Mazoyer, B., 2001b. A fourdimensional probabilistic atlas of the human brain. J. Am. Med. Inform. Assoc. 8, 401-430. McKnight, T.R., von dem Bussche, M.H., Vigneron, D.B., Lu, Y., Berger, M.S., McDermott, M.W., Dillon, W.P., Graves, E.E., Pirzkall, A., Nelson, S.J., 2002. Histopathological validation of a three-dimensional magnetic resonance spectroscopy index as a predictor of tumor presence. J. Neurosurg. 97, 794–802. doi:10.3171/jns.2002.97.4.0794 Meyer, P.T., Sturz, L., Sabri, O., Schreckenberger, M., Spetzger, U., Setani, K.S., Kaiser, H.J., Buell, U., 2003. Preoperative motor system brain mapping using positron emission tomography and statistical parametric mapping: hints on cortical reorganisation. J. Neurol. Neurosurg. Psychiatry 74, 471–478. Mirman, D., Landrigan, J.-F., Kokolis, S., Verillo, S., Ferrara, C., Pustina, D., in press, this issue. Corrections for multiple comparisons in voxel-based lesion-symptom mapping. Neuropsychologia. doi:10.1016/j.neuropsychologia.2017.08.025 32 Mohr, J.P., Biller, J., Hilal, S.K., Yuh, W.T., Tatemichi, T.K., Hedges, S., Tali, E., Nguyen, H., Mun, I., Adams, H.P., 1995. Magnetic resonance versus computed tomographic imaging in acute stroke. Stroke 26, 807–812. Motta, M., Ramadan, A., Hillis, A.E., Gottesman, R.F., Leigh, R., 2014. Diffusion-perfusion mismatch: an opportunity for improvement in cortical function. Front. Neurol. 5, 280. doi:10.3389/fneur.2014.00280 Nachev, P., Coulthard, E., Jäger, H.R., Kennard, C., Husain, M., 2008. Enantiomorphic normalization of focally lesioned brains. NeuroImage 39, 1215–1226. doi:10.1016/j.neuroimage.2007.10.002 Neumann-Haefelin, T., Wittsack, H.J., Wenserski, F., Siebler, M., Seitz, R.J., Mödder, U., Freund, H.J., 1999. Diffusion- and perfusion-weighted MRI. The DWI/PWI mismatch region in acute stroke. Stroke 30, 1591–1597. Ojemann, J.G., Miller, J.W., Silbergeld, D.L., 1996. Preserved function in brain invaded by tumor. Neurosurgery 39, 253-258; discussion 258-259. Ostrom, Q.T., Gittleman, H., Xu, J., Kromer, C., Wolinsky, Y., Kruchko, C., BarnholtzSloan, J.S., 2016. CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2009–2013. Neuro-Oncol. 18, v1–v75. doi:10.1093/neuonc/now207 Pallud, J., Capelle, L., Taillandier, L., Badoual, M., Duffau, H., Mandonnet, E., 2013. The silent phase of diffuse low-grade gliomas. Is it when we missed the action? Acta Neurochir. (Wien) 155, 2237–2242. doi:10.1007/s00701-013-1886-7 Povlishock, J.T., 1993. Pathobiology of traumatically induced axonal injury in animals and man. Ann. Emerg. Med. 22, 980–986. Povlishock, J.T., Katz, D.I., 2005. Update of neuropathology and neurological recovery after traumatic brain injury. J. Head Trauma Rehabil. 20, 76–94. Pustina, D., Avants, B., Faseyitan, O.K., Medaglia, J.D., Coslett, H.B., in press, this issue. Improved accuracy of lesion to symptom mapping with multivariate sparse canonical correlations. Neuropsychologia. doi:10.1016/j.neuropsychologia.2017.08.027 Pustina, D., Coslett, H. B., Turkeltaub, P. E., Tustison, N., Schwartz, M. F., Avants, B., 2016. Automated segmentation of chronic stroke lesions using LINDA: Lesion identification with neighborhood data analysis. Hum. Brain Mapp. 37, 1405-1421. doi:10.1002/hbm.23110 33 Ricci, P.E., Burdette, J.H., Elster, A.D., Reboussin, D.M., 1999. A comparison of fast spinecho, fluid-attenuated inversion-recovery, and diffusion-weighted MR imaging in the first 10 days after cerebral infarction. AJNR Am. J. Neuroradiol. 20, 1535–1542. Rondina, J.M., Filippone, M., Girolami, M., Ward, N.S., 2016. Decoding post-stroke motor function from structural brain imaging. NeuroImage Clin. 12, 372–380. doi:10.1016/j.nicl.2016.07.014 Rorden, C., Bonilha, L., Fridriksson, J., Bender, B., Karnath, H.-O., 2012. Age-specific CT and MRI templates for spatial normalization. NeuroImage 61, 957–965. doi:10.1016/j.neuroimage.2012.03.020 Rorden, C., Brett, M., 2000. Stereotaxic display of brain lesions. Behav. Neurol. 12, 191– 200. Rorden, C., Karnath, H.-O., 2004. Using human brain lesions to infer function: a relic from a past era in the fMRI age? Nat. Rev. Neurosci. 5, 813–819. doi:10.1038/nrn1521 Rorden, C., Karnath, H.-O., Bonilha, L., 2007. Improving lesion-symptom mapping. J. Cogn. Neurosci. 19, 1081–1088. doi:10.1162/jocn.2007.19.7.1081 Rudrauf, D., Mehta, S., Grabowski, T. J., 2008. Disconnection's renaissance takes shape: Formal incorporation in group-level lesion studies. Cortex 44, 1084-1096. doi:10.1016/j.cortex.2008.05.005 Schaefer, P.W., Hunter, G.J., He, J., Hamberg, L.M., Sorensen, A.G., Schwamm, L.H., Koroshetz, W.J., Gonzalez, R.G., 2002. Predicting cerebral ischemic infarct volume with diffusion and perfusion MR imaging. AJNR Am. J. Neuroradiol. 23, 1785–1794. Scherer, H.J., 1940. The forms of growth in gliomas and their practical significance. Brain 63, 1–35. doi:10.1093/brain/63.1.1 Schiffbauer, H., Ferrari, P., Rowley, H.A., Berger, M.S., Roberts, T.P., 2001. Functional activity within brain tumors: a magnetic source imaging study. Neurosurgery 49, 1313-1320; discussion 1320-1321. Schlaug, G., Benfield, A., Baird, A.E., Siewert, B., Lövblad, K.O., Parker, R.A., Edelman, R.R., Warach, S., 1999. The ischemic penumbra: operationally defined by diffusion and perfusion MRI. Neurology 53, 1528–1537. Shahid, H., Sebastian, R., Schnur, T.T., Hanayik, T., Wright, A., Tippett, D.C., Fridriksson, J., Rorden, C., Hillis, A.E., 2017. Important considerations in lesion-symptom mapping: Illustrations from studies of word comprehension. Hum. Brain Mapp. 38, 2990–3000. doi:10.1002/hbm.23567 34 Shattuck, D.W., Mirza, M., Adisetiyo, V., Hojatkashani, C., Salamon, G., Narr, K.L., Poldrack, R.A., Bilder, R.M., Toga, A.W., 2008. Construction of a 3D probabilistic atlas of human cortical structures. NeuroImage 39, 1064–1080. Shaw, K., Brennan, N., Woo, K., Zhang, Z., Young, R., Peck, K.K., Holodny, A., 2016. Infiltration of the basal ganglia by brain tumors is associated with the development of co-dominant language function on fMRI. Brain Lang. 155–156, 44–48. doi:10.1016/j.bandl.2016.04.002 Skirboll, S.S., Ojemann, G.A., Berger, M.S., Lettich, E., Winn, H.R., 1996. Functional cortex and subcortical white matter located within gliomas. Neurosurgery 38, 678-684; discussion 684-685. Smith, D.V., Clithero, J.A., Rorden, C., Karnath, H.-O., 2013. Decoding the anatomical network of spatial attention. Proc. Natl. Acad. Sci. U. S. A. 110, 1518–1523. doi:10.1073/pnas.1210126110 Soares, J.M., Magalhães, R., Moreira, P.S., Sousa, A., Ganz, E., Sampaio, A., Alves, V., Marques, P., Sousa, N., 2016. A hitchhiker’s guide to functional magnetic resonance imaging. Front. Neurosci. 10, 515. doi:10.3389/fnins.2016.00515 Soares, J.M., Marques, P., Alves, V., Sousa, N., 2013. A hitchhiker’s guide to diffusion tensor imaging. Front. Neurosci. 7, 31. doi:10.3389/fnins.2013.00031 Sperber, C., Karnath, H.-O., 2016. Topography of acute stroke in a sample of 439 right brain damaged patients. NeuroImage Clin. 10, 124–128. doi:10.1016/j.nicl.2015.11.012 Sperber, C., Karnath, H.-O., 2017. Impact of correction factors in human brain lesionbehavior inference. Hum. Brain Mapp. 38, 1692–1701. doi:10.1002/hbm.23490 Sperber, C., Karnath, H.-O., in press, this issue. On the validity of lesion-behaviour mapping methods. Neuropsychologia. doi:10.1016/j.neuropsychologia.2017.07.035 Su, E., Bell, M., 2016. Diffuse Axonal Injury, in: Laskowitz, D., Grant, G. (Eds.), Translational Research in Traumatic Brain Injury, Frontiers in Neuroscience. CRC Press/Taylor and Francis Group, Boca Raton (FL). Swanson, K., Alvord, E.C., Murray, J.D., 2004. Dynamics of a model for brain tumors reveals a small window for therapeutic intervention. Discrete Contin. Dyn. Syst. - Ser. B 4, 289–295. doi:10.3934/dcdsb.2004.4.289 Swanson, K.R., Bridge, C., Murray, J.D., Alvord, E.C., 2003. Virtual and real brain tumors: using mathematical modeling to quantify glioma growth and invasion. J. Neurol. Sci. 216, 1–10. 35 Taniguchi, M., Kato, A., Ninomiya, H., Hirata, M., Cheyne, D., Robinson, S.E., Maruno, M., Saitoh, Y., Kishima, H., Yoshimine, T., 2004. Cerebral motor control in patients with gliomas around the central sulcus studied with spatially filtered magnetoencephalography. J. Neurol. Neurosurg. Psychiatry 75, 466–471. Thiebaut de Schotten, M., Ffytche, D.H., Bizzi, A., Dell’Acqua, F., Allin, M., Walshe, M., Murray, R., Williams, S.C., Murphy, D.G.M., Catani, M., 2011. Atlasing location, asymmetry and inter-subject variability of white matter tracts in the human brain with MR diffusion tractography. NeuroImage 54, 49–59. doi:10.1016/j.neuroimage.2010.07.055 Thiel, A., Herholz, K., Koyuncu, A., Ghaemi, M., Kracht, L.W., Habedank, B., Heiss, W.D., 2001. Plasticity of language networks in patients with brain tumors: a positron emission tomography activation study. Ann. Neurol. 50, 620–629. Ticini, L.F., de Haan, B., Klose, U., Nagele, T., Karnath, H.-O., 2010. The role of temporoparietal cortex in subcortical visual extinction. J. Cogn. Neurosci. 22, 2141–2150. Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., Mazoyer, B., Joliot, M., 2002. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 15, 273–289. von Kummer, R., Bourquain, H., Bastianello, S., Bozzao, L., Manelfe, C., Meier, D., Hacke, W., 2001. Early prediction of irreversible brain damage after ischemic stroke at CT. Radiology 219, 95–100. doi:10.1148/radiology.219.1.r01ap0695 Warach, S., Chien, D., Li, W., Ronthal, M., Edelman, R.R., 1992. Fast magnetic resonance diffusion-weighted imaging of acute human stroke. Neurology 42, 1717–1723. Wilke, M., de Haan, B., Juenger, H., Karnath, H.O., 2011. Manual, semi-automated, and automated delineation of chronic brain lesions: A comparison of methods. NeuroImage 56, 2038-2046. doi:10.1016/j.neuroimage.2011.04.014 Wilke, M., Holland, S. K., Altaye, M., Gaser, C., 2008. Template-O-Matic: A toolbox for creating customized pediatric templates. NeuroImage 41, 903-913. doi:10.1016/j.neuroimage.2008.02.056 Wilson, S. M., 2016. Lesion-symptom mapping in the study of spoken language understanding. Lang. Cogn. Neurosci. 32, 891-899. doi:10.1080/23273798.2016.1248984 Winkler, A.M., Kochunov, P., Glahn, D.C. FLAIR Templates. Available at http://glahngroup.org or http://brainder.org. 36 Wu, O., Cloonan, L., Mocking, S.J.T., Bouts, M.J.R.J., Copen, W.A., Cougo-Pinto, P.T., Fitzpatrick, K., Kanakis, A., Schaefer, P.W., Rosand, J., Furie, K.L., Rost, N.S., 2015. Role of Acute Lesion Topography in Initial Ischemic Stroke Severity and Long-Term Functional Outcomes. Stroke 46, 2438–2444. doi:10.1161/STROKEAHA.115.009643 Wunderlich, G., Knorr, U., Herzog, H., Kiwit, J.C., Freund, H.J., Seitz, R.J., 1998. Precentral glioma location determines the displacement of cortical hand representation. Neurosurgery 42, 18-26; discussion 26-27. Zhang, Y., Kimberg, D.Y., Coslett, H.B., Schwartz, M.F., Wang, Z., 2014. Multivariate lesion-symptom mapping using support vector regression. Hum. Brain Mapp. 35, 5861–5876. doi:10.1002/hbm.22590 Yourganov, G., Smith, K.G., Fridriksson, J., Rorden, C., 2015. Predicting aphasia type from brain damage measured with structural MRI. Cortex 73, 203-215. doi:10.1016/j.cortex.2015.09.005 Yushkevich, P. A., Piven, J., Hazlett, H. C., Smith, R. G, Ho, S., Gee, J. C., Gerig, G., 2006. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. NeuroImage 31, 1116-1128. doi:10.1016/j.neuroimage.2006.01.015 Zhang, Y., Zhang, J., Oishi, K., Faria, A.V., Jiang, H., Li, X., Akhter, K., Rosa-Neto, P., Pike, G.B., Evans, A., Toga, A.W., Woods, R., Mazziotta, J.C., Miller, M.I., van Zijl, P.C.M., Mori, S., 2010. Atlas-guided tract reconstruction for automated and comprehensive examination of the white matter anatomy. NeuroImage 52, 1289– 1301. doi:10.1016/j.neuroimage.2010.05.049 Zilles, K., Schleicher, A., Langemann, C., Amunts, K., Morosan, P., Palomero-Gallagher, N., Schormann, T., Mohlberg, H., Bürgel, U., Steinmetz, H., Schlaug, G., Roland, P.E., 1997. Quantitative analysis of sulci in the human cerebral cortex: Development, regional heterogeneity, gender difference, asymmetry, intersubject variability and cortical architecture. Hum. Brain Mapp. 5, 218–221. Zopf, R., Fruhmann Berger, M., Klose, U., Karnath, H.-O., 2009. Perfusion imaging of the right perisylvian neural network in acute spatial neglect. Front. Hum. Neurosci. 3, 15. doi:10.3389/neuro.09.015.2009 Zopf, R., Klose, U., Karnath, H.-O., 2012. Evaluation of methods for detecting perfusion abnormalities after stroke in dysfunctional brain regions. Brain Struct. Funct. 217, 667–675. doi:10.1007/s00429-011-0363-4 38 Figure legends Figure 1: Number of publications per year that used the lesion method between the years 1995 and 2015. This result was obtained by running a Pubmed (https://www.ncbi.nlm.nih.gov/pubmed) literature search with the search term „(‘lesion analysis’ OR ‘lesion mapping’ OR ‘VLSM’ OR ‘VLBM’) AND brain“, followed by a manual exclusion of non-empirical articles (i.e. methodological articles, review articles, etc). Note the clear and steady increase in number of publications that used the lesion method per year over the last 10 years. Figure 2: Illustration of the typical lesion study pipeline. When performing a lesion study, the researcher has to decide which patients to select, how to assess the patient’s lesion location and behavioural status, how to spatially normalise the patient’s brain image and lesion map, how to perform the voxelwise (statistical) comparisons over all patients, and finally, how to anatomically interpret the obtained results. Figure 3: Illustration of the thought experiment described in the manuscript text. Both lesions depicted on the top equally affect the brain area crucially related to a certain cognitive function of interest. The diagram on the bottom, reflects the rate of spontaneous recovery over time of this function. The difference in measured behavioural deficit scores of these patients (‘54’ vs. ‘27’) does not result from differences in the relevance of the individual lesions for the cognitive function of interest. Instead, this difference is the consequence of including the two patients at different time points following stroke-onset: one patient is recruited and behaviourally tested in the acute phase after stroke (and thus shows a max. behavioural deficit score of ‘54’) while the other patient is first seen and tested in the intermediate/chronic stroke phase, following partial recovery of the behavioural deficit (from an initial behavioural deficit score of ‘54’ down to a behavioural deficit score of ‘27’). Figure 4: The consequences of dividing an unselectively recruited patient sample into subsamples on the basis of an a priori anatomical hypothesis. A: Sketch where one and the same stroke patient sample (middle image, simple lesion overlap in yellow) is divided into either an anterior and posterior subsample (left image), or a ventral and dorsal subsample (right image), in order to perform separate lesion-behaviour mapping analyses in each of these subsamples. This division of a single patient sample into subsamples will generate 39 results that match the a priori hypothesis. B: Illustration of this effect in an unselectively recruited sample of 20 stroke patients (patients taken from Sperber and Karnath [2016]). One and the same patient sample (top middle image, simple lesion overlap) is divided into either an anterior and posterior subsample (left images), or a ventral and dorsal subsample (right images). Whereas dividing the entire patient sample into an anterior and posterior subsample results in an area of maximum lesion overlap at a z-coordinate of 6 and 24 respectively, dividing the same patient sample into a ventral and dorsal subsample results in an area of maximum lesion overlap at a z-coordinate of -12 and 33 respectively. Moreover, when dividing the entire patient sample into an anterior and posterior subsample, 7 patients (i.e. 35% of the sample) could not be categorised. When dividing the entire patient sample into a ventral and dorsal subsample, 11 patients (i.e. 55% of the sample) could not be categorised. The number of overlapping lesions is illustrated by colour, from violet (n=1) to red (n=maximum lesion overlap). The numbers at the bottom of the Figure indicate MNI zcoordinates. Images are in neurological orientation. Fig. 1 40 Fig. 2 Fig. 3 41 Fig. 4  The lesion method is an influential and popular method to study human brain function  But virtually no papers or books exist to assist scientists interested in this method  We here provide a hitchhiker’s guide with practical guidelines and references