musings 30 Jun 2006 08:45 pm

How much should scientists worry?

No, this isn’t a post about the incidence of anxiety disorders among scientists. The question I want to ask is more circumscribed: how much do scientists need worry about the possibility of the assumptions that make their research possible failing?

Here’s how this question came up. In response to my last post, in which I argued that the BOLD signal has been shown to correlate very highly with neural activity, Jonah Lehrer wrote:

But this is what I want fMRI researchers to grapple with. Instead of searching for the neural correlates of romantic love, why not grapple with the fascinating anomalies the technology actually illuminates. In my earlier post, I said that I wasn’t entirely convinced that fMRI has earned its reductionist conclusions. This is why. I’m fascinated and bewildered by the Logothetis data. In the five years since his paper was published, there have been thousands upon thousands of fMRI papers documenting all sorts of really interesting things. But we still don’t understand a significant part of the “nuanced” relationship between neural activity and the flow of oxygenated blood. The devil is always in the details.

I don’t want to rehash the technical debate about BOLD in this post. Instead, what I want to highlight is that last sentence about the devil (who could resist the devil?). What I want to know is: is it true? Is the devil always in the details?

Let’s suppose so. And let’s work through the implications of a failure of the “BOLD reflects neural activity” assumption. There are basically two ways an assumption like this could hurt us. They correspond roughly to the issues of measurement reliability and construct validity. Measurement reliability has to do with whether a given instrument consistently measures the same thing under similar conditions. This is pretty similar to the common-sense notion of reliability: if you have a reliable friend, you can trust that they’ll behave in a consistent manner when you need something from them; conversely, an unreliable car might crap out on you the next time you drive it, even if you don’t do anything radically different. An often overlooked but important point about reliability is that it isn’t a single value–technically, there’s no such thing as a reliable measure, only a measure that’s reliable when tested on a particular sample in a particular context. When you go to Walgreens to buy a thermometer, you’re banking on the fact that it’ll measure body temperature ‘reliably’. And it will; but that’s only because the temperatures you subject it to fall within the range it’s attuned to. Stick a $10 thermometer in the oven and it’s no longer reliable.

In theory, you could have the same problem with fMRI. Like any other signal, the BOLD signal only indexes neural activity reliably under a particular range of circumstances. Most fMRI researchers assume that the kind of research they do is always going to fall within that range. But that assumption could fail; it could be that there are nuances we don’t appreciate, and if we don’t study those nuances, we’ll end up applying fMRI in situations where it’s worthless. So that’s one potential reason for worrying about the devil in the details.

What’s the other reason? Well, ensuring that a measure is reliable doesn’t necessarily mean it’s valid. Validity, and specifically construct validity, is the degree to which your measure gets at what you think it does. Construct validity is a huge bugbear in research, because there’s no way to quantify it. You can calculate a reliability coefficient pretty easily in most cases, but determining whether a measure accurately reflects a construct of interest is a qualitative matter. Suppose you have a 30-item self-report questionnaire you think assesses people’s generosity. You publish an article showing that people’s score on the measure predicts how much money they report giving to charity last year. What’s the problem? Well, your measure could have very high reliability and still be a bad measure of generosity. Why? For one thing, it’s a self-report measure. Who says people are the best judges as to their own generosity? Maybe what you’re really measuring is some aspect of social desirability. Most people want to see themselves in a positive light; it’s not inconceivable people who say they’re more generous also overreport how much money they donate (this kind of thing is a huge problem in personality psychology research, which is why researchers often include ‘lie’ scales in their experiments, or go to great lengths to get more objective corroborating data).

Along the same lines, an fMRI study reporting on the neural correlates of romantic love isn’t really reporting on the neural correlates of love unless you think showing someone photographs of loved ones is the same thing as inducing a strong feeling of romantic love. The behavioral responses induced by such photographs could be perfectly reliable, but we don’t know if that’s because they’re accurately tapping people’s deep romantic feelings or just accessing aspects of familiarity, social norms, etc. The same thing could be true with respect to the BOLD signal: maybe our failure to understand all of the subtleties of the relationship between blood flow and neural activity prevents us from understanding what fMRI is really measuring.

So what does this have to do with the original question I posed? Well, admittedly, not much (I thought a tangential discussion of reliability and validity was worthwhile), except that time is finite and all risks are not equal. In order to get any research done at all, researchers have to make assumptions about all kinds of things. We assume that our subjects will show up at 11 am like they promised (and we mumble vague threats into the air when they don’t materialize); we assume that our measures are reliable and do what we think they do; we assume that we’re applying certain statistical procedures correctly even though we don’t really understand what all those equations mean; and so on. Most of the time, we’re not even aware we’re making these assumptions. They only become apparent when we make a mistake and have to go back and fix or re-learn something.

Should we make people question all of their assumptions before getting down to actual research? Maybe. But who’s going to make the list of Assumptions that Need to be Questioned? And what should we put on it? Should we emphasize methodological issues, statistical issues, or conceptual issues? Do we want scientists who ably conduct sophisticated laboratory experiments costing tens of thousands of dollars and then inadvertently throw half their study’s power out the window because they never learned that Median Splits Are Bad and cheerfully dichotomize continuous variables? (This happens all the time.) Or do we want scientists who deftly manipulate matrices, can write three volumes on the merits of canonical correlation, but have no interest in collecting empirical data?

The correct answer is ‘all of the above’. We want scientists with every possible combination of strengths and weaknesses. There are people who are interested in developing new research methods but don’t have the faintest interest in using them to study anything substantive, and there are people who are brilliant at asking probing questions and designing clever experiments but don’t know how to test their hypotheses quantitatively. These aren’t interchangeable populations: the kind of person who really really wants to understand how music perception works isn’t necessarily the kind of person who really really wants to develop a reliable and valid measure of neural activity.

Let’s come back to the pitfalls of fMRI. Suppose you’re the former kind of person, and one day someone tells you about a nifty new technology called fMRI that you can use to study music perception. You get really excited, take a couple of classes, and manage to con a lab head into funding your experiment by whispering sweet nothings in their ear about multiple publications in Science and Nature. On your way to the scanner on the first day, a famous biophysicist pulls you aside and says: “you probably shouldn’t do that experiment yet; we know there’s a strong correlation between neural activity and the signal that machine measures, but we’re not absolutely positively 100% sure it’s always going to measure how much involvement a given brain region has in a cognitive process. Maybe you should hold off.”

What are you going to do? My guess is you don’t care. You go in and do the experiment anyway. My guess is if you ask most cognitive neuroscientists working in the field today, they do the experiment anyway, regardless of how much they know about the BOLD signal. That’s not blind faith in the machine or an irrational devotion to pretty pictures; it simply reflects the fact that (a) you have to take risks to do science (which are pretty minimal in this case), and (b) studying the neural basis of the BOLD signal is (for most cognitive neuroscientists) incredibly boring. And it’s not like this relative disregard is unique to imaging. Consider any of the following potential criticisms:

  1. You’re a cognitive psychologist? Don’t you think you should do some work on the brain before you develop any complex abstract models? After all, we don’t yet fully understand the brain, and you wouldn’t want to develop theories that aren’t compatible with the way the brain works…
  2. You’re a philosopher? Don’t you think you should do a bunch of psychology research before you start formulating weird metaphysical constructs? How do you think you’re going to relate your referential semantics program to information processing terminology? You are a materialist, right?
  3. You’re a systems neuroscientist? Listen, your theories about rate and place codes are very nice, but they’re really simple. You should probably do a Ph.D. in cellular neuroscience so you can understand how synchrony in neural assemblies develops before you go chasing after complex interactions between them.

All of these are pretty good criticisms, I think. But they’re beside the point. We don’t do experiments thinking we’ve got all the assumptions covered; we do them in spite of the fact that we know we’ll be wrong a good deal of the time. Because that’s the only way science can work. If we had to worry about every little assumption before making a start, we’d be paralyzed by doubt and never get off the ground. Now, I’m not saying scientists should be reckless. There’s no question it makes sense to worry about issues that could present major obstacles to research. But for most cognitive neuroscientists, the fidelity of the BOLD response to the underlying neural activity just isn’t one of them. So, is the devil in the details? Well, sure. But that’s right where we want him: he only hides in the details after we’ve evicted him from all of the more important places.

5 Responses to “How much should scientists worry?”

  1. on 30 Jun 2006 at 9:09 pm 1.Chris said …

    As a cognitive psychologist, my attitude toward imaging is that it occasionally ofters a good tool for hypothesis testing, when two competing models would make different predictions about brain activity. The best example of this I know is the work of Todd Maddox and his colleagues on different types of categorization. They used imaging studies to show that rule-based categorization largely resulted in activation in the frontal cortex and the thalamocortical loop, while categorization that didn’t (and couldn’t) involve explicit rules utilized parts of the brain’s reward system. I know this is an oversimplified description of their data, but the point I’m trying to make is that different activation helped them argue for the existence of different categorization processes. However, they’d hypothesized the different processes based on behavioral data, and without that data, they would never have been able to know what to look at in the imaging data. Furthermre, the impetus for the imaging studies wasn’t to make sure their model was plausible, neuroscientifically. It rarely is, and I don’t think it should be, because imaging studies generally can’t present a picture detailed enough to determine the plausibility of theories.

  2. on 30 Jun 2006 at 9:10 pm 2.Chris said …

    By the way, I meant include in that comment the fact that I’m loving this new blog.

  3. on 30 Jun 2006 at 9:31 pm 3.small and gray said …

    Chris, thanks! Glad you like it, I’m a big fan of your writing.

    As far as what imaging can and can’t do–I think it really depends on the quality of the study, much like in any other discipline. I think it’s probably easier to look impressive without saying anything substantive in an imaging paper than a behavioral study. But at its best, I think imaging offers a really powerful window into cognition that’s difficult to get any other way. The logic of dissociation and association that’s been adapted from neuropsychology is a very useful one when used appropriately. But hopefully I’ll write more on that in detail as this blog comes along…

  4. on 02 Jul 2006 at 7:36 pm 4.Andrew said …

    I’m really impressed with this blog. I’m a recent college grad who is just getting into neuroimaging, and I appreciate your defense of fMRI as a unique tool. Also, I don’t think that fMRI’s role in understanding psychopathology, which (arguably) has more “real world” importance than some cognitive neuroscientific pursuits, should be overlooked. Keep up the good work!! :)

  5. on 17 Jun 2008 at 12:05 am 5.Small Gray Matters » Blog Archive » Two cautionary notes on the use of fMRI said …

    [...] Nature each have very nice commentaries on the limitations of fMRI, a topic I’ve written about a few times before. The Nature piece is a review by Nikos Logothetis entitled “What we can [...]

Trackback This Post | Subscribe to the comments through RSS Feed

Leave a Reply