<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.0.6" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Small Gray Matters</title>
	<link>http://www.smallgraymatters.com</link>
	<description>of brains and their minds</description>
	<pubDate>Thu, 10 May 2007 07:10:24 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.6</generator>
	<language>en</language>
			<item>
		<title>brains in the elevator: notes from CNS 2007, pt. I</title>
		<link>http://www.smallgraymatters.com/2007/05/10/brains-in-the-elevator-notes-from-cns-2007-pt-i/</link>
		<comments>http://www.smallgraymatters.com/2007/05/10/brains-in-the-elevator-notes-from-cns-2007-pt-i/#comments</comments>
		<pubDate>Thu, 10 May 2007 07:04:17 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>musings</category>

		<category>academics</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2007/05/10/brains-in-the-elevator-notes-from-cns-2007-pt-i/</guid>
		<description><![CDATA[I&#8217;m in New York for the 2007 annual meeting of the Cognitive Neuroscience Society. CNS alternates between San Francisco and New York; this year it&#8217;s in the latter city. I suppose if you have to pick two cities to have a conference in, those are pretty good ones. Still, one of the things I like [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m in New York for the 2007 annual meeting of the Cognitive Neuroscience Society. CNS alternates between San Francisco and New York; this year it&#8217;s in the latter city. I suppose if you have to pick two cities to have a conference in, those are pretty good ones. Still, one of the things I like best about going to conferences is getting to explore cities I haven&#8217;t spent time in. Not so much of that this year. On the other hand, having less inclination to sightsee leaves more time for posters, talks, and socializing, and that&#8217;s not a bad thing either.</p>
<p align="center">*    *    *</p>
<p>As always, there are too many posters to see. The CNS schedule of events doesn&#8217;t begin to approach SFN standards&#8211;the latter consisting of a CD&#8217;s worth of fully indexed and searchable abstracts, and five different books (one per day)&#8211;but if you do any sort of neuroimaging work, a much higher proportion of the abstracts are likely to interest you. I start every conference I go to by spending half an hour meticulously checking off all the posters I want to see at the next session. Then when that session rolls around I promptly discard my notes and drift aimlessly from aisle to aisle.</p>
<p align="center">*    *    *</p>
<p>There are a lot of complaints this year about the quality of the poster halls here at the Sheraton New York. The halls are (a) maze-like; (b) dark; and (c) warm. It&#8217;s a safe bet that some small proportion of attendees enjoys this environment, but for those of us who (a) don&#8217;t have an exquisitely-tuned spatial navigation system, (b) aren&#8217;t vampires, or (c) don&#8217;t suffer from hyperthyroidism, it&#8217;s a little bit uncomfortable.</p>
<p align="center">*    *    *</p>
<p>The drinks last night at the welcome reception started at $6.50 for a soft drink. $11.50 for a beer. When I asked the bartender why I couldn&#8217;t just have a cup of tap water for free, he shrugged in antipathy. I suppose it was more polite than saying &#8220;because it would undercut our bottom line, schmuck.&#8221; So I went down the street, bought a bucket, filled it with ice and water, and gave away free refreshments to all the thirsty neuroscientists. No, just kidding. I bent over and took it just like everyone else.</p>
<p align="center">*    *    *</p>
<p>Memo to presenters: that signup sheet next to your poster isn&#8217;t <em>real</em>. Etiquette requires that after someone&#8217;s finished being bored by the intimate details of your presentation for fifteen minutes, they be provided with some way of expressing their joy and gratitude to you for furnishing them with a life-changing experience. They do this by signing up to receive a second iteration of your treatment in written form. Putting their name on your form completes all contractual obligations. There&#8217;s no requirement that you actually follow up and email them your poster. In fact, doing so only inconveniences your audience. Last time I came back from CNS I spent an entire morning hitting the &#8216;delete&#8217; button. I could have been doing much more productive things, like brushing my teeth.</p>
<p align="center">*    *    *</p>
<p>I&#8217;m slowly realizing that New York is an expensive place with terrible service. Take for instance this morning. I was standing on the corner outside the hotel when a nice man approached me and said he was an artist and that he could put a beautiful glossy sheen on my poster for just $80. So I gave him my poster and $80, and he said he&#8217;d be back in twenty minutes. Well it&#8217;s been 3 hours and I haven&#8217;t heard anything. When he comes back, I&#8217;m going to be very angry with him. Just wait till he sees what kind of customer evaluation I give him at the information desk.</p>
<p align="center">*    *    *</p>
<p>Auditory perception, memory systems, emotion, and numerical processing. These are all important areas of research, and certainly worthy of inclusion in the poster sessions. But there&#8217;s no reason to be elitist. Cognitive neuroscience is a diverse field. I&#8217;ve been emailing the poster committee my suggestions for topics for several years, and I&#8217;ve yet to see any follow-through or receive a reply. What&#8217;s wrong, people? Too creative? Too novel? Don&#8217;t envy me just because I thought of having a symposium on brick-selective cortex and you didn&#8217;t. It&#8217;s not <em>my </em>fault you lack vision.</p>
<p align="center">*    *    *</p>
<p>There&#8217;s a lot of talk at this conference about how the brain is this wonderfully clever device that lets us project ourselves effortlessly into the past and future, move forwards and backwards in time, etc. etc. It&#8217;s not entirely unlike that other device that smoothly whisks you from the seventeenth floor down to the atrium while you&#8217;re busy placing mental bets on the length of the coffee line. Between your brain and the elevator, there&#8217;s no dimension you can&#8217;t conquer! You&#8217;re a master of time and space! Then the doors open up and someone jams your shoulder into the wall as they rush by you. Looks like you&#8217;re a lowly grad student again, grasshopper.</p>
<p align="center">*    *    *</p>
<p>After a long day spent milling around hundreds of posters made by hundreds of scientists, all as smart and creative as you, all working on equally interesting problems, it&#8217;s easy to get a little down on yourself. What&#8217;s the point, you might ask yourself. Why bother participating in science if the best any of us can ever hope for is to make a tiny, insignificant contribution to that great puzzle that is the human mind. And what&#8217;s so great about the human mind anyway, if it&#8217;s just the temporal analog of an elevator. You might as well be studying dirt. Dirt is less dynamic than the mind, but more tractorable.</p>
<p align="center">*    *    *</p>
<p>If a neuroscientist gives a great keynote address at a conference and no one hears it because they&#8217;ve all skipped the morning session to go roam around lower manhattan, does she still get to put it on her vita?
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2007/05/10/brains-in-the-elevator-notes-from-cns-2007-pt-i/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Getting rich in graduate school</title>
		<link>http://www.smallgraymatters.com/2007/04/01/getting-rich-in-graduate-school/</link>
		<comments>http://www.smallgraymatters.com/2007/04/01/getting-rich-in-graduate-school/#comments</comments>
		<pubDate>Sun, 01 Apr 2007 17:56:03 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>science</category>

		<category>politics</category>

		<category>news articles</category>

		<category>academics</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2007/04/01/getting-rich-in-graduate-school/</guid>
		<description><![CDATA[The New York Times has an interesting article in today’s paper by Mary Jenkins covering a new federal program set to provide substantial raises in funding for a minority of graduate students in the sciences. The Pell-Mell Grants, a joint venture of the Federal Government, Pell Grant program, and Andrew W. Mellon Foundation, is projected [...]]]></description>
			<content:encoded><![CDATA[<p><span style="font-family: Verdana">The New York Times has <a href="http://www.nytimes.com/2007/04/01/weekinreview/01jenkins.html?ref=science">an interesting article</a> in today’s paper by Mary Jenkins covering a new federal program set to provide substantial raises in funding for a minority of graduate students in the sciences. The Pell-Mell Grants, a joint venture of the Federal Government, Pell Grant program, and Andrew W. Mellon Foundation, is projected to cost 8 billion dollars over the next twenty years. Needless to say, with that amount of money on the table, you’re going to see some strong opinions put forward about the program’s merit. The basic gist of the NYT’s article is that (not surprisingly) graduate students love it; established faculty members, not so much.</span></p>
<p class="MsoPlainText"><span style="font-family: Verdana">The really striking thing about the program is the sheer amount of money it throws at a select number of students—a projected 8,000 in 2014, with the number of awards gradually increasing over the next six years. It funds graduate students in natural science and engineering disciplines at a level up to $60,000 annually for three years, with applications renewable up to four times. Why the dramatic increase in funding over such a protracted period? From the article:</span></p>
<blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">Critics complain that allowing graduate students to secure major funding for up to 12 years of predoctoral work will encourage complacency and clog up universities with ‘lifers’. Others see it is a necessary step if the US wishes to remain competitive with emerging Asian countries. Dr. Ron Sekubus, chair of the department of public policy at George Washington  University, notes that without the funding, American universities would be forced to admit an ever-increasing number of international students in order to buffer against the loss of American students to more lucrative fields such as medicine and law. When international students complete their degrees, they are almost invariably unable to find legal work in the US, leading them to return to their home countries. The long-run outcome of this perpetual ‘brain drain’, Sekubus suggests, is that the US will fall in scientific productivity relative to rapidly-developing countries such as India and China. </span></p>
<p class="MsoPlainText"><span style="font-family: Verdana">When I ask him whether the government couldn’t just solve this problem overnight by allowing more trained foreign scientists to stay in the US once they’ve completed their doctorate, Sekubus’ responds aggressively. “Immigration is not the solution,” he says. “Increasing funding for American citizens via merit-based and non merit-based programs is the solution.”</span></p>
</blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">The reasoning seems pretty straightforward. But as always, the story isn’t as clear cut as the above quote suggests. Here&#8217;s another relevant bit from the article:</span></p>
<blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">Tovaz Shikarti, a program officer at the NIH, points out that high levels of predoctoral funding in the sciences make sense within the current cultural context:</span></p>
<p class="MsoPlainText"><span style="font-family: Verdana">&#8220;When we sat down to look at it, we couldn&#8217;t really understand why the top 5% American graduate students are getting paid less than the bottom 1% of American faculty. American culture is built on the promise of potential; it&#8217;s a foregone conclusion that this generation of top students is going to do some pretty remarkable things in a few years. The Pell-Mell grant program is our way of equalizing the situation by honoring the American tradition. You could think of this as an NBA draft for scientists.&#8221;</span></p>
<p class="MsoPlainText"><span style="font-family: Verdana">Others don&#8217;t see it that way. John Jacobson, a professor of history at The College of Wooster, complains. &#8220;I certainly don&#8217;t object to students making a livable wage. I was a graduate student once too. But when a first year physics graduate student at Wisconsin makes more than I do as an Associate Professor of Historical Arts at Wooster, I don&#8217;t think that&#8217;s right. My wife and I are giving serious thought to respecializing in materials science just so we can get a piece of the pie. It&#8217;s almost like the government&#8217;s goal is to get rid of the humanities and social sciences altogether.&#8221;</span></p>
</blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">One of the caveats to the program is that the awards are highly selective&#8211;more so than existing NSF and NIH fellowships. There&#8217;s a three-stage selection process. The first two are fairly standard: First, would-be Pell-Mell grantees send in an application similar to the one required for the current NSF predoctoral fellowships. In fact, applicants to the NSF program are automatically entered into the Pell-Mell competition if they fill in several new fields on the NSF forms. Second, successful first-stage applicants are subjected to a still more rigorous screening, including close scrutiny of applicants’ transcripts, letters of recommendation, and current institution.</span></p>
<p class="MsoPlainText"><span style="font-family: Verdana">But it’s the final stage that sets the Pell-Mell grants apart from other programs. Successful applicants must not only demonstrate their academic prowess, but must also pass muster in the eyes of a newly-created Human Excellence Review Board (HERB). One of the novel requirements implemented by HERB is that applicants must designate a non-academic hobby as their “special skill”. The goal of this requirement is to encourage applications from well-rounded students with broad interests, instead of automatons who spend all day in the lab living and breathing one narrow discipline.</span></p>
<blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">&#8220;What you list as your special skill is flexible,” says Tovaz Shikarti. “There&#8217;s no strict criterion our applicants have to live up to.” He notes that when the NIH conducted a limited test run with graduate students at Darthmouth College and the University of Pennsylvania, it received applications from people with skills like lizard hunting, karaoke, and cheese making. &#8220;There was even one trapeze swinger,&#8221; he says.</span></p>
</blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">If this all sounds a little bit cock-eyed, don’t worry, the government is on top of that too.</span></p>
<blockquote>
<p class="MsoPlainText"><span style="font-family: Verdana">When prompted as to whether the Pell-Mell program might not produce bad press for the NIH and NSF at a time when American scientists are complaining about flatlining funding, administrators demur. “We had a serious discussion about that at HERB,” says Aashish Patel, a board member. “Some people wanted to kill the program, to stub it out. But it’s not like we’re sitting around smoking up when we come up with these ideas. They’re serious policy proposals.”</span></p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2007/04/01/getting-rich-in-graduate-school/feed/</wfw:commentRss>
		</item>
		<item>
		<title>item due in 7 days</title>
		<link>http://www.smallgraymatters.com/2007/01/13/item-due-in-7-days/</link>
		<comments>http://www.smallgraymatters.com/2007/01/13/item-due-in-7-days/#comments</comments>
		<pubDate>Sat, 13 Jan 2007 14:26:09 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>humor</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2007/01/13/item-due-in-7-days/</guid>
		<description><![CDATA[Following in the footsteps several other science blogs, here&#8217;s a library card for smallgraymatters.com:


]]></description>
			<content:encoded><![CDATA[<p>Following in the footsteps <a href="http://scienceblogs.com/terrasig/2007/01/library_catalog_card.php">several</a> <a href="http://scienceblogs.com/retrospectacle/2007/01/library_card_meme.php">other</a> <a href="http://scienceblogs.com/pharyngula/2007/01/where_youll_find_me_in_the_car.php">science</a> <a href="http://scienceblogs.com/goodmath/2007/01/gmbm_in_the_card_catalog.php">blogs</a>, here&#8217;s a <a href="http://www.blyberg.net/card-generator/">library card</a> for smallgraymatters.com:</p>
<p><img alt="a brief history of the brain, by small &#038; gray" title="a brief history of the brain, by small &#038; gray" src="http://www.smallgraymatters.com/images/librarycard.jpg" />
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2007/01/13/item-due-in-7-days/feed/</wfw:commentRss>
		</item>
		<item>
		<title>trendspotting the fMRI literature</title>
		<link>http://www.smallgraymatters.com/2007/01/08/trendspotting-the-fmri-literature/</link>
		<comments>http://www.smallgraymatters.com/2007/01/08/trendspotting-the-fmri-literature/#comments</comments>
		<pubDate>Tue, 09 Jan 2007 06:43:58 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>fmri</category>

		<category>academics</category>

		<category>methodology</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2007/01/08/trendspotting-the-fmri-literature/</guid>
		<description><![CDATA[Select a few neuroimaging papers at random and you’re likely to come across a handful of statements in the introduction to the effect that the topic under study is of “increasing interest”. At conferences and research talks, you’ll sometimes see speakers invoke a familiar kind of figure that looks something like this:

That’s the number of [...]]]></description>
			<content:encoded><![CDATA[<p>Select a few neuroimaging papers at random and you’re likely to come across a handful of statements in the introduction to the effect that the topic under study is of “increasing interest”. At conferences and research talks, you’ll sometimes see speakers invoke a familiar kind of figure that looks something like this:</p>
<p><img title="Number of 'language and fmri' citations in PubMed, 1996-2006" alt="Number of 'language and fmri' citations in PubMed, 1996-2006" src="http://www.smallgraymatters.com/images/language_1.jpg" /></p>
<p>That’s the number of citations in PubMed containing the terms ‘fMRI’ and ‘language’ in the abstract or title, plotted by year of publication. Figures like this purport to show that interest in a topic is increasing dramatically. Just look at that increase! In 1996, there were only 13 hits; by 2005, there were 99! It’s as clear as daylight that interest in the neural bases of language is increasing!</p>
<p>Of course, the poorly-kept secret is that fMRI didn’t exist twenty years ago, and wasn’t really widely adopted until the last few years. So it’s natural to see an increase in publications that study language using neuroimaging methods. You’d expect a similar increase for almost<em> every</em> other area of research. The more pertinent question is whether interest in a particular topic has increased <em>disproportionately</em> relative to the general increase in the use of fMRI over the last few years. Instead of plotting absolute numbers, what we want is something like this:</p>
<p><img src="http://www.smallgraymatters.com/images/language_2.jpg" /></p>
<p>In the above figure, the pink line represents the number of papers with the terms ‘fMRI’ and ‘language’ in the title (the blue line in the first figure has now turned pink&#8211;sorry about the color confusion!). But now the additional (blue) line shows the number of papers that have just  the term ‘fMRI’ in the abstract. The increase in language papers starts to look suspect, since it&#8217;s clear the increase in fMRI papers on language is essentially paralleled by the increase in fMRI papers in general. Here’s an even better representation:</p>
<p><img src="http://www.smallgraymatters.com/images/language_3.jpg" /></p>
<p>That’s the proportion of PubMed studies with the terms ‘fMRI’ and ‘language’ in the title or abstract over the last few years relative to the total number of studies with just the term “fMRI”. As you can see, it’s a very different picture. It’s a small sample size, but there’s not much reason to think people are any more interested in studying language in 2006 than in 1998—at least, <em>relative to interest in other topics that can be studied with fMRI.</em></p>
<p>So what to make of claims that research interest is increasing in topics X, Y, and Z? Well, in a sense those claims are true, since the total number of neuroimaging publications continues to rise fairly dramatically. But in the sense that researchers probably care about more—namely, the “if I have a magnet and I want to do a study, what’s a hot topic right now?” sense—most research topics <em>can’t</em> be on the rise, by definition (just like most people can’t be of above average intelligence). Moreover, the number of academic publications <em>in general</em> has increased pretty dramatically over the last few years, so it’s not even clear from the above just how much of the increase in the number of fMRI papers on language is due to greater adoption of fMRI as opposed to a more global increase in scientific research output.</p>
<p>Now, the point of this post isn’t just to malign a ubiquitous research tactic. One can’t really fault people for wanting to think their own research is more interesting than other people’s. I’ll be the first to confess I’ve inserted some rather disingenuous comments about how oh-so-fascinating my results are and how much they (should) mean to other researchers in my papers. It’s hard to motivate a paper without doing that to some degree, or even to get motivated to do the research in the first place. What the second graph above does point up though, is that the question as to what topics are ‘hot’ is an empirical one—and fortunately, one that can be relatively easily (though imprecisely) tested.</p>
<p>To generate the above graphs, I used data from PubMed. One of the many nice things about PubMed is that it has <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html">an API</a> that allows you to access the database programmatically (in contrast to Google Scholar, which is inaccessible via API due to agreements between Google and the major publishers to keep it that way). So, in the interest of doing some trendspotting, I wrote a small Visual Basic program to quantify the emergence (or lack thereof) of real ‘trends’ in research. I used the search string “fMRI [tiab]” as the control—i.e., all articles containing the string “fMRI” in the title or abstract. This is a conservative approach since the standard PubMed search also searches article contents, resulting in a difference of an order of magnitude in hits (7000 vs. 160000). But the more conservative approach is likely more accurate, since any study that includes the term in its title or abstract is much more likely to report original fMRI data than studies that just mention the terms in passing.</p>
<p>This reference number (broken down by year) was then compared with the results of a series of more specific searches. Basically, for a variety of topics, I added a single search term like “language” or “emotion” to the basic search. Again, the stipulation was that only titles and abstracts be searched. The ratio between the specific and the general term was then plotted for each year in order to highlight potential trends.</p>
<p>What do the results look like? Here are the ‘trends’ in neuroimaging for four major areas of research, broken down for the years 1996-2006:</p>
<p><img src="http://www.smallgraymatters.com/images/domains_1.jpg" /></p>
<p>What can we infer from the above figure? Well, just by eyeballing it, it looks like there’s a general trend toward relative increases in the number of papers on emotion, working memory, and attention, and no change for language. Statistical tests reveal that the three positive trends are significant (p < .05 for all three). So there’s at least some evidence that there are in fact trends in neuroimaging research (assuming there isn’t some alternative explanation, e.g., abstracts just getting longer and consequently mentioning more terms). The key point is that this kind of information can’t be gleaned just by looking at the first figure presented in this post. Absolute increases in publication count aren’t particularly informative. In contrast, when you use a control condition—though in this case, an admittedly crude one—you can feel a little more confident about the conclusions you’re able to draw. Naturally, this is a small sample size, and as I mentioned, the search is highly conservative (obviously, more than 46 fMRI articles on emotion were published in 2006!). But it’s likely that the results are a good representation of what’s out there, and that we can safely generalize to the many papers that use fMRI to study these topics but didn’t use the exact term in the abstract.</p>
<p>What about other ways of carving up the literature? Here’s the breakdown by sensory modality:<br />
<img src="http://www.smallgraymatters.com/images/domains_2.jpg" /></p>
<p>Doesn’t look like much is going on, and indeed none of the regression slopes are statistically significant. But at least this analysis is somewhat reassuring given the increases seen above for working memory, attention, and emotion: it’s clearly not as though <em>all</em> search terms are being mentioned more frequently in more recent fMRI abstracts.</p>
<p>Here’s one last figure (this could obviously go on for a very long time) plotting the trajectory of publication count in a few less-studied domains:</p>
<p><img src="http://www.smallgraymatters.com/images/domains_3.jpg" /></p>
<p>The trends for ‘social’, ‘reward’, and ‘decision making’ are significant here, but the trendline for pain isn’t. Social neuroscience research in particular appears to be emerging as a prominent domain of fMRI research, more than doubling its relative share of the literature between 2005 and 2006, though it’s still a relatively small field.</p>
<p>In evaluating the figures above, there are several caveats to keep in mind. One major limitation of this trendspotting approach is that it’s not well-suited to quantifying trends in more fine-grained areas of research, because there may only be a handful of studies per year, resulting in a pretty unreliable measure. Then again, claims that one small niche of research within the broader field of cognitive neuroscience is on the rise probably aren’t that interesting to begin with. If a particular topic was studied by 2 people in 2000 and 6 in 2005 (instead of a projection of, say, 4), you might want to wait a while before hopping on the bandwagon.</p>
<p>Another obvious limitation is that the procedure I used to generate these graphs was extremely simplistic. One can easily imagine more sophisticated approaches that control much more tightly for potential confounds (e.g.,  tier of journal, mean abstract length, etc.) and use better quantitative measures than the simple ratio I used above. That’s ok though; the point I want to make isn’t that this particular set of graphs provides a particularly accurate insight into the state of the field of neuromaging. Rather, the point is that scientific trends can be studied empirically just like anything else, and there’s a massive amount of data freely available for mining. Entire journals are devoted to tracking and discussing current research fads (see the <a href="http://www.trends.com">‘Trends in…’ series</a>), but it’s unclear whether the editors at such outlets make their decisions on the basis of quantitative information. Conversely, from an author’s perspective, knowing what’s hot isn’t just a matter of curiosity—careful attention to trends could conceivably increase the rate of acceptance of one’s publications.</p>
<p>As a side note, if anyone wants to suggest possible searches for trends they’d like to see quantified, feel free to leave a comment below or to email me. I may release the VB program at some point, but it’s in no shape to see the light of day at the moment. Of course, you can always head over to PubMed and enter search terms manually.
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2007/01/08/trendspotting-the-fmri-literature/feed/</wfw:commentRss>
		</item>
		<item>
		<title>the full monty on frontal love syndrome</title>
		<link>http://www.smallgraymatters.com/2007/01/07/the-full-monty-on-frontal-love-syndrome/</link>
		<comments>http://www.smallgraymatters.com/2007/01/07/the-full-monty-on-frontal-love-syndrome/#comments</comments>
		<pubDate>Mon, 08 Jan 2007 06:09:31 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>humor</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2007/01/07/the-full-monty-on-frontal-love-syndrome/</guid>
		<description><![CDATA[Mind Hacks offers up this humorous vignette for your entertainment:
There&#8217;s a lovely typo in a 1976 paper from the Journal of Neurology, Neurosurgery, and Psychiatry that reports on a study about epilepsy after surgery. Check out the last sentence of the abstract &#8230;
I&#8217;ll spare you the suspense (but read the abstract anyway!): an undisclosed subset [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.mindhacks.com/">Mind Hacks</a> offers up <a href="http://www.mindhacks.com/blog/2007/01/temporal_typo_trauma.html">this humorous vignette</a> for your entertainment:</p>
<blockquote><p>There&#8217;s a lovely typo in a 1976 paper from the <em>Journal of Neurology, Neurosurgery, and Psychiatry</em> that reports on a study about epilepsy after surgery. Check out the last sentence of the abstract &#8230;</p></blockquote>
<p>I&#8217;ll spare you the suspense (but <a href="http://www.mindhacks.com/blog/2007/01/temporal_typo_trauma.html">read the abstract anyway</a>!): an undisclosed subset of patients in the sample were fortunate enough to suffer from &#8220;temporal love trauma&#8221;. You might have thought temporal love trauma to be an unusual disorder (it&#8217;s certainly much rarer than temporal <strong>lobe</strong> trauma), but that&#8217;s an empirical question, and the empirical answer is you&#8217;d be wrong. At least according to Google Scholar, which assures us quite confidently that temporal love epilepsy is a well-documented condition:</p>
<p><a href="http://scholar.google.com/scholar?q=%22temporal+love%22&#038;hl=en&#038;lr=&#038;btnG=Search">http://scholar.google.com/scholar?q=%22temporal+love%22&#038;hl=en&#038;lr=&#038;btnG=Search </a></p>
<p>Marvel that it is, Google Scholar also gives us a rare glimpse into the symptoms of a related but even more mysterious disorder: frontal love epilepsy. Consider the title of the following paper, cited in the reference section of a book by one M. Cherkes Julkowski:</p>
<p style="margin-left: 40px">Burgess, PW &#038; Shallice, T (1996). Response suppression, initiation, and strategy use following frontal love lesions. <em>[For reasons that are presently unclear to me, I wasn&#8217;t able to locate this article in the primary literature.]</em></p>
<p>Or the following helpful tip from a set of lecture notes on functional neuroanatomy, <a href="http://ibs.derby.ac.uk/~keith/5ps021/Functional_Neuroanatomy-Derby_05.pdf">now mysteriously disappeared from their original Cambridge home</a> (a conspiracy?):</p>
<p style="margin-left: 40px">Theories of frontal love function have superseded ARAS theory in explaining personality differences.</p>
<p>The lecture notes then go on to say that empirical studies have shown that the personality traits of extraversion and agreeableness depend in large part on one&#8217;s frontal love capacity. Illustrations are provided in the text. If you don&#8217;t believe me, you can go and see for yourself. Wait, I forgot: they&#8217;re no longer online. How unfortunate.</p>
<p>I imagine there are also cases of <a href="http://www.google.com/search?num=100&#038;hl=en&#038;lr=&#038;q=%22parietal+love%22&#038;btnG=Search">parietal love</a> lesions or <a href="http://www.google.com/search?hl=en&#038;q=%22occipital+love%22&#038;btnG=Google+Search">occipital love</a> epilepsy (blinded by one&#8217;s feelings?) out there, waiting to be discovered. It really is a great time to be alive and doing science&#8230;
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2007/01/07/the-full-monty-on-frontal-love-syndrome/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A primer on power</title>
		<link>http://www.smallgraymatters.com/2006/12/04/a-primer-on-power/</link>
		<comments>http://www.smallgraymatters.com/2006/12/04/a-primer-on-power/#comments</comments>
		<pubDate>Tue, 05 Dec 2006 05:06:23 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>tutorials</category>

		<category>methodology</category>

		<category>statistics</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2006/12/04/a-primer-on-power/</guid>
		<description><![CDATA[I&#8217;d like to title this post “a power primer,” but that’s the title of a 1992 Psychological Bulletin article by Jacob Cohen (the god of power analysis, now deceased). So instead I’ve titled it “a primer on power.” By changing a few words around I’ve very cleverly gone from academic plagiarism to paying homage. (And [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d like to title this post “a power primer,” but that’s the title of <a href="http://www.education.wisc.edu/elpa/academics/syllabi/2006/06Spring/825Borman/Cohen1992.pdf">a 1992 Psychological Bulletin article by Jacob Cohen</a> (the god of power analysis, now deceased). So instead I’ve titled it “a primer on power.” By changing a few words around I’ve very cleverly gone from academic plagiarism to paying homage. (And it really is one: I think Cohen’s article, and his lengthier works on power, should be required reading for behavioral scientists of all stripes).</p>
<p>Power is one of the most misunderstood and/or underappreciated concepts in scientific research. Simply put, it refers to the probability of detecting an effect in your sample when it is in fact present in the population (i.e., when it’s ‘real’). If your study has, say, 90% power to detect a difference in the length of socks worn by basketball players as compared to soccer players, that means that <em>if there really is a difference</em> between basketball and soccer players’ sock length, there’s a 9 in 10 chance on average that you’ll be able to detect it in your sample.</p>
<p>In general, power is a good thing, and you want to have as much of it as you can. In an ideal world, scientific experiments would have 100% power to detect effects. Unfortunately, that doesn’t happen in the real world, because to have 100% power (i.e., complete certainty), you’d need to sample the entire population of interest, which isn’t very practical (that’s a lot of players, and twice as many socks). In practice, researchers’ sample sizes are constrained by resource considerations. And so, as a result, is power. Any time you conduct an experiment with a finite sample, you’re taking the risk that you might miss an effect even if it really does exist, simply because of blind (mis)fortune. And in general, the smaller your sample, the greater the probability of you missing an effect. This idea is intuitive enough to most people: it seems pretty obvious that if you want to know whether men are taller than women, you don’t want to base your judgment on the difference in height between just one man and one woman. If you did, you&#8217;d run the risk that you just happened to pick a particularly short man and/or a particularly tall woman. The more men and women you measure, the more the random variations from the mean average out, and the smaller the odds of mistakenly concluding that there’s no gender difference in height.</p>
<p>Where confusion starts to set in (and the impetus for this post) is that the intimate link between sample size and power often leads people (including many scientists) to suppose that there’s a single ‘right’ sample size for all research studies of a particular kind. It’s not uncommon to hear people say things like, “we can&#8217;t trust that study because it&#8217;s based on only 50 people! They need at least 300 to be able to say anything meaningful about the general population!” (Actually this sort of statement also betrays another kind of confusion that relates to the difference between Type I and Type II errors, but that’s a separate issue). The problem is that statistical power depends not only on sample size, but also on two other numbers: the size of the effect, and the stipulated false positive rate (also referred to as alpha, or the Type I error rate).</p>
<p>The importance of the first of these—effect size—is easy to see intuitively. Suppose that the average height difference between men and women was 2 feet rather than several inches. How hard would it be to detect that difference and conclude it exists? Not very. A group of curious alien taxonomists wouldn’t need to abduct very many humans before they figured the gender difference out, simply because the vast majority of men would be taller than the vast majority of women, and the difference would hit the aliens right between the antennae. On the other hand, if the mean height difference was only 1/10th of an inch, our aliens would need to abduct a lot of humans and measure them very carefully before they’d be in a good position to claim that a height difference exists. Simply put, if the effect you’re looking for is large, it takes fewer subjects in order to detect it. Or, more formally, one’s power to detect an effect increases in proportion to the magnitude of the effect, when holding sample size constant.</p>
<p>The second parameter, false positive rate, is somewhat less intuitive. The basic idea is that, since sampling is random and error necessarily creeps in, on rare occasions, researchers are going to end up concluding that an effect exists in the population even though it doesn’t really. Just how often such errors occur is typically a matter of stipulation: scientists will decide that they can accept a false positive occurring, say, 1 out of every 20 times, and adjust their statistical tests accordingly. Conventionally, the false positive rate is set to 5% (and significance tests are therefore conducted at p < .05). Because the convention is so strong, it’s often easy to overlook the false positive rate in power calculations and just default to the standard 5% level. Nonetheless, there is a relationship: the more conservative your statistical test (i.e., the smaller the false positive rate you're willing to accept is), the lower your power gets. In less technical terms, it's kind of like saying that if you only want to be <em>fairly</em> sure that an effect holds true, you don&#8217;t need to look very hard. But if you want to be <em>really</em> sure, you need to double and triple-check to make sure. And double and triple-checking requires more observations (i.e., more subjects.)</p>
<p>Given that power depends only on these two parameters (sample size and false positive rate), how much power is enough? It’s widely accepted that a reasonable level of power is 80-85%. I say ‘widely accepted’ because when people stop to think about what level of power they find acceptable, their answer tends to be in that ballpark (i.e., 4 times out of 5, your experiment will detect the effect you want if it really exists). But that’s not to say that most studies actually <em>have</em> that level of power in practice. One of the most remarkable findings (and one that’s been demonstrated over and over again) made by statisticians interested in power is that an absurdly large proportion of studies in many disciplines simply don’t have the necessary power to detect the effects they hypothesize. In the article I linked to at the beginning, Jacob Cohen points out that an analysis he conducted in 1960 indicated that the average social psychology study had only 48% power to detect moderate-sized effects. In Cohen’s words, “the chance of obtaining a significant result was about that of tossing a head with a fair coin” (p. 155). And that’s on<em> average</em>; presumably there are a good number of studies that have set out to identify effects they have <em>no real chance of detecting even if they’re actually present in the population</em>.</p>
<p>Cohen then went on to note that other statisticians conducting similar reviews have shown no improvement in the average level of power in the decades since. For anyone actively involved in research—or even to casual consumers of science—this should raise red flags all over the place. There really is no excuse for failing to do a simple power calculation before beginning to collect data. It’s not as though power calculation is a tedious process: all you have to do is plug two or three numbers into an online worksheet, and poof, you get your answer instantly. And yet many, maybe even most, scientists fail to do so.</p>
<p>In fairness, doing a power calculation isn’t quite <em>that</em> easy, because you rarely know the exact size of the effect you’re seeking. If you did, you probably wouldn’t need to do the study in the first place! While it’s easy to decide you’d like your study to have, say, 80% power, it’s not so easy to come up with a reasonable estimate of effect size.</p>
<p>Suppose for example that we want to know if there’s a correlation between people’s mood and the amount of television they watch daily. Let’s stipulate our power has to be around 80% (we don’t want to do our study if we don’t think there’s at least a 4 in 5 chance of detecting an effect), and we’ll test our hypothesis at the conventional level of p < .05. How many subjects do we need to collect data from? Well, depends. If the correlation between mood and television watching in the general population is <em>large</em> (canonically, around r = .5), we’re only going to need 29 people to have an 80% chance of detecting it. If it’s <em>medium</em> (say, r = .3), we’re going to have to round up 85 people. But if it’s only a <em>small</em> effect (say, r = .1, or an overlap of only 1% of the total variance in each measure), we’re faced with the daunting prospect of chasing down 785 subjects! Note that in all 3 of these cases, we’re assuming that there <em>really is a correlation between mood and television-watching</em>. The only difference is how strong that effect is.</p>
<p>Of course, power calculations don’t always have to mean bad news. For example, in my area of research (functional neuroimaging), power calculations are often quite comforting. It’s an interesting quirk that people often criticize imaging studies for having small samples, when in fact imaging studies probably don’t have lower power on average than other kinds of studies (at least for standard experimental, within-subject analyses). The knee-jerk reaction is understandable though, because many psychologists (particularly in social or personality psychology) are used to working with samples in the hundreds. If that’s your background, it’s no surprise that when you come across neuroimaging studies that used samples of only 15 subjects (a pretty standard size), you’re going to think something’s horribly wrong.</p>
<p>In fact, there’s nothing wrong, because it turns out (fortuitously!) that effect sizes in functional neuroimaging studies tend to be huge. It’s not uncommon to see effect sizes around d = 2 (d is a standardized measure of effect size popularized by Cohen; it’s measured in standard deviations, so a d of 2 means the difference in neural activation between two experimental conditions is around 2 standard deviations). Effects that large are unheard of in most other disciplines. Consider that Cohen himself considered anything above d = 0.8 a ‘large’ effect (this is just a heuristic of course—the meaning of ‘large’ differs considerably across research areas!).</p>
<p>A quick power calculation reveals that a study with 12 subjects has essentially 100% power to detect an effect size of 2 at p < .05. Basically, if the population effect really is that big, you’re not going to miss it. In fact, with only 2 subjects, you’d still have an 88% shot of detecting it. This explains why early neuroimaging studies that often had only 3 or 4 subjects were able to obtain replicable results. In the early days, when little was known about the relationship between specific cognitive tasks and neural activity in humans, researchers used very broad experimental task contrasts specifically intended to elicit very large, very obvious changes in activation (e.g., comparing activation during a working memory task to a passive resting state). The effects were (not surprisingly, in hindsight) enormous. As time goes on and our knowledge of the functional neuroanatomy of cognition builds up, hypotheses become more subtle, and effect sizes diminish, requiring larger samples.</p>
<p>Of course, imaging studies usually don’t test for effects at p < .05, for reasons I won’t go into here (mainly the need to correct for multiple comparisons). Still, even at p < .001, a study with 15 subjects has 70% power. That’s not great, but it’s a comparable level to what you’ll find in many behavioral studies. Bump the sample up to 20 subjects, and power is now 92%, which is more than acceptable.</p>
<p>Hopefully, these example make clear the importance of (a) conducting power calculations <em>before</em> starting to collect data, and (b) having some reasonable notion as to what the population effect size might be (e.g., based on related effects that have already been identified). Even if you&#8217;re never going to collect any data yourself, and just want to be an informed consumer of scientific literature, it pays to know something about power. Remember: effect size matters. The fact that a study only has 10 people doesn&#8217;t necessarily mean it&#8217;s too small to provide meaningful data. Conversely, a study can have thousands of subjects and still be underpowered.
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2006/12/04/a-primer-on-power/feed/</wfw:commentRss>
		</item>
		<item>
		<title>what&#8217;s your number?</title>
		<link>http://www.smallgraymatters.com/2006/11/11/whats-your-number/</link>
		<comments>http://www.smallgraymatters.com/2006/11/11/whats-your-number/#comments</comments>
		<pubDate>Sun, 12 Nov 2006 03:17:00 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>general</category>

		<category>academics</category>

		<category>publishing</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2006/11/11/whats-your-number/</guid>
		<description><![CDATA[The PLoS blog has an interesting entry by Richard Cave, PLoS&#8217;s IT director, on the topic of unique author identification. If you&#8217;ve done more than a couple dozen literature searches, odds are you&#8217;ve run into cases where you&#8217;ve asked yourself &#8220;is I. Niedebeternaym the I. Niedebeternaym I&#8217;m looking for?&#8221; Sometimes authors share names; sometimes individual [...]]]></description>
			<content:encoded><![CDATA[<p>The PLoS blog has <a href="http://www.plos.org/cms/node/133">an interesting entry by Richard Cave</a>, PLoS&#8217;s IT director, on the topic of unique author identification. If you&#8217;ve done more than a couple dozen literature searches, odds are you&#8217;ve run into cases where you&#8217;ve asked yourself &#8220;is I. Niedebeternaym <em>the</em> I. Niedebeternaym I&#8217;m looking for?&#8221; Sometimes authors share names; sometimes individual authors list their names differently on different papers; and sometimes authors <em>change</em> names (e.g., after getting married). While most of us can probably agree that it&#8217;d be nice if unique author IDs existed, there are plenty of technical issues that need be resolved before such a system can be implemented. See the <a href="http://www.plos.org/cms/node/133">full post</a> for an insightful discussion.
</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2006/11/11/whats-your-number/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The genetics of episodic memory</title>
		<link>http://www.smallgraymatters.com/2006/10/21/the-genetics-of-episodic-memory/</link>
		<comments>http://www.smallgraymatters.com/2006/10/21/the-genetics-of-episodic-memory/#comments</comments>
		<pubDate>Sat, 21 Oct 2006 07:17:58 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>fmri</category>

		<category>research articles</category>

		<category>molecular genetics</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2006/10/21/the-genetics-of-episodic-memory/</guid>
		<description><![CDATA[The latest issue of Science has a really impressive article by Papassotiropoulos et al. probing the genetic basis of episodic memory. In it, the authors identify for the first time a link between a polymorphism in a gene called Kibra and individual variability in performance on delayed episodic memory tasks.
In brief, Papassotiropoulos and colleagues conducted [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">The <a href="http://www.sciencemag.org/content/vol314/issue5798/index.dtl">latest issue of Science</a> has <a href="http://www.sciencemag.org/cgi/content/full/314/5798/475">a <em>really </em>impressive article by Papassotiropoulos et al.</a> probing the genetic basis of episodic memory. In it, the authors identify for the first time a link between a polymorphism in a gene called <em>Kibra </em>and individual variability in performance on delayed episodic memory tasks.</p>
<p class="MsoNormal">In brief, Papassotiropoulos and colleagues conducted a whole-genome scan on 500,000 distinct single nucleotide polymorphism (SNPs) in a large Swiss sample, and identified a correlation with episodic memory performance in two genes. They subsequently replicated the association between one of the genes (Kibra) and memory performance in two independent samples. What’s striking isn’t just the presence of two successful replications (almost unheard of in a paper that’s first to identify a gene-behavior relationship—many effects of this sort fail to replicate at all in subsequent studies), but also the size of the effect. In the initial sample, T allele non-carriers (i.e., subjects who had 2 C alleles of the Kibra SNP) performed 24% better on a free recall task after a 5 minute delay. You often hear people write off the molecular genetic approach to studying cognitive differences on the grounds that individual genes account for only a fraction of the variance and it’d take dozens or hundreds of genes to form a meaningful account. What the Kibra effect and other similar studies suggest is that, at least for some traits, a handful of genes may actually account for a considerable portion of the variance.</p>
<p class="MsoNormal">While the discovery and replication of the Kibra-episodic memory association alone would be a high-impact finding, Papassotiropoulos et al. didn’t stop there. They then went on to conduct brain imaging analyses in both humans and mice, demonstrating that a truncated version of Kibra is densely expressed in the medial temporal lobe (a region heavily implicated in episodic memory formation), but not in other areas such as the frontal lobes. They suggest that Kibra may exert its effect on memory via modulation of hippocampal function, though the precise locus and mechanism of effect is currently unknown.</p>
<p class="MsoNormal">But wait! There’s more! The authors then went on to conduct an fMRI study, in which they imaged 15 T allele carriers and 15 non-carriers during performance of an encoding task (a face-profession association task). They observed selective increases in the medial temporal lobes in non-carriers (the group with poorer performance in the genetic samples) relative to carriers, and no regions showing the converse effect. Because the two groups were matched for delayed memory performance, the relative increase in the group with poorer performance likely reflects less efficient processing, requiring greater activation to achieve the same level of memory performance.</p>
<p class="MsoNormal">As if all this wasn’t enough, Papassotiropolous et al. also conducted structural imaging analyses using both automated whole-brain and manual tracing approaches. These analyses didn’t turn up any significant findings, but given the amount of effort and breadth of expertise required for all of these analyses, one can only applaud them for trying.</p>
<p class="MsoNormal">On the whole, I can’t say enough good things about this paper. Regardless of the implications of the substantive finding itself (which, if replicated by other groups, should have important implications both theoretically and practically), it’s remarkable to see such a diversity of approaches and sources of expertise brought to bear on a single problem. Lots of people pay lip service to the notion that inter-disciplinary science is a good thing, but to date there are relatively few demonstrations of the idea on a large scale. The combination of molecular genetics and cognitive neuroimaging seems like a particularly profitable approach, yet few people have applied it thus far (with several notable exceptions, e.g., centers at the NIH and Pittsburgh). If this is the shape of things to come, it’ll be fun to watch over the next few years as this sort of research takes off…</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2006/10/21/the-genetics-of-episodic-memory/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Multiple choice tests: why you shouldn&#8217;t panic</title>
		<link>http://www.smallgraymatters.com/2006/08/26/multiple-choice-tests-why-you-shouldnt-panic/</link>
		<comments>http://www.smallgraymatters.com/2006/08/26/multiple-choice-tests-why-you-shouldnt-panic/#comments</comments>
		<pubDate>Sat, 26 Aug 2006 16:32:32 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>academics</category>

		<category>tutorials</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2006/08/26/multiple-choice-tests-why-you-shouldnt-panic/</guid>
		<description><![CDATA[Many undergraduate students in the social and life sciences go through 4 or more years of university education utterly convinced that multiple choice exams are Satan’s favorite testing format. Drawn up by diabolical, sadistic demons (sometimes termed “professors”), questions on multiple choice exams are invariably ambiguous, unfair, and out for (the student’s) blood.  Personally, [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">Many undergraduate students in the social and life sciences go through 4 or more years of university education utterly convinced that multiple choice exams are Satan’s favorite testing format. Drawn up by diabolical, sadistic demons (sometimes termed “professors”), questions on multiple choice exams are invariably ambiguous, unfair, and out for (the student’s) blood.  Personally, I have my own vivid and unpleasant memories of the teeth-gnashing, expletive-laden tirades I went through not so very long ago whenever I received an exam back with questions marked wrong that I felt I should have received credit for. But now that I’m an older and marginally wiser graduate student with several statistics and research methods classes under my belt, I appreciate what I couldn’t back then: <em>there’s nothing wrong with multiple choice exams </em>(most of the time!). Multiple choice exams are fine. They’re better than fine&#8211;they’re great. The problem isn’t the exams; it’s that no one ever bothers to explain the logic of the format to students at a point in time when it actually matters (e.g., at the beginning of the semester, before the first exam).</p>
<p class="MsoNormal">Now that I’m in the position of having to grade students’ multiple choice exams and explain their mistakes to them during office hours, I often find myself wishing I had a concise explanation as to why they really shouldn’t feel bad about getting Question Number 26 wrong, and why it’s still a perfectly good question even if they felt the wording was ambiguous. There are <a href="http://www.google.com/search?hl=en&#038;lr=&#038;q=multiple+choice+strategies&#038;btnG=Search">plenty of guides</a> about <a href="http://www.studygs.net/tsttak3.htm">how to <em>take</em></a> multiple choice exams floating around on the web, but what I’m after is a Damage Control Guide explaining how to defuse tension associated with students’ perceptions that they got screwed over on the last test. So rather than wait around indefinitely, I thought I’d write one, in the hopes others might find it useful.</p>
<p class="MsoNormal">The overarching point students need to understand and accept about multiple choice exams is that they are almost always made up of <em>mostly bad </em>questions, and that this is in fact <em>mostly a good thing</em>. By ‘mostly’ bad I mean that almost any question on a multiple choice exam is going to be ambiguous to some degree. Wording that seems crystal clear to one student is going to seem horribly vague to another; a question to which one students thinks B is unambiguously the right answer may confuse and anger another student, who think B, C, and D are all perfectly acceptable answers based on what the textbook says. Ideally, of course, such ambiguity shouldn’t be <em>so </em>pervasive as to completely paralyze and perplex the majority of students taking a test. However, some measure of ambiguity and even outright error is unavoidable.</p>
<p class="MsoNormal">It also turns out not to be a very big deal. It can be demonstrated mathematically that even a multiple choice test made up of mostly bad questions can still provide a very good measure of student’s knowledge of the tested material, provided that (a) there’s at least a weak correlation between students’ scores on individual questions and their overall knowledge, and (b) there are enough questions on the exam.</p>
<p class="MsoNormal">In practice, both of these numbers can usually be surprisingly modest. The reliability of a measure (or multiple choice test) is most commonly estimated using Cronbach’s alpha, which, in one form, allows us to compute a reliability coefficient as a function of two quantities: the number of items (or questions) on the test, and the average correlation between items. The formula is as follows:</p>
<p class="MsoNormal"><a href="http://en.wikipedia.org/wiki/Cronbach's_alpha"><img title="Cronbach's alpha formula" alt="Cronbach's alpha formula" src="http://upload.wikimedia.org/math/4/6/6/4668d9e129a9651a64fa0031b8ce7b2c.png" /></a></p>
<p class="MsoNormal">Where N is the number of items and r is the average inter-item correlation. Given this formula, it’s easy to estimate the reliability of a hypothetical test. For example, a test with 30 questions and an average inter-item correlation of only .2 (equivalent to an average of only 4% shared variance between items!) will have a reliability coefficient of .88. In general, anything over .85 or so is considered good, so even by a creating a test with only 30 questions and weakly inter-correlated items, you can see that an instructor can end up with a very reliable test. Given that grades are typically derived from more than one test, the reliability of students’ overall grades will generally increase further. Moreover, if you were to increase the number of items on a given test to 90, reliability jumps to .96, or near perfect.</p>
<p class="MsoNormal">
<p class="MsoNormal">Note that because an average inter-item correlation of .2 is pretty low, the above calculation essentially gives instructors a free pass to have several bad questions on each exam. The net effect of poorly wording a question is to reduce its ability to correlate with other questions, because whether or not a student gets a bad question right depends on chance rather than knowledge. So smarter students are no more likely to get a bad question right than are poor students. Just how many bad questions one can afford to have on a test depends on how inter-correlated the <em>good </em>questions are; but it’s clear to see that even on a test of 30 questions with an average inter-item correlation of .2, having 4 or 5 questions that are completely uncorrelated with the rest of the test would have relatively little impact on the overall reliability of the test. And since reliability increases as a function of number of items, any concern about the drop can easily be offset by adding another 10 or 20 items.</p>
<p class="MsoNormal">
<p class="MsoNormal">Of course, all of this may initially seem like mumbo-jumbo to an irate student who feels they were mortally wronged by ambiguous wording on one or two questions. But it’s useful to explain nonetheless, because students who understand the logic will not only complain less, making your life easier, but will also have a more pleasant college experience, since they won’t spend four or more years feeling persecuted by malevolent instructors.</p>
<p class="MsoNormal">
<p class="MsoNormal">Having said all of this, there are a couple of important caveats, and one shouldn’t just conclude that <em>any</em> reasonably well-thought out multiple choice test is acceptable for class use. First, bad exam questions (even when there are only a few) do present a genuine problem for a small minority of students, namely those whose performance is at ceiling. If you’re a student who would have performed perfectly on a test made up of clear, relevant, and unambiguously-worded questions, the inclusion of bad items can only hurt you, since you have nowhere to go but down. In contrast, students who score lower in the distribution, say, around 75%, have little to complain about, since it’s entirely possible for their score to <em>increase</em> due to the inclusion of bad questions. Students who score near the bottom would actually experience a beneficial effect, with noise generally increasing their scores. But since the distribution of scores is almost always top-heavy in academic settings (more people pass than fail!), the overall net effect of unreliability is to shift the distribution of scores slightly downwards. In most cases this isn’t a problem since most instructors implicitly account for this (e.g., by making some exams ‘easy’ in order to shift scores upwards), but it’s worth keeping in mind anyway. Even if the reliability of your test is very high, it may still make sense to throw out the worst questions in order to prevent a systematic slip in the distribution.</p>
<p class="MsoNormal">
<p class="MsoNormal">A second and more important concern is that establishing that a test is reliable doesn’t necessarily mean it’s a <em>valid</em> measure of students’ learning. A reliable test is simply one that measures the same thing consistently. Nothing about the reliability coefficient tells you <em>what </em>that thing is. There are lots of things you could measure consistently in student populations that have little or nothing to do with the course material you’re teaching. For example, if you like to write extremely tricky multiple choice questions that require students to perform rigorous exercises in logic (e.g., “the answer can’t be A, because only one of these answers is right, and A entails that B is true as well”), you may well end up with highly reliable tests. However, these tests may not be valid measures of students’ knowledge of, say, organic chemistry or developmental psychology, because in effect, by turning your exams into an exercise in logic, you’ve loaded the ability to reason abstractly into your questions. In other words, what determines whether students do well on your exams may turn out to be their general level of fluid intelligence, and not the degree to which they’ve studied and assimilated the material. So while an <em>un</em>reliable test is <em>always </em>a lousy test, a reliable test <em>may </em>still be a lousy test. The ability to easily calculate Cronbach’s alpha isn’t an excuse to stop worrying about <em>what</em> your exams are testing for. But it does let you establish that wording problems or ambiguity on some questions don&#8217;t have much of an impact on your overall ability to measure students&#8217; performance.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2006/08/26/multiple-choice-tests-why-you-shouldnt-panic/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Is expertise under genetic control?</title>
		<link>http://www.smallgraymatters.com/2006/08/15/13/</link>
		<comments>http://www.smallgraymatters.com/2006/08/15/13/#comments</comments>
		<pubDate>Wed, 16 Aug 2006 06:32:53 +0000</pubDate>
		<dc:creator>small and gray</dc:creator>
		
		<category>musings</category>

		<category>behavioral genetics</category>

		<guid isPermaLink="false">http://www.smallgraymatters.com/2006/08/15/13/</guid>
		<description><![CDATA[Jonah Lehrer has a post over at Frontal Cortex today that follows up on his article in Seed a couple of weeks ago arguing that exceptional abilities are the result of extensive practice rather than genetic predisposition. My own view is that they&#8217;re probably not; or at least, I’m not sure the question is a [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">Jonah Lehrer has a post over at <a href="http://scienceblogs.com/cortex/">Frontal Cortex</a> today that <a href="http://scienceblogs.com/cortex/2006/08/talent_and_practice.php">follows up</a> on his <a href="http://www.seedmagazine.com/news/2006/07/how_to_get_to_carnegie_hall.php">article in Seed</a> a couple of weeks ago arguing that exceptional abilities are the result of extensive practice rather than genetic predisposition. My own view is that they&#8217;re probably not; or at least, I’m not sure the question is a coherent one to begin with. At any rate, here’s what Lehrer says:</p>
<p class="MsoNormal">
<blockquote>
<p class="MsoNormal">For one thing, there&#8217;s a lot of empirical evidence that suggests I&#8217;m right. Virtually every psychological study that investigates expert &#8220;performers&#8221; - from chess grandmasters to concert pianists to brain surgeons - concludes that what separates these individuals from their peers is the amount of &#8220;deliberate practice&#8221; they are willing to endure. If there is an innate difference between Yo Yo Ma and a mediocre cellist, or between Tiger Woods and your golfing uncle, it is a willingness to practice, and not an innate aptitude for the cello or the 9 iron. As K. Anders Ericsson, a cognitive psychologist at Florida State University, wrote in his influential article &#8220;The Role of Deliberate Practice in the Acquisition of Expert Performance,&#8221; &#8220;The differences between expert performers and normal adults are not immutable, that is, due to genetically prescribed talent. Instead, these differences reflect a life-long period of deliberate effort to improve performance.&#8221;</p>
</blockquote>
<p class="MsoNormal">
<p class="MsoNormal">I incline to disagree with this for several reasons. First, I think it mischaracterizes the notion of heredity. Saying that variance in a behavior is under genetic influence is patently <em>not</em> the same thing as saying it’s immutable. Consider that, in Western societies (where malnutrition isn’t an issue), height is almost entirely under genetic control. Does this mean height is immutable? Of course not. Take a child born to 6’6” parents and deprive it of its basic nutritional needs, and it’ll be lucky to reach average height. Saying that someone has ‘tall genes’ isn’t saying they have a fixed endowment that’ll express itself in the same way regardless of environment. It’s saying that, across a range of environments, a person born to tall parents is more likely to be tall than other people around them exposed to the same basic environment. It’s not clear why this should be at all controversial. Does anyone really believe that if 10 people sampled at random were each forced to practice the piano for exactly 10,000 hours, they’d all attain exactly the same level of skill?</p>
<p class="MsoNormal">
<p class="MsoNormal">A second problem relates to the false dichotomy between the effects of practice and genetically prescribed talent. These aren’t opposing factors; in fact, they’re completely orthogonal to one another. Saying that someone practices a lot isn’t actually saying anything about whether the contribution to the behavior is under genetic or environmental influence. The reason for this is simple: any number of genetic factors could drive a person to practice an ability to a greater or lesser degree. These include everything from very general factors such as fluid intelligence to specific cognitive abilities to personality factors to creativity to aesthetic sensibilities.</p>
<p class="MsoNormal">
<p class="MsoNormal">Take the case of prodigious musical ability. It may be comforting to think that we could all be Yo Yo Ma if we <em>really wanted to</em>. But it’s almost certainly not true: the vast majority of the population would never be able to play the cello like Yo Yo Ma no matter how much they practiced or how early they started. And the reason isn’t that there’s something magical about Yo Yo Ma’s brain—some single amorphous genetic talent he possesses that other people just don’t. It’s much more likely that he simply possesses lots of little genetic quirks that virtually no one else happens to have the right combination of (for playing the cello really well, at least).</p>
<p class="MsoNormal">
<p class="MsoNormal">What does it take to be Yo Yo Ma? Well here’s a very short list of just a few factors that are likely to be under considerable genetic control and undoubtedly make one more likely to be a good cello player: a certain amount of general intelligence; a certain amount of visuospatial ability; good sequencing skills; a liking of music; absolute pitch; long, slender fingers; agility; a tremendous degree of personal motivation; and high levels of conscientiousness. Of course, plenty of people (indeed, the vast majority) will be above average on one or more of these dimensions. But that’s not the point. The point is that being a great cellist isn’t about having a magically different brain. It’s about having a particular combination of abilities, many of which are genetically influenced, that just happens to make one well-suited to playing the cello. That’s not in any way denying that practicing thousands of hours is <em>necessary </em>in order to achieve Yo Yo Ma’s stature. It’s just saying that it’s not sufficient.</p>
<p class="MsoNormal">
<p class="MsoNormal">Lehrer actually seems to concede this point to some degree when it comes to personality factors:</p>
<p class="MsoNormal">
<blockquote>
<p class="MsoNormal">If there is a genetic element linking Mozart and Jordan it is the talent for practice itself, a willingness to endure the endless hours of sweat and toil required of all great performers.</p>
</blockquote>
<p class="MsoNormal">
<p class="MsoNormal">This is quite clearly true, since about half of the variance in personality traits like conscientiousness is sucked up by additive genetic influences in most twin studies. But there’s no reason to suppose the buck stops with personality. Why should Mozart and Jordan’s relevant genetic endowment differ from other people only when it comes to motivation? Isn’t Jordan’s height under genetic influence? Should we really believe that a 5’3” male with a 12” vertical leap could become the world’s greatest basketball player given enough effort? Or that someone who’s tone deaf and has a congenital hand tremor is as likely to produce virtuoso violin performances after 10,000 hours of practice as someone who has absolute pitch and excellent motor control? These are extreme examples, but they’re only quantitatively and not qualitatively different from the vast majority of individuals, who are likely to be somewhat taller than 5’3” and to be neither tone deaf nor have absolute pitch.</p>
<p class="MsoNormal">
<p class="MsoNormal">A third problem with the claim that expertise doesn’t depend on innate factors stems from the fact that none of the evidence Lehrer cites, including Ericsson’s line of research, really addresses the fundamental issue. Ericsson’s work, particularly the Psych Review article Lehrer cites, is a textbook case of attacking a straw man (just to be clear, I think it&#8217;s excellent research when framed <em>as a study of expertise or of the structure of memory</em>; it&#8217;s specifically the claim that expertise isn&#8217;t innate thats problematic). The argument has the following form: (a) in virtually all expert populations studied to date, long hours of practice are a defining feature; (b) the number of hours of practice is positively correlated with ability; (c) most people don’t practice a lot and aren’t very good; therefore we can conclude (d) that practice is responsible for expertise and innate factors have little or nothing to do with it.</p>
<p class="MsoNormal">
<p class="MsoNormal">What’s the problem with this reasoning? Well, consider the following analogue often found in the developmental psychology literature: (a) violent parents tend to abuse their children; (b) those children tend to perpetuate the ‘cycle of violence’ by abusing their own children when they grow up; (c) the amount of violent behavior displayed by children correlates with that displayed by their parents; so (d) we can conclude that parental behavior causes aggression in children.</p>
<p class="MsoNormal">
<p class="MsoNormal">The problem with the latter chain of reasoning should be obvious: one can’t conclude that parental environmental factors are the key contributors to children’s behaviors without explicitly modeling the shared genetic variance, because correlation doesn&#8217;t entail causation. And when you <em>do </em>model the genetic variance, parental influence almost invariably represents a negligible amount of the total variance, whereas additive genetic factors typically account for about half. This may be counterintuitive, but the explanation is simple enough: violent parents pass on violent genes to their kids, and the genetic contribution seems to dwarf the influence of parents’ overt behavior.</p>
<p class="MsoNormal">
<p class="MsoNormal">The exact same problem applies to correlational studies of expertise. It’s surely not remarkable to note that thousands of hours of practice are necessary to become a world-class expert in almost any field. The very notion of ‘expertise’ practically requires as much: if it could be acquired in a matter of hours, everyone would be an expert, and the term would lose all meaning. The issue isn’t whether or not practice is the <em>proximal </em>cause of expertise. It’s whether or not genetic factors underlie the drive and desire to practice itself. And that’s a question that <em>simply cannot be addressed without explicitly modeling the genetic contribution to a behavior</em>. You simply can’t tell just by studying an individual grandmaster who’s played 60,000 hours of chess whether they’re that good at chess because they’re naturally gifted, or because they were pushed to play chess <em>in spite</em> of a lack of natural ability.</p>
<p class="MsoNormal">
<p class="MsoNormal">Consider: if you had an IQ of 80 and had trouble learning the rules of chess, would you keep playing the game and making mistakes? Probably not. If you were merely mediocre, and lost to everyone else in your chess club, would you keep plugging away at it for thousands of hours? Possibly, but it’s unlikely. The people who end up practicing a single skill for thousands of hours are almost invariably those who (a) love what they do; (b) have an unusual level of drive; and (c) find it comes naturally to them. People tend not to practice at things they don’t like, can’t keep at, or don’t seem to be any good at. All of these dispositions are, of course, at least partially (and probably substantially) under genetic control.</p>
<p class="MsoNormal">
<blockquote>
<p class="MsoNormal">But there is virtually no evidence that expert performers are born with extraordinary brains. In fact, the average IQ of people at the top of their field - whether they are surgeons or politicians, pianists or painters - equals that of the average college student. In other words, their expertise is very specific, confined to a particular &#8220;cognitive domain&#8221;.</p>
</blockquote>
<p class="MsoNormal">
<p class="MsoNormal">Two problems here. First, as noted above, expert performers don’t have to differ in some general intellectual domain in order for their skills to be driven by genetic influences. Why shouldn’t we think that oratorical, artistic, or motoric skills are under some amount of genetic control independently of general factors such as fluid intelligence? To argue otherwise would be to disregard any amount of evidence from behavioral genetics. And secondly, the domain specificity argument is also a red herring. The fact that skills acquired in one domain don’t easily generalize to other domains doesn’t say anything about the role of genetics if one grants that genetic predispositions can be highly specialized, or that different domains rely on different combinations of abilities.</p>
<p class="MsoNormal">
<blockquote>
<p class="MsoNormal">But practice doesn&#8217;t just change which brain areas are activated by a certain task. It also leads to anatomical changes within those same brain regions. For example, the brains of expert violin players have swollen representations of the fingers of their left hand in the somatosensory cortex. This increase in neural space makes Bach easier to play.</p>
</blockquote>
<p class="MsoNormal">
<p class="MsoNormal">Again, this is true, but says nothing about the role of nature and nurture. The issues has never been whether plasticity occurs as a result of practice (it does!), it’s whether variations in the degree of plasticity across individuals are due to genetic or to environmental factors. Expert violin players are presumably individuals whose somatosensory cortex is <em>capable</em> of adapting to the demands of an instrument over time. It’s entirely possible that the vast majority of humanity <em>doesn’t</em> display the same range of plasticity. Without modeling the genetic variance, there’s no way to tell.</p>
<p class="MsoNormal">
<p class="MsoNormal">This leads to the gauntlet Lehrer throws down in a comment following the post:</p>
<p class="MsoNormal">
<blockquote>
<p class="MsoNormal">In fact, the only innate talent that talented people seem to contain is a talent for practice. But if there are scientific studies that suggest otherwise, I&#8217;d love to hear about them.</p>
</blockquote>
<p class="MsoNormal"><em> </em></p>
<p class="MsoNormal">Just to reiterate, let’s be clear that none of the data Lehrer discusses in his post offer any support for the notion that innate differences <em>don’t </em>contribute to expert performance and skill acquisition. In the absence of direct evidence, the reasonable position would be to default to estimates of heredity obtained in non-expert domains as an appropriate estimate of the genetic variance. For example, fluid intelligence and most major personality dimensions are around 50-60% heritable in most studies. Given such results, the default hypothesis should probably be that expertise is under substantial genetic control.</p>
<p class="MsoNormal">
<p class="MsoNormal">That said, there’s a big problem associated with the very notion of estimating heredity for expert populations, which is that the estimate obtained will be highly dependent on the type of sample used. If, for example, you were to recruit a genuinely random sample from the population of, say, American adults, the likely result would be massive overinflation of the environmental contribution to expertise. The reason is that, for any given domain of expertise, there are only a small number of experts, and the vast majority of the population has never so much as attempted to acquire the ability in question. By way of analogy, if you studied the genetic contribution to ice hockey-playing ability in 12 year-olds living in central Arkansas, virtually all of the variance would be environmental, simply because most kids in central Arkansas have probably never laced up skates in their lives. Conduct the same study in Quebec, and a large chunk of the variance will be genetic, because hockey is heavily emphasized in the culture and almost everyone gets the opportunity to try their skills out.</p>
<p class="MsoNormal">
<p class="MsoNormal">The opposite extreme isn’t very useful either: if you conducted a twin study that only sampled experts with more than, say, 10,000 hours of practice in a given domain, you’d inflate the genetic variance, since there’d be a severe reduction in environmental variance in the sample. Put differently, when everyone gets the same environmental treatment, any differences in behavior <em>must </em>be the result of genes (or just random noise).</p>
<p class="MsoNormal">
<p class="MsoNormal">What’s the happy medium? Well, there really isn’t one. But probably the best indirect evidence stems from studies that have looked at genetic and environmental contributions to skill learning on a compressed timeframe (i.e., across hours instead of thousands of hours). A study of this kind was reported by Fox et al. in Nature in 1996 (“<a href="http://www.nature.com/nature/journal/v384/n6607/abs/384356a0.html">Genetic and environmental contributions to the acquisition of a motor skill</a>”). In it the authors demonstrated convincingly that a substantial portion of the variance in both levels of performance and rates of skill acquisition was under genetic control. The latter point is particularly compelling given the current context, because it’s easy to forget that people can differ not only in how good they are at something, but in how fast they pick it up. Saying that thousands of hours of practice are required to become an expert is misleading, in that some people might need 12,000 and others only 7,000. These aren’t trivial differences.</p>
<p class="MsoNormal">
<p class="MsoNormal">In sum, what can we conclude about the heredity of expertise, given available data? The answer is, unfortunately, not much. Virtually all the data promoted as evidence for a practice model is irrelevant, because it attacks a straw man. The issue isn’t whether practice leads to expertise, it’s what gives rise to the tendency to practice a particular skill in the first place. Conversely, there isn’t much hard data that <em>does </em>approach the issue from the appropriate perspective (i.e., that of behavioral genetics). So in the interim, the appropriate position is probably to default to existing estimates obtained in non-expert domains, and maintain that it’s a bit of both: genetic and environmental contributions both influence expert performance. It’s not clear that we’re going to get a more meaningful answer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.smallgraymatters.com/2006/08/15/13/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
