Names (or Parents?) Make a Difference
I just read an article about a study showing that the name of a girl
can be used to predict whether a girl will study math or physics after
the age of 16. The study, done by David Figlio, professor of economics
at the University of Florida indicated that girls with "very feminine"
names, such as Isabella, Anna, and Elizabeth, are less likely to study
hard sciences compared to girls with names like Grace or Alex.
Myself, I find it hard to understand how someone can estimate the
"femininity" of a name but it might be just me. Even if there is such
a scale though, I do not see any causality in the finding, as implied
in the article. (I see predictive power, but no causality.) In my own
interpretation, parents that choose "very feminine" names also try to
steer their daughters towards more "feminine" careers. I cannot
believe that names by themselves set a prior probability on the career
path of a child. (The Freakonomics book had a similar discussion about
names and success.)
Oh well, how you can lie with statistics...
Posted by Panos Ipeirotis at 11:40 AM
1 comments Links to this post
Tuesday, May 8, 2007
Replacing Survey Articles with Wikis?
Earlier this year, together with Ahmed Elmagarmid and Vassilios
Verykios, we published a survey article at IEEE TKDE on duplicate
record detection (also known as record linkage, deduplication, and
with many other names).
Although I see this paper as a good effort in organizing the
literature in the field, I will be the first to recognize that the
paper is incomplete. We tried our best to include every research
effort that we identified, and the reviewers helped a lot in this
respect. However, I am confident that there are still many nice papers
that we missed.
Furthermore, since the time the paper has been accepted for
publication, many more papers have been published and many more will
be published in the future. So, this means that the useful half-life
of (any?) such survey is necessarily short.
How can we make such papers more relevant and more resistant to
deprecation? One solution that I am experimenting with is to make the
survey article a wiki, and then post it to Wikipedia, allowing other
researchers to add their own papers in the survey.
I am not sure if Wikipedia is the best option, due to licensing
issues, though. A personal wiki may be a better option, but I do not
have a good grasp of the pros and cons of each approach. One of the
benefits of Wikipedia is the existence of nice templates for handling
citations. One of the disadvantages is the copyright license of
Wikipedia, which may discourage (or prevent) people from posting
material there.
Furthermore, it is not clear that a wikified document is the best way
to organize a survey. A few days back, I got a (forwarded) email from
Foster Provost, who was seeking my opinion for the best way to
organize an annotated bibliography. (Dragomir Radev had a similar
question.) Is a wiki the best option? Or is it by construction too
flat? Should we use some other type of software that allows people to
generate explicit, annotated connections between the different papers?
(Any public tool?)
 
No comments:
Post a Comment