Why we started a task force on gender and diversity policies in science (and not only)
An article by our member Antonella Succurro opens up a debate on gender-bias within the evaluation of scientific careers.
An article published on November 17 in Nature Communications states it loud and clear: having a woman as a mentor harms your scientific career. After a massive uproar of the scientific community on social media, the article is (since November 19) under investigation for retraction. Although we could consider this prompt reaction as “good”, the question of how could such article be published in the first place remains. And I want to make it clear – I am not writing this out of resentment because I feel offended by the conclusions, I write this because the scientific process behind the research presented there is flawed. So let’s be clear and repeat it another time: an article should not be retracted on the basis that many people on social media don’t like the conclusions – science is not a democratic process and you don’t get to vote whether a neutrino should go faster than light.
So let’s have a look at what went wrong here, which in my opinion is divided into three main categories: (i) not accounting for nor acknowledging systemic biases; (ii) a flawed editorial process that kind of ignored reviewers’ concerns about the scientific methods and (iii) a sort of stubbornness of the authors on insisting on certain conclusions that are wrong and unnecessary (or even deleterious) for determining the scientific value of the work. The following summary is not exhaustive, but is a starting point.
i) Systemic bias
First let me point out that the authors do nothing to acknowledge that their “gender analysis” (or to be precise, the “perceived gender from first name analysis”, as pointed out in a personal twitter conversation with Dr. Sophia Frentz) will not be inclusive of non-binary and trans-gender scientists. With the recent news that journals will allow scientists who transitioned to retroactively change their name on publications, this shortcoming in the article highlights a total disconnect from the societal context of the topics the authors are investigating. You would expect that a scientist making an extraordinary claim on diversity policies would at least be familiar with the current on-going debates. This article’s specific extraordinary claim reads:
<<Our gender-related findings suggest that current diversity policies promoting female–female mentorships, as well-intended as they may be, could hinder the careers of women who remain in academia in unexpected ways. Female scientists, in fact, may benefit from opposite-gender mentorships in terms of their publication potential and impact throughout their post-mentorship careers. Policy makers should thus revisit first and second order consequences of diversity policies while focusing not only on retaining women in science, but also on maximizing their long-term scientific impact.>>
So the authors are basically suggesting to revisit diversity policies based on their results. However, the authors are looking at historical data, evaluating mentorship quality and mentorship outcome (aka success) of the protégés based on standard impact metrics such as number of citations and publications. The mentorship outcome is computed as “the number of citations that it accumulated 5 years post publication” for each publication of the protégé without the mentor and once “the academic age of the protégé was greater than 7 years”. This means that for any protégé academically “younger” than 12 years the data are not usable. Therefore the authors are making such extraordinary claims (“revisit diversity policies”) from the analysis of mentor-protégé pairs active during years when the diversity policies they want to revisit were, most probably, not even in place. This is both a systemic and systematic bias. Plus, I would expect from someone writing about gender in academia to know something about the “leaky pipeline” problem, with women dropping out from the academic career at the more senior levels, and acknowledge this as a major limitation of any results they claim to have. Quoting Carl Sagan, “Extraordinary claims require extraordinary evidence”, but I do not see any extraordinary evidence here. Instead , I see a lot of methodological problems, detailed in the next section.
ii) Red flags from peer review ignored
The reviewers report with the authors’ rebuttal is available and enlightening. Three out of four reviewers raised relevant red flags which should have prevented the manuscript from being published with its current claims. Two out of four remain adamant after the first round of review that the article should not be published with certain content. In particular, the authors claim they are analyzing “informal mentorship in academic collaborations”. However, they use data of academic co-authorship and make various assumptions on how “junior” scientists are mentored by “senior” co-authors. In this way and under these assumptions they identify in an automated way about 3 million mentor–protégé pairs.
Mixing-up co-authorship with mentorship is a long stretch and completely unjustified, as pointed out by the reviewers very strongly. In response to the reviewers’ criticism, the authors decided to prepare a survey which should have proved that the protégé received some kind of mentorship from the mentor. The details of the survey are available in the Supplementary Material and I find the authors did a very poor design choice for it. Basically, they define as mentorship receiving advice on “writing”, “research / study design (methodology, experimental design, etc.)”, “data analysis/modeling”, “addressing reviewer comments”, “selecting a venue for publication” in the context of the co-authored publication that identified them as mentor–protégé pair. These “advices” are literally the basic elements of co-authoring a paper when you work in a team (did I mention that one of the condition to be selected as mentor–protégé pair is to be part of the same US-based institution?). Then comes the second part of the survey, according to which selecting “True” to the statement “I received a letter of recommendation from him/her for a fellowship/award or job application” accounts for mentorship.
In addition to the poor design, the reported statistics are not impressive and leave serious doubts on the sampling reliability. The survey was sent out to 2000 randomly selected protégés (out of 3M), but it was filled only by 167. This is unfortunate, and although I can acknowledge it is often difficult to get response to such kind of requests, I wonder how hard the authors tried to get responses (they don’t mention reminder emails, for example). Anyway, what is more troublesome is that ~40% of their randomly selected protégés replied that they were in a formal type of mentorship (“ thesis advisor” or “committee member of [their] thesis”) with the identified mentor. Yet they claim in the title and in the main manuscript that they investigate informal mentorship. What is even more troublesome is that in their rebuttal to the reviewers they state that because of such result from the survey they “have modified both the main manuscript as well as the Supplementary Materials to talk about mentorship in general, without specifying whether it is formal or informal”. Yet “informal mentorship” is still in the title as well as in a relevant part of the manuscript where the authors are trying to explain what makes their work different from another published work. Which brings me to the next point of discussion.
iii) Big claims, big rewards
In conclusion, I want to raise my own red flag about a way too common behavior that is well illustrated in this article’s case. The authors had in principle a decent work in place, analyzing millions of co-authorships to understand the impact of scientific collaborations, something always useful to get more insights on the current scientific performance measures (which, we all know, have to be improved). However, the authors needed to stretch their work and their conclusion, claiming they were investigating mentorship (which they don’t, see above) and drawing conclusions about diversity policies with clear temporal inconsistencies with respect to their data.
I can only guess, but I can think that they needed very badly to present their work as novel enough to make it into a high impact factor journal and controversial enough to gain lots of citations later on. “There is only one thing in the world worse than being talked about, and that is not being talked about” [Oscar Wilde’s Dorian Gray]. Looks like the academic reward system is encouraging such kind of behavior, and it is deeply sad. Because now, much in analogy to what happened with the infamous “vaccines cause autism” paper, the word is out there, that women mentors are bad for your scientific career, and not many people will care to read a rebuttal, or a retraction report, or whatever comes next. It’s sad and dangerous and, I’m afraid, it will happen again.