The FT program received 17 proposals in January, 12 in February, and 7 in March. We now have the answer for April: 8 proposals. Does this mean that initial enthusiasm for the program has died down, and 7-8 proposals is the new steady state? Or are we simply experiencing temporary “proposal fatigue” after the recent round of Gemini, ALMA, HST, and numerous other deadlines?
If the FT program isn’t meeting your needs (or if you think it’s great, and don’t want us to change anything!) we’d very much like to hear from you. If you’re going to be at the Future and Science of Gemini meeting in Toronto in June, we will be giving a talk and presenting a poster about the FT program. This will be a good opportunity for you to give us your thoughts, so please keep an eye out for Rachel Mason and Tom Geballe, who will be representing the FT team at the meeting. Your Users’ Committee representatives will also be happy to pass on your feedback.
Just for fun, here are some of the figures we recently made for a report on the FT pilot for the Science and Technology Advisory Committee and the Gemini Board of Directors. Without going into great detail right now, our preliminary conclusions are that:
- The review process gives fairly good agreement on the strongest and weakest proposals. Using a clipped (rather than raw) mean score to rank the proposals would (unsurprisingly) give a stronger signal, but it would not have changed the outcome of the selection process so far;
- There is no evidence of people unfairly down-grading other proposals in an attempt to promote their own (although subtle patterns would be difficult to detect in the limited data currently available). Where people give low scores, they almost always use the full range.
- Expert reviewers are no harsher than non-expert ones. At least, most of the very low scores have been given by people who describe themselves as knowing little about the field. This probably means that proposers should make an effort to make their science accessible to a broad audience.
- The minimum score necessary for any proposal to be awarded time, no matter how much time is available, is 2.0. This is a little lower than the mean, so we may consider increasing that threshold in future cycles.
Left: Grade vs rank for the cycle 1 (top), 2 (middle), and 3 (bottom) proposals. The dashed line shows the minimum score that must be achieved for any proposal to be awarded time, regardless of whether it will fit in the available hours. Accepted proposals are shown in green, rejected ones in red. The green/red points indicate the mean scores (on which the final rankings are based), while the bars show the range of scores received. For comparison we also show “clipped” means and ranges, with the top and bottom scores removed (grey points, filled bars). Filled bars are not visible for proposals 5 and 7 in cycle 1, as all but two reviewers gave them a score of 3 (clipped range = 0). Proposal 12 in cycle 2 withdrew partway through the cycle, having acquired their requested data elsewhere, and the remaining reviewers were instructed to enter a score of 0.
Middle: Grades given by each reviewer in cycles 1 (top) 2 (middle), and 3 (bottom). Red points show each reviewer’s mean grade, and bars show the range of grades assigned. The overall mean is indicated by the blue dotted line in each plot, while the minimum score necessary for a proposal to be awarded time is shown by the dashed grey line.
Right: Distribution of proposal scores in cycles 1 (top) 2 (middle), and 3 (bottom), separated by reviewer’s self-assessment of their knowledge of the field of each proposal.