University of California, Berkeley
To develop open source R software and training to support various parts of the research process including data publication, data integration, and reproducibility
In 2013, the Foundation approved a one?year grant to rOpenSci, a collective of data scientists, to build and promote a suite of “packages” for R, a powerful programming language and software environment for statistical computing and graphics. The packages aimed to greatly simplify the process of gathering data from various archives and services commonly used by researchers. Such software modules dramatically lower the barriers to R use, freeing researchers from having to write their own idiosyncratic code when parsing data from commonly used repositories like Dryad, the Global Biodiversity Information Facility, or the Biodiversity Heritage Library. This grant provides continued support for this project. The project team will continue software development, shifting their focus to several generic needs like spatial data analysis and the submission of data to repositories for publication, as well as supporting R interoperability with popular emerging tools for data management like Dat. To further lower barriers to R use in data-driven research, rOpenSci will also develop openly licensed curricular “modules” that could be incorporated into graduate seminars or informal workshops. In speed adoption, rOpenSci will cultivate an initial cohort of a dozen “ambassadors” from across the natural and social sciences who will develop domain-specific R packages and lead various outreach and community-building efforts in their home disciplines.