University of California, Office of the President
To develop and deploy infrastructure necessary to elevate data to a first-class research output
One obstacle to developing effective data citation practices is that data does not behave like a published article. It can be far more complex, can exist in many successive versions (none of which are canonical), and only a part of a given dataset might be used by a given study. An effective data citation regime must reflect the multitude of ways data can be used in research. These issues were taken up by the California Digital Library (CDL) in a 2014 National Science Foundation planning study to explore the idea of “data level metrics” and determine which metrics would be of most value to researchers. The grant funds an expansion of this work, as the CDL assembles a coalition to implement their findings. Over the next two years, CDL will bring together the organization that mints DOIs for datasets (DataCite) and the organization that manages the standard for article download and access data (COUNTER) with a collection of data repositories (DataONE) in order to implement best data citation practices using extensions to the popular Lagotto article usage tracking software. Beyond their own implementation, this collaboration will work with the Research Data Alliance to build consensus for and recruit additional repositories to adopt their best practices and technical solutions.