This is a discussion paper published by GBIF, where the role of citation, peer-review and Intellectual Property Rights are discussed in the context of biodiversity data publishing.
An unprecedented amount of biodiversity data is becoming available on the internet.
However, significant amounts of data, particularly historic data, are not available online.
The Global Biodiversity Information Facility publishes millions more primary biodiversity
data records every year, but finds that this is a decreasing proportion of the potentially
available data it could publish. Because data sharing agreements and policies alone are
insufficient, new approaches are required to accelerate data publication. Only in the past
few years have scientists begun calling for data ‘citation’ and referring to data
‘publication’ rather than data ‘sharing’ and ‘availability’. Issues of intellectual property
rights (IPR) only complicate data access in the latter contexts. In contrast, the
‘publication’ process has well-established conventions that simplify and clarify IPR issues.
Concerns over data quality impede the use of large biodiversity databases by researchers
and subsequent benefits to society. Peer-review is the standard mechanism used to
distinguish the quality of scientific publications. Here, we argue that the next step in data
publication is to include the option of peer-review. Data publication can be similar to the
conventional publication of articles in journals that includes online submission, quality
checks, peer-review, and editorial decisions. This quality-assurance process will at least
assess, and potentially could improve the accuracy of the data, which in turn reduces the
need for users to ‘clean’ the data, and thus increases data use while the authors and/or
editors get due credit for a peer-reviewed (data) publication. Adoption of international
and community-wide standards related to data citation, accessibility, metadata, and
quality control would enable easier integration of data across datasets. Metadata, for
example, would include relevant information about the datasets that would enable a user
to better understand the data and determine its suitability for use for particular purposes.
It is recognized that a significant amount of data is already published without peer-review,
both through GBIF and other databases, and through various internet and print media. This
will continue. However, providing a scale of quality assurance, of which the highest
standard is peer-review, will both improve quality assurance and attract the attention of
scientists and organizations that place little value on non peer-reviewed publications. Most
steps in the process proposed here are already undertaken by GBIF and/or some of their
participants. The peer-review process is well-established in the science community,
including peer-review of biodiversity data by several journals. Thus the process proposed
here is practical and does not pose new technical difficulties. It may be implemented by
GBIF in collaboration with its participants and science journals.
Data publications should strive to be of similar merit as other peer-reviewed publications,
and thus be recognized by employers, funding agencies and scientists as a meritorious
activity. This will require metrics of data use, such as views, downloads and citations.
Here, we propose a staged publication process involving editorial and technical quality
controls, of which the final (and optional) stage includes peer-review.
Mark J Costello, William K Michener, Mark Gahegan, Zhi‐Qiang Zhang, Phil Bourne, Vishwas Chavan (2012). Quality assurance and intellectual property rights in advancing biodiversity data publications ver. 1.0, Copenhagen: Global Biodiversity Information Facility, Pp. 33, ISBN: 87‐92020‐49‐6. Accessible at http://links.gbif.org/qa_ipr_advancing_biodiversity_data_publishing_en_v1.