
Publications: archive file
Archive
The archive containing all publications should be a list of simple Dublin Core records. There are two ways of encoding such an archive: a simple CSV text file, or XML.
CSV archive
A CSV file with each row representing a single publication. This format is very simple to produce and is compatible with the Darwin Core text guidelines, in particular the ECAT references extension.
It does not allow for line breaks in the metadata – something commonly found in abstracts. If you do not have abstracts, or can replace the line breaks, please consider this format. A simple example file with only one record looks like this:
<pre>
dc:identifier link dc:bibliographicCitation dc:title dc:creator dc:date dc:source dc:subject dc:description
doi:10.1038/ng0609-637 Hartge, P., Genetics of reproductive lifespan. Nature Genetics 41, 637 - 638 (2009) Genetics of reproductive lifespan Patricia Hartge 2009-06-01 Nature Genetics 41, 635 (2009) genomics, epidemiology Five genome-wide association studies of the timing of menarche and menopause have now taken us beyond the range of candidate gene and linkage studies. The list of new genetic associations identified for these two traits should shed light on the mechanisms of ovarian aging, as well as breast cancer and other diseases associated with reproductive lifespan.
...
</pre>
XML archive
The exact same informations can also be encoded as XML, which allows for line breaks and markup within the abstracts. A simple XML schema is provided to validate resources encoded in Dublin Core alone. The above example would look like this:
<?xml version="1.0" encoding="UTF-8"?>
<resources xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xsi:noNamespaceSchemaLocation="http://gbif-ecat.googlecode.com/files/publication_archive.xsd">
<resource>
<dc:identifier>doi:10.1038/ng0609-637</dc:identifier>
<dc:identifier>http://www.nature.com/ng/journal/v41/n6/pdf/ng0609-637.pdf</dc:identifier>
<dc:title>Genetics of reproductive lifespan</dc:title>
<dc:creator>Patricia Hartge</dc:creator>
<dc:date>2009-06-01</dc:date>
<dc:source>Nature Genetics 41, 635 (2009)</dc:source>
<dc:subject>genomics; epidemiology</dc:subject>
<dc:language>en</dc:language>
<dc:rights>Copyright © 2009 Wiley-Liss, Inc., A Wiley Company</dc:rights>
<dc:description>
Five genome-wide association studies of the timing of menarche and menopause have now taken us beyond the range of candidate gene and linkage studies.
The list of new genetic associations identified for these two traits should shed light on the mechanisms of ovarian aging, as well as breast cancer and other diseases associated with reproductive lifespan.
</dc:description>
</resource>
...
</resources>


