A literature-based inventory of ecological interactions in Doñana National Park
Citation
Moracho E, Calvo G, Gómez J M, Homet P, Rodríguez-Sánchez F, Villalva P, Jordano P (2024). A literature-based inventory of ecological interactions in Doñana National Park. Version 1.2. Estación Biológica de Doñana (CSIC). Sampling event dataset https://doi.org/10.15470/jlhz16 accessed via GBIF.org on 2024-12-15.Description
Our database comprises records of pairwise ecological interactions between species in Doñana National Park from a systematic literature review spanning from 1900 to 2023. This survey includes a diverse range of interaction types, such as predation, competition, pollination, mycorrhizae, among others, and covers taxa from all kingdoms (Plant, Animal, Fungi, Bacteria, Chromista, Archaea, Protozoa, Viruses). Each record represents an encounter between two species at a specific location and time and includes information on the interaction features, biotic partners and environmental context. The database consolidates data from multiple sources, including 'sampling event' and 'occurrence' data, enabling comprehensive coverage of all interaction data extracted from the literature. These data are valuable for scientific research focused on ecological interactions within natural ecosystems, including big-data approaches for generating species distribution models and macro-ecological analyses. The diverse nature of our database allows for an extensive exploration of the complexity and diversity of ecological interactions, with implications for a wide range of ecological studies.
Purpose
This is a valuable dataset for scientific research aimed at understanding the complex dynamics of natural ecosystems, particularly with respect to the interactions between species. These data offer significant potential for advancing big-data approaches, such as the generation of species distribution models and macro-ecological analyses, which can provide insights into the underlying ecological processes that govern species coexistence and ecosystem functioning.
Sampling Description
Study Extent
The data collection period for papers published from Doñana National Park and extensions of protected natural areas within the Natura 2000 Network in Andalusia was extended until May 31st, 2023.Sampling
We conducted a bibliographic compilation through an exhaustive review of over xxx published and unpublished scientific reports that provide data on interaction diversity across taxa. These reports cover a range of kingdoms, including Animalia, Plantae, Fungi, Chromista, Bacteria, Archaea, Protozoa, and Viruses. We also reviewed non-digital literature from collections at the CSIC archive. To identify relevant scientific peer-reviewed papers on biotic interactions in Doñana Natural Park, we systematically searched for the specific keyword “Doñana” and filtered the results by ecological-related categories. The interaction data extracted from the literature review was collected in a spreadsheet using a controlled vocabulary and an early validation procedure. The database structure includes 80 columns, covering the following information: (1) survey characteristics, (2) time, date and geographic location, (3) taxonomic information, (4) anatomy of the interaction (based on an ad hoc developed ontology of species interactions), and (5) interaction intensity and outcome. We collected two types of interaction data from field studies of natural populations: (i) interaction intensity data based on sampling design studies that provide information on both sampling effort and intensity, and (2) binary interaction data (presence-absence) based on studies that did not involve sampling effort, but instead relied on occurrences of interaction events.Quality Control
The fist measure of quality control involves searching for duplicated papers by assigning a unique citation key to each scientific report. We also carefully check thesis chapters that have been published in scientific journals to avoid double recording of information. The second measure of quality control is performed during the recording of interaction data into the spreadsheet. We use controlled vocabularies in many fields and a hierarchical data entry validation for interaction description fields based on previous filtering by kingdom partners. The third measure of quality control is applied during the integration of spreadsheets into a single table in R. We have developed a SUMHAL-WP5 package for R to optimize data integration into the final database, with an adequate validation and data quality check. During this process, we check the data recorded in each column according to controlled vocabularies and predefined value ranges. We also ensure that all indispensable data for each record is filled, and taxonomy classification is accomplished according to the GBIF Backbone Taxonomy. All these quality checks are performed automatically using GitHub Actions.Method steps
- The bibliographic compilation involved multiple sources of information. First, we searched the ISI Web of Science (WOS) for papers containing data on ecological interactions until May 31st, 2023. The search term "Doñana" was used to retrieve papers, and we restricted the search to the categories of Environmental Sciences-Ecology, Plant Sciences, Zoology, Entomology, Marine and Freshwater Biology, Biodiversity Conservation, Agriculture, and Forestry in WOS ("Theme" as the search field). After removing papers that were clearly out of scope, we retained xx papers from WOS. An additional search on Google Scholar yielded 98 papers using the same search term. We also reviewed the list of theses published by the Doñana Biological Station since 19xx, which comprised a total of 54 works on ecological sciences. To identify duplicates, we created a citation key for each study, and special care was taken when searching for duplicated papers included in thesis works. After removing duplicates from all the sources, we obtained a joint list of xxx papers. We then screened the abstracts or, when necessary, the main text of the articles to select only those papers providing data on ecological interactions. Finally, we read the selected papers and theses in full, resulting in a final set of xx papers and xx theses that provided data for the database on ecological interactions. We also selected a restricted set of national journals that met the following criteria: (1) the surveys were conducted in Doñana National Park or extensions to protected areas in Andalusia, and (2) the aim of the studies or reports focused on biodiversity and species ecology. A total of xxx journals were selected (names), comprising xxx reports/surveys. We carefully read all the texts to extract interaction data. Interaction data extracted from literature was collected in a spreadsheet using a controlled vocabulary and an early validation procedure. The taxonomic classifications of the interacting species were verified at lower taxonomic ranks with the help of the GBIF taxonomy (https://www.gbif.org/es/species/1?root=true). The data from various databases or spreadsheets were integrated continuously using GitHub Actions to create a single CSV file. This integration process utilized the SUMHAL-WP5 package that was specifically developed in R for this purpose. In the event of an error, the corresponding author was notified and a correction was required for the proper integration of the data. Furthermore, during the integration process, the taxonomic classification of the interacting partners was automatically filled using the GBIF Backbone Taxonomy.
Additional info
This database defines an ecological interaction as a meeting between two partners at a specific location and time. The core of the database is the interactions that are described by different attributes in the Event file, including the source of the information, the location of the interaction, and the sampling design description, among others. The Occurrences file provides details about the partners involved, including their taxonomy and behavior during the interaction. The MeasurementsOfFacts file includes measurements of the interaction intensity, which quantifies the strength of pairwise interactions. This measure is provided only when the sampling design provides continuous data (e.g., frequency of the encounter); presence-absence data are used otherwise. List of variables and their descriptions: “Events" data: > eventID: A unique identifier for each interaction event in the dataset. It can be created by combining the citation_key and Partners code (e.g., citation_key:Partners:0000001). > bibliographicCitation: A bibliographic reference for the scientific report from which the data was extracted. > eventDate: The date or date range when the interaction occurred. The accuracy of the time can vary from years to a specific day, depending on the data provided by the study. > country: The country where the interaction was observed. > stateProvince: The administrative region where the interaction occurred, represented as a combination of region and province (e.g., Andalucía:Sevilla). > municipality: The municipality where the interaction occurred. > locationRemarks: A descriptive name of the location in the Natura 2000 Network > locality: The name of the specific site (e.g., village or town) where the interaction occurred. > decimalLatitude: The latitude of the site where the interaction occurred, expressed in decimal degrees using the unprojected WGS84 coordinate system and georeferenced in Google Earth. > decimalLongitude: The longitude of the site where the interaction occurred, expressed in decimal degrees using the unprojected WGS84 coordinate system and georeferenced in Google Earth. > minimumElevationInMeters: The altitude of the site in meters. > geodeticDatum: The geodetic datum used to define the geographical coordinates of the site. > samplingEffort: The effort expended for sampling biotic interactions at a locality and time. This information is available only for studies with a sampling design, and it includes the following data: Sampling_space: The area covered by the sampling effort (expressed as the value and its units). Sampling_time: The duration of the sampling effort (expressed as the value and its units). > sampleSizeUnit: The smallest unit of measurement used to quantify the interaction sampling. > sampleSizeValue: The number of sample size units on which the interaction intensity is based. > samplingProtocol: The method used to sample biotic interactions in the study. The terms used are restricted to the following: camera trap, barcoding, direct observation, fecal sample, mist net, stomach content, transect and pellet analysis. > basisOfRecord: The basis of interaction sampling, which indicates how the data were collected. It includes "MaterialSample" for interaction events inferred from physical samples (e.g., a fecal sample, a stomach, etc), "HumanObservation" for interaction data directly observed in the field by people, and "MachineObservation" for data collected automatically by machines. > fieldNotes: Some clarifications that may be necessary for interpreting the data extracted from the study. > dynamicProperties: A list of general descriptors that provide additional information about the study, including: - Study focus: This indicates which taxa the study is focused on and can be categorized as phytocentric, zoocentric, combined, or other; - Data type: This describes the nature of the interaction data provided in the study, which can be either binary or continuous; and Individual ID: This is relevant when the study is performed at the individual level. > institutionCode: The acronym of the institution having custody of the dataset. 2. “Occurrences” data: > eventID: A unique identifier for each interaction event in the dataset. > kingdom, phylum, class, order, family, genus, scientificName, verbatimtaxonRank: The taxonomic classification of the observed taxa, following the GBIF taxonomic backbone. > lifeStage: The age class of each partner involved in the interaction, if known. > sex: The sex of each partner involved in the interaction, if known. > behaviour: The action performed by each partner during the interaction. 3. “MeasurementsOfFacts” data: > measurementType: Only the measurement “interaction intensity” is provided in this file. > measurementUnit: The units of measurement used to quantify the interaction intensity. These are transferred almost literally from the study. > measurementValue: The numeric value of the interaction intensity when the data type is continuous. For binary data, the value is "NA" (not applicable).Taxonomic Coverages
-
Chordatarank: phylum
-
Avesrank: class
-
Mycobacterialesrank: order
-
Actinomycetiarank: class
-
Mycobacteriaceaerank: family
-
Carnivorarank: order
-
Actinobacteriotarank: phylum
-
Accipitridaerank: family
-
Herpestidaerank: family
-
Bacteriarank: kingdom
-
Accipitriformesrank: order
-
Mammaliarank: class
-
Animaliarank: kingdom
Geographic Coverages
Bibliographic Citations
- Thompson, J. N. (2014). Interaction and coevolution. University of Chicago Press. Hobern, D., Baptiste, B., Copas, K., Guralnick, R., Hahn, A., van Huis, E., ... & Wieczorek, J. (2019). Connecting data and expertise: a new alliance for biodiversity knowledge. Biodiversity data journal, 7. Moilanen, A., Wilson, K., & Possingham, H. (2009). Spatial conservation prioritization: quantitative methods and computational tools. Oxford University Press. -
Contacts
Eva Morachooriginator
position: Postdoctoral researcher
Estación Biológica de Doñana, CSIC
C/ Américo Vespucio, 26
Sevilla
41092
Sevilla
ES
email: emoracho@ebd.csic.es
Gemma Calvo
originator
position: Technician
Estación Biológica de Doñana, EBD-CSIC
ES
email: gemma.calvo@ebd.csic.es
Jose María Gómez
originator
position: Principal investigator
Estación Experimental de Zonas Áridas, EEZA-CSIC
Carr. Sacramento, s/n
Almería
04120
ES
email: jmgreyes@eeza.csic.es
Pablo Homet
originator
position: Technician
Estación Biológica de Doñana, EBD-CSIC
ES
email: pablo.homet@ebd.csic.es
Francisco Rodríguez-Sánchez
originator
position: Associated professor
Universidad de Sevilla
C. San Fernando, 4
Sevilla
41004
ES
email: frodriguez.work@gmail.com
Pablo Villalva
originator
position: Research assistant
Estación Biológica de Doñana, EBD-CSIC
ES
Pedro Jordano
originator
position: Principal investigator
Estación Biológica de Doñana, EBD-CSIC
ES
email: jordano@ebd.csic.es
Eva Moracho
metadata author
position: Postdoctoral researcher
Estación Biológica de Doñana, CSIC
C/ Américo Vespucio, 26
Sevilla
41092
Sevilla
ES
email: emoracho@ebd.csic.es
Pedro Jordano
metadata author
position: Principal investigator
Estación Biológica de Doñana, CSIC
C/ Américo Vespucio, 26
Sevilla
41092
Sevilla
ES
email: jordano@ebd.csic.es
Pedro Jordano Barbudo
user
email: jordano@ebd.csic.es
Eva Moracho
administrative point of contact
position: Postdoctoral researcher
Estación Biológica de Doñana, CSIC
C/ Américo Vespucio, 26
Sevilla
41092
Sevilla
ES
email: emoracho@ebd.csic.es
Pedro Jordano
administrative point of contact
position: Principal investigator
Estación Biológica de Doñana, CSIC
C/ Américo Vespucio, 26
Sevilla
41092
Sevilla
ES
email: jordano@ebd.csic.es