Understanding the Coral Reef Data Lifecycle

By: Rebecca Wenker, NOAA Coral Reef Conservation Program Data Manager

Takeaway: Coral reef data goes through a whole lifecycle including planning, collecting and organizing, preservation, access and sharing, and re-use. The goal of this lifecycle is to make the data FAIR, or findable, accessible, interoperable, and re-usable, which ultimately benefits both scientists and communities.

When coral reef data come to mind, often what many people think of first is the collection processes. This can include scientists going out on multi-leg research cruises, SCUBA divers surveying reefs of various depths and types, and underwater instruments monitoring changes in ocean chemistry. After data collection, analysis can begin—gathering all of that data together to analyze and discover trends. In reality, there is a whole lifecycle that coral reef data go through before and after these steps!

Two circular flow charts showing the stages of a data lifecycle.
Two examples of the stages in a data lifecycle. Credit: DataONE.

The data lifecycle can be divided into the following generalized sections:

  1. Planning
  2. Collecting and organizing
  3. Preservation
  4. Access and sharing
  5. Re-use

A common thread through each section is an emphasis on proper data management and stewardship, which essentially means ensuring that data are organized, accessible, and usable. This is especially important as the size and diversity of data increases. For example, Structure-from-Motion (SfM) photogrammetry is an imaging technique that is increasingly used in coral reef monitoring methodologies, and it generates 3D models of reefs with file sizes as large as 15 terabytes! In comparison, typical coral reef monitoring data in the form of spreadsheet files come in at an average of 2–500 megabytes in size.

Graphic saying: Findable, Accessible, Interoperable, Reusable.
The FAIR Principles.

The ultimate goal of the lifecycle is to ensure that data are FAIR, or Findable, Accessible, Interoperable, and Reusable. Every step of the lifecycle has actions you can take to make sure this is the case!


Ideally, all parts of the coral reef data lifecycle should be addressed in a data management plan (DMP) before data collection even begins. A DMP is a formal document that outlines what you will do with your data during and after you complete your research project, and ensures that your data are safe in the present and future. These plans are extremely important, because if your data collection, organization, storage, and preservation methods are outlined in advance, you are less likely to encounter unanticipated roadblocks or make errors that reduce the quality of your data and research results. DMPs also help you plan how you will share your data, allowing it to be discovered and used by others. However, DMPs are not set in stone, and can be edited as needed during the lifecycle.

The five main components of a DMP generally include:

  • Documentation (data description, data collection and analyses, metadata, timeline),
  • Roles and responsibilities,
  • Budget (equipment, collection/execution, data storage, data publishing),
  • Storage and preservation (short-term and long-term), and
  • Access, sharing, use plans and policies.

Funding organizations are increasingly requiring the inclusion of a DMP as part of the application process, and often have particular requirements regarding data format, storage, products, and access that need to be met. Therefore, incorporating those expectations into your DMP is critical, and can often help shape the rest of the data lifecycle.

Collection and Organization

Moving into data collection should be less intimidating now that you’ve planned the process out in a DMP, but there are still important things to remember! The major things you should incorporate during data collection include:

  1. Data Organization - Using a standardized file format or database for data entry, following spreadsheet best practices, and using logical file naming conventions.
  2. Quality Control - Define and enforce standards used during collection (data formats, terminology and codes used, measurement units, metadata), document changes to data, and perform quality assurance checks and statistical summaries on data after analysis. Assign someone this data quality responsibility.
  3. Data Storage - Ensure that your data are stored in a way to prevent accidental data loss. Routinely back-up data, and have multiple copies of these back-ups in different locations.
  4. Data Documentation - metadata, metadata, metadata! It is crucial that you capture the data that describes your data. For example, who created it, what the content of the data is, when the data were created, where they were geographically, how the data were collected/processed, and why the data were developed. You NEED metadata to remember how you processed and collected your own data, so information won’t be lost when a student/employee leaves, and so that data can be used again in the future. I recommend looking up best practices for writing quality metadata.
A line graph showing that the details about data are lost as time goes on due to various events
Why metadata are important - details are lost over time. Credit: DataONE.

If you take all of these into account, your data analysis should now be more streamlined and the quality of the data improved.


Now that you have the data, and it has been organized, analyzed, and is sitting on your hard drive, what do you do with it? Archive it!

Data repositories and archives are physical or digital information infrastructures which provide long-term storage, preservation, and access to metadata and data. In addition, they can include other benefits such as metadata generation, citation and DOI issuance (digital object identifier used to uniquely identify and access your dataset or document), dataset embargoing, and data curation! Consequently, submitting your data to an archive is one of the best things you can do in terms of data access and sharing.

As an example, most of the NOAA Coral Reef Conservation Program’s data are submitted to NOAA’s National Centers for Environmental Information, which maintains one of the world’s most significant environmental data archives. It guarantees that your data will be preserved for decades; accepts a wide range of environmental data types and formats; and provides data curation and stewardship to ensure that the data being archived is of the best quality possible.

A photo collage of different data sources and storage formats.
NOAA’s National Centers for Environmental Information data archive contains a wide variety of environmental data, including data about coral reefs.

Archives vary in their data focus, metadata standards and documentation requirements, file formats and data size accepted, costs, and levels of curation. Also, when you submit your data to an archive they should be in their final form and understandable for re-use without having to reach out to you as the scientist. That is why it is important to plan ahead in your DMP to make this process easier, generate quality metadata, and know what is expected of you from the archive you intend to submit your data to.

Access and Sharing

 Infographic titled ‘Protect Yourself & the Reef!’
National Park Service poster shows reef safe sunscreen ingredients to use while swimming.

After you take the step to preserve your data, it is now time to share it.

The positive impacts of data accessibility and sharing cannot be understated, and they are critical for progress in both the scientific and public realms. In the scientific community, data access increases the impact and visibility of research, promotes innovation and collaboration, and encourages data reproducibility. This in turn helps to increase data transparency and verification. Data access also helps the public foster trust in scientific research, and enables them to make informed decisions with regard to policy development, voting, education, and personal lifestyle choices. For example, scientific research has shown that several common chemicals in sunscreen are harmful to corals. In response, many people who recreate on or near coral reefs use reef safe (mineral) sunscreens instead. In fact, the state of Hawai’i has banned sunscreens containing the chemical active ingredients oxybenzone and octinoxate, which are the two most harmful sunscreen chemicals to corals.

As described above, data preservation via repositories and archives are one method of data access and sharing. However, there are many others, including:

  • Open-access publications,
  • Conference or public presentations,
  • Outreach events,
  • Project/Program websites, and
  • Media articles.


With all of this data out there, it is time to re-use it! Data re-use can have tremendous impacts, especially in the scientific community. Data re-use enables researchers to build upon the work of others, improve methodologies, and perform meta-analyses. It can also be re-used to educate new researchers about the most current and noteworthy findings in the field. This all helps to maintain research continuity, as well as save time and money. Comparing, reproducing, and verifying results also authenticates data as accurate and trustworthy. Finally, data re-use can promote resource/data exchange, interdisciplinary research, and general feedback, which leads to more comprehensive and wider-reaching results.

All of this improves the quality of the data, research, and future results - which ultimately benefits both scientists and the public!

A purple website with four boxes. The upper left box shows a map and the other three show graphs with purple data points and gray error boxes.
NOAA Coral Reef Conservation Program data can be accessed and re-used via the National Coral Reef Monitoring Program Data Visualization Tool. This image shows benthic data for the U.S. Virgin Islands.

To sum up, the data lifecycle can be long and complex. However, it’s worth pursuing so that quality data are generated and can be accessed, understood, and used by others for many years to come.

Related Stories and Products