Skip to Main Content Home Ask a Librarian

Research Data Management

From Boston University

What is a Data Repository?

According to the Registry of Research Data Repositories ( a data repository is a

subtype of a sustainable information infrastructure which provides long-term storage and access to research data that is the basis for a scholarly publication. Research data means information objects generated by scholarly projects for example through experiments, measurements, surveys or interviews.

In other words, a data repository provides long-term storage of the data that supports scholarly publications. Data repositories are institutional efforts to provide sustainable preservation of the data created by researchers. Data repositories serve to ensure research data is accessible beyond the life of a grant, research project, or individual careers.

Learn more from Boston University's Data Services here!

From the NIH

Selecting a Data Repository

Learn how to evaluate and select appropriate data repositories. FAQs


 Selecting a Data Repository (Find the Domain-Specific and Generalist repositories in the box below)
 Desirable Characteristics for All Data Repositories
 Additional Considerations for Human Data
 Repositories for Scientific Data


As outlined in NIH's Supplemental Policy Information: Selecting a Repository for Data Resulting from NIH-Supported Research, using a quality data repository generally improves the FAIRness (Findable, Accessible, Interoperable, and Re-usable) of the data. For that reason, NIH strongly encourages the use of established repositories to the extent possible for preserving and sharing scientific data.

While NIH supports many data repositories, there are also many biomedical data repositories and generalist repositories supported by other organizations, both public and private. Researchers may wish to consult experts in their own institutions (e.g., librarians, data managers) for assistance in selecting an appropriate data repository.

NIH encourages researchers to select data repositories that exemplify the desired characteristics below, including when a data repository is supported or provided by a cloud-computing or high-performance computing platform. These desired characteristics aim to ensure that data are managed and shared in ways that are consistent with FAIR data principles.

Selecting a Data Repository

Selecting a Data Repository

  • For some programs and types of data, NIH and/or Institute, Center, Office (ICO) policy(ies) and Funding Opportunity Announcements (FOAs) identify particular data repositories (or sets of repositories) to be used to preserve and share data.
    • For data generated from research subject to such policies or funded under such FOAs, researchers should use the designated data repository(ies).
  • For data generated from research for which no data repository is specified by NIH, researchers are encouraged to select a data repository that is appropriate for the data generated from the research project. Be sure to consult the list of desirable characteristics and the following guidance:
    • Primary consideration should be given to data repositories that are discipline or data-type specific to support effective data discovery and reuse. For a list of NIH-supported repositories, visit Repositories for Sharing Scientific Data.
    • If no appropriate discipline or data-type specific repository is available, researchers should consider a variety of other potentially suitable data sharing options:
      • Small datasets (up to 2 GB in size) may be included as supplementary material to accompany articles submitted to PubMed Central (instructions).
      • Data repositories, including generalist repositories or institutional repositories, that make data available to the larger research community, institutions, or the broader public.
      • Large datasets may benefit from cloud-based data repositories for data access, preservation, and sharing.