New NIST Resources Aid Searches of Coronavirus Dataset

n i s t

View as a Web Page

Small green box labeled "Updates"


New NIST Resources Aid Searches of Coronavirus Dataset

Illustration shows squiggle of colored lines straightening into parallel lines from left to right.

NIST has made available four new resources for searching the CORD-19 Open Research Dataset. These tools were developed in response to the March 16, 2020, White House Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset, which called on the artificial intelligence community to develop ways to make the collection’s text and data easily searchable by biomedical researchers.

  • The NIST Scientific Indexing Resource uses the NIST-developed “root and rule” method to determine keywords and link together words into phrases and concepts to help a user find relevant and related articles without needing to search for a precisely matched keyword or phrase.
  • The COVID-19 Data Repository relies on the Configurable Data Curation System, developed at NIST for structuring datasets that lack organization, and offers multiple ways to query the dataset.
  • The COVID-19 Registry, also based on the Configurable Data Curation System, is a web application that collects descriptions of resources including other repositories, databases, services, portals, websites, and organizations. It relies on contributions from across the research community.
  • cord19-cdcs-nist, hosted on GitHub, provides quick access to CORD-19 data that is already screened for incomplete, irrelevant, or corrupt data, and therefore ready for analysis with any programming language.

Please visit the NIST website for more information on these tools.