The Common Fund Data Ecosystem’s Crosscut Metadata Model (CFDE C2M2)

What is C2M2?

The Crosscut Metadata Model (C2M2) is a standard for describing biomedical experimental data. The C2M2 framework will enable the biomedical research community to perform powerful cross-dataset searches, custom aggregation of experimental data, and rigorous statistical analysis.

Why do we need it?

There is an abundance of rich biological data in the world today that is almost impossible for researchers to find, share, and study.

Making data Findable, Accessible, Interoperable, and Reusable (FAIR) is not an easy task. Data created through different studies and for different projects often do not share terms even when the underlying data types are identical.

For example, when study A labels its tissue of origin data column as "tissue", study B labels the same column as "anatomy", and study C calls it "UBERON IDs", combining data from all three studies would require significant terminology expertise and cross-study collaboration.

Many datasets are also hidden away in isolated, niche data repositories making it extremely difficult to find and access unless you already know where to look.

The CFDE is working to make data FAIR by building a standardized way of linking related terms from different datasets to one another. The C2M2 model is that standard. Under this framework, terminologies will be consistent and clearly defined, and researchers will know how to find, access, combine and reuse all available datasets from across the Common Fund Programs.