The Data Coordinating Centers (DCCs) from each Common Fund Program will collect metadata and format it by standardizing terms used to describe data. The format is specified to be easily readable by a computer, so that it can be loaded into a database. Researchers will then be able to search the database using a human readable web portal.
For the "tissue of origin" example, the formatting and standardization process might involve each study agreeing to adopt the term "anatomy" to describe tissue of origin, and consequently reformatting their individual data submission.
Appropriately formatted datasets submitted to the CFDE will be integrated into C2M2. The CFDE software infrastructure will automatically validate submission format compliance and metadata integrity, and request changes from data submitter if necessary.
The general expectation is that the metadata submitted and managed by a DCC will transition, over time, through increasingly rich modeling levels as the life cycle of DCC/CFDE technical interaction progresses, which will enable increasingly powerful downstream applications.