Several key concerns regarding the formation and success of CFDE have been noted throughout this document. These concerns are restated here and listed in relative order of priority:
dbGaP access. While CFDE could easily result in significant advances in querying and accessing datasets hosted at each DCC, this information will be far more useful if can be used in combination with data hosted at dbGaP. There is a critical need to reduce the barriers associated with accessing dbGaP data, and to enable access to data across studies hosted there.
Single sign-on and authorization. Two of the centers most intimately familiar with protection of human data (GTEx and Kids First) are skeptical that the solutions provided by the NCBI Virtual Directory Service will meet their needs. This solution is also expected to take another 18 months for implementation. In the absence of a single-sign on solution it is unlikely that CFDE will be able to adopt a system that will readily apply to all Common Fund DCCs; this will impede our ability to share restricted data.
Building expertise within and across the DCCs. There is a large potential loss of opportunities if the DCCs continue to operate in isolation of each other. We strongly recommend increasing their interactions with each other to avoid duplication of effort, capturing institutional knowledge from the mature DCCs, promoting the re-use of technologies, and encouraging shared standards development.
Support burden. All DCCs reported that user support is vital to their mission, and requires a lot of investment of their time. It is likely CFDE activities will increase the number of users seeking help as we enable the user community to make use of datasets across all of the DCCs. This increased support burden can be addressed by creating a centralized help desk, and offering additional training in the form of documentation, webinars, and conferences.
STRIDES and CFDE role clarification. Confusion on the part of the DCC staff about the difference between STRIDES and CFDE was evident during our visits. It is likely the lack of clarity is shared by other sites, and this could potentially reduce the interest of the DCCs, as well as their NIH Program Officers, in participating in CFDE. Another concern is that storage costs are significant; regardless of the funding source, or the reductions made available through STRIDES, costs to host data on cloud-based systems are going to be considerable in size.
Ensuring CFDE compliance. While we have proposed several measures to promote usage of CFDE Best Practices, the concern remains that DCCs will not be motivated to participate in the standards for use across the DCCs. Ultimately DCCs are independent entities, and answer to their own NIH Program Officer; encouragement from Common Fund leadership for DCCs to participate in CFDE will be vital to the success of this project.