Quintana, Yuri. “Challenges to Implementation of Global Translational Collaboration Platforms” MOJ Proteomics & Bioinformatics vol. 2,6 (2015): 65.
Translational Collaboration Platforms connect clinical, genomics, and patient-reported data for the advancement of biomedical research, providing an opportunity to speed up the translating of basic science findings into clinical applications and new medicines. These platforms bring together data from both clinical and research databases and provide opportunities for multi-disciplinary research. Recent years have seen a significant growth of these platforms and some global collaborations research networks have been established using these platforms. In this brief summary of these platforms, we examine the challenges in implementation for global international research collaborations and challenges for the sustainability of research networks.
Connect scientists and practitioners through a shareable and interoperable infrastructure,
Develop standard rules and a common language to share information more easily, and
Build or adapt tools for collecting, analyzing, integrating, and disseminating information associated with cancer research and care.
In 2011, an NIH study  reported some of the problems with the caBig program. In May 2012, the program ended  and the National Cancer Informatics Program (NCIP) created caGrid as its successor . Launched in 2007, the Informatics for Integrating Biology and the Bedside (i2b2) [10,11] infrastructure is based at Partners HealthCare System in Boston, Massachusetts, and is funded by United States National Institute of Health (NIH). The project is open source and has been adopted by numerous academic hospitals around the world for biomedical research. The system can store patient medications and laboratory values, and these can be combined with clinical research data, such as information from a case report form or genomic data, into a single cohesive unit that can be queried in an integrated manner. The i2b2 system differs from caBIG in that the core data in i2b2 is instantiated according to a single relational model, not a compendium of object models . The i2b2 system has been used to set up the Shared Research Informatics Network (SHRINE) that can distribute i2b2 queries to data from several Harvard hospitals, particularly the Beth Israel Deaconess Medical Center, the Dana-Farber Cancer Institute, and Children’s Hospital Boston . Based on i2b2 architecture, the tranSMART platform [14-16] is a set of data models, shared data sets, data transformation utilities, and analytical web applications that accelerate discoveries within complex biological systems by creating a standardized and semantically integrated database of research results linked to reusable and scalable self-service analytics. TranSMART was initially funded by Johnson & Johnson Corporation and is now funded by the TranSMART Foundation as public-private cooperation . Similarly, several European stakeholders have sponsored eTRICKS  for European life sciences research collaborations.
The cBioPortal for Cancer Genomics is an open-source platform  based at Memorial Sloan-Kettering Cancer Center, New York, funded by NIH grants and industry support. The goal is to provide translational researchers access to data sets generated by large-scale cancer genomics projects, such as the Cancer Genome Atlas (http://cancergenome.nih.gov) and the International Cancer Genome Consortium (http://icgc.org). The system has visualization and analysis tools and export functionalities. The public version contains large cancer genomics data sets. The system can also be privately installed and allows researchers to upload their data sets. The Biology-Related Information Storage Kit (BRISK)  is based at the University of British Columbia, Vancouver, Canada, and is funded by a partnership between private and private sources. It is a web-based platform initially developed for researchers in the AllerGen (The Allergy, Genes and Environment Network) consortium (http://www.allergen-nce.ca). The Integrating Data for Analysis, Anonymization, and sharing (iDASH) platform  is based in San Diego, California, and is funded by NIH grants. The platform is a powerful high performance-computing platform for data integration for biomedical and behavioral researchers. It is focused on sharing data with privacy-preserving methods.
The integrated clinical omics database (iCOD)  is based at the Tokyo Medical Dental University, Japan, and is publically funded. The system can combine comprehensive clinical, pathological, and molecular information about patients. The system can show the interrelation of clinical and omics data for the discovery of plausible disease pathways. Georgetown Database of Cancer (G-DOC)  is based at Georgetown University, Washington, DC, and is funded by the US government’s Health and Human Services agency. The system integrates patient demographics, structured clinical research data, and clinical outcomes data with high-throughput omics data (DNA, mRNA, microRNA, and metabolites).
Launched in 2003, The Pediatric Oncology Network Database, (www.pond4kids.org)  is a secure, web-based, multilingual pediatric hematology/oncology database created for use in countries with limited resources to meet various clinical data management needs including cancer registration, delivery of protocol-based care, outcome evaluation, and assessment of psychosocial support programs. Established as a part of the International Outreach Program at St. Jude Children’s Research Hospital in Memphis, Tennessee, USA. POND4Kids serves as a tool for oncology units to store patient data for easy retrieval and analysis and to achieve uniform data collection to facilitate meaningful comparison of information among international centers.
Technical Data Integration - The growing volume and complexity of data in biological data sets require more complex architectures to integrate data from diverse data sets. Data from different generations of lab and sequencing hardware make integration difficult because of different data formats and granularity. The process of uploading data is complicated and requires sustainable resources.
Data Quality - Data quality assurance remains a large problem for data that are collected from diverse institutions. Each institution may have different levels of capacity to review their data quality. The ability to track the level of review of data remains a problem. In some systems there are no detailed mechanisms to tag data (down to the individual data item) as to the level of certainty.
Data Sharing - Data sharing agreements must continue to evolve to manage the impact of ongoing changes in government regulations and evolving corporate compliance needs. This requires substantial dedicated efforts from various institutional departments (technical, legal, clinical, research, management) to review changes to agreements.
Liability - Data breaches continue to be a growing problem for any online platform. This issue requires dedicated expert technical staff to manage access and legal agreements to delineate the liability among collaborating partners. The problem becomes more complicated with the addition of international countries that have different laws and penalties for breaches.
Privacy - The increasing complexity of privacy laws requires changes to software to accommodate the tracking of consents for data and compliance with local, national, and international privacy laws pertaining to the data sources.
Discovery - Novel discoveries from shared data are among the key objectives of these networks. Intellectual property agreements need to be established in advance to handle these opportunities, and the agreements are subject to change as institutions are merged, sold, or reorganized.
Funding Sustainability - Sustainable funding models are unclear from the current emerging collaboration networks. Government research grants and/or industry funding initially fund most networks. Funding from governments continues to be strained. Government funding for any project will usually end once the proof of concept has been published. For industry-sponsored projects, industry will want to see a return on the investment. For industry, it is difficult to measure the return for a shared data network because of the length of time it takes to see outcomes that can be monetized in a commercial application.
Translational collaboration platforms have been successfully developed to support life science research with diverse types of data and from multiple centers. Among the challenges include data integration, quality, sharing models and policies and procedures to manage privacy, liability, and intellectual property. Despite the many challenges to the implementation of these platforms, there are some emerging networks for multi-national collaborations. Models for sustainability of these networks will need to be developed for these platforms and research networks to continue past the initial implementation phase. Careful planning with multiple stakeholders will be needed to create platforms that meet the needs of both clinical and life sciences researchers, and create sustainable research networks and funding models.