Utilize Big Data

The Big Data produced in the DoMore! project will be the basis needed for us to identify and establish robust generic biomarkers for cancer prognosis and prediction.

Sample size plays a significant role for accurate cancer prognostics, especially when cancer heterogeneity is taken into account. To solve the problems of tumor heterogeneity, undersampling and observer variance, we will need to do more. That is, do more sampling, do more analysis and establish more and better prognostic markers.

The false discovery rate for prognostic markers depends on: the recurrence rate within the population of patients studied, the number of features (e.g. genes) studied (relating directly to the number of multiple comparisons), the variability of gene expression and the size of the hazard ratios deemed clinically relevant. For example, if 270 patients with a recurrence rate of 30% were enrolled in a prognostic study in which 761 genes were analyzed using a Cox proportional hazards model, then the false discovery rate would be approximately 25%! However, if we pool analysis of two large independent studies, the false discovery rate is significantly reduced. For example, if 1,000 patients were analyzed from two separate studies (assuming broad comparability between patient groups), and only features positive in both studies were considered, then less than 4% of features would be expected to be false discoveries. Thus, the features identified in the pooled analysis could be used to build a prognostic model, with greater confidence. This emphasizes the need for well-designed experiments with ‘Big Data’ if we are to deliver compelling biomarkers with real clinical utility.

We will teach computers, through Deep Learning and Big Data utilization to establish more robust grading systems in cancer types where pathology has failed. We will do so in an objective and reproducible way, reducing human error and removing subjective analyses, suboptimal diagnosis, and ultimately suboptimal treatment of cancer.


© DoMore! | Editor-in-chief: Håvard E. Greger Danielsen | Webmaster: Marian Seiergren