Knowledge synthesis of 100 million biomedical documents augments the deep expression profiling of coronavirus receptors
Abstract: The COVID-19 pandemic demands assimilation of all biomedical knowledge to decode mechanisms of pathogenesis. Despite the recent renaissance in neural networks, a platform for the real-time synthesis of the exponentially growing biomedical literature and deep omics insights is unavailable. Here, we present the nferX platform for dynamic inference from 45 quadrillion+ possible conceptual associations from unstructured text and triangulation with insights from Single Cell RNA-sequencing, bulk RNAseq and proteomics from diverse tissue types. A hypothesis-free profiling of ACE2 suggests tongue keratinocytes, olfactory epithelial cells, airway club cells and respiratory ciliated cells as potential reservoirs of the SARS-CoV-2 receptor. We find the gut as the putative hotspot of COVID-19, where a maturation correlated transcriptional signature is shared in small intestine enterocytes among coronavirus receptors(ACE2, DPP4, ANPEP). A holistic data science platform triangulating insights from structured and unstructured data holds potential for accelerating the generation of impactful biological insights and hypotheses.