Artificial intelligence (AI) and Machine learning (ML), which represent simulation of human intelligence for decision making is the current buzzword across many industries and are often used interchangeably. According to computer scientist and machine learning pioneer Tom M. Mitchell, ML is the study of computer algorithms that allow computer programs to automatically improve through experience. According to the U.S. Food and Drug Administration (FDA), AI is 'the science and engineering of making intelligent machines', whereas ML is 'an AI tool that can be used to design and train software algorithms to learn from and act on data ' (https://www.fda.gov/medical-devices/software- medicaldevice-samd/artificial-intelligence-and-machine-learning-software-medicaldevice#whatis). So ML is one of the enablers for achieving AI. ML algorithms can help in understanding complex data patterns present in input training data. Machine learning algorithms can be grouped into the following categories: (a) supervised learning, (b) unsupervised learning, or (c) reinforcement learning. In supervised learning, the algorithm is given an input data along with corresponding target label/group. The algorithm tries to learn from input features and target groups. The classification and regression tasks are named as supervised learning since then they learn from the labelled training data. In unsupervised learning the input data is provided without target group. The algorithm's task is to identify the underlying structure in the data and then assign group label. Unsupervised learning is also called clustering. In the case of reinforcement learning, the algorithm's aim is to find the most suitable action that will maximize the reward. Like clustering, reinforcement learning does not need the target label for training. Another group of algorithm that is becoming popular is the deep learning (DL) framework. Generally deep learning methods require much larger amount of data to train efficiently compared to machine learning, but once trained properly, their efficiency is usually better than machine learning frameworks. Usually interpretability of machine learning concepts is much more clearer compared to deep learning. Deep learning architectures have better efficiency in areas where it is impractical to design features/concepts from input data. So, where ML requires domain knowledge and expertise, DL can be applied more or less in a much wider arena, given large amount of data to train the systems efficiently.
He most popular AI method that we currently use are Human-AI interaction gadgets by Apple Siri, Google Home and Amazon Alexa. The video prediction systems used by Netlix, Amazon and YouTube are powered by ML algorithms that help in identifying the right content. The shopping cart items shown by FlipKart and Amazon are driven by ML programs that look up customer's past searches both on their website and internet and then suggest them the most suitable items. AI/ML methods are increasingly becoming essential in our daily lives e.g. Google Maps, where it processes real-time information about traffic and commuter routes quickly on the user's mobile phones and provides the best route to take. Healthcare is another field that is thought to be highly suitable for the application of AI tools and techniques. These techniques will enhance the quality of automation and make decision making in primary/tertiary patient care in public domain much more robust. Al in healthcare has the potential to transform the quality of life for billions of people worldwide. lt is predicted that in near future every clinician whether a specialty doctor or a general physician will be using Al/ML for making clinical decisions. They will use these techniques to interpret medical scans, pathology slides, skin lesions, retinal images, electrocardiograms, endoscopy, genetic diseases, faces, vital signs and many more.
Accurate interpretation of radiology images plays an important role in clinical diagnosis and treatment planning. In the last few years there have been many studies that show how AI can outperform humans in the interpretation of medical images from various diseases. Chest X-rays with over 2 billion scans worldwide every year are one of the most commonly used medical scans for diagnosis of several thoracic diseases (chest X-ray database- ChestX-ray8; Wang, X. et al. 2017). A deep learning program viz., CheXNet developed by Rajpurkar et al. in 2017 uses a 121-layer Convolutional Neural Network (CNN) trained on large publicly available chest X-ray dataset with over 100,000 records from 14 diseases, can detect pneumonia from chest X-rays better than trained radiologists. Later on a team from Google (Li et al., 2017) analysed the same dataset using residual neural network (ResNet) architecture and performed better than the reference baseline for diagnosis of pneumonia, heart enlargement and collapsed lung. The sensitivity of manual identification of cancerous pulmonary nodule by clinical community has not been satisfactory and ranges from 36% to 84% depending upon tumor size and cohort. Recently a deep neural network (DNN) was applied to detect cancerous pulmonary nodule from X-ray of chest using DNN (Nam et al., 2018). This method yielded much better results compared to manual identification methods and its overall performance was better in 16 out of 18 clinicians. The clinicians who performed better than the AI method had over 13 years of experience. Another interesting application of AI has been in identification of bone fracture from images as compared to human interpretation e.g. using a DNN method on wrist fractures detection increased accuracy from 81% to 92% and reduced the misinterpretation by 47% (Lindsey et al., 2018).
Al and ML methods are now applied in various problems of genomics including variant calling, variant classification, detection of functional and regulatory elements in the human genome.
Particularly in genomics data, it is impractical to use hand crafted statistical rules for the task of interpretation; e.g. let us take the example of generic variant-calling tools, most of which are prone to systematic errors that are biased due to dependency on sample preparation, sequencing platform and technology, sequence context, and inherent biology of the sample in question. Recently published DeepVariant (Poplin et al., 2018), a CNN (Convolutional Neural Network)-based variant caller trained directly on read alignments without any specialized knowledge about genomics or sequencing platforms, has shown significantly higher performance than the popular variant callers including GATK. Similarly, Illumina's SpliceAI (Jaganathan et al., 2019) uses deep learning for predicting intronic variants that can lead to cryptic splicing. Cryptic splicing is enriched in autism and intellectual disability patients as compared to healthy individuals, therefore understanding the impact of variants in intronic is an important step in clinical diagnostics. Many AI tools that use CNNs and RNNs (Recurrent Neural Networks) are also being used for predicting cis and trans regulatory binding sites. DeepSEA (Lyu et al., 2018), another multitask CNN trained on large-scale functional genomics data learns sequence dependencies at multiple scales and predicts DNase hypersensitive sites, transcription factor binding sites, histone marks, and the inluence of genetic variation on those regulatory elements, with accuracy superior to tools for prioritizing non-coding functional variants. AlphaFold from Google Deepmind is another application of AI where Levinthal's paradox (protein folding problem) has been addressed with the aid of BlueGene supercomputing from IBM and is able to predict protein structure large genomic datasets (Senior et al., 2019). Protein structure prediction is an important step in understanding protein function and is difficult to obtain experimentally. AlphaFold is neural network-based program that infer protein structure by analyzing covariation in homologous sequence. AlphaFold generated highly accurate structures for 24 out of 43 free modelling domains.
EHR (Electronic Health Record) analysis is another field that is catching lot of action from deep learning.
Patient clinical data is a very rich source of healthcare data that can impact clinical decision making in heavy manner; but it exists in multitude of formats ranging from radiology data to clinical notes. A lexible data structure Fast Healthcare Interoperability Resources (FHIR) format is being encouraged to represent clinical data in a consistent, hierarchical, and extensible container format, regardless of the health system, which simplifies data interchange between sites. EHR data can be converted to features or concepts of an RNN (there's a reason to call them Recurrent Neural Networks), that can identify patterns of patient characteristics, diagnoses, demography, medications, and other events capable of predicting patient mortality or hospital readmission. NLP techniques can capture unstructured data, analyze the grammatical structure, determine the meaning of the information, summarize it and thus can reduce cost and extract the information for deep analytics. Combined with genomic data, NLP-based methods have boosted phenotype-informed genetic analysis, resulting in automated genetic diagnoses and is now been used to diagnose rare diseases.
Compared to standard statistical approaches that use direct feature learning, deep learning has been shown to simultaneously harmonize inputs including free-text notes, to produce predictions for a wide range of clinical problems and outcomes that outperform state-of-the-art traditional predictive models. Automatic Mendelian Literature Evaluation (AMELIE) (Birgmeier et al., 2020) is one such example of a text-mining tool that parses 29 million PubMed abstracts as well as several full text article to associate the causal variants with their phenotypes. In diagnosis of rare Mendelian disorder, singleton patient analysis (without relatives' exomes) is the most time-consuming scenario. By connecting the phenotype of patients with literature AMELIE ranked the causal genes at top for 66% of 215 diagnosed singleton cases from Deciphering Developmental Disorders (DDD) project. In MedGenome we have implemented random forest based ML tool (VaRTK - Variant Ranking ToolKit) to prioritize variants in our diagnostics samples. VaRTK provides pathogenicity score to each coding variants and is trained on 33 features generated from variants and genes. These features are derived from our clinical reports by genome analysts. In a recent testing (June 2020) on 184 cases where a variant has been reported by genome analyst VaRTK ranks ~90% of pathogenic cases in top 20. By incorporating our latest advancement in VaRTK where we include phenotype score, we are able to rank 93% of the pathogenic cases in top 20.
AI features or concepts are not generally explainable in human comprehension and is often referred to as a black box. Since huge fraction of the audience to claimed AI products do not understand the black box many companies are misusing the AI term for their various product and services. In a report from Verge, 40% of European startups that claimed to use AI don't use the technology. In clinical decision making, which is a high-risk situation, acceptance of AI predictions is subject to acceptance both by clinician and patient community. The genomic and healthcare data suffer from multilayer substructure; with data being prone to biasing arising out of confounding factors such as socioeconomic status, cultural practices, unequal representation, and other non-causal factors that relate to the delivery and accessibility of medicine and clinical tests rather than to their efficacy (Topol, 2019). AI systems must be carefully applied to differentiate between these types of non-causal bias. However, the road is not as dark as it seems. FDA is currently recognizing many AI-driven health analytics in medical diagnostics. One such example is the recent FDA approval of X-ray imaging technique for detection of lung diseases by GE Healthcare. More research is underway to bring confidence to AI systems in high risk areas like clinical diagnostics. This goes hand in hand with development of more cluster computing as well as more clinical data synchronizing efforts like FHIR and standardization of medical terminology like Unified Medical Language System (UMLS). With more efforts in these areas, acceptance of AI in clinical set ups will gain new heights.