본문 바로가기

Others

Optimal intensive care outcome prediction over time using machine learning.

https://www.ncbi.nlm.nih.gov/pubmed/30427913




 2018 Nov 14;13(11):e0206862. doi: 10.1371/journal.pone.0206862. eCollection 2018.

Optimal intensive care outcome prediction over time using machine learning.

Author information

1
Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom.
2
Bloomsbury Institute of Intensive Care Medicine, University College London, London, United Kingdom.
3
Nuffield Department of Clinical Neurosciences, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom.
4
Department of Intensive Care, Guy's and St. Thomas' NHS Foundation Trust, St. Thomas' Hospital, Westminster Bridge Road, Lambeth, London.
5
Centre for Perioperative Medicine and Critical Care Research, Imperial College Healthcare NHS Trust, Praed St., London, United Kingdom.

Abstract

BACKGROUND:

Prognostication is an essential tool for risk adjustment and decision making in the intensive care unit (ICU). Research into prognostication in ICU has so far been limited to data from admission or the first 24 hours. Most ICU admissions last longer than this, decisions are made throughout an admission, and some admissions are explicitly intended as time-limited prognostic trials. Despite this, temporal changes in prognostic ability during ICU admission has received little attention to date. Current predictive models, in the form of prognostic clinical tools, are typically derived from linear models and do not explicitly handle incremental information from trends. Machine learning (ML) allows predictive models to be developed which use non-linear predictors and complex interactions between variables, thus allowing incorporation of trends in measured variables over time; this has made it possible to investigate prognosis throughout an admission.

METHODS AND FINDINGS:

This study uses ML to assess the predictability of ICU mortality as a function of time. Logistic regression against physiological data alone outperformed APACHE-II and demonstrated several important interactions including between lactate & noradrenaline dose, between lactate & MAP, and between age & MAP consistent with the current sepsis definitions. ML models consistently outperformed logistic regression with Deep Learning giving the best results. Predictive power was maximal on the second day and was further improved by incorporating trend data. Using a limited range of physiological and demographic variables, the best machine learning model on the first day showed an area under the receiver-operator characteristic curve (AUC) of 0.883 (σ = 0.008), compared to 0.846 (σ = 0.010) for a logistic regression from the same predictors and 0.836 (σ = 0.007) for a logistic regression based on the APACHE-II score. Adding information gathered on the second day of admission improved the maximum AUC to 0.895 (σ = 0.008). Beyond the second day, predictive ability declined.

CONCLUSION:

This has implications for decision making in intensive care and provides a justification for time-limited trials of ICU therapy; the assessment of prognosis over more than one day may be a valuable strategy as new information on the second day helps to differentiate outcomes. New ML models based on trend data beyond the first day could greatly improve upon current risk stratification tools.

PMID:
 
30427913
 
DOI:
 
10.1371/journal.pone.0206862



영국.

캠브리지, 런던대, 옥스퍼드대



ICU에 입원한 중환자의 예후 예측.


이전엔 logistic regression 모델을 썼는데, 

이전 연구들은 입원한지 하루 이틀정도의 단기간 예후 예측에 쓰이는 편이 었고,

sepsis, AKI acute kidney injury 의 케이스에 주로 해당되는 연구들이었음.


중환자실 관련 데이터 베이스 : MIMIC-II, MIMIC-III 


이 연구의 method : ‘AdaBoost’, an adaptive boosted ensemble method,  Transfer learning.

머신러닝 방법을 써서, 조금 더 최적화된 예측 모델을 만들 것인가를 중점.

그리고 고려되지 않았던, factor 가 있지 않을까 찾아보는 목적도. 




데이터베이스 자료 : heart rate (HR), mean arterial pressure (MAP), Pa O2/Fi O2 ratio, sodium, potassium, lactate, creatinine, CRP, and pH on each day 






GLM 을 기본으로 ( logistic regression)

알고리즘 모델 4가지를 가져다 씀.

Four machine learning models were implemented using the ‘caret’ package v6.0-77 [37]. 

‘adaboost’ (‘AdaBoost.M1’), a boosted decision tree algorithm, was selected due to past performance with

this dataset. 

‘parRF’, a parallel implementation of a random forest algorithm

‘svmRadial Weights’, a support vector machines algorithm with radial basis function kernel and class weights, 

and 

‘avNNet’, a single layer model averaged neural network, were selected as good class representatives based on performance across multiple datasets [38]. 


Deep Learning was implemented using the ‘keras’ package v2.1.4.9000 for R to implement a six hidden-layer neural network with ‘TensorFlow’




심지어 이 연구는 R코드도 제공함.  역시 PlosOne 은 오픈 소스를 지향하는 훌륭한.... 저널...

Calculations were carried out using R version 3.4.4 on Linux. The code was optimized to run

in parallel across 16x3.3GHz Intel Xeon cores with a total of 32GB RAM. The code is available

at https://github.com/ariercole/TemporalPrediction.


2018 PLosOne_ R code_ Optimal intensive care outcome prediction over time usingmachine learning.pdf






결과. 딥러닝 방법이 , 조금더 성능이 좋았다. (6개 히든 레이어 밖에 안되긴 하지만...)

Deep Learning, which has the advantage of being able to uncover even more complex interactions than other machine learning algorithms, has the best performance on this dataset, despite the relatively small size of multilayer perceptron model used here (Fig 3). 





딥러닝은, 자체적인 블랙박스 이펙트 때문에 보통 아무데나 막 적용하지는 않는데

각 설명변수들의 알지못한 관계를 드러나게 하는 데 도움이 되어, 다른 알고리즘 과 같이 썼을 때

성능향상을 보인 것 같다.

Deep Learning does not naturally lend itself to the development of practical clinical tools, due

to the ‘black box’ effect; the complexity of the model means that analysis and explanation of

how the covariates are utilised is limited. Techniques do exist, but a formal evaluation of the

workings of the neural networks here is beyond the scope of this study. However, some infor-

mation can be readily gleaned from an investigation into which patients are correctly and

incorrectly classified by the model.


Machine learning approaches and Deep Learning in particular are complex algorithms that

are inherently less interpretable than techniques such as logistic regression. However, in this

work we seek to use such approaches to see what additional information content is unex-

plained by traditional classifiers and to examine the time variation of this rather than under-

stand the causative factors so this is less of a consideration for our work. 


Despite the difficulties in translation to a useful clinical tool, Deep Learning has shown

promise in improving the quality of predictions. In this study we aim to evaluate whether

there is predictive information that is not captured by conventional time-insensitive models

and machine learning techniques are a natural tool for this.



LIMITATIONS : (1)  missing data imputed   (2) ICU day 는 , 날짜가 갈 수록 뒷부분의 자료가 부족해 지는 점. (3) 사망 생존의 데이터가 적은 점  (4) obsevational study 인점  등.




2018 PLosOne_ Optimal intensive care outcome prediction over time using machine learning.pdf