Projects

Bayesian Federated Inference (BFI)

Most medical statistics methods require the number of patients in a data set to be much larger than the number of measurements that are recorded for each patient. This requirement is increasingly problematic. We can now measure vastly more patient characteristics than in the past, but unless we have similarly large numbers of patients in our data sets (which is expensive, and for rare diseases impossible), our data cannot be used fully.

Pooling data from different medical centers is often impossible due to privacy regulations, consent limitations, and logistic hurdles. Bayesian Federated Inference (BFI) is a novel statistical approach via which one can recover reliably from specific local analyses in separate centers what would have been found if their local data had been pooled. One can thereby harness the statistical power of large combined data sets without any need for data sharing.

Decision support for medicine

PHRASE/AISN

Stroke-caused cognitive and neuromotor impairment is an increasing burden on society worldwide, in terms of quality of life and healthcare costs. Methodologies for intelligent and personalized stroke rehabilitation reduce long-term disability of patients and the economic burden on healthcare.

Overfitting correction in
medical inference

When data sets contain insufficient numbers of patients compared to the number of measurements per patients statistical methods are known to start `overfitting’, i.e. to misinterpret noise as signal. This leads to unreliable predictions, and implies that many expensive data remain under-used (see above description of the BFI method).

We have shown, using mathematical tools from statistical physics (the replica methods), that it is possible for a large family of statistical models to predict the relation between the incorrect inferences of an overfitting model and what would have been reported in the absence of overfitting. Our results, illustrated by application to linear, logistic, and Cox regression, enable us to correct statistical inferences in the hitherto forbidden overfitting regime.

Responder Identification

Conventional medical statistics methods only quantify the average cohort-wide benefit of a drug. Hence, many drugs are licensed even if they benefit only a subset of the patients, and clinical trials fail if a detrimental effect in one patient subgroup cancels the beneficial effect in another. This avoidable situation causes unnecessary patient harm, reduced availability of drugs, and waste of resources.

We recently developed and implemented a new Bayesian statistical method that focuses on identifying responder subgroups in clinical trial data. In doing so, it can rescue and prevent failed clinical trials, and increase response rates of drugs by better targeting. We focus on cancer trials, where the problems are specifically acute. The benefits of responder subgroup identification, however, apply similarly to other diseases. Our method has already been tested successfully on two failed phase III cancer trials.