Big data and machine learning: hope or hype?

Big data and machine learning offer prospects of earlier disease detection, better treatment outcomes and more personalized medicine. But there are challenges in understanding the assumptions made by artificial intelligence, in ensuring the privacy of sensor-derived data, and in the strategies used for its analysis.

Big data in psychiatry is at the intersection of three “megatrends”: a rise in spending on neuroscience research; machine learning, which has advanced massively over the past decade; and personalized medicine, since the current century will increasingly see prediction and treatment at the level of the individual.

This was how Danilo Bzdok (McGill University, Montreal, Quebec, Canada) introduced the discussion at ECNP 2020 Virtual. There is potential, for sure. But he also identified challenges. One is a trade-off between the ease of interpretation of an artificial intelligence model and its accuracy.

Earlier identification of the effects of therapy is an achievable goal

Secondly, in psychiatry over the past few years, there has been a paradoxical phenomenon: as datasets have increased in size, the tools derived from them have not increased in predictive power. It is not clear why this is the case.


Potential breakthroughs, potential pitfalls

Maria Faurholt-Jepsen (Copenhagen Affective Disorder Research Center, Denmark) believes that real-time collection of objective data – as opposed to the answers given by patients, often retrospectively, to paper-based questionnaires – provides an exciting opportunity to better understand psychiatric problems. This is especially so when limited insight is part of the disorder.

And one of her hopes is that objective data collection from a variety of sensors will allow us to investigate the effects of intervention at an earlier stage than is otherwise possible.

But she too admits to challenges, one being the decision on whether strategies for analyzing data should be hypothesis driven or hypothesis generating. There are potential breakthroughs but also potential pitfalls, she noted.


Hidden patterns in complex data

There is hype, but the hope is bigger

Janaina Mourao-Miranda (Centre for Medical Image Computing, University College London, UK) argued that machine learning can discover hidden patterns in complex, multimodal and longitudinal data. This offers the promise of earlier disease detection, better treatment outcomes through the identification of patient subgroups, and more personalized medicine.

But she also noted that most models are based on associations rather than causality, and that we should not treat them as a “black box” but seek to understand why they predict what they do and the underlying assumptions.

There is hype, but the hope is bigger, she argued.


Hunger for data

There are also ethical challenges. How do people feel about generating masses of data – perhaps for a lifetime – without that seeming intrusive? Can anonymisation be assured when requested?

The hype is not the problem: it is more the hunger for data, which is being bought up by companies worldwide, argued Bart de Witte (Digital Health Academy, Berlin, Germany). But, he said, data is not a commodity. Data is people. Data is human life.

He made a plea that big data in medicine should not be privatized, and neither should the discoveries derived from it. And he argued for public access to data and the knowledge it brings, much as there is access to open-source software.

Our correspondent’s highlights from the symposium are meant as a fair representation of the scientific content presented. The views and opinions expressed on this page do not necessarily reflect those of Lundbeck.