    Altuncu MT, Mayer E, Yaliraki SN, Barahona Met al., 2018,

    From Text to Topics in Healthcare Records: An Unsupervised Graph Partitioning Methodology

    Electronic Healthcare Records contain large volumes of unstructured data,including extensive free text. Yet this source of detailed information oftenremains under-used because of a lack of methodologies to extract interpretablecontent in a timely manner. Here we apply network-theoretical tools to analysefree text in Hospital Patient Incident reports from the National HealthService, to find clusters of documents with similar content in an unsupervisedmanner at different levels of resolution. We combine deep neural networkparagraph vector text-embedding with multiscale Markov Stability communitydetection applied to a sparsified similarity graph of document vectors, andshowcase the approach on incident reports from Imperial College Healthcare NHSTrust, London. The multiscale community structure reveals different levels ofmeaning in the topics of the dataset, as shown by descriptive terms extractedfrom the clusters of records. We also compare a posteriori against hand-codedcategories assigned by healthcare personnel, and show that our approachoutperforms LDA-based models. Our content clusters exhibit good correspondencewith two levels of hand-coded categories, yet they also provide further medicaldetail in certain areas and reveal complementary descriptors of incidentsbeyond the external classification taxonomy.

    Altuncu MT, Yaliraki SN, Barahona M, 2018,

    Content-driven, unsupervised clustering of news articles through multiscale graph partitioning

    The explosion in the amount of news and journalistic content being generatedacross the globe, coupled with extended and instantaneous access to informationthrough online media, makes it difficult and time-consuming to monitor newsdevelopments and opinion formation in real time. There is an increasing needfor tools that can pre-process, analyse and classify raw text to extractinterpretable content; specifically, identifying topics and content-drivengroupings of articles. We present here such a methodology that brings togetherpowerful vector embeddings from Natural Language Processing with tools fromGraph Theory that exploit diffusive dynamics on graphs to reveal naturalpartitions across scales. Our framework uses a recent deep neural network textanalysis methodology (Doc2vec) to represent text in vector form and thenapplies a multi-scale community detection method (Markov Stability) topartition a similarity graph of document vectors. The method allows us toobtain clusters of documents with similar content, at different levels ofresolution, in an unsupervised manner. We showcase our approach with theanalysis of a corpus of 9,000 news articles published by Vox Media over oneyear. Our results show consistent groupings of documents according to contentwithout a priori assumptions about the number or type of clusters to be found.The multilevel clustering reveals a quasi-hierarchy of topics and subtopicswith increased intelligibility and improved topic coherence as compared toexternal taxonomy services and standard topic detection methods.

    Arnaudon A, 2018,

    Structure preserving noise and dissipation in the Toda lattice

    , Journal of Physics A: Mathematical and Theoretical, Vol: 51, ISSN: 1751-8113

    In this paper, we use Flaschka's change of variables of the open Toda latticeand its interpretation in term of the group structure of the LU factorisationas a coadjoint motion on a certain dual of Lie algebra to implement a structurepreserving noise and dissipation. Both preserve the structure of coadjointorbit, that is the space of symmetric tri-diagonal matrices and arise as a newtype of multiplicative noise and nonlinear dissipation of the Toda lattice. Weinvestigate some of the properties of these deformations and in particular thecontinuum limit as a stochastic Burger equation with a nonlinear viscosity.This work is meant to be exploratory, and open more questions that we cananswer with simple mathematical tools and without numerical simulations.

    Arnaudon A, Ganaba N, Holm DD, 2018,

    The stochastic energy-Casimir method

    , Comptes Rendus Mécanique, Vol: 346, Pages: 279-290, ISSN: 1631-0721
    Arnaudon A, Holm D, Sommer S, 2018,

    String methods for stochastic image and shape matching

    , Journal of Mathematical Imaging and Vision, Vol: 60, Pages: 953-967, ISSN: 0924-9907

    Matching of images and analysis of shape differences is traditionally pursued by energy minimization of paths of deformations acting to match the shape objects. In the large deformation diffeomorphic metric mapping (LDDMM) framework, iterative gradient descents on the matching functional lead to matching algorithms informally known as Beg algorithms. When stochasticity is introduced to model stochastic variability of shapes and to provide more realistic models of observed shape data, the corresponding matching problem can be solved with a stochastic Beg algorithm, similar to the finite-temperature string method used in rare event sampling. In this paper, we apply a stochastic model compatible with the geometry of the LDDMM framework to obtain a stochastic model of images and we derive the stochastic version of the Beg algorithm which we compare with the string method and an expectation-maximization optimization of posterior likelihoods. The algorithm and its use for statistical inference is tested on stochastic LDDMM landmarks and images.

    Arnaudon A, Holm DD, Sommer S, 2018,

    A Geometric Framework for Stochastic Shape Analysis

    , Foundations of Computational Mathematics, ISSN: 1615-3375

    © 2018, The Author(s). We introduce a stochastic model of diffeomorphisms, whose action on a variety of data types descends to stochastic evolution of shapes, images and landmarks. The stochasticity is introduced in the vector field which transports the data in the large deformation diffeomorphic metric mapping framework for shape analysis and image registration. The stochasticity thereby models errors or uncertainties of the flow in following the prescribed deformation velocity. The approach is illustrated in the example of finite-dimensional landmark manifolds, whose stochastic evolution is studied both via the Fokker–Planck equation and by numerical simulations. We derive two approaches for inferring parameters of the stochastic model from landmark configurations observed at discrete time points. The first of the two approaches matches moments of the Fokker–Planck equation to sample moments of the data, while the second approach employs an expectation-maximization based algorithm using a Monte Carlo bridge sampling scheme to optimise the data likelihood. We derive and numerically test the ability of the two approaches to infer the spatial correlation length of the underlying noise.

    Attard M, Dawes T, Simoes Monteiro de Marvao A, Biffi C, Shi W, Wharton J, Rhodes C, Ghataorhe P, Gibbs J, Howard L, Rueckert D, Wilkins M, O'Regan Det al., 2018,

    Metabolic pathways associated with right ventricular adaptation to pulmonary hypertension: three dimensional analysis of cardiac magnetic resonance

    , EHJ Cardiovascular Imaging / European Heart Journal - Cardiovascular Imaging, ISSN: 2047-2412
    Biffi C, de Marvao A, Attard MI, Dawes TJW, Whiffin N, Bai W, Shi W, Francis C, Meyer H, Buchan R, Cook SA, Rueckert D, O'Regan DPet al., 2018,

    Three-dimensional cardiovascular imaging-genetics: a mass univariate framework

    , BIOINFORMATICS, Vol: 34, Pages: 97-103, ISSN: 1367-4803
    Clarke J, Warren L, Darzi A, Barahona Met al., 2018,

    Guiding Interoperable Electronic Health Records Through Patient Sharing Networks

    , NPJ Digital Medicine
    Dawes TJW, Cai J, Quinlan M, de Marvao A, Ostrowski PJ, Tokarczuk PF, Watson GMJ, Wharton J, Howard LSGE, Gibbs JSR, Cook SA, Wilkins MR, O'Regan DPet al., 2018,

    Fractal Analysis of Right Ventricular Trabeculae in Pulmonary Hypertension

    , RADIOLOGY, Vol: 288, Pages: 386-395, ISSN: 0033-8419

