- Risk prediction for Cardiovascular Disease using ECG Data in the China Kadoorie Biobank, Yangting Shen et al
- DataInfrastructure QMUL https://twitter.com/AccessAiNews/status/https://twitter.com/AccessAiNews/status/908263527135285248
- Machine learning workflow GSK https://twitter.com/AccessAiNews/status/https://twitter.com/AccessAiNews/status/908248310820913152
Afternoon presentations interested in finding investors / additional investment.
Oxford Uni, John Fox
Biomedicine data is dirty, non standard, not easily coded. How to formalise knowledge.
Changed multidisciplinary clinical meetings – built for Royal Free to make recommendations. Tried to capture data in structured way. Not NLP, tools allow manual input of logic. Executable models of care e.g. chemotherapy regimen for breast cancer patients, thyroid pathway. Machine can follow model with potential to give advice. Want to give tools so that clinicians can publish their own pathways on open clinical repository. How to capture data that allow new results to be analysed – rapid learning systems.
Work in progress – big data means bigger noise. Techniques allow interesting signals in data. Applying different machine learning techniques. Mapping to clinical terms and concepts. Symbolic learning e.g. Imperial College (not healthcare).
Some of knowledge representation is computable.
CREDO programme in Jnl of Biomedical Science.
DeskGen, Edward Perello
4-5 million variants per genome.
They provide design pipelines to investors interested in targets.
How to design crispr to cut on target – prediction ranking. Based on large numbers of variables. Reviewed rules from different academic labs.
Don’t know all about genes that are essential for survival in all circumstances. Trying to predict rank of guides. Uses spearman coefficient then analyse predicted vs actual performance.
Basepaws, Anna Skaya
Basepaws cat genetic testing kit. Create personalised pet products. Have 2000 cat samples from owners around the world.
Cat genome wasn’t available until after 2014. Builds multiagent models based on phenotypes. Cats closest mammal to humans outside of primates so think modelling may be easier.
Issue of owner vs vets collecting phenotypic data. The accuracy of test and predictions, vets want us to improve. Basepaws in first year of operation.
Cambridge Uni, Jose Miguel Lobato
Interest in bayesian optimisation and neural networks. Currebt research includes models recognising / generating images and sounds from musical instruments.
Can encode molecules using SMILES . discrete generative model – Variational Autoencoder (CVAE) . GomezBombarelli et al 2016 and using context free grammar to capture constraints. Represents data as grammar production rules.
Additionally tested with symbolic regression, use bayesian optimisation to search.
What syntax will show about molecules? Can suggest potential molecules with useful properties, easy to synthesise – being able to take account of many constraints.
Next and final post will be personal reflections