Benevolentai Technology Limited

United Kingdom

Back to Profile

1-64 of 64 for Benevolentai Technology Limited Sort by
Query
Aggregations
Jurisdiction
        World 34
        United States 30
Date
2025 August 1
2025 (YTD) 2
2024 5
2023 12
2022 14
See more
IPC Class
G06N 20/00 - Machine learning 11
G06N 5/02 - Knowledge representationSymbolic representation 11
G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients 11
G16C 20/70 - Machine learning, data mining or chemometrics 10
G16B 15/30 - Drug targeting using structural dataDocking or binding prediction 8
See more
Status
Pending 24
Registered / In Force 40
Found results for  patents

1.

METHOD FOR IDENTIFYING OFF-TARGET PROTEINS

      
Application Number 19048296
Status Pending
Filing Date 2025-02-07
First Publication Date 2025-08-14
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Potterton, Andrew
  • Meyers, Joshua
  • Do Canto Angonese, Bibiana

Abstract

A computer-implemented method for identifying off-target proteins comprises: receiving an indication of a first protein comprising residues of interest for targeting; receiving data indicative of a first whole protein sequence corresponding to the first protein; comparing the first whole protein sequence against a protein sequence database to identify whole protein sequences of other proteins having a threshold level of sequence resemblance to the first whole protein sequence; performing multiple sequence alignment on the other whole protein sequences with respect to the first whole protein sequence; identifying residues within each of the aligned whole protein sequences which positionally correspond with the residues of interest in the first whole protein sequence; determining a measure of similarity between the first protein and each other protein; and identifying one or more of the other proteins as off-target proteins with respect to the drug target based on the measures of similarity.

IPC Classes  ?

  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction
  • G16B 30/10 - Sequence alignmentHomology search
  • G16B 40/30 - Unsupervised data analysis

2.

METHOD AND SYSTEM FOR IDENTIFYING BIOLOGICAL ENTITIES FOR DRUG DISCOVERY

      
Application Number 18710948
Status Pending
Filing Date 2022-11-14
First Publication Date 2025-01-16
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Corneil, Dane Sterling
  • Wiatrak, Maciej Ludwick
  • Brayne, Angus Richard Greville
  • Subbiah, Vinay Prashanth

Abstract

A computer-implemented method of training a machine learning model to identify biological entities for drug discovery is disclosed. The method comprises providing a training data set comprising a plurality of entity-linked text sequences, each text sequence including a mention of a biological entity, where the biological entity is linked to a corresponding biological entity identifier from a set of possible biological entity identifiers; masking the mention of the biological entity within each text sequence; encoding each masked text sequence into an input representation for a machine learning model; and training a machine learning model to predict the unique entity identifier of the masked biological entity based on the input representation. The described method is able to utilise the full breadth of the rich contextual information available in the biomedical text corpus to predict new biological targets for drug discovery and avoids the restrictions intrinsic to relationship prediction using knowledge graphs. The ability to identify more promising, biologically relevant targets in an automated manner, significantly reduces the requirement of human input and reduces the failure rate in targets that are progressed in the drug delivery pipeline.

IPC Classes  ?

  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction
  • G16B 40/20 - Supervised data analysis
  • G16H 70/40 - ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

3.

METHOD FOR DETERMINING A MEASURE OF RELATIVE GENE EXPRESSION

      
Application Number 18326181
Status Pending
Filing Date 2023-05-31
First Publication Date 2024-12-05
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Mcgarvey, Alison
  • Leite, Ana
  • Rolando, Delphine
  • Rosser, Gabriel

Abstract

A computer-implemented method for determining a measure of relative gene expression is disclosed. The method comprises: receiving a plurality of gene expression datasets, wherein each gene expression dataset comprises gene expression levels for a respective sample, and wherein the plurality of gene expression datasets are all measured using a first transcriptomic platform; computing a distribution of gene expression levels across the plurality of gene expression datasets; fitting a number of Gaussian components to the distribution of gene expression levels using a Gaussian mixture model; defining, based on the fitted Gaussian components, a set of relative gene expression thresholds for the plurality of gene expression datasets; and determining a measure of relative gene expression for each of a plurality of genes across the plurality of gene expression datasets based on the set of relative gene expression thresholds.

IPC Classes  ?

  • G16B 25/10 - Gene or protein expression profilingExpression-ratio estimation or normalisation
  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction

4.

METHOD FOR DETERMINING A MEASURE OF RELATIVE GENE EXPRESSION

      
Application Number GB2024051420
Publication Number 2024/246548
Status In Force
Filing Date 2024-05-31
Publication Date 2024-12-05
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Mcgarvey, Alison
  • Leite, Ana
  • Rolando, Delphine
  • Rosser, Gabriel

Abstract

A computer-implemented method for determining a measure of relative gene expression is disclosed. The method comprises: receiving a plurality of gene expression datasets, wherein each gene expression dataset comprises gene expression levels for a respective sample, and wherein the plurality of gene expression datasets are all measured using a first transcriptomic platform; computing a distribution of gene expression levels across the plurality of gene expression datasets; fitting a number of Gaussian components to the distribution of gene expression levels using a Gaussian mixture model; defining, based on the fitted Gaussian components, a set of relative gene expression thresholds for the plurality of gene expression datasets; and determining a measure of relative gene expression for each of a plurality of genes across the plurality of gene expression datasets based on the set of relative gene expression thresholds.

IPC Classes  ?

  • G16B 25/10 - Gene or protein expression profilingExpression-ratio estimation or normalisation
  • G16B 40/30 - Unsupervised data analysis

5.

METHOD AND SYSTEM FOR PREDICTING BIOLOGICAL ENTITIES

      
Application Number GB2024051290
Publication Number 2024/236317
Status In Force
Filing Date 2024-05-17
Publication Date 2024-11-21
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Corneil, Dane Sterling
  • Patel, Ravi
  • Brayne, Angus Richard Greville
  • Neculae, Georgiana
  • Jaroslawicz, Daniel
  • Kropiwnicki, Eryk
  • Subbiah, Vinay Prashanth

Abstract

A computer-implemented method predicting a biological entity meeting a user- defined biological requirement using a knowledge base, the method comprising: providing an inference knowledge base comprising a corpus of textual data; receiving a user query defining a biological requirement for which a biological entity is to be predicted; obtaining, based on the query, a query sentence text describing the biological requirement and including mention of a biological entity, in which the biological entity itself is masked for prediction; selecting a candidate biological entity for the masked biological entity and retrieving a plurality of evidence sentences from the knowledge base, each evidence sentence including mention of the candidate biological entity, wherein the evidence sentences are retrieved based on computing a similarity of the query sentence to sentences within the knowledge base; inputting each training query sentence and a plurality of retrieved evidence sentences into a reasoner model, where mention of the candidate biological entity is masked in the query sentence and evidence sentences, the reasoner model trained to predict a probability that the candidate biological entity is the masked biological entity based on the retrieved evidence sentences.

IPC Classes  ?

  • G06N 3/045 - Combinations of networks
  • G06N 3/08 - Learning methods
  • G06N 5/022 - Knowledge engineeringKnowledge acquisition
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

6.

LEARNING FROM TRIAGE ANNOTATIONS

      
Application Number 18491988
Status Pending
Filing Date 2023-10-23
First Publication Date 2024-04-18
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Neil, Daniel Lawrence
  • Corneil, Dane Sterling
  • Subbiah, Vinay Prashanth
  • Hodos, Rachel

Abstract

Herein disclosed are a methods and systems of SAMMI—a machine learning-based workflow that uses human annotations as labels for training models—used to predict human-based annotations for drug discovery. SAMMI receives an input to a model trained using human-annotated data, wherein the human-annotated data comprises at least one annotation associated with a triage-progressability annotation of whether to progress the input for the drug discovery. SAMMI also receives a set of features. The set of features are associated with the input, the model, and the triage-progressability of the input. The set of features is applied to the model to predict whether the input is triage-progressible. A model output is provided based on the prediction.

IPC Classes  ?

  • G16H 70/40 - ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
  • G06N 20/00 - Machine learning

7.

METHOD AND SYSTEM OF PREDICTING A CLINICAL OUTCOME OR CHARACTERISTIC

      
Application Number EP2023073236
Publication Number 2024/042164
Status In Force
Filing Date 2023-08-24
Publication Date 2024-02-29
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Rose, Harry
  • Farré, Anna Muñoz
  • Kothalawala, Dilini
  • Daktylidis, Antonios Poulakakis
  • Martinez, Andrea Rodriguez

Abstract

A computer-implemented method of training a machine learning model to predict a clinical outcome or characteristic based on a patient's clinical history is disclosed. The method comprises: providing training data comprising structured electronic health record data for a plurality of patients, the structured electronic health record data comprising a plurality of clinical observations, each clinical observation having a text description and an associated time stamp, wherein the training data for each patient is labelled with one or more labels, each representing a clinical outcome or characteristic; converting each patient's electronic health record data into a text sequence comprising the text descriptions concatenated in sequence of the time stamps; inputting the text sequence into a machine learning model; and training the machine learning model to predict a clinical outcome or characteristic based on the input text sequence.

IPC Classes  ?

  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

8.

GRAPH EMBEDDING SYSTEMS AND APPARATUS

      
Application Number 18365325
Status Pending
Filing Date 2023-08-04
First Publication Date 2023-12-14
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Sim, Aaron
  • Ludwick Wiatrak, Maciej
  • Brayne, Angus Richard Greville
  • Creed, Paidi
  • Paliwal, Saee

Abstract

Methods and apparatus are provided for generating an embedding of a graph. The graph includes a plurality of nodes and each node includes a connection to another one or more of the nodes. The method including and/or apparatus configured to: receiving data representative of at least a portion of the graph; transforming the nodes of the graph into a non-Euclidean geometry; iteratively updating an embedding model based the transformed nodes in the non-Euclidean geometry based on a causal loss function and a link prediction function associated with the non-Euclidean geometry.

IPC Classes  ?

9.

ENTITY SELECTION METRICS

      
Application Number 18359093
Status Pending
Filing Date 2023-07-26
First Publication Date 2023-11-16
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Griffin, Gabi
  • Litombe, Nicholas
  • Smith, Daniel Paul
  • Degiorgio, Alexander

Abstract

Embodiments of present disclosure provide a system, apparatus and method(s) for generating a set of metrics for evaluating entities used with a predictive machine learning model, the method comprising: selecting one or more sets of entities from a data sources for generating a plurality of predictions aggregated from said one or more sets of entities using one or more pre-trained predictive models; selecting a subset of predictions from the plurality of predictions based on said one or more sets of entities in relation to the data source; extracting metadata from the data source associated with the subset of predictions, where the metadata comprises entity metadata and predicted metadata; generating the set of metrics based on the metadata extracted and the subset of predictions; and outputting the set of metrics for evaluation.

IPC Classes  ?

  • G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
  • G16B 50/30 - Data warehousingComputing architectures
  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction

10.

SVO ENTITY INFORMATION RETRIEVAL SYSTEM

      
Application Number 17786922
Status Pending
Filing Date 2020-12-09
First Publication Date 2023-11-02
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor Fauqueur, Julien

Abstract

Methods, apparatus, system and computer-implemented method are provided for a computer-implemented method of automatically extracting entities associated with one or more domain(s) of interest from a corpus of text. A plurality of portions of text are received from the corpus of text, each portion of text comprising data representative of at least two entities and/or relationships thereto. For each received portion of text, identifying one or more subject-verb-object (SVO) entity data item(s) comprising data representative of at least two entities, a relationship associated with the at least two entities, a subject entity corresponding to an entity of said at least two entities, an object entity corresponding to an entity of the at least two entities, a verb portion associated with the relationship, and a direction of the relationship associated with the at least two entities. A graph structure based on the set of identified SVO entity data items is output, the graph structure comprising a graph of entity nodes and relationship edges linking the entity nodes with each relationship edge including an indication of directionality of said relationship.

IPC Classes  ?

  • G06F 40/295 - Named entity recognition
  • G06F 40/242 - Dictionaries
  • G06F 16/31 - IndexingData structures thereforStorage structures
  • G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
  • G16B 50/10 - OntologiesAnnotations

11.

SYSTEM OF SEARCHING AND FILTERING ENTITIES

      
Application Number 17786909
Status Pending
Filing Date 2020-12-11
First Publication Date 2023-11-02
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Lewis, Neal Ryan
  • Oechsle, Oliver

Abstract

Methods, apparatus, system and computer-implemented method(s) are provided for creating a graph of entities of interest and relationships thereto. A search query is received corresponding to entities of interest. The search query including data representative of a first set of entities. An expanded search query is generated based on inputting the received search query to one or more entity expansion process(es) or engine(s). The expanded search query including data representative of a second set of entities and the first set of entities. Creating a graph of entities of interest and relationships thereto based on processing the expanded search query with data representative of a corpus of text. Creating the graph by processing the expanded search query to filter an existing graph of entities of interest and relationships thereto based on the expanded search query. The existing graph of entities of interest and relationships thereto is previously generated based on the corpus of text.

IPC Classes  ?

12.

GRAPH PATTERN INFERENCE

      
Application Number 18007391
Status Pending
Filing Date 2021-07-21
First Publication Date 2023-10-05
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Hodos, Rachel
  • Briody, Joss
  • Aponte, David
  • Corneil, Dane Sterling
  • Smith, Daniel Paul

Abstract

A computer-implemented method of querying a graph to assess relationships amongst graph nodes comprises determining a query node on the graph, identifying one or more target nodes on the graph in relation to the query node based on a set of connectivity patterns; generating graph-based statistics for each target node of the one or more target nodes, wherein the graph-based statistics are extracted for subgraphs associated with each target node and the query node; and assessing the graph-based statistics of each target node to determine predicted relationships between the one or more target nodes and the query node.

IPC Classes  ?

13.

ADAPTIVE DATA MODELS AND SELECTION THEREOF

      
Application Number 18040538
Status Pending
Filing Date 2021-08-04
First Publication Date 2023-09-14
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Hodos, Rachel
  • Gao, Yingkai
  • Neil, Daniel Lawrence
  • Cedoz, Pierre-Louis Maurice Valentin

Abstract

Method(s), apparatus, and system(s) are provided for selecting a data model configuration for use in training predictive models comprise receiving two or more data model configurations, extracting a data model for each of the two or more data model configurations from a knowledge graph, generating a separate predictive model for each of the extracted data models, scoring the output of each separate predictive model based on a benchmark data set, and selecting at least one data model configuration of the two or more data model configurations based on the output scores.

IPC Classes  ?

  • G06N 5/02 - Knowledge representationSymbolic representation

14.

COHORT STRATIFICATION INTO ENDOTYPES

      
Application Number 18300623
Status Pending
Filing Date 2023-04-14
First Publication Date 2023-08-17
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Martinez, Andrea
  • Poulakakis-Daktylidis, Antonios
  • Tomlinson, Hamish
  • Watcharapichat, Pijika
  • Cakiroglu, Sera Aylin

Abstract

A system for identifying a target for the treatment of a primary disease is provided. The system comprises: an input module configured to receive data for studying the primary disease, the data relating to individuals of a cohort; an encoder configured to use machine learning to encode the data as latent variables; an interpretation module configured to interpret the latent variables to stratify the individuals of the cohort into endotypes of the primary disease; and an identification module configured to identify a target that is associated with one of the endotypes.

IPC Classes  ?

  • G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
  • G16H 70/60 - ICT specially adapted for the handling or processing of medical references relating to pathologies
  • G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

15.

DISTRIBUTIONS OVER LATENT POLICIES FOR HYPOTHESIZING IN NETWORKS

      
Application Number 18193722
Status Pending
Filing Date 2023-03-31
First Publication Date 2023-08-03
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Neil, Daniel Lawrence
  • Corneil, Dane Sterling

Abstract

Embodiments of present disclosure provide a system, apparatus and method(s) for determining one or more target nodes and associated paths from a query of a graph structure. The method receives the query to the graph structure, where the query comprises a data representation of at least one query node. The method identifies one or more target nodes in response to the query based on a policy network, where the policy network is configured to determine the one or more target nodes in accordance with a latent policy distribution associated with the policy network. The method traverses the graph structure by a search in relation to the policy network, where the search is configured to navigate from the query node to the one or more identified target nodes to determine the associated paths. The method outputs a list of the one or more target nodes and the associated paths for the query, where the list are ranked in relation to the latent policy distribution.

IPC Classes  ?

16.

PATIENT STRATIFICATION USING LATENT VARIABLES

      
Application Number 17997448
Status Pending
Filing Date 2021-04-23
First Publication Date 2023-06-01
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Sim, Aaron
  • Creed, Paidi
  • Zhang, Jiajie
  • Glastonbury, Craig
  • Norvaisas, Povilas
  • Mulas, Francesca
  • Leug, Gregor Alexander
  • Watcharapichat, Pijika

Abstract

A computer-implemented method of stratifying a population of patients into disease endotypes is provided. The method comprises: encoding data relating to the patients as latent variables; determining one or more importance measures of the latent variables; prioritising the latent variables using the importance measures; interpreting one or more of the ranked latent variables; and identifying a disease endotype that is represented by one or more of the interpreted latent variables.

IPC Classes  ?

  • G16B 40/30 - Unsupervised data analysis
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

17.

METHOD AND SYSTEM FOR IDENTIFYING BIOLOGICAL ENTITIES FOR DRUG DISCOVERY

      
Application Number GB2022052881
Publication Number 2023/089304
Status In Force
Filing Date 2022-11-14
Publication Date 2023-05-25
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Corneil, Dane Sterling
  • Wiatrak, Maciej Ludwck
  • Brayne, Angus Ricard Greville
  • Subbiah, Vinay Prashanth

Abstract

A computer-implemented method of training a machine learning model to identify biological entities for drug discovery is disclosed. The method comprises providing a training data set comprising a plurality of entity-linked text sequences, each text sequence including a mention of a biological entity, where the biological entity is linked to a corresponding biological entity identifier from a set of possible biological entity identifiers; masking the mention of the biological entity within each text sequence; encoding each masked text sequence into an input representation for a machine learning model; and training a machine learning model to predict the unique entity identifier of the masked biological entity based on the input representation. The described method is able to utilise the full breadth of the rich contextual information available in the biomedical text corpus to predict new biological targets for drug discovery and avoids the restrictions intrinsic to relationship prediction using knowledge graphs. The ability to identify more promising, biologically relevant targets in an automated manner, significantly reduces the requirement of human input and reduces the failure rate in targets that are progressed in the drug delivery pipeline.

IPC Classes  ?

  • G06N 3/042 - Knowledge-based neural networksLogical representations of neural networks
  • G10L 15/183 - Speech classification or search using natural language modelling using context dependencies, e.g. language models
  • G06N 3/045 - Combinations of networks

18.

SELECTING A CELL LINE FOR AN ASSAY

      
Application Number 17904911
Status Pending
Filing Date 2021-02-12
First Publication Date 2023-04-13
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Sim, Aaron
  • Mulas, Francesca
  • Ojamies, Poojitha
  • Glastonbury, Craig
  • Norvaisas, Povilas
  • Creed, Paidi

Abstract

A computer-implemented method and a system of selecting a cell line for an assay. The computer-implemented method and system encode data, which is comprised of one or more features, as one or more latent variables. The one or more features encoded in the one or more latent variables are identified and mapped to cell lines based on the one or more features. A relevance of one or more targets to each of one or more of the one or more latent variables is determined and the one or more targets to the cell lines are matched via the one or more latent variables.

IPC Classes  ?

  • G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
  • G16B 40/30 - Unsupervised data analysis

19.

PRIORITISING BIOLOGICAL TARGETS

      
Application Number 17782058
Status Pending
Filing Date 2020-11-27
First Publication Date 2023-01-19
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor Bollerman, Thomas Joseph

Abstract

A computer-implemented method of prioritising biological targets is disclosed. The method comprises: receiving a selection of classes of one or more categories; and, for each of a plurality of biological targets, determining an extent of alignment of the biological target to each selected class. The method also comprises prioritising the biological targets based on the extents of alignment; and outputting a representation of one or more prioritised biological targets.

IPC Classes  ?

  • G16B 5/00 - ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
  • G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
  • G16B 40/20 - Supervised data analysis

20.

DESIGNING A MOLECULE AND DETERMINING A ROUTE TO ITS SYNTHESIS

      
Application Number 17772180
Status Pending
Filing Date 2020-10-23
First Publication Date 2022-12-22
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Segler, Marwin
  • Brown, Nathan

Abstract

A computer-implemented method of designing a molecule and determining a route to synthesise the molecule is provided. The method comprises: receiving one or more desired properties of the molecule; generating one or more candidate molecules using a first machine learning technique that uses the one or more desired properties of the molecule as an input; and for at least one candidate molecule, computing one or more routes to synthesise the candidate molecule using a second machine learning technique.

IPC Classes  ?

  • G16C 20/10 - Analysis or design of chemical reactions, syntheses or processes
  • G16C 20/50 - Molecular design, e.g. of drugs
  • G16C 20/70 - Machine learning, data mining or chemometrics

21.

IDENTIFYING ONE OR MORE COMPOUNDS FOR TARGETING A GENE

      
Application Number 17623929
Status Pending
Filing Date 2020-06-26
First Publication Date 2022-11-17
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Sellwood, Matthew

Abstract

A computer-implemented method of identifying a tool compound is provided. The method comprises: searching a database for first candidate compounds that each target one or more first target genes; generating a first fingerprint for each first candidate compound by: searching the database for genes associated with the first candidate compound, and predicting genes associated with the first candidate compound; and filtering the first candidate compounds using the first fingerprints to identify a first optimum compound for targeting the one or more first target genes.

IPC Classes  ?

  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction
  • G16B 40/30 - Unsupervised data analysis
  • G16C 20/50 - Molecular design, e.g. of drugs
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

22.

LEARNING FROM TRIAGE ANNOTATIONS

      
Application Number EP2022060781
Publication Number 2022/223828
Status In Force
Filing Date 2022-04-22
Publication Date 2022-10-27
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Neil, Daniel Lawrence
  • Corneil, Dane Sterling
  • Subbiah, Vinay Prashanth
  • Hodos, Rachel Anne

Abstract

Herein disclosed are a methods and systems of SAMMI – a machine learning-based workflow that uses human annotations as labels for training models – used to predict human-based annotations for drug discovery. SAMMI receives an input to a model trained using human-annotated data, wherein the human-annotated data comprises at least one annotation associated with a triage-progressability annotation of whether to progress the input for the drug discovery. SAMMI also receives a set of features. The set of features are associated with the input, the model, and the triage-progressability of the input. The set of features is applied to the model to predict whether the input is triage-progressible. A model output is provided based on the prediction.

IPC Classes  ?

  • G16H 10/20 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
  • G16H 70/40 - ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

23.

EVALUATION FRAMEWORK FOR TARGET IDENTIFICATION IN PRECISION MEDICINE

      
Application Number GB2022050440
Publication Number 2022/185028
Status In Force
Filing Date 2022-02-18
Publication Date 2022-09-09
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Degiorgio, Alex
  • Rose, Harry
  • Gurel, Meltem
  • Creed, Paidi
  • Lueg, Gregor

Abstract

A computer-implemented method for evaluating a target identification workflow in precision medicine is provided. The target identification workflow comprises: an endotype detection module configured to detect endotypes from cohort data, and a target prediction module configured to predict targets for each of the endotypes. The method comprises: mapping endotypes detected by the endotype detection module to assays; assessing targets predicted by the target prediction module for endotype specificity; and evaluating the workflow for its ability to predict endotype specific targets. It is intended that the abstract, when published, will be accompanied by Figure 6.

IPC Classes  ?

  • G16B 5/00 - ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
  • G16B 25/10 - Gene or protein expression profilingExpression-ratio estimation or normalisation

24.

RANKING BIOLOGICAL ENTITY PAIRS BY EVIDENCE LEVEL

      
Application Number 17625113
Status Pending
Filing Date 2020-07-10
First Publication Date 2022-08-25
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Neil, Daniel Lawrence
  • Lacoste, Alix Mary Benedicte
  • Degiorgio, Alexander
  • Churcher, Ian
  • Sutherland, Russell David
  • Gao, Yingkai

Abstract

A computer-implemented method of electronically mining medical and scientific datasets to determine a ranking indicating a level of evidence for an association between two entities is disclosed. The method comprises receiving a representation of an entity pair, performing first data mining on one or more unstructured datasets to generate one or more first scores each representing an extent of association between the entities of the entity pair, and performing second data mining on one or more structured datasets to generate one or more second scores each representing an extent of association between the entities of the entity pair. The method also comprises using a classifier to determine a predicted ranking for the entity pair using the one or more first scores and the one or more second scores, and providing the predicted ranking to a user as an indication of the strength of evidence for an association between the entities of the entity pair.

IPC Classes  ?

  • G16H 10/20 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
  • G06N 5/02 - Knowledge representationSymbolic representation
  • G06N 3/02 - Neural networks

25.

GRAPH EMBEDDING SYSTEMS AND APPARATUS

      
Application Number GB2021051322
Publication Number 2022/167774
Status In Force
Filing Date 2021-05-28
Publication Date 2022-08-11
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Sim, Aaron
  • Ludwik Wiatrak, Maciej
  • Brayne, Angus
  • Creed, Paidi
  • Paliwal, Saee

Abstract

Methods and apparatus are provided for generating an embedding of a graph. The graph includes a plurality of nodes and each node includes a connection to another one or more of the nodes. The method including and/or apparatus configured to: receiving data representative of at least a portion of the graph; transforming the nodes of the graph into a non-Euclidean geometry; iteratively updating an embedding model based the transformed nodes in the non-Euclidean geometry based on a causal loss function and a link prediction function associated with the non-Euclidean geometry.

IPC Classes  ?

  • G06N 5/02 - Knowledge representationSymbolic representation
  • G06N 7/00 - Computing arrangements based on specific mathematical models
  • G06N 20/00 - Machine learning

26.

ENTITY SELECTION METRICS

      
Application Number GB2022050130
Publication Number 2022/162343
Status In Force
Filing Date 2022-01-18
Publication Date 2022-08-04
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Griffin, Gabi
  • Litombe, Nicholas
  • Smith, Daniel
  • Degiorgio, Alexander

Abstract

Embodiments of present disclosure provide a system, apparatus and method(s) for generating a set of metrics for evaluating entities used with a predictive machine learning model, the method comprising: selecting one or more sets of entities from a data sources for generating a plurality of predictions aggregated from said one or more sets of entities using one or more pre-trained predictive models; selecting a subset of predictions from the plurality of predictions based on said one or more sets of entities in relation to the data source; extracting metadata from the data source associated with the subset of predictions, where the metadata comprises entity metadata and predicted metadata; generating the set of metrics based on the metadata extracted and the subset of predictions; and outputting the set of metrics for evaluation.

IPC Classes  ?

  • G16B 5/00 - ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
  • G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
  • G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
  • G06N 5/02 - Knowledge representationSymbolic representation
  • G06N 20/00 - Machine learning

27.

Entity type identification for named entity recognition systems

      
Application Number 17426764
Grant Number 12197867
Status In Force
Filing Date 2020-03-23
First Publication Date 2022-06-16
Grant Date 2025-01-14
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Briody, Joss
  • Iso-Sipila, Juha
  • Oechsle, Oliver
  • Togia, Theodosia

Abstract

Method(s), apparatus and system(s) are provided for entity type identification and/or disambiguation of entities within a corpus of text the method including: receiving one or more entity results, each entity result comprising data representative of an identified entity and a location of the identified entity within the corpus of text; identifying an entity type for each entity of the received entity results by inputting text associated with the location of said each entity in the corpus of text to a trained entity type (ET) model configured for predicting or extracting an entity type of said each entity from the corpus of text; and outputting data representative of the identified entity type of each entity in the received entity results.

IPC Classes  ?

28.

NAME ENTITY RECOGNITION WITH DEEP LEARNING

      
Application Number 17437982
Status Pending
Filing Date 2020-03-23
First Publication Date 2022-06-16
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Iso-Sipila, Juha
  • Kruger, Felix Alexander
  • Safari, Amir
  • Togia, Theodosia

Abstract

Systems, methods and apparatus are provided for identifying entities in a corpus of text. The system comprising: a first named entity recognition (NER) system comprising one or more entity dictionaries, the first NER system configured to identify entities and/or entity types within a corpus of text based on the one or more entity dictionaries, a second NER system comprising an NER model configured for predicting entities and/or entity types within the corpus of text; and a comparison module configured for identifying entities based on comparing the entity results output from the first and second NER systems, where the identified entities are different to the entities identified by the first NER system. The system may further include an updating module configured to update the one or more entity dictionaries based on the identified entities. The system may further include a dictionary building module configured to build a set of entity dictionaries based on at least the identified entities. The system may further comprise a training module configured to generate or update the NER model by training a machine learning, ML, technique for predicting entities and/or entity types from the corpus of text using a training dataset based on data representative of the identified entities and/or entity types.

IPC Classes  ?

  • G06F 40/295 - Named entity recognition
  • G06F 40/49 - Data-driven translation using very large corpora, e.g. the web

29.

COHORT STRATIFICATION INTO ENDOTYPES

      
Application Number GB2021052570
Publication Number 2022/079413
Status In Force
Filing Date 2021-10-05
Publication Date 2022-04-21
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Martinez, Andrea
  • Poulakakis-Daktylidis, Antonios
  • Tomlinson, Hamish
  • Watcharapichat, Pijika
  • Cakiroglu, Sera Aylin

Abstract

A system for identifying a target for the treatment of a primary disease is provided. The system comprises: an input module configured to receive data for studying the primary disease, the data relating to individuals of a cohort; an encoder configured to use machine learning to encode the data as latent variables; an interpretation module configured to interpret the latent variables to stratify the individuals of the cohort into endotypes of the primary disease; and an identification module configured to identify a target that is associated with one of the endotypes.

IPC Classes  ?

  • G16H 20/10 - ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/08 - Learning methods

30.

DISTRIBUTIONS OVER LATENT POLICIES FOR HYPOTHESIZING IN NETWORKS

      
Application Number GB2021052431
Publication Number 2022/069868
Status In Force
Filing Date 2021-09-20
Publication Date 2022-04-07
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Neil, Daniel Lawrence
  • Corneil, Dane Sterling

Abstract

Embodiments of present disclosure provide a system, apparatus and method(s) for determining one or more target nodes and associated paths from a query of a graph structure. The method receives the query to the graph structure, where the query comprises a data representation of at least one query node. The method identifies one or more target nodes in response to the query based on a policy network, where the policy network is configured to determine the one or more target nodes in accordance with a latent policy distribution associated with the policy network. The method traverses the graph structure by a search in relation to the policy network, where the search is configured to navigate from the query node to the one or more identified target nodes to determine the associated paths. The method outputs a list of the one or more target nodes and the associated paths for the query, where the list are ranked in relation to the latent policy distribution.

IPC Classes  ?

  • G06N 3/00 - Computing arrangements based on biological models
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/08 - Learning methods
  • G06N 5/00 - Computing arrangements using knowledge-based models
  • G06N 5/02 - Knowledge representationSymbolic representation

31.

ADAPTIVE DATA MODELS AND SELECTION THEREOF

      
Application Number GB2021052013
Publication Number 2022/029428
Status In Force
Filing Date 2021-08-04
Publication Date 2022-02-10
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Hodos, Rachel Anne
  • Gao, Yingkai
  • Neil, Daniel Lawrence
  • Cedoz, Pierre-Louis Maurice Valentin

Abstract

Method(s), apparatus, and system(s) are provided for selecting a data model configuration for use in training predictive models comprise receiving two or more data model configurations, extracting a data model for each of the two or more data model configurations from a knowledge graph, generating a separate predictive model for each of the extracted data models, scoring the output of each separate predictive model based on a benchmark data set, and selecting at least one data model configuration of the two or more data model configurations based on the output scores.

IPC Classes  ?

  • G06N 5/02 - Knowledge representationSymbolic representation
  • G06N 20/00 - Machine learning
  • G16C 20/70 - Machine learning, data mining or chemometrics

32.

GRAPH PATTERN INFERENCE

      
Application Number GB2021051868
Publication Number 2022/023707
Status In Force
Filing Date 2021-07-21
Publication Date 2022-02-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Hodos, Rachel
  • Briody, Joss
  • Aponte, David
  • Corneil, Dane
  • Smith, Daniel Paul

Abstract

A computer-implemented method of querying a graph to assess relationships amongst graph nodes comprises determining a query node on the graph, identifying one or more target nodes on the graph in relation to the query node based on a set of connectivity patterns; generating graph- based statistics for each target node of the one or more target nodes, wherein the graph-based statistics are extracted for subgraphs associated with each target node and the query node; and assessing the graph- based statistics of each target node to determine predicted relationships between the one or more target nodes and the query node.

IPC Classes  ?

33.

MACHINE LEARNING FOR PROTEIN BINDING SITES

      
Application Number 17276675
Status Pending
Filing Date 2019-11-29
First Publication Date 2022-02-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Meyers, Joshua
  • Segler, Marwin
  • Simonovsky, Martin

Abstract

A computer-implemented method of training a machine learning model to learn ligand binding similarities between protein binding sites is disclosed. The method comprises inputting to the machine learning model: a representation of a first binding site; a representation of a second binding site, wherein the representations of the first and second binding sites comprise structural information; and a label comprising an indication of ligand binding similarity between the first binding site and the second binding site. The method also comprises outputting from the machine model a similarity indicator based on the representations of the first and second binding sites; performing a comparison between the similarity indicator and the label; and updating the machine learning model based on the comparison.

IPC Classes  ?

  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction
  • G16B 40/20 - Supervised data analysis

34.

PATIENT STRATIFICATION USING LATENT VARIABLES

      
Application Number GB2021050998
Publication Number 2021/219980
Status In Force
Filing Date 2021-04-23
Publication Date 2021-11-04
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Sim, Aaron
  • Creed, Paidi
  • Zhang, Jiajie
  • Glastonbury, Craig
  • Norvaisas, Povilas
  • Mulas, Francesca
  • Leug, Gregor Alexander
  • Watcharapichat, Pijika

Abstract

A computer-implemented method of stratifying a population of patients into disease endotypes is provided. The method comprises: encoding data relating to the patients as latent variables; determining one or more importance measures of the latent variables; prioritising the latent variables using the importance measures; interpreting one or more of the ranked latent variables; and identifying a disease endotype that is represented by one or more of the interpreted latent variables.

IPC Classes  ?

  • G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

35.

AUTOMATIC QUERY CONSTRUCTION FOR KNOWLEDGE DISCOVERY

      
Application Number 17270359
Status Pending
Filing Date 2019-06-17
First Publication Date 2021-10-14
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Smith, Daniel Paul
  • Zhang, Jiajie

Abstract

A system for discovering biological knowledge patterns of interest is described. The system comprises: a receive module configured to receive information defining a base pattern and a generalised base pattern, the base pattern comprising one or more entity nodes each representing a biological entity and one or more biological relationships indicated between the nodes, the generalised base pattern being related to the base pattern by virtue of replacing at least one entity node representing a respective biological entity by an associated set node representing a set of biological entities that includes the respective biological entity; a query module configured to generate a first query portion that, in combination with the generalised base pattern, defines a first query that retrieves a first set of results including the base pattern; and a control module configured to cause the query module to generate a second query portion that, in combination with the first query, defines a second query that retrieves a second set of results including the base pattern.

IPC Classes  ?

36.

Hierarchical relationship extraction

      
Application Number 17268124
Grant Number 11886822
Status In Force
Filing Date 2019-09-26
First Publication Date 2021-10-07
Grant Date 2024-01-30
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Creed, Paidi
  • Jin Sim, Aaron Jefferson Khey

Abstract

Methods, apparatus, system and computer-implemented method are provided for embedding a portion of text describing one or more entities of interest and a relationship. The portion of text describes a relationship for the one or more entity(ies) of interest, where the portion of text includes multiple separable entities describing the relationship and the entity(ies). The multiple separable entities including the one or more entity(ies) of interest and one or more relationship entity(ies). A set of embeddings for each of the separable entities is generated, where the set of embeddings for a separable entity includes an embedding for the separable entity and an embedding for at least one entity associated with the separable entity. One or more composite embeddings may be formed based on at least one embedding from each of the sets of embeddings. The composite embedding(s) may be sent for input to a machine learning model or classifier.

IPC Classes  ?

37.

SELECTING A CELL LINE FOR AN ASSAY

      
Application Number GB2021050342
Publication Number 2021/170971
Status In Force
Filing Date 2021-02-12
Publication Date 2021-09-02
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Sim, Aaron
  • Mulas, Francesca
  • Ojamies, Poojitha
  • Glastonbury, Craig
  • Norvaisas, Povilas
  • Creed, Paidi

Abstract

A computer-implemented method and a system of selecting a cell line for an assay. The computer-implemented method and system encode data, which is comprised of one or more features, as one or more latent variables. The one or more features encoded in the one or more latent variables are identified and mapped to cell lines based on the one or more features. A relevance of one or more targets to each of one or more of the one or more latent variables is determined and the one or more targets to the cell lines are matched via the one or more latent variables.

IPC Classes  ?

  • G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
  • G16B 40/30 - Unsupervised data analysis

38.

PROTEIN FAMILIES MAP

      
Application Number GB2020053155
Publication Number 2021/123739
Status In Force
Filing Date 2020-12-09
Publication Date 2021-06-24
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Oechsle, Oliver

Abstract

Methods, apparatus, system and computer-implemented method are provided for a computer-implemented method of identifying candidate entities of interest associated with disease selection information. The method including: receiving a first set of entities that are predicted to be associated with the disease selection information; retrieving a second set of entities that are known to be associated with the disease selection information; generating a set of entity mappings between entities of the first set of entities, entities the second set of entities, and entities of a graph structure in relation to the disease selection information, the graph structure based on an entity hierarchy, ontology or taxonomy of an entity family associated with the first and second sets of entities; linking entities from the first and second sets of entities to the graph structure based on the generated set of entity mappings; and identifying candidate entities of interest from those linked entities of the first and second sets of entities on the graph structure based on determining where each entity from the first set of entities is located on the graph structure relative to one or more entities of the second set of entities on the graph structure.

IPC Classes  ?

  • G16B 45/00 - ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
  • G06F 16/904 - BrowsingVisualisation therefor
  • G06F 16/9038 - Presentation of query results

39.

SVO ENTITY INFORMATION RETRIEVAL SYSTEM

      
Application Number GB2020053156
Publication Number 2021/123740
Status In Force
Filing Date 2020-12-09
Publication Date 2021-06-24
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Fauqueur, Julien

Abstract

Methods, apparatus, system and computer-implemented method are provided for a computer-implemented method of automatically extracting entities associated with one or more domain(s) of interest from a corpus of text. A plurality of portions of text are received from the corpus of text, each portion of text comprising data representative of at least two entities and/or relationships thereto. For each received portion of text, identifying one or more subject-verb-object (SVO) entity data item(s) comprising data representative of at least two entities, a relationship associated with the at least two entities, a subject entity corresponding to an entity of said at least two entities, an object entity corresponding to an entity of the at least two entities, a verb portion associated with the relationship, and a direction of the relationship associated with the at least two entities. A graph structure based on the set of identified SVO entity data items is output, the graph structure comprising a graph of entity nodes and relationship edges linking the entity nodes with each relationship edge including an indication of directionality of said relationship.

IPC Classes  ?

  • G06F 16/36 - Creation of semantic tools, e.g. ontology or thesauri
  • G06F 16/33 - Querying
  • G06F 16/31 - IndexingData structures thereforStorage structures
  • G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

40.

SYSTEM OF SEARCHING AND FILTERING ENTITIES

      
Application Number GB2020053176
Publication Number 2021/123742
Status In Force
Filing Date 2020-12-11
Publication Date 2021-06-24
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Lewis, Neal Ryan
  • Oechsle, Oliver

Abstract

Methods, apparatus, system and computer-implemented method(s) are provided for creating a graph of entities of interest and relationships thereto. A search query is received corresponding to entities of interest. The search query including data representative of a first set of entities. An expanded search query is generated based on inputting the received search query to one or more entity expansion process(es) or engine(s). The expanded search query including data representative of a second set of entities and the first set of entities. Creating a graph of entities of interest and relationships thereto based on processing the expanded search query with data representative of a corpus of text. Creating the graph by processing the expanded search query to filter an existing graph of entities of interest and relationships thereto based on the expanded search query. The existing graph of entities of interest and relationships thereto is previously generated based on the corpus of text.

IPC Classes  ?

  • G06F 16/33 - Querying
  • G06F 16/338 - Presentation of query results
  • G06F 16/36 - Creation of semantic tools, e.g. ontology or thesauri
  • G16B 50/00 - ICT programming tools or database systems specially adapted for bioinformatics

41.

PRIORITISING BIOLOGICAL TARGETS

      
Application Number GB2020053061
Publication Number 2021/111113
Status In Force
Filing Date 2020-11-27
Publication Date 2021-06-10
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Bollerman, Thomas Joseph

Abstract

A computer-implemented method of prioritising biological targets is disclosed. The method comprises: receiving a selection of classes of one or more categories; and, for each of a plurality of biological targets, determining an extent of alignment of the biological target to each selected class. The method also comprises prioritising the biological targets based on the extents of alignment; and outputting a representation of one or more prioritised biological targets.

IPC Classes  ?

  • G16B 5/00 - ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
  • G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
  • G16B 50/20 - Heterogeneous data integration

42.

DESIGNING A MOLECULE AND DETERMINING A ROUTE TO ITS SYNTHESIS

      
Application Number GB2020052702
Publication Number 2021/084234
Status In Force
Filing Date 2020-10-23
Publication Date 2021-05-06
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Segler, Martin
  • Brown, Nathan

Abstract

A computer-implemented method of designing a molecule and determining a route to synthesise the molecule is provided. The method comprises: receiving one or more desired properties of the molecule; generating one or more candidate molecules using a first machine learning technique that uses the one or more desired properties of the molecule as an input; and for at least one candidate molecule, computing one or more routes to synthesise the candidate molecule using a second machine learning technique.

IPC Classes  ?

  • G16C 20/10 - Analysis or design of chemical reactions, syntheses or processes
  • G16C 20/50 - Molecular design, e.g. of drugs

43.

ENSEMBLE MODEL CREATION AND SELECTION

      
Application Number 17041528
Status Pending
Filing Date 2019-03-29
First Publication Date 2021-04-22
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Sellwood, Matthew
  • Fiscato, Marco
  • Vaucher, Alain Claude

Abstract

Method(s), apparatus and system(s) are provided for generating and using an ensemble model. The ensemble may be generated by training a plurality of models based on a plurality of datasets associated with compounds; calculating model performance statistics for each of the plurality of trained models; selecting and storing a set of optimal trained model(s) from the trained models based on the calculated model performance statistics; and forming one or more ensemble models, each ensemble model comprising multiple models from the set of optimal trained model(s). The ensemble model may be used by retrieving the ensemble model and inputting, to the ensemble model, data representative of one or more labelled dataset(s) used to generate and/or train the model(s) of the ensemble model; and receiving, from the ensemble model, output data associated with labels of the one or more labelled dataset(s).

IPC Classes  ?

  • G06N 20/20 - Ensemble learning
  • G06K 9/62 - Methods or arrangements for recognition using electronic means

44.

Attention filtering for multiple instance learning

      
Application Number 17041533
Grant Number 12321863
Status In Force
Filing Date 2019-03-29
First Publication Date 2021-04-22
Grant Date 2025-06-03
Owner BenevolentAl Technology Limited (United Kingdom)
Inventor
  • Creed, Paidi
  • Sim, Aaron Jefferson Khey Jin
  • Spencer, Stephen Thomas
  • Vilenius, Mikko Juhani

Abstract

Method(s), apparatus, and system(s) are provided for filtering a set of data, the set of data comprising multiple data instances by: receiving a set of scores for the set of data; determining attention filtering information based on prior knowledge of one or more relationships between the data instances in said set of data and calculating attention relevancy weights corresponding to the data instances and the set of scores; and providing the attention filtering information to a machine learning, ML, technique or ML model.

IPC Classes  ?

  • G06N 5/022 - Knowledge engineeringKnowledge acquisition
  • G06F 16/901 - IndexingData structures thereforStorage structures
  • G06F 17/16 - Matrix or vector computation
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 20/00 - Machine learning

45.

MOLECULAR DESIGN USING REINFORCEMENT LEARNING

      
Application Number 17041573
Status Pending
Filing Date 2019-03-29
First Publication Date 2021-03-25
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Segler, Marwin Hans Siegfried

Abstract

Method(s), apparatus and system(s) are provided for designing a compound exhibiting one or more desired property(ies) using a machine learning (ML) technique. This may be achieved by generating a second compound using the ML technique to modify a first compound based on the desired property(ies) and a set of rules for modifying compounds; scoring the second compound based on the desired property(ies); determining whether to repeat the generating step based on the scoring; and updating the ML technique based on the scoring prior to repeating the generating step.

IPC Classes  ?

  • G16C 20/50 - Molecular design, e.g. of drugs
  • G16C 20/70 - Machine learning, data mining or chemometrics

46.

Graph neutral networks with attention

      
Application Number 17041625
Grant Number 12106217
Status In Force
Filing Date 2019-05-16
First Publication Date 2021-03-18
Grant Date 2024-10-01
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Creed, Paidi
  • Sim, Aaron
  • Alamdari, Amir
  • Briody, Joss
  • Neil, Daniel
  • Lacoste, Alix

Abstract

Methods and apparatus are provided for generating a graph neural network (GNN) model based on an entity-entity graph. The entity-entity graph comprising a plurality of entity nodes in which each entity node is connected to one or more entity nodes of the plurality of entity nodes by one or more corresponding relationship edges. The method comprising: generating an embedding based on data representative of the entity-entity graph for the GNN model, wherein the embedding comprises an attention weight assigned to each relationship edge of the entity-entity graph; and updating weights of the GNN model including the attention weights by minimising a loss function associated with at least the embedding; wherein the attention weights indicate the relevancy of each relationship edge between entity nodes of the entity-entity graph. The entity-entity graph may be filtered based on the attention weights of a trained GNN model. The filtered entity-entity graph may be used to update the GNN model or train another GNN model. The trained GNN model may be used to predict link relationship between a first entity and a second entity associated with the entity-entity graph.

IPC Classes  ?

  • G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
  • G06F 17/16 - Matrix or vector computation
  • G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
  • G06N 3/047 - Probabilistic or stochastic networks
  • G06N 3/08 - Learning methods
  • G06N 3/082 - Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
  • G06N 5/02 - Knowledge representationSymbolic representation

47.

Search tool using a relationship tree

      
Application Number 17041550
Grant Number 11880375
Status In Force
Filing Date 2019-03-28
First Publication Date 2021-03-11
Grant Date 2024-01-23
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor Smith, Daniel Paul

Abstract

A system for determining biological entities of interest is described. The system comprises a user input module configured to receive a search term comprising a representation of a biological entity; a search module configured to determine which biological entities of a set have a known association with the biological entity of the search term, those having a known association being results and those not having a known association being non-results, wherein biological entities of the set are related to each other by parent-child relationships in a relationship tree; and an analysis module configured to determine biological entities of interest by identifying non-results that have one or more results within a boundary in the relationship tree.

IPC Classes  ?

  • G06F 16/2455 - Query execution
  • G06F 16/28 - Databases characterised by their database models, e.g. relational or object models
  • G06F 16/26 - Visual data miningBrowsing structured data
  • G06F 16/2458 - Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries

48.

ACTIVE LEARNING MODEL VALIDATION

      
Application Number 17041620
Status Pending
Filing Date 2019-03-29
First Publication Date 2021-01-28
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Segler, Marwin Hans Siegfried

Abstract

Method(s), apparatus, and computer-implemented method(s) are provided for training a machine learning (ML) technique to generate a property model for predicting whether a compound has a particular property. An iterative procedure/feedback loop may be performed for generating the property model, the procedure including: generating a prediction result list for a plurality of compounds and their association with the particular property based on the property model; validating the property model based on compounds from the prediction result list having an association with the particular property; and updating the property model based on the property model validation. The procedure/loop may be repeated using the updated property model until it is determined the property model has been validly trained. The property model validation may include selecting a shortlist of compounds, performing simulation analysis and/or laboratory analysis on the shortlist of compounds in relation to the particular property and using the simulation and/or laboratory results in updating the property model.

IPC Classes  ?

  • G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
  • G16C 20/70 - Machine learning, data mining or chemometrics

49.

RANKING BIOLOGICAL ENTITY PAIRS BY EVIDENCE LEVEL

      
Application Number GB2020051667
Publication Number 2021/009493
Status In Force
Filing Date 2020-07-10
Publication Date 2021-01-21
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Neil, Daniel Lawrence
  • Lacoste, Alix Mary Benedicte
  • De Giorgio, Alexander
  • Churcher, Ian
  • Sutherland, Russel David
  • Gao, Yingkai

Abstract

A computer-implemented method of electronically mining medical and scientific datasets to determine a ranking indicating a level of evidence for an association between two entities is disclosed. The method comprises receiving a representation of an entity pair, performing first data mining on one or more unstructured datasets to generate one or more first scores each representing an extent of association between the entities of the entity pair, and performing second data mining on one or more structured datasets to generate one or more second scores each representing an extent of association between the entities of the entity pair. The method also comprises using a classifier to determine a predicted ranking for the entity pair using the one or more first scores and the one or more second scores, and providing the predicted ranking to a user as an indication of the strength of evidence for an association between the entities of the entity pair.

IPC Classes  ?

  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

50.

Shortlist selection model for active learning

      
Application Number 17041622
Grant Number 12094578
Status In Force
Filing Date 2019-03-29
First Publication Date 2021-01-14
Grant Date 2024-09-17
Owner BenevolentAI Technology Limited (United Kingdom)
Inventor
  • Plumbley, Dean
  • Segler, Marwin Hans Siegfried

Abstract

Method(s) and apparatus are provided for generating a selection model based on a machine learning (ML) technique, the selection model for selecting a shortlist of compounds requiring validation with a particular property. An iterative procedure or feedback loop for generating the selection model may include: receiving a prediction result list output from a property model for predicting whether a plurality of compounds are associated with a particular property and an property model score; retraining the selection model based on the property model score and/or the prediction result list; selecting a shortlist of compounds using the retrained selection model from the plurality of compounds associated with the prediction result list; sending the selected shortlist of compounds for validation with the particular property, where another ML technique is used to update the property model based on the validation; repeating the receiving and retraining of the selection model until determining the selection model has been validly trained.

IPC Classes  ?

  • G06F 11/30 - Monitoring
  • G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
  • G06N 20/00 - Machine learning
  • G16C 20/10 - Analysis or design of chemical reactions, syntheses or processes
  • G16C 20/70 - Machine learning, data mining or chemometrics

51.

IDENTIFYING ONE OR MORE COMPOUNDS FOR TARGETING A GENE

      
Application Number GB2020051549
Publication Number 2021/005332
Status In Force
Filing Date 2020-06-26
Publication Date 2021-01-14
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Sellwood, Matthew

Abstract

A computer-implemented method of identifying a tool compound is provided. The method comprises: searching a database for first candidate compounds that each target one or more first target genes; generating a first fingerprint for each first candidate compound by: searching the database for genes associated with the first candidate compound, and predicting genes associated with the first candidate compound; and filtering the first candidate compounds using the first fingerprints to identify a first optimum compound for targeting the one or more first target genes.

IPC Classes  ?

  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction
  • G16B 40/30 - Unsupervised data analysis
  • G16C 20/20 - Identification of molecular entities, parts thereof or of chemical compositions
  • G01N 33/48 - Biological material, e.g. blood, urineHaemocytometers
  • C12Q 1/02 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving viable microorganisms

52.

ENTITY TYPE IDENTIFICATION FOR NAMED ENTITY RECOGNITION SYSTEMS

      
Application Number GB2020050777
Publication Number 2020/193964
Status In Force
Filing Date 2020-03-23
Publication Date 2020-10-01
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Briody, Joss
  • Iso-Sipila, Juha
  • Oechsle, Oliver
  • Togia, Theodosia

Abstract

Method(s), apparatus and system(s) are provided for entity type identification and/or disambiguation of entities within a corpus of text the method including: receiving one or more entity results, each entity result comprising data representative of an identified entity and a location of the identified entity within the corpus of text; identifying an entity type for each entity of the received entity results by inputting text associated with the location of said each entity in the corpus of text to a trained entity type (ET) model configured for predicting or extracting an entity type of said each entity from the corpus of text; and outputting data representative of the identified entity type of each entity in the received entity results.

IPC Classes  ?

53.

NAME ENTITY RECOGNITION WITH DEEP LEARNING

      
Application Number GB2020050779
Publication Number 2020/193966
Status In Force
Filing Date 2020-03-23
Publication Date 2020-10-01
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Iso-Sipila, Juha
  • Kruger, Felix Alexander
  • Safari, Amir
  • Togia, Theodosia

Abstract

Systems, methods and apparatus are provided for identifying entities in a corpus of text. The system comprising: a first named entity recognition (NER) system comprising one or more entity dictionaries, the first NER system configured to identify entities and/or entity types within a corpus of text based on the one or more entity dictionaries, a second NER system comprising an NER model configured for predicting entities and/or entity types within the corpus of text; and a comparison module configured for identifying entities based on comparing the entity results output from the first and second NER systems, where the identified entities are different to the entities identified by the first NER system. The system may further include an updating module configured to update the one or more entity dictionaries based on the identified entities. The system may further include a dictionary building module configured to build a set of entity dictionaries based on at least the identified entities. The system may further comprise a training module configured to generate or update the NER model by training a machine learning, ML, technique for predicting entities and/or entity types from the corpus of text using a training dataset based on data representative of the identified entities and/or entity types.

IPC Classes  ?

54.

MACHINE LEARNING FOR PROTEIN BINDING SITES

      
Application Number EP2019083188
Publication Number 2020/109608
Status In Force
Filing Date 2019-11-29
Publication Date 2020-06-04
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Meyers, Joshua
  • Segler, Marwin
  • Simonovsky, Martin

Abstract

A computer-implemented method of training a machine learning model to learn ligand binding similarities between protein binding sites is disclosed. The method comprises inputting to the machine learning model: a representation of a first binding site; a representation of a second binding site, wherein the representations of the first and second binding sites comprise structural information; and a label comprising an indication of ligand binding similarity between the first binding site and the second binding site. The method also comprises outputting from the machine model a similarity indicator based on the representations of the first and second binding sites; performing a comparison between the similarity indicator and the label; and updating the machine learning model based on the comparison.

IPC Classes  ?

  • G16B 15/30 - Drug targeting using structural dataDocking or binding prediction
  • G16B 40/20 - Supervised data analysis

55.

HIERARCHICAL RELATIONSHIP EXTRACTION

      
Application Number GB2019052721
Publication Number 2020/065326
Status In Force
Filing Date 2019-09-26
Publication Date 2020-04-02
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Creed, Paidi
  • Sim, Aaron Jefferson Khey Jin

Abstract

Methods, apparatus, system and computer-implemented method are provided for embedding a portion of text describing one or more entities of interest and a relationship. The portion of text describes a relationship for the one or more entity(ies) of interest, where the portion of text includes multiple separable entities describing the relationship and the entity(ies). The multiple separable entities including the one or more entity(ies) of interest and one or more relationship entity(ies). A set of embeddings for each of the separable entities is generated, where the set of embeddings for a separable entity includes an embedding for the separable entity and an embedding for at least one entity associated with the separable entity. One or more composite embeddings may be formed based on at least one embedding from each of the sets of embeddings. The composite embedding(s) may be sent for input to a machine learning model or classifier.

IPC Classes  ?

56.

AUTOMATIC QUERY CONSTRUCTION FOR KNOWLEDGE DISCOVERY

      
Application Number GB2019051673
Publication Number 2020/039159
Status In Force
Filing Date 2019-06-17
Publication Date 2020-02-27
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Smith, Daniel Paul
  • Zhang, Jiajie

Abstract

A system for discovering biological knowledge patterns of interest is described. The system comprises: a receive module configured to receive information defining a base pattern and a generalised base pattern, the base pattern comprising one or more entity nodes each representing a biological entity and one or more biological relationships indicated between the nodes, the generalised base pattern being related to the base pattern by virtue of replacing at least one entity node representing a respective biological entity by an associated set node representing a set of biological entities that includes the respective biological entity; a query module configured to generate a first query portion that, in combination with the generalised base pattern, defines a first query that retrieves a first set of results including the base pattern; and a control module configured to cause the query module to generate a second query portion that, in combination with the first query, defines a second query that retrieves a second set of results including the base pattern.

IPC Classes  ?

  • G06F 16/2458 - Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
  • G06F 16/901 - IndexingData structures thereforStorage structures
  • G06N 5/00 - Computing arrangements using knowledge-based models
  • G16C 20/70 - Machine learning, data mining or chemometrics
  • G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

57.

GRAPH NEUTRAL NETWORKS WITH ATTENTION

      
Application Number GB2019051352
Publication Number 2019/220128
Status In Force
Filing Date 2019-05-16
Publication Date 2019-11-21
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Creed, Paidi
  • Sim, Aaron
  • Alamdari, Amir
  • Briody, Joss
  • Neil, Daniel
  • Lacoste, Alix

Abstract

Methods and apparatus are provided for generating a graph neural network (GNN) model based on an entity-entity graph. The entity-entity graph comprising a plurality of entity nodes in which each entity node is connected to one or more entity nodes of the plurality of entity nodes by one or more corresponding relationship edges. The method comprising: generating an embedding based on data representative of the entity-entity graph for the GNN model, wherein the embedding comprises an attention weight assigned to each relationship edge of the entity-entity graph; and updating weights of the GNN model including the attention weights by minimising a loss function associated with at least the embedding; wherein the attention weights indicate the relevancy of each relationship edge between entity nodes of the entity-entity graph. The entity-entity graph may be filtered based on the attention weights of a trained GNN model. The filtered entity-entity graph may be used to update the GNN model or train another GNN model. The trained GNN model may be used to predict link relationship between a first entity and a second entity associated with the entity-entity graph.

IPC Classes  ?

  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/08 - Learning methods
  • G06N 5/02 - Knowledge representationSymbolic representation
  • G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

58.

SEARCH TOOL FOR KNOWLEDGE DISCOVERY

      
Application Number GB2019050889
Publication Number 2019/186168
Status In Force
Filing Date 2019-03-28
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Smith, Daniel Paul

Abstract

A system is disclosed for searching a set of biological entities. The system comprises: a user input module configured to receive a user input comprising a representation of a biological entity; a search module configured to determine which entities of a set of biological entities are associated with the user input; a visualisation module configured to render a visualisation of multiple biological entities of the set and of parent-child relationships between them; and an overlay module configured to render an association indicator visually indicating one or more biological entities of the visualisation that are associated with the user input.

IPC Classes  ?

  • G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

59.

REINFORCEMENT LEARNING

      
Application Number GB2019050925
Publication Number 2019/186196
Status In Force
Filing Date 2019-03-29
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Segler, Marwin Hans Siegfried

Abstract

Method(s), apparatus and system(s) are provided for designing a compound exhibiting one or more desired property(ies) using a machine learning (ML) technique. This may be achieved by generating a second compound using the ML technique to modify a first compound based on the desired property(ies) and a set of rules for modifying compounds; scoring the second compound based on the desired property(ies); determining whether to repeat the generating step based on the scoring; and updating the ML technique based on the scoring prior to repeating the generating step.

IPC Classes  ?

  • G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
  • G16C 20/70 - Machine learning, data mining or chemometrics

60.

SEARCH TOOL USING A RELATIONSHIP TREE

      
Application Number GB2019050890
Publication Number 2019/186169
Status In Force
Filing Date 2019-03-28
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor Smith, Daniel Paul

Abstract

A system for determining biological entities of interest is described. The system comprises a user input module configured to receive a search term comprising a representation of a biological entity; a search module configured to determine which biological entities of a set have a known association with the biological entity of the search term, those having a known association being results and those not having a known association being non-results, wherein biological entities of the set are related to each other by parent-child relationships in a relationship tree; and an analysis module configured to determine biological entities of interest by identifying non-results that have one or more results within a boundary in the relationship tree.

IPC Classes  ?

61.

ACTIVE LEARNING MODEL VALIDATION

      
Application Number GB2019050921
Publication Number 2019/186193
Status In Force
Filing Date 2019-03-29
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Segler, Marwin Hans Siegfried

Abstract

Method(s), apparatus, and computer-implemented method(s) are provided for training a machine learning (ML) technique to generate a property model for predicting whether a compound has a particular property. An iterative procedure/feedback loop may be performed for generating the property model, the procedure including: generating a prediction result list for a plurality of compounds and their association with the particular property based on the property model; validating the property model based on compounds from the prediction result list having an association with the particular property; and updating the property model based on the property model validation. The procedure/loop may be repeated using the updated property model until it is determined the property model has been validly trained. The property model validation may include selecting a shortlist of compounds, performing simulation analysis and/or laboratory analysis on the shortlist of compounds in relation to the particular property and using the simulation and/or laboratory results in updating the property model.

IPC Classes  ?

  • G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
  • G16C 20/70 - Machine learning, data mining or chemometrics

62.

ENSEMBLE MODEL CREATION AND SELECTION

      
Application Number GB2019050923
Publication Number 2019/186194
Status In Force
Filing Date 2019-03-29
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Sellwood, Matthew
  • Fiscato, Marco
  • Vaucher, Alain Claude

Abstract

Method(s), apparatus and system(s) are provided for generating and using an ensemble model. The ensemble may be generated by training a plurality of models based on a plurality of datasets associated with compounds; calculating model performance statistics for each of the plurality of trained models; selecting and storing a set of optimal trained model(s) from the trained models based on the calculated model performance statistics; and forming one or more ensemble models, each ensemble model comprising multiple models from the set of optimal trained model(s). The ensemble model may be used by retrieving the ensemble model and inputting, to the ensemble model, data representative of one or more labelled dataset(s) used to generate and/or train the model(s) of the ensemble model; and receiving, from the ensemble model, output data associated with labels of the one or more labelled dataset(s).

IPC Classes  ?

  • G16C 20/70 - Machine learning, data mining or chemometrics

63.

SHORTLIST SELECTION MODEL FOR ACTIVE LEARNING

      
Application Number GB2019050924
Publication Number 2019/186195
Status In Force
Filing Date 2019-03-29
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Plumbley, Dean
  • Segler, Marwin Hans Siegfried

Abstract

Method(s) and apparatus are provided for generating a selection model based on a machine learning (ML) technique, the selection model for selecting a shortlist of compounds requiring validation with a particular property. An iterative procedure or feedback loop for generating the selection model may include: receiving a prediction result list output from a property model for predicting whether a plurality of compounds are associated with a particular property and an property model score; retraining the selection model based on the property model score and/or the prediction result list; selecting a shortlist of compounds using the retrained selection model from the plurality of compounds associated with the prediction result list; sending the selected shortlist of compounds for validation with the particular property, where another ML technique is used to update the property model based on the validation; repeating the receiving and retraining of the selection model until determining the selection model has been validly trained.

IPC Classes  ?

  • G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
  • G16C 20/70 - Machine learning, data mining or chemometrics

64.

ATTENTION FILTERING FOR MULTIPLE INSTANCE LEARNING

      
Application Number GB2019050927
Publication Number 2019/186198
Status In Force
Filing Date 2019-03-29
Publication Date 2019-10-03
Owner BENEVOLENTAI TECHNOLOGY LIMITED (United Kingdom)
Inventor
  • Creed, Paidi
  • Sim, Aaron Jefferson Khey Jin
  • Spencer, Stephen Thomas
  • Vilenius, Mikko Juhani

Abstract

Method(s), apparatus, and system(s) are provided for filtering a set of data, the set of data comprising multiple data instances by: receiving a set of scores for the set of data; determining attention filtering information based on prior knowledge of one or more relationships between the data instances in said set of data and calculating attention relevancy weights corresponding to the data instances and the set of scores; and providing the attention filtering information to a machine learning, ML, technique or ML model.

IPC Classes  ?