Disclosed is a method, performed by a computing device, for generating a molecule on the basis of a reinforcement learning model, according to an embodiment of the present disclosure. The method comprises the steps of: inputting target pocket information into a molecule generation model; performing a reinforcement learning process on the basis of the target pocket information by using the molecule generation module; and generating a final molecule corresponding to the target pocket information on the basis of the reinforcement learning process by using the molecule generation model, wherein the reinforcement learning process may use an action or state associated with a partially generated molecule.
A method by which a computing device generates molecular fragment-based molecules, according to one embodiment of the present disclosure, comprises the steps of: acquiring an initial molecular structure; acquiring one or more attachment sites to which a molecular fragment is to be attached in the initial molecular structure; and utilizing a reinforcement learning model so as to generate a new molecule on the basis of candidate molecular fragments to be attached to the one or more attachment sites of the initial molecular structure, wherein the reinforcement learning model utilizes an action space based on the one or more attachment sites and the candidate molecular fragments.
The present disclosure relates to a method of training a protein structure prediction model, performed by a computing device. According to an embodiment of the present disclosure, the method may comprise the steps of: repeating protein structure prediction multiple times by using the protein structure prediction model; and calculating a plurality of loss functions on the basis of the plurality of predictions, wherein the plurality of loss functions may include loss functions calculated in different manners according to the number of repetitions.
G16B 15/00 - TIC spécialement adaptées à l’analyse de structures moléculaires bidimensionnelles ou tridimensionnelles, p. ex. relations structurelles ou fonctionnelles ou alignement de structures
G16B 30/00 - TIC spécialement adaptées à l’analyse de séquences impliquant des nucléotides ou des aminoacides
Disclosed is a method for predicting a binding site of a protein, the method performed by one or more processors of a computing device.
Disclosed is a method for predicting a binding site of a protein, the method performed by one or more processors of a computing device.
The method may include: obtaining one or more candidate data; filtering the one or more candidate data, and obtaining the filtered candidate data, by using a first neural network model for detecting a binding site; and predicting a binding residue based on the filtered candidate data by using a second neural network model for identifying the binding residue, and the first neural network model may share some parameters with the second neural network model.
In the present disclosure, a method by which a computing device predicts a protein structure by using a neural network model may comprise the steps of: obtaining information about each residue of a protein; updating the information about each residue by using update information associated with a twist structure; and adjusting the protein structure on the basis of the updated information about each residue.
G16B 15/00 - TIC spécialement adaptées à l’analyse de structures moléculaires bidimensionnelles ou tridimensionnelles, p. ex. relations structurelles ou fonctionnelles ou alignement de structures
G16B 30/00 - TIC spécialement adaptées à l’analyse de séquences impliquant des nucléotides ou des aminoacides
G16B 45/00 - TIC spécialement adaptées à la visualisation de données liées à la bio-informatique, p. ex. affichage de cartes ou de réseaux
Disclosed is a method for predicting an affinity between a drug and a target substance, which is performed by a computing device including at least one processor according to some embodiments of the present disclosure. The method for predicting an affinity between a drug and a target substance may include: extracting a feature value of each of the drug and the target substance by using a first neural network; performing a cross attention between the feature values by using a second neural network; and predicting the affinity between the drug and the target substance based on a result of performing the cross attention by using a third neural network.
The present disclosure relates to a method for predicting a binding structure between a protein and a ligand by using a neural network model performed by a computing device, the method comprising the steps of: obtaining a first pair representation associated with the protein; obtaining a second pair representation associated with the ligand; obtaining an interaction representation between the protein and the ligand; updating the first pair representation, the second pair representation, and the interaction representation; and, after the updating, predicting a binding structure between the protein and the ligand, on the basis of the first pair representation, the second pair representation, and the interaction representation.
The present disclosure relates to a method, performed by at least one computing device, for predicting an interaction structure between a protein and a compound. The method may comprise the steps of: obtaining information regarding a protein graph representing a structure of a protein; obtaining information regarding a compound graph representing a structure of a compound; and predicting, on the basis of the information regarding the protein graph and the information regarding the compound graph, an interaction feature between a node of the protein graph and a node of the compound graph. In this regard, the node of the protein graph may be associated with a substructure of the protein, and the node of the compound graph may be associated with a fragment of the compound, which is greater than an atomic unit.
Disclosed is a computer program stored in a computer-readable storage medium. The method may include: obtaining a target protein included in training data and indicator information related to the target protein; identifying a homologous protein of the target protein; and augmenting the training data by matching the homologous protein to the indicator information related to the target protein.
Disclosed according to an embodiment of the present disclosure is a computer program stored on a computer-readable storage medium. The method comprises the steps of: acquiring target proteins and index information associated with the target protein contained in training data; identifying homologous proteins of the target proteins; and augmenting the training data by correlating the index information associated with the target proteins and the homologous proteins.
A task to be achieved in the present disclosure is to train a local neural network model based on federated learning in consideration of a heterogeneous environment providing different training data. A training method for training a local neural network model based on federated learning, the method being performed by at least one computing device, according to an embodiment of the present disclosure for achieving the task described above, may comprises the steps of: calculating the difference between a global neural network model and the local neural network model; determining additional regularization for training the local neural network model, on the basis of the calculated difference; and training the local neural network model on the basis of a loss function including the determined additional regularization.
The present disclosure relates to training a local neural network model based on federated learning in consideration of a heterogeneous environment in which training data are different from each other. An exemplary embodiment of the present disclosure provides a method of training a local neural network model based on federated learning, the method being performed by at least one computing device, the method including: calculating a difference between a global neural network model and a local neural network model; determining an additional regularization for training the local neural network model based on the calculated difference; and training the local neural network model based on a loss function including the determined additional regularization.
The present disclosure relates to a new drug predicting method, and device for performing method. A method for predicting new drugs includes generating preprocessed compound information by preprocessing compound information of a compound, by a new drug predicting device; generating preprocessed protein information by preprocessing protein information of a protein, by the new drug predicting device; concatenating the preprocessed compound information and the preprocessed protein information by the new drug predicting device; and predicting a binding affinity based on the concatenated preprocessed compound information and preprocessed protein information by the new drug predicting device.
Disclosed according to one embodiment of the present disclosure are a method for identifying a binding area having selectivity for a target protein, and a computer program stored in a computer-readable storage medium, using same. Particularly, according to the present disclosure, a computer device: determines at least one binding area of a homologous protein, corresponding to at least one binding area of a target protein; compares the at least one binding area of the homologous protein and the at least one binding area of the target protein that have been determined; identifies, on the basis of the comparison, a binding area, among the at least one binding area of the target protein, having selectivity in relation to the at least one binding area of the homologous protein; and identifies an amino acid residue included in the identified binding area, which contributes to selectivity.
G16B 15/30 - Ciblage de médicament à l’aide de données structurellesPrévision d’amarrage ou de liaison moléculaire
G16B 45/00 - TIC spécialement adaptées à la visualisation de données liées à la bio-informatique, p. ex. affichage de cartes ou de réseaux
G16B 5/00 - TIC spécialement adaptées à la modélisation ou aux simulations dans la biologie des systèmes, p. ex. réseaux de régulation génétique, réseaux d’interaction entre protéines ou réseaux métaboliques
According to an embodiment of the present disclosure, a data sampling method for active learning performed by a computing device comprising at least one processor may comprise the steps of: generating normalized feature vectors for an unlabeled data set on the basis of a neural network model; estimating the density of the normalized feature vectors by grouping the normalized feature vectors on a vector space; and extracting query data for active learning from the unlabeled data set on the basis of the estimated density.
According to some embodiments of the present disclosure, disclosed is a method for predicting the affinity between a drug and a target substance that is performed by a computing device including at least one processor. The method for predicting the affinity between a drug and a target substance may comprise the steps of: extracting a feature value of each of the drug and the target substance by using a first neural network; performing cross attention between the feature values by using a second neural network; and predicting the affinity between the drug and the target substance on the basis of the result of performing the cross attention by using a third neural network.
Disclosed is a method for training a multi-task model, performed by a computing device comprising at least one processor, according to some embodiments of the present disclosure. The method for training a multi-task model may comprise the steps of: acquiring a training data set; and on the basis of the training data set, training a neural network model for outputting a result of prediction of an input value and estimating uncertainty of the prediction, wherein a loss function for training the neural network model includes a first loss function for quantifying the prediction result and the uncertainty of the prediction, and a second loss function for improving the prediction accuracy of the neural network model.
Disclosed is a method, performed by a computing device, for predicting a medicine for controlling the entrance of a virus into a host according to an embodiment. The method may comprise the steps of: estimating first affinity between a medicine and a protein receptor and second affinity between the medicine and a protease by using a pre-trained neural network model; and filtering a database on the basis of the first affinity and the second affinity to predict a medicine for controlling the entrance of a virus into a host.
According to an embodiment of the present disclosure, disclosed is a training method for neural network model diversity performed by a computing device. The method may comprise the steps of: training a first neural network model on the basis of a training data set; and training a second neural network model on the basis of the training data set such that the trained first neural network model and the second neural network model generate different outputs.
A curriculum-based active learning method carried out by a computing device is disclosed according to one embodiment of the present disclosure. The method may comprise the steps of: training a neural network model on the basis of a first training data set among training data sets acquired through active learning; and training the neural network model by using, among the training data sets acquired through active learning, a second training data set having a higher training difficulty level than that of the first training data set.
The present disclosure relates to a method for predicting whether or not a compound binds to a hinge of the active site of a kinase, the method comprising the steps of: generating a feature vector representing information about the surrounding environment of each of the atoms of the compound on the basis of the chemical structure of the compound; and classifying, on the basis of the feature vector, whether or not each atom of the compound binds to a hinge region of the kinase.
G16B 5/00 - TIC spécialement adaptées à la modélisation ou aux simulations dans la biologie des systèmes, p. ex. réseaux de régulation génétique, réseaux d’interaction entre protéines ou réseaux métaboliques
G16B 15/30 - Ciblage de médicament à l’aide de données structurellesPrévision d’amarrage ou de liaison moléculaire
G16B 20/30 - Détection de sites de liaison ou de motifs
G16C 20/30 - Prévision des propriétés des composés, des compositions ou des mélanges chimiques
G16C 20/70 - Apprentissage automatique, exploration de données ou chimiométrie
The present disclosure relates to a method for analyzing genetic information by using a sparsely connected neural network model generated on the basis of gene ontology (GO) information so as to analyze genetic information by using a high-accuracy model while reducing the cost of learning the existing fully connected neural network. Specifically, the method may comprise the steps of: learning a neural network model generated on the basis of gene ontology information; and analyzing genetic information on the basis of the neural network model, wherein the neural network model includes a hierarchical structure in which nodes are sparsely connected on the basis of gene ontology information.
The present disclosure relates to a method adapted to derive an epitope candidate on the basis of the amino acid sequence of an antigen, the method comprising the steps of: generating a plurality of amino acid sub-sequences on the basis of the amino acid sequence of an antigen; generating characteristic values for the plurality of amino acid sub-sequences; and deriving at least one epitope candidate on the basis of the characteristic values for the plurality of amino acid sub-sequences.
G16B 5/00 - TIC spécialement adaptées à la modélisation ou aux simulations dans la biologie des systèmes, p. ex. réseaux de régulation génétique, réseaux d’interaction entre protéines ou réseaux métaboliques
G16B 15/00 - TIC spécialement adaptées à l’analyse de structures moléculaires bidimensionnelles ou tridimensionnelles, p. ex. relations structurelles ou fonctionnelles ou alignement de structures
The present disclosure relates to a method for searching for a compound related to a biological target, and the method may comprise the steps of: calculating a plurality of associations between the biological target and a plurality of compounds; extracting some of the plurality of compounds on the basis of the associations; analyzing frequencies of a plurality of pharmacophore characteristics in the extracted some compounds; extracting some of the plurality of pharmacophore characteristics on the basis of the analysis of the frequencies; and searching for compounds that share the extracted some pharmacophore characteristics.
The present invention relates to a new drug prediction method, and an apparatus for performing the method. The new drug prediction method can comprise steps in which: a new drug prediction apparatus preprocesses compound information about a compound to generate compound information (preprocessing); the new drug prediction apparatus preprocesses protein information about a protein to generate protein information (preprocessing); the new drug prediction apparatus concatenates the compound information (preprocessing) and the protein information (preprocessing); and the new drug prediction apparatus predicts a binding force on the basis of the compound information (preprocessing) and the protein information (preprocessing) that have been concatenated.
G16C 20/30 - Prévision des propriétés des composés, des compositions ou des mélanges chimiques
G16C 20/10 - Analyse ou conception des réactions, des synthèses ou des procédés chimiques
G16C 20/50 - Conception moléculaire, p. ex. de médicaments
G16C 60/00 - Science informatique des matériaux, c.-à-d. TIC spécialement adaptées à la recherche des propriétés physiques ou chimiques de matériaux ou de phénomènes associés à leur conception, synthèse, traitement, caractérisation ou utilisation
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
genetic testing for scientific research purposes; development of computer software systems for the storage of data; development of computer software systems for the transmission of data; development of computer software systems for the processing of data; database design and development; development of computer software for data processing; providing information relating to scientific analysis of the genetic information via online networks; biological information check being genetic testing for scientific research purposes; providing platform as a service (PaaS) services featuring computer software platforms for use in database management; platform as a service (PaaS) services featuring computer software platforms for use in drug development for use in prediction, identification or discovery of biological targets, molecular structures, drug binding affinity, drug absorption, drug distribution, drug metabolism, drug excretion and drug toxicity; software design and development; software engineering; development of pharmaceutical preparations and medicines; pharmaceutical research services; research relating to medicines being pharmaceutical research services; pharmaceutical drug development services; scientific research in the field of genetic engineering; scientific research in the field of gene analysis; scientific research for medical products; application service provider, namely, hosting computer application software for others in the field of knowledge management for creating searchable databases of information and data
27.
METHOD AND SYSTEM FOR DETERMINING FEATURE INFLUENCE
A method and system for determining a feature influence is disclosed. A method for determining, using a neural network for classification of input data having J number of features (J is a natural number of 2 or greater) into K number of different classes (K is a natural number of 2 or greater), a degree by which each of one or more features among the J number of features influences the classification, comprises the steps of: extracting, from N number of input data, k class input data classified as a specific k class among K number of classes in order to calculate an influence (DIj) of a specific j feature among J number of features, by a feature influence determination system; calculating, by the feature influence determination system, a kk influence indicating a degree of influence on the classifying of the j feature (xij) of the extracted k class input data as the k class; calculating at least one kr influence that is data indicating a degree of influence on classifying of the j feature (xij) of the k class input data, as an r(r=!k) class other than the k class, by the feature influence determination system; and calculating the influence (DIj) on the basis of a difference between each of the at least one kr influence and a value of the kk influence, by the feature influence determination system.