Techniques are provided for analyzing circular DNA in a biological sample (e.g., including cell-free DNA, such as plasma). For example, to measure circular DNA, cleaving can be performed to linearize the circular DNA so that they may be sequenced. Example cleaving techniques include restriction enzymes and transposases. Then, one or more criteria can be used to identify linearized DNA molecules, e.g., so as to differentiate from linear DNA molecules. An example criterion is mapping a pair of reversed end sequences to a reference genome. Another example criterion is identification of a cutting tag, e.g., associated with a restriction enzyme or an adapter sequence added by a transposase. Once circular DNA molecules (e.g., eccDNA and circular mitochondrial DNA) are identified, they may be analyzed (e.g., to determine a count, size profile, and/or methylation) to measure a property of the biological sample, including genetic properties and level of a disease.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
The present disclosure describes techniques for measuring quantities (e.g., relative frequencies) of sequence end motifs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a condition of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the sequence end motifs. The present disclosure provides various uses for measures of the relative frequencies of sequence end motifs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from one of such tissue may be referred to as clinically-relevant DNA.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16H 50/30 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indicesICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for individual health risk assessment
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
Scientific and medical research services; providing information in the field of genetics and cancer for medical or scientific research via an online database; providing temporary use of on-line non-downloadable software and applications for use in studying, diagnosing or screening for cancer and studying genetics and DNA; providing temporary use of on-line non-downloadable cloud computing software for use in studying, diagnosing or screening cancer and studying genetics and DNA; software as a service (SaaS) services featuring software in the nature of a platform for genetic and bioinformatics analysis. Genetic testing and reporting for medical purposes; medical testing for diagnostic or treatment purposes; medical screening; medical diagnostic testing, monitoring and reporting services; providing medical information regarding genetics via a website; genetic analysis and reporting services for medical purposes.
Various embodiments are directed to detecting infection-causing microbial cell-free DNA from a biological sample based on their size profiles and/or end signatures, in which the detection of infection-causing microbial DNA can be performed without no template control (NTC) samples. Embodiments can include identifying the infection-causing pathogen-derived microbial DNA based on sizes of microbial cell-free DNA molecules. Embodiments can also include identifying from the infection-causing pathogen-derived microbial DNA based on end signatures of microbial cell-free DNA molecules. Embodiments can also include applying a machine-learning algorithm to a plurality of vectors that represent end signatures of the microbial cell-free DNA molecules, to identify the infection-causing pathogen-derived microbial DNA. By detecting the infection-causing pathogen-derived microbial DNA, a level of infection for the biological sample can be predicted.
09 - Scientific and electric apparatus and instruments
10 - Medical apparatus and instruments
Goods & Services
Downloadable scientific and medical data via the internet
(term considered too vague by the International Bureau -
rule 13 (2) (b) of the Regulations); downloadable electronic
data files featuring genetic information; downloadable
electronic data files featuring medical information;
downloadable electronic data files featuring cancer
screening information and results; scientific instruments
and apparatus for use in genetic research and analysis;
scientific instruments and apparatus for use in medical
research and analysis; scientific instruments and apparatus
for use in cancer research and analysis; scientific
instruments and apparatus for use in body fluid collection
and analysis; medical laboratory research instruments for
use in detecting cancer; medical laboratory research
instruments for use in detecting genetic sequences; medical
laboratory research instruments for use in collecting and
analyzing body fluids; test tubes. Medical apparatus and instruments for use in detecting
cancer; medical apparatus and instruments for use in
detecting genetic sequences; medical apparatus and
instruments for use in collecting and analyzing body fluids;
blood testing apparatus.
6.
TUMOR FRACTION ESTIMATION USING METHYLATION VARIANTS
A computer-implemented method for generating a tumor fraction estimate from a DNA sample of a subject is disclosed. The method may include receiving a dataset of methylation sequence reads from the sample of the subject. The method may also include dividing the dataset into a plurality of variants. The method, may further include determining methylation states of the plurality of variants. The method may further include filtering the plurality of variants based on a bank of reference sequence reads to generate a filtered subset of variants. The bank may include reads generated from non-cancer samples and biopsy samples of a plurality of tissues of reference individuals. The counts of the methylation states of variants in the filtered subset are determined and input to a model that is trained based on recurrence rates of the variants in the reference sequence reads. The tumor fraction estimate may be generated by the model.
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
7.
COMPOSITIONS AND METHODS FOR IDENTIFYING CELL TYPES
YISSUM RESEARCH DEVELOPMENT COMPANY OF THE HEBREW UNIVERSITY OF JERUS... (Israel)
HADASIT MEDICAL RESEARCH SERVICES AND DEVELOPMENT LTD. (Israel)
GRAIL, INC. (USA)
Inventor
Kaplan, Tomer
Dor, Yuval
Shemer, Ruth
Glaser, Benjamin
Abstract
The present disclosure relates generally to compositions and methods for determining cell type based on a methylation profile of associated DNA. For cell free DNA, such determination can be used to identify disease or conditions relating to the cell type. For tumor cells, such determination is useful for identifying their primary origin.
C12Q 1/6881 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
8.
Systems and methods for classifying patients with respect to multiple cancer classes
Technical solutions for classifying patients with respect to multiple cancer classes are provided. The classification can be done using cell-free whole genome sequencing information from subjects. A reference set of subjects is used to train classifiers to recognize genomic markers that distinguish such cancer classes. The classifier training includes dividing the reference genome into a set of non-overlapping bins, applying a dimensionality reduction method to obtain a feature set, and using the feature set to train classifiers. For subjects with unknown cancer class, the trained classifiers provide probabilities or likelihoods that the subject has a respective cancer class for each cancer in a set of cancer classes. The present disclosure thus describes methods to improve the screening and detection of cancer class from among several cancer classes. This serves to facilitate early and appropriate treatment for subjects afflicted with cancer.
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
G16H 10/40 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
G16H 10/60 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
G16H 70/60 - ICT specially adapted for the handling or processing of medical references relating to pathologies
09 - Scientific and electric apparatus and instruments
10 - Medical apparatus and instruments
Goods & Services
(1) Downloadable scientific and medical data via the internet (term considered too vague by the International Bureau - rule 13 (2) (b) of the Regulations); downloadable electronic data files featuring genetic information; downloadable electronic data files featuring medical information; downloadable electronic data files featuring cancer screening information and results; scientific instruments and apparatus for use in genetic research and analysis; scientific instruments and apparatus for use in medical research and analysis; scientific instruments and apparatus for use in cancer research and analysis; scientific instruments and apparatus for use in body fluid collection and analysis; medical laboratory research instruments for use in detecting cancer; medical laboratory research instruments for use in detecting genetic sequences; medical laboratory research instruments for use in collecting and analyzing body fluids; test tubes.
(2) Medical apparatus and instruments for use in detecting cancer; medical apparatus and instruments for use in detecting genetic sequences; medical apparatus and instruments for use in collecting and analyzing body fluids; blood testing apparatus.
10.
DIAGNOSTIC APPLICATIONS USING NUCLEIC ACID FRAGMENTS
Various embodiments are directed to applications (e.g., classification of biological samples) of the analysis of the count, the fragmentation patterns, and size of cell-free nucleic acids, e.g., plasma DNA and serum DNA, including nucleic acids from pathogens, such as viruses. Embodiments of one application can determine if a subject has a particular condition. For example, a method of present disclosure can determine if a subject has cancer or a tumor, or other pathology. Embodiments of another application can be used to assess the stage of a condition, or the progression of a condition over time. For example, a method of the present disclosure may be used to determine a stage of cancer in a subject, or the progression of cancer in a subject over time (e.g., using samples obtained from a subject at different times).
C12Q 1/6888 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
C12Q 1/6879 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for sex determination
C12Q 1/6806 - Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Methods are provided to improve the positive predictive value for cancer detection using cell-free nucleic acid samples. Various embodiments are directed to applications (e.g., diagnostic applications) of the analysis of the fragmentation patterns and size of cell-free DNA, e.g., plasma DNA and serum DNA, including nucleic acids from pathogens, including viruses. Embodiments of one application can determine if a subject has a particular condition. For example, a method of present disclosure can determine if a subject has cancer or a tumor, or other pathology. Embodiments of another application can be used to assess the stage of a condition, or the progression of a condition over time. For example, a method of the present disclosure may be used to determine a stage of cancer in a subject, or the progression of cancer in a subject over time (e.g., using samples obtained from a subject at different times).
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6806 - Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
C12Q 1/70 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving virus or bacteriophage
G16H 10/40 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
Various embodiments are directed to applications (e.g., classification of biological samples) of the analysis of the count and size of cell-free nucleic acids, e.g., plasma DNA and serum DNA, including nucleic acids from pathogens, such as viruses. Embodiments of one application can predict if a subject previously treated for a pathology will relapse at a future time point. Targeted sequencing (e.g., specifically designed capture probes, amplification primers) can be used to identify DNA across the entire viral genome.
Various embodiments are directed to applications (e.g., classification of biological samples) of the analysis of the count and size of cell-free nucleic acids, e.g., plasma DNA and serum DNA, including nucleic acids from pathogens, such as viruses. Embodiments of one application can predict if a subject previously treated for a pathology will relapse at a future time point. Targeted sequencing (e.g., specifically designed capture probes, amplification primers) can be used to identify DNA across the entire viral genome.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Methods for measuring subpopulations of target molecules (e.g., polypeptides and/or cell-free ribonucleic acid) are provided. In some embodiments, methods of generating a sequencing library from a plurality of RNA molecules in a test sample obtained from a subject are provided, as well as methods for analyzing the sequencing library to detect, e.g., the presence or absence of a disease.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
G01N 33/574 - ImmunoassayBiospecific binding assayMaterials therefor for cancer
C12Q 1/6809 - Methods for determination or identification of nucleic acids involving differential detection
15.
SOMATIC VARIANT COOCCURRENCE WITH ABNORMALLY METHYLATED FRAGMENTS
Systems and methods for identifying variant alleles as somatic or germline are provided. Reference and variant alleles for a genomic position are identified. Methylation states and sequences of nucleic acid fragment sequences that map to the genomic position are obtained from a sample of a subject. Using the sequences of nucleic acid fragment sequences, each nucleic acid fragment sequence that has the reference allele is assigned to a reference subset, and each nucleic acid fragment sequence that has the variant allele is assigned to a variant subset. One or more indications of the methylation states across the nucleic acid fragment sequences in the variant subset and an indication of the number of nucleic acid fragment sequences in the reference subset versus the variant subset are applied to a trained binary classifier. An identification of the variant allele at the genomic position as somatic or germline is obtained from the classifier.
Nuclease activity can affect the methylation level and fragmentation of cfDNA. Certain levels of nuclease activity may be correlated with certain levels of methylation in certain regions. Methylation level in certain genomic regions can be analyzed to classify nuclease activity. Methylation statuses of different genomic regions compared to methylation statuses of other genomic regions can determine a level of a condition (e.g., a disease such as cancer or disorder) in a subject. Nuclease activity can be monitored through analysis of methylation statuses of different sites. The efficacy of a treatment can also be determined using methylation levels at certain genomic regions. The number of fragments from genomic regions that are hypomethylated or hypermethylated in a reference genome can be used to provide information (e.g., fractional concentration) on the sample itself. The size distribution of extrachromosomal circular DNA can also be used to analyze a biological sample. Systems are also described.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
Scientific and medical research services; providing
information in the field of genetics and cancer for medical
or scientific research via an online database; providing
temporary use of on-line non-downloadable software and
applications for use in studying, diagnosing or screening
for cancer and studying genetics and DNA; providing
temporary use of on-line non-downloadable cloud computing
software for use in studying, diagnosing or screening cancer
and studying genetics and DNA; software as a service (SaaS)
services featuring software in the nature of a platform for
genetic and bioinformatics analysis. Genetic testing and reporting for medical purposes; medical
testing for diagnostic or treatment purposes; medical
screening; medical diagnostic testing, monitoring and
reporting services; providing medical information regarding
genetics via a website; genetic analysis and reporting
services for medical purposes.
09 - Scientific and electric apparatus and instruments
10 - Medical apparatus and instruments
Goods & Services
Downloadable scientific and medical data via the internet; downloadable electronic data files featuring genetic information; downloadable electronic data files featuring medical information; downloadable electronic data files featuring cancer screening information and results; scientific instruments and apparatus for use in genetic research and analysis; scientific instruments and apparatus for use in medical research and analysis; scientific instruments and apparatus for use in cancer research and analysis; scientific instruments and apparatus for use in body fluid collection and analysis; medical laboratory research instruments for use in detecting cancer; medical laboratory research instruments for use in detecting genetic sequences; medical laboratory research instruments for use in collecting and analyzing body fluids; test tubes Medical apparatus and instruments for use in detecting cancer; medical apparatus and instruments for use in detecting genetic sequences; medical apparatus and instruments for use in collecting and analyzing body fluids; blood testing apparatus
19.
PREPARATION OF NUCLEIC ACID SAMPLES FOR SEQUENCING
Compositions and methods are provided for amplifying nucleic acids, including cell free nucleic acid fragments, in preparation for sequencing. Methods are provided for making circularized nucleic acid templates having the structure [T]-[PS1]-[L]-[PS2] or [PS1]-[L]-[PS2]-[T'], where (a) T is a target nucleic acid and T' is a complement to a target nucleic acid; (b) each of PS1 and PS2 is a nucleic acid primer site; (c) L is a linker having a primer extension reaction terminating organic molecule; and the structure is circularized by binding a 5' end thereof to a 3' end thereof. Target sequences in the circularized templates are amplified by binding to PS1 a primer complimentary to PS1 and binding to PS2 a primer complimentary to PS2 and copying the target sequences by a primer extension reaction. Advantages include a reduction in ligation steps, which can result in fewer clean up steps and improved library conversion efficiency.
05 - Pharmaceutical, veterinary and sanitary products
10 - Medical apparatus and instruments
Goods & Services
Medical diagnostic reagents and medical diagnostic kits comprised of medical diagnostic reagents; reagents for medical use and medical diagnostics and screening kits comprised of reagents for medical diagnostics or screening use; reagents for use in genetic testing for medical and medical diagnostic purposes; diagnostic preparations for medical purposes; assays, reagents, enzymes, and nucleotides for medical purposes, including for medical diagnostics or screening purposes; diagnostic assays, reagents, enzymes, and nucleotides for medical purposes, including for medical diagnostics or screening purposes. Blood testing apparatus.
21.
METHODS USING CHARACTERISTICS OF URINARY AND OTHER DNA
The ends of cell-free DNA fragments may be used for analysis of a biological sample. In some embodiments, DNA from a urine sample may be analyzed. Cell-free DNA fragments often include jagged ends, where one end of one strand of double-stranded DNA extends beyond the other end of the other strand. The length and amount of these jagged ends may be used to determine a level of a condition of an individual. The density of ends of fragments in certain regions may also be used in classifying the level of a condition. Additionally, DNA fragments may show a periodic pattern with the amount of DNA fragments corresponding to a length of the overhang. The periodicity may be analyzed to determine properties of a biological sample. Jagged ends may also be analyzed with a technique that avoids trimming overhanging 3' ends of a double-stranded DNA.
C12Q 1/68 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving nucleic acids
C12Q 1/6809 - Methods for determination or identification of nucleic acids involving differential detection
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
22.
METHODS USING CHARACTERISTICS OF URINARY AND OTHER DNA
The ends of cell-free DNA fragments may be used for analysis of a biological sample. In some embodiments, DNA from a urine sample may be analyzed. Cell-free DNA fragments often include jagged ends, where one end of one strand of double-stranded DNA extends beyond the other end of the other strand. The length and amount of these jagged ends may be used to determine a level of a condition of an individual. The density of ends of fragments in certain regions may also be used in classifying the level of a condition. Additionally, DNA fragments may show a periodic pattern with the amount of DNA fragments corresponding to a length of the overhang. The periodicity may be analyzed to determine properties of a biological sample. Jagged ends may also be analyzed with a technique that avoids trimming overhanging 3' ends of a double-stranded DNA.
The ends of cell-free DNA fragments may be used for analysis of a biological sample. In some embodiments, DNA from a urine sample may be analyzed. Cell-free DNA fragments often include jagged ends, where one end of one strand of double-stranded DNA extends beyond the other end of the other strand. The length and amount of these jagged ends may be used to determine a level of a condition of an individual. The density of ends of fragments in certain regions may also be used in classifying the level of a condition. Additionally, DNA fragments may show a periodic pattern with the amount of DNA fragments corresponding to a length of the overhang. The periodicity may be analyzed to determine properties of a biological sample. Jagged ends may also be analyzed with a technique that avoids trimming overhanging 3′ ends of a double-stranded DNA.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6816 - Hybridisation assays characterised by the detection means
05 - Pharmaceutical, veterinary and sanitary products
10 - Medical apparatus and instruments
Goods & Services
Medical diagnostic reagents and medical diagnostic kits comprised of medical diagnostic reagents; reagents for medical use and medical diagnostics or screening kits comprised of reagents for medical diagnostics or screening use; reagents for use in genetic testing for medical and medical diagnostic purposes; diagnostic preparations for medical purposes; assays, reagents, enzymes, and nucleotides for medical or clinical diagnostics or screening purposes; diagnostic assays, reagents, enzymes, and nucleotides for medical or clinical purposes. Blood testing apparatus.
25.
SYSTEMS AND METHODS FOR USING A CONVOLUTIONAL NEURAL NETWORK TO DETECT CONTAMINATION
A method for training a convolutional neural net for contamination analysis is provided. A training dataset is obtained comprising, for each respective training subject in a plurality of subjects, a variant allele frequency of each respective single nucleotide variant in a respective plurality of single nucleotide variants, and a respective contamination indication. First and second subsets of the plurality of training subjects have first and second contamination indication values, respectively. A corresponding first channel comprising a first plurality of parameters that include a respective parameter for a single nucleotide variant allele frequency of each respective single nucleotide variant in a set of single nucleotide variants in a reference genome is constructed for each respective training subject. An untrained or partially trained convolutional neural net is trained using, for each respective training subject, at least the corresponding first channel of the respective training subject as input against the respective contamination indication.
Detecting cross-contamination between test samples used for determining cancer in a subject is beneficial. To detect cross-contamination, test sequences including at least one single nucleotide polymorphism are prepared using genome sequencing techniques. Some of the test sequences can be filtered to improve accuracy and precision. A prior contamination probability for each test sequence is determined based on a minor allele frequency. A contamination model including a likelihood test is applied to a test sequence. The likelihood test obtains a current contamination probability representing the likelihood that the test sample is contaminated. The contamination model can also determine a likelihood that the sample includes loss of heterozygosity representing the likelihood that the test sequence is contaminated. Test samples that are contaminated are removed. A source for the contaminated test sample can be found by comparing contaminated test sequences to other test sequences.
Systems and methods for validating that a DNA sample is from a test subject are disclosed. The test subject reports one or more characteristics (biological sex, ethnicity, and/or age) that may be predicted from the DNA sample. The predictions are compared to the reported characteristics to validate the DNA sample. To validate according to biological sex, the system determines a Y-chromosome signal based on counts of sequence reads for a gene specific to the Y chromosome and, similarly, an X-chromosome signal using another gene specific to the X chromosome. The biological sex is predicted based on a comparison of the two signals. To validate according to ethnicity, the system predicts ethnicity based on detected allele frequencies for SNPs specific to each chromosome. To validate according to age, the system calculates the methylation densities for age-informative CpG sites. The system utilizes trained regression models to predict the age using the methylation densities.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6879 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for sex determination
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
Detecting cross-contamination between test samples used for determining cancer in a subject is beneficial. To detect cross-contamination, test sequences including at least one single nucleotide polymorphism are prepared using genome sequencing techniques. Some of the test sequences can be filtered to improve accuracy and precision. A prior contamination probability for each test sequence is determined based on a minor allele frequency. A contamination model including a likelihood test is applied to a test sequence. The likelihood test obtains a current contamination probability representing the likelihood that the test sample is contaminated. The contamination model can also determine a likelihood that the sample includes loss of heterozygosity representing the likelihood that the test sequence is contaminated. Test samples that are contaminated are removed. A source for the contaminated test sample can be found by comparing contaminated test sequences to other test sequences.
Detecting cross-contamination between test samples used for determining cancer in a subject is beneficial. To detect cross-contamination, test sequences including at least one single nucleotide polymorphism are prepared using genome sequencing techniques. Some of the test sequences can be filtered to improve accuracy and precision. A prior contamination probability for each test sequence is determined based on a minor allele frequency. A contamination model including a likelihood test is applied to a test sequence. The likelihood test obtains a current contamination probability representing the likelihood that the test sample is contaminated. The contamination model can also determine a likelihood that the sample includes loss of heterozygosity representing the likelihood that the test sequence is contaminated. Test samples that are contaminated are removed. A source for the contaminated test sample can be found by comparing contaminated test sequences to other test sequences.
Systems and methods for validating that a DNA sample is from a test subject are disclosed. The test subject reports one or more characteristics (biological sex, ethnicity, and/or age) that may be predicted from the DNA sample. The predictions are compared to the reported characteristics to validate the DNA sample. To validate according to biological sex, the system determines a Y-chromosome signal based on counts of sequence reads for a gene specific to the Y chromosome and, similarly, an X-chromosome signal using another gene specific to the X chromosome. The biological sex is predicted based on a comparison of the two signals. To validate according to ethnicity, the system predicts ethnicity based on detected allele frequencies for SNPs specific to each chromosome. To validate according to age, the system calculates the methylation densities for age-informative CpG sites. The system utilizes trained regression models to predict the age using the methylation densities.
Size-band analysis is used to determine whether a chromosomal region exhibits a copy number aberration or an epigenetic alteration. Multiple size ranges may be analyzed instead of focusing on specific sizes. By using multiple size ranges instead of specific sizes, methods may analyze more sequence reads and may be able to determine whether a chromosomal region exhibits a copy number aberration even when clinically-relevant DNA may be a low fraction of the biological sample. Using multiple ranges may allow for the use of all sequence reads from a genomic region, rather than a selected subset of reads in the genomic region. The accuracy of analysis may be increased with higher sensitivity at similar or higher specificity. Analysis may include fewer sequencing reads to achieve the same accuracy, resulting in a more efficient process.
C12Q 1/6809 - Methods for determination or identification of nucleic acids involving differential detection
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
Systems and methods for validating that a DNA sample is from a test subject are disclosed. The test subject reports one or more characteristics (biological sex, ethnicity, and/or age) that may be predicted from the DNA sample. The predictions are compared to the reported characteristics to validate the DNA sample. To validate according to biological sex, the system determines a Y-chromosome signal based on counts of sequence reads for a gene specific to the Y chromosome and, similarly, an X-chromosome signal using another gene specific to the X chromosome. The biological sex is predicted based on a comparison of the two signals. To validate according to ethnicity, the system predicts ethnicity based on detected allele frequencies for SNPs specific to each chromosome. To validate according to age, the system calculates the methylation densities for age-informative CpG sites. The system utilizes trained regression models to predict the age using the methylation densities.
Various embodiments are performed to using nuclease expression in tissues that influences cell-free DNA end signatures/motifs and size of overhang between DNA strands. Embodiments can identify a nuclease that is being differentially regulated in abnormal cells relative to normal cells. Embodiments can determine that the nuclease preferentially cuts DNA into DNA molecules having: (i) a particular sequence end signature; or (ii) a specified length of overhang between a first strand and a second strand. A parameter can be determined for a biological sample based on an amount of DNA molecules that include an end sequence corresponding to the particular sequence end signature and/or a measured property correlating to the specified length of overhang. The parameter can be used to determine a characteristic of a tissue type, a fractional concentration of clinically-relevant DNA molecules, or a level of abnormality of a tissue type in the biological sample.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
34.
NUCLEASE-ASSOCIATED END SIGNATURE ANALYSIS FOR CELL-FREE NUCLEIC ACIDS
Various embodiments are directed to using nuclease expression in tissues that influences cell-free DNA end signatures/motifs and size of overhang between DNA strands. Embodiments can identify a nuclease that is being differentially regulated in abnormal cells relative to normal cells. Embodiments can determine that the nuclease preferentially cuts DNA into DNA molecules having: (i) a particular sequence end signature; or (ii) a specified length of overhang between a first strand and a second strand. A parameter can be determined for a biological sample based on an amount of DNA molecules that include an end sequence corresponding to the particular sequence end signature and/or a measured property correlating to the specified length of overhang. The parameter can be used to determine a characteristic of a tissue type, a fractional concentration of clinically-relevant DNA molecules, or a level of abnormality of a tissue type in the biological sample.
C12Q 1/34 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving hydrolase
Systems and methods described herein include detecting a presence or absence of HPV in a biological sample having cell-free nucleic acids from a subject and potentially cell-free nucleic acids from an HPV strain. Based on a detection of HPV viral nucleic acids in the biological sample, an HPV-based multiclass classifier that predicts a score for each HPV-associated cancer type is applied. The HPV-based multiclass classifier is trained on a training set of HPV-positive cancer samples. An HPV-associated cancer associated with the biological sample is determined based on the scores predicted by the HPV multiclass classifier.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
36.
DETECTION AND CLASSIFICATION OF HUMAN PAPILLOMAVIRUS ASSOCIATED CANCERS
Systems and methods described herein include detecting a presence or absence of HPV in a biological sample having cell-free nucleic acids from a subject and potentially cell-free nucleic acids from an HPV strain. Based on a detection of HPV viral nucleic acids in the biological sample, an HPV-based multiclass classifier that predicts a score for each HPV-associated cancer type is applied. The HPV-based multiclass classifier is trained on a training set of HPV-positive cancer samples. An HPV-associated cancer associated with the biological sample is determined based on the scores predicted by the HPV multiclass classifier.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/70 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving virus or bacteriophage
37.
METHYLATED DNA FRAGMENT ENRICHMENT, METHODS, COMPOSITIONS AND KITS
A method of processing an input sample, as well as related kits and compositions, is provided herein. In various instances, the disclosure relates to providing an input sample comprising nucleic acid fragments, wherein in at least a portion of the nucleic acid fragments each fragment comprises one or more methylated cytosines; converting unmethylated cytosines of nucleic acid fragments of the input sample to uracils, yielding converted fragments; copying the converted fragments using a mixture of nucleotides, the mixture comprising a mixture of: binding moiety-modified cytosines and binding moiety-lacking cytosines; binding moiety-modified guanines and binding moiety-lacking guanines; or binding moiety-modified cytosines, binding moiety-lacking cytosines, binding moiety-modified guanines, and binding moiety-lacking guanines; wherein the copying yields a mixture of binding moiety-modified fragments and unmodified fragments which may be separated to provide a set of fragments enriched for hypermethylated fragments.
Methods for measuring subpopulations of cell-free ribonucleic acid (RNA) molecules are provided. In some embodiments, methods of generating a sequencing library from a plurality of RNA molecules in a test sample obtained from a subject are provided, as well as methods for analyzing the sequencing library to detect, e.g., the presence or absence of a disease.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
39.
GENERATING CANCER DETECTION PANELS ACCORDING TO A PERFORMANCE METRIC
A system generates a cancer detection panel. The system is configured to generate an assay having a minimized size and number of genomic regions while still detecting the presence of cancer at or above a specific performance threshold. To select the genomic regions for the panel, the system employs a classification model. The classification model receives a set of genomic regions that may be associated with disease presence. The model then determines a sensitivity score for each genomic region and ranks the regions according to their score. The sensitivity score is based on a likelihood that variations in the genomic region are indicative of cancer. The model then selects genomic regions for the panel based on their rank. The model only selects as many genomic indicators as are needed for desired detection performance. The genomic regions can be associated with solid or liquid cancers, viral regions, or cancer hotspots.
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
40.
CANCER CLASSIFICATION WITH GENOMIC REGION MODELING
Methods and systems for detecting cancer and/or determining a cancer tissue of origin are disclosed. Fragments are grouped into genomic regions, wherein a region model is trained for each genomic region using a neural network with hidden layers. Fragments are input into the region models, and the outputs are used to generate a feature vector for cancer classification. In one embodiment, the region models are shallow neural networks configured to generate a score indicating a likelihood that a fragment is derived from a cancer biological sample. The feature vector is determined based on counts of fragments having scores above threshold scores for the various genomic regions. In another embodiment, the regions models are configured to generate a region embedding for an input methylation embedding of a fragment. The region embeddings are pooled by region and then pooled again to generate the feature vector.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Methods and systems for detecting cancer and/or determining a cancer tissue of origin are disclosed. A multiclass cancer classifier is disclosed that is trained with a plurality of biological samples containing cfDNA fragments and at least one synthetic training sample generated from the biological samples. The analytics system generates the synthetic training sample by sampling fragments from a training sample labeled as cancer and sampling fragments from another training sample labeled as non-cancer. The sampling probability is determined based on a limit of detection of the cancer classifier, e.g., in order to generate synthetic training samples with cancer tumor fraction proximate to the limit of detection.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16B 20/20 - Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
42.
CANCER CLASSIFICATION WITH GENOMIC REGION MODELING
Methods and systems for detecting cancer and/or determining a cancer tissue of origin are disclosed. Fragments are grouped into genomic regions, wherein a region model is trained for each genomic region using a neural network with hidden layers. Fragments are input into the region models, and the outputs are used to generate a feature vector for cancer classification. In one embodiment, the region models are shallow neural networks configured to generate a score indicating a likelihood that a fragment is derived from a cancer biological sample. The feature vector is determined based on counts of fragments having scores above threshold scores for the various genomic regions. In another embodiment, the regions models are configured to generate a region embedding for an input methylation embedding of a fragment. The region embeddings are pooled by region and then pooled again to generate the feature vector.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
43.
SYSTEMS AND METHODS FOR CALLING VARIANTS USING METHYLATION SEQUENCING DATA
An allelic position variant calling method using a prior genotype probability at the allelic position is provided. A strand specific base count set in forward and reverse directions for the allelic position is obtained, using strand orientation and identity of a respective base at the allelic position in each respective nucleic acid fragment sequence that maps to the allelic position, where bases at the allelic position whose identity can be affected by conversion of cytosine to uracil do not contribute to the strand specific base count set. Respective forward and reverse strand conditional probabilities are computed for each candidate genotype for the allelic position using the strand specific base count set and sequencing error estimate. Likelihoods are computed using a combination of these conditional probabilities and the prior genotype probability. From this, a determination is made as to whether the likelihoods support a variant call at the allelic position.
A method for discriminating a cancer state is provided. A first dataset is obtained for a plurality of subjects having a first cancer state. Each subject has a plurality of nucleic acid methylation fragments with methylation patterns comprising CpG site methylation states. An autoencoder including an encoder and decoder is trained by evaluating the error in the autoencoder reconstruction of the methylation pattern and nucleic acid sequence of each nucleic acid methylation fragment in the first dataset. A second dataset is obtained for a plurality of subjects having a second cancer state. A plurality of features is identified by inputting the methylation pattern and nucleic acid sequence of each nucleic acid methylation fragment in the second dataset into the trained autoencoder and computing a score determined by the autoencoder reconstruction of the methylation pattern. The plurality of features is used to train a supervised model that discriminates a cancer state.
An allelic position variant calling method using a prior genotype probability at the allelic position is provided. A strand specific base count set in forward and reverse directions for the allelic position is obtained, using strand orientation and identity of a respective base at the allelic position in each respective nucleic acid fragment sequence that maps to the allelic position, where bases at the allelic position whose identity can be affected by conversion of cytosine to uracil do not contribute to the strand specific base count set. Respective forward and reverse strand conditional probabilities are computed for each candidate genotype for the allelic position using the strand specific base count set and sequencing error estimate. Likelihoods are computed using a combination of these conditional probabilities and the prior genotype probability. From this, a determination is made as to whether the likelihoods support a variant call at the allelic position.
Systems and methods of identifying methylation patterns discriminating or indicating a cancer condition are provided. First and second datasets are obtained. Each dataset comprises a plurality of fragment methylation patterns determined by methylation sequencing of nucleic acids obtained from a first or second set of subjects and comprising a methylation state of each CpG site in a corresponding plurality of CpG sites. Each plurality of subjects has a respective first or second state of the cancer condition. First and second interval maps are generated for each respective dataset, each comprising a plurality of nodes characterized by a start methylation site, an end methylation site, a representation of each different fragment methylation pattern and a count of fragments. The first and second interval maps are scanned for qualifying methylation patterns within a predetermined range of CpG sites, satisfying one or more selection criteria, thereby identifying methylation patterns discriminating a cancer condition.
The present disclosure describes techniques for measuring quantities (e.g., relative frequencies) of end motif pairs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a pathology of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the end motif pairs. The present disclosure provides various uses for measurements of the relative frequencies of end motif pairs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from certain tissue(s) may be referred to as clinically-relevant DNA.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
48.
BITERMINAL DNA FRAGMENT TYPES IN CELL-FREE SAMPLES AND USES THEREOF
It describes techniques for measuring quantities (e.g., relative frequencies) of end motif pairs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a pathology of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the end motif pairs. It provides various uses for measurements of the relative frequencies of end motif pairs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from certain tissue(s) may be referred to as clinically-relevant DNA.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
49.
BITERMINAL DNA FRAGMENT TYPES IN CELL-FREE SAMPLES AND USES THEREOF
It describes techniques for measuring quantities (e.g., relative frequencies) of end motif pairs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a pathology of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the end motif pairs. It provides various uses for measurements of the relative frequencies of end motif pairs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from certain tissue(s) may be referred to as clinically-relevant DNA.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Various methods, apparatuses, and systems are provided for detecting a genetic disorder in a gene associated with a nuclease, for determining an efficacy of a dosage of an anticoagulant, and for monitoring an activity of a nuclease. Measured parameter values can be compared to a reference value to determine classifications of a genetic disorder, efficiency, or activity. An amount of a particular base (e.g., in an end motif) at fragment ends, an amount of a particular base at fragment ends of a particular size, or a total amount of cell-free DNA fragments (e.g., as a concentration) can be used. Certain samples may be treated with an anticoagulant, and different incubation times can be used for certain methods.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Various methods, apparatuses, and systems are provided for detecting a genetic disorder in a gene associated with a nuclease, for determining an efficacy of a dosage of an anticoagulant, and for monitoring an activity of a nuclease. Measured parameter values can be compared to a reference value to determine classifications of a genetic disorder, efficiency, or activity. An amount of a particular base (e.g., in an end motif) at fragment ends, an amount of a particular base at fragment ends of a particular size, or a total amount of cell-free DNA fragments (e.g., as a concentration) can be used. Certain samples may be treated with an anticoagulant, and different incubation times can be used for certain methods.
C12Q 1/6827 - Hybridisation assays for detection of mutation or polymorphism
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
52.
SYSTEMS AND METHODS FOR ESTIMATING CELL SOURCE FRACTIONS USING METHYLATION INFORMATION
A method of identifying a plurality of features for estimating subject cell source fraction is provided. For each respective training subject in a plurality of training subjects, a corresponding methylation pattern of each respective cell-free fragment in a corresponding training plurality of cell-free fragments and a corresponding subject cancer indication is obtained. Each cell-free fragment is mapped to a bin in a plurality of bins, each bin representing a portion of a human reference genome. A cell-free fragment cancer condition is assigned to each cell-free fragment, as a function of a classifier upon inputting a corresponding methylation pattern of the respective cell-free fragment into the classifier. A measure of association is determined for each bin between the subject cancer condition and the cell-free fragment cancer condition. The plurality of features for estimating subject cell source fraction are identified as a subset of the plurality of bins.
Methods for determining a disease condition of a subject of a species are provided that comprises obtaining a dataset of fragment methylation patterns determined by methylation sequencing of nucleic acid from a biological sample of the subject. A fragment methylation pattern comprises the methylation state of each CpG site in the fragment. A patch including a channel comprising parameters for the methylation status of respective CpG sites in a set of CpG sites in a reference genome represented by the patch is constructed by populating, for each respective fragment in the plurality of fragments that aligns to the set of CpG sites, an instance of all or a portion of the plurality of parameters based on the methylation pattern of the respective fragment. Application of the patch to a patch convolutional neural network determines the disease condition of the subject.
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
G16H 30/40 - ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
54.
SYSTEMS AND METHODS FOR EVALUATING LONGITUDINAL BIOLOGICAL FEATURE DATA
Systems and methods are provided for determining whether a test subject has a disease condition. In one aspect, the method includes determining at least first and second genotypic data constructs for a test subject, formed from data collected from first and second sample from the subject, respectively, at different times. The first and second genotypic data constructs are inputted into a model for the disease condition, thereby generating first and second model score sets for the disease condition, respectively. A test delta score set is determined based on a difference between the first and second model score sets. The test delta score set is evaluated against a plurality of reference delta score sets, to determine the disease condition of the test subject, where each reference delta score set is for a respective reference subject in a plurality of reference subjects.
Methods and systems for detecting cancer and/or determining a cancer tissue of origin are disclosed. In some embodiments, a multiclass cancer classifier is disclosed that is trained with a plurality of biological samples containing cfDNA fragments. The analytics system derives a feature vector for each sample, and the multiclass classifier predicts a probability likelihood for each of a plurality of tissue of origin (TOO) classes. In some embodiments, the plurality of TOO classes include hematological subtypes, including both hematological malignancies and precursor conditions. In one embodiment, non-cancer samples having high tissue signal are pruned from the training sample set. In another embodiment, the analytics system stratifies samples according to tissue signal and applies binary threshold cutoffs determined for each stratum.
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
56.
SYSTEMS AND METHODS FOR DIAGNOSING A DISEASE CONDITION USING ON-TARGET AND OFF-TARGET SEQUENCING DATA
Systems and methods for determining whether a subject has a disease condition in a set of disease conditions are provided. The method includes obtaining a test dataset that comprises a first plurality of bin values obtained for a first plurality of bins collectively representing a first portion of a reference genome, and a second plurality of bin values obtained for a second plurality of bins collectively representing a second portion of the reference genome. The first and second plurality of bin values are derived from a targeted sequencing of a plurality of nucleic acids that are enriched using a plurality of probes. A plurality of copy number values are determined from the first and second plurality of bin values. The copy number values are inputted into a trained classifier, thereby determining whether the subject has a disease condition.
Noise models for processing nucleic acid datasets can stratify processed sequence reads into different read tiers. Each read tier can be defined based on whether a potential variant location is at an overlapping region and/or a complementary region of the sequence reads. A processing system can determine, for each read tier, a stratified sequencing depth at the variant location. The processing system can determine, for reach read tier, one or more noise parameters conditioned on the stratified sequencing depth of the read tier. The noise parameters can be associated with a noise distribution. The processing system can generate an output for each noise model based on the noise parameters conditioned on the stratified sequencing depth. The processing system can combine the output for each stratified noise model to generate a combined result, which can represent a likelihood that an event would be as or more extreme than the observed data.
Systems and methods for determining consensus base calls in nucleic acid sequencing are provided. A sequencing dataset is obtained corresponding to a plurality of base reads for a first base position within a plurality of base positions of a target nucleic acid molecule. The sequencing dataset includes at least two features, for each base read of the plurality of base reads. The at least two features are selected from among the features: a nucleotide base, a read quality score, a strand identifier, a trinucleotide context of the base read, and a confidence score associated with the trinucleotide context. The sequencing dataset is transformed into a feature tensor representing a distribution of the plurality of features in the sequencing dataset. The feature tensor is assessed with a classifier to determine a consensus base call for the first base position. The consensus base call comprises a predicted nucleotide base.
Methods and systems for determining a subjects likelihood of responding to a treatment by assessing the subjects cell-free DNA (cfDNA) sample include receiving sequence data gathered from sequencing the cfDNA sample, generating a feature matrix of values that correspond to synonymous and nonsynonymous mutations detected in the sequence data, and predicting, based on analysis of the feature matrix at a TMB prediction model, a tumor mutational burden (TMB) for a tissue of interest at the subject. The predicted TMB is evaluated to determine whether a set of criteria indicating a likely response to treatment is met. The set of criteria can include criterion(s) that are met when the predicted TMB is high, when the predicted TMB corresponds to a predicted tumoral heterogeneity indicative of homogeneous tissue, when the predicted TMB corresponds to a tumor fraction indicative of a positive responder, or any combination thereof.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
60.
SYSTEMS AND METHODS FOR DETERMINING TUMOR FRACTION
Systems and methods for determining a tumor fraction for a subject are provided. A plurality of bin values is obtained. Each respective bin value in the plurality of bin values corresponds to a bin in a plurality of bins. Each bin represents a corresponding region of a reference genome. The plurality of bin values is derived from a first biological sample of the subject. A plurality of copy number values is determined at least in part from the plurality of bins values. A plurality of allele frequencies for a plurality of alleles is derived from a second biological sample of the subject. At least the plurality of copy number values and the plurality of allele frequencies, or a plurality of features derived therefrom, are applied to a reference model, thereby determining the tumor fraction of the subject.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
61.
Determining linear and circular forms of circulating nucleic acids
Techniques are provided for analyzing circular DNA in a biological sample (e.g., including cell-free DNA, such as plasma). For example, to measure circular DNA, cleaving can be performed to linearize the circular DNA so that they may be sequenced. Example cleaving techniques include restriction enzymes and transposases. Then, one or more criteria can be used to identify linearized DNA molecules, e.g., so as to differentiate from linear DNA molecules. An example criterion is mapping a pair of reversed end sequences to a reference genome. Another example criterion is identification of a cutting tag, e.g., associated with a restriction enzyme or an adapter sequence added by a transposase. Once circular DNA molecules (e.g., eccDNA and circular mitochondrial DNA) are identified, they may be analyzed (e.g., to determine a count, size profile, and/or methylation) to measure a property of the biological sample, including genetic properties and level of a disease.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Systems and methods for classifier training are provided. A first dataset is obtained that comprises, for each first subject, a corresponding plurality of bin values, each for a bin in a plurality of bins, and subject cancer condition. A feature extraction technique is applied to the first dataset thereby obtaining feature extraction functions, each of which is an independent linear or nonlinear function of bin values of the bins. A second dataset is obtained comprising, for each second subject, a corresponding plurality of bin values, each for a bin in the plurality of bins and subject cancer condition. The plurality of bin values of each corresponding subject in the second plurality are projected onto the respective feature extraction functions, thereby forming a transformed second dataset comprising feature values for each subject. The transformed second dataset and subject cancer condition serves to train a classifier on the cancer condition set.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/68 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving nucleic acids
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
In various embodiments, an analytics system uses models to determine features and classification of disease states. A disease state can indicate presence or absence of cancer, a cancer type, or a cancer tissue of origin. The models can include a binary classifier and a tissue of origin classifier. The analytics system can process sequence reads from test biological samples to generate data for training the classifiers. The analytics system can also use combinations of machine learning techniques to train the models, which can include a multilayer perceptron. In some embodiments, the analytics system uses methylation information to train the models to determine predictions regarding disease state.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
64.
STRATIFICATION OF RISK OF VIRUS ASSOCIATED CANCERS
Provided herein are methods and systems for stratifying risk for a subject to develop a pathogen-associated disorder based on analysis of cell-free nucleic acid molecules from a biological sample of the subject. In various examples, screening frequency is determined based on the risk analysis. Also provided herein are methods and systems for analyzing variant patterns of a pathogen genome in cell-free nucleic acid molecules.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
65.
Diagnostic applications using nucleic acid fragments
Various embodiments are directed to applications (e.g., classification of biological samples) of the analysis of the count, the fragmentation patterns, and size of cell-free nucleic acids, e.g., plasma DNA and serum DNA, including nucleic acids from pathogens, such as viruses. Embodiments of one application can determine if a subject has a particular condition. For example, a method of present disclosure can determine if a subject has cancer or a tumor, or other pathology. Embodiments of another application can be used to assess the stage of a condition, or the progression of a condition over time. For example, a method of the present disclosure may be used to determine a stage of cancer in a subject, or the progression of cancer in a subject over time (e.g., using samples obtained from a subject at different times).
C12Q 1/6888 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
C12Q 1/6879 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for sex determination
C12Q 1/6806 - Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
G16B 20/00 - ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
G16H 10/40 - ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
G16H 50/20 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
G16H 50/70 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
G16H 50/30 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indicesICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for individual health risk assessment
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
66.
STRATIFICATION OF RISK OF VIRUS ASSOCIATED CANCERS
Provided herein are methods and systems for stratifying risk for a subject to develop a pathogen-associated disorder based on analysis of cell-free nucleic acid molecules from a biological sample of the subject. In various examples, screening frequency is determined based on the risk analysis. Also provided herein are methods and systems for analyzing variant patterns of a pathogen genome in cell-free nucleic acid molecules.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
G06F 19/00 - Digital computing or data processing equipment or methods, specially adapted for specific applications (specially adapted for specific functions G06F 17/00;data processing systems or methods specially adapted for administrative, commercial, financial, managerial, supervisory or forecasting purposes G06Q;healthcare informatics G16H)
G01N 33/48 - Biological material, e.g. blood, urineHaemocytometers
67.
DETERMINING LINEAR AND CIRCULAR FORMS OF CIRCULATING NUCLEIC ACIDS
Techniques are provided for analyzing circular DNA in a biological sample, which include cleaving with restriction enzymes or transposases to linearize the circular DNA so that they may be sequenced; using one or more criteria to identify linearized DNA molecules; and analyzing the identified molecules to measure properties of the biological sample.
Techniques are provided for analyzing circular DNA in a biological sample, which include cleaving with restriction enzymes or transposases to linearize the circular DNA so that they may be sequenced; using one or more criteria to identify linearized DNA molecules; and analyzing the identified molecules to measure properties of the biological sample.
Systems and methods for determining a cancer class of a subject are provided in which a plurality of sequence reads, in electronic form, are obtained from a biological sample of the subject. The sample comprises a plurality of cell-free DNA molecules including respective DNA molecules longer than a threshold length of less than 160 nucleotides. The plurality of sequence reads excludes sequence reads of cell-free DNA molecules in the plurality of cell- free DNA molecules longer than the threshold length. The plurality of sequence reads is used to identify a relative copy number at each respective genomic location in a plurality of genomic locations in the genome of the subject. The genetic information about the subject obtained from the sample and the genetic information consisting of the identification of the relative copy number at each respective genomic location, is applied to a classifier that determines the cancer class of the subject.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C04B 40/06 - Inhibiting the setting, e.g. mortars of the deferred action type containing water in breakable containers
70.
DETECTING CANCER, CANCER TISSUE OF ORIGIN, AND/OR A CANCER CELL TYPE
The present description provides a hematological disorder (HD) assay panel for targeted detection of methylation patterns or variants specific to various hematological disorders, such as clonal hematopoiesis of indeterminate potential (CHIP) and blood cancers, such as leukemia, lymphoid neoplasms (e.g. lymphoma), multiple myeloma, and myeloid neoplasm. Further provided herein includes methods of designing, making, and using the HD assay panel for detection of various hematological disorders.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
The present description provides a cancer assay panel for targeted detection of cancer-specific methylation patterns. Further provided herein includes methods of designing, making, and using the cancer assay panel for detection of cancer tissue of origin (e.g., types of cancer).
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
72.
DETECTING CANCER, CANCER TISSUE OF ORIGIN, AND/OR A CANCER CELL TYPE
The present description provides a cancer assay panel for targeted detection of cancer-specific methylation patterns. Further provided herein includes methods of designing, making, and using the cancer assay panel to detect cancer and particular types of cancer.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6832 - Enhancement of hybridisation reaction
05 - Pharmaceutical, veterinary and sanitary products
10 - Medical apparatus and instruments
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
Medical diagnostic reagents and medical diagnostic kits
comprised of medical diagnostic reagents; reagents for
medical use and medical diagnostics and screening kits
comprised of reagents for medical diagnostics or screening
use; reagents for use in genetic testing for medical and
medical diagnostic purposes; diagnostic preparations for
medical purposes; assays, reagents, enzymes, and nucleotides
for medical purposes, including for medical diagnostics or
screening purposes; diagnostic assays, reagents, enzymes,
and nucleotides for medical purposes, including for medical
diagnostics or screening purposes. Blood testing apparatus. Scientific and medical research services; hosting an online
database in the field of genetics and cancer for medical or
scientific research; providing temporary use of on-line
non-downloadable software and applications for use in
studying, diagnosing or screening for cancer and studying
genetics and DNA; providing temporary use of on-line
non-downloadable cloud computing software for use in
studying, diagnosing or screening cancer and studying
genetics and DNA. Genetic testing and reporting for medical purposes; medical
testing for diagnostic or treatment purposes; medical
screening; medical diagnostic testing, monitoring and
reporting services; providing medical information regarding
genetics, via a website; genetic analysis and reporting
services for medical purposes.
The present disclosure describes techniques for measuring quantities (e.g., relative frequencies) of sequence end motifs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a condition of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the sequence end motifs. The present disclosure provides various uses for measures of the relative frequencies of sequence end motifs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from one of such tissue may be referred to as clinically-relevant DNA.
G16H 50/30 - ICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indicesICT specially adapted for medical diagnosis, medical simulation or medical data miningICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for individual health risk assessment
The present disclosure describes techniques for measuring quantities (e.g., relative frequencies) of sequence end motifs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a condition of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the sequence end motifs. The present disclosure provides various uses for measures of the relative frequencies of sequence end motifs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from one of such tissue may be referred to as clinically-relevant DNA.
The present disclosure describes techniques for measuring quantities (e.g., relative frequencies) of sequence end motifs of cell-free DNA fragments in a biological sample of an organism for measuring a property of the sample (e.g., fractional concentration of clinically-relevant DNA) and/or determining a condition of the organism based on such measurements. Different tissue types exhibit different patterns for the relative frequencies of the sequence end motifs. The present disclosure provides various uses for measures of the relative frequencies of sequence end motifs of cell-free DNA, e.g., in mixtures of cell-free DNA from various tissues. DNA from one of such tissue may be referred to as clinically-relevant DNA.
Methods for measuring subpopulations of ribonucleic acid (RNA) molecules are provided. In some embodiments, methods of generating a sequencing library from a plurality of RNA molecules in a test sample obtained from a subject are provided, as well as methods for analyzing the sequencing library to detect, e.g., the presence or absence of a disease.
Systems and methods are disclosed for determining a cell source fraction in a biological sample of a test subject. Nucleic acid fragments are obtained from a biological sample, comprising cell-free nucleic acid, of the test subject. A methylation state is obtained for each nucleic acid fragment in a first plurality of nucleic acid fragments. Each respective nucleic acid fragment is individually assigned a first score, thereby obtaining a first plurality of scores. Each respective score represents a likelihood that the corresponding nucleic acid fragment was obtained from a cell-free nucleic acid molecule associated with the first cell source. The first plurality of scores is transformed into a first plurality of counts, each count in the first plurality of counts being for a methylation site in a first predetermined set of methylation sites. A first cell source fraction for the test subject is estimated using the first plurality of counts.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
79.
CANCER TISSUE SOURCE OF ORIGIN PREDICTION WITH MULTI-TIER ANALYSIS OF SMALL VARIANTS IN CELL-FREE DNA SAMPLES
A predictive cancer model generates a prediction of cancer tissue source of origin for a subject of interest by analizing values of one or more types of features that are derived from cfDNA obtained from the individual. Specifically, cfDNA from the individual is sequenced to generate sequence reads using one or more physical assays, examples of which include a small variant sequencing assay. The sequence reads of the physical assays are processes through corresponding computational analyses to generate small variant features and other features. The values of features can be provided to a prediction model that generates a prediction of cancer tissue source of origin and/or cancer presence.
Systems and methods are provided for determining relevant medical information about a cancer based on the distribution of fragment lengths of cell-free DNA sequenced from a biological fluid sample. In certain embodiments, the systems and methods are useful for segmenting a cancer genome, phasing alleles in a cancer genome, detecting the loss of heterozygosity in a cancer genome, assigning an origin of a variant allele, validating a sequencing mapping, and validating use of an allele in a cancer classifier.
G06F 19/22 - for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or Single-Nucleotide Polymorphism [SNP] discovery or sequence alignment
A system and method for determining a presence of cancer in a test sample from a test subject comprising a set of fragments of deoxyribonucleic acid (DNA). The fragments may be identified through probabilistic analyses or identified when determined to be hypermethylated or hypomethylated. The system generates a test feature vector with a score for each CpG site for use in a trained model. The score is based on a number of the fragments in the test sample that overlap the CpG site. The system inputs the test feature vector into the trained model. The trained model has a function that generates a cancer prediction based on the test feature vector and a set of classification parameters. The cancer prediction for the test sample may include a cancer prediction value for each cancer type that describes a likelihood the test sample is of that particular cancer type.
A method and system for determining one or more sources of a cell free deoxyribonucleic acid (cfDNA) test sample from a test subject. The cfDNA test sample contains a plurality of deoxyribonucleic acid (DNA) molecules with numerous CpG sites that may be methylated or unmethylated. A trained deconvolution model comprises a plurality of methylation parameters, including a methylation level at each CpG site for each source, and a function relating a sample vector as input and a source of origin prediction as output. The method generates a test sample vector comprising a site methylation metric relating to DNA molecules from the test sample that are methylated at that CpG site. The method inputs the test sample vector into the trained deconvolution model to generate a source of origin prediction indicating a predicted DNA molecule contribution of each source.
05 - Pharmaceutical, veterinary and sanitary products
10 - Medical apparatus and instruments
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
Medical diagnostic reagents and medical diagnostic kits
comprised of medical diagnostic reagents; reagents for
medical use and medical diagnostics or screening kits
comprised of reagents for medical diagnostics or screening
use; reagents for use in genetic testing for medical and
medical diagnostic purposes; diagnostic preparations for
medical purposes; assays, reagents, enzymes, and nucleotides
for medical or clinical diagnostics or screening purposes;
diagnostic assays, reagents, enzymes, and nucleotides for
medical or clinical purposes. Blood testing apparatus. Scientific and medical research services; hosting an on line
database in the field of genetics and cancer for medical or
scientific research; providing temporary use of on-line
non-downloadable software and applications for use in
studying, diagnosing or screening for cancer and studying
genetics and DNA; providing temporary use of on-line
non-downloadable cloud computing software for use in
studying, diagnosing or screening cancer and studying
genetics and DNA. Genetic testing and reporting for medical purposes; medical
testing for screening, diagnostic or treatment purposes;
medical diagnostic testing, monitoring and reporting
services; providing medical information regarding genetics,
via a website; genetic analysis and reporting services for
medical purposes.
84.
METHYLATION MARKERS AND TARGETED METHYLATION PROBE PANEL
The present description provides a cancer assay panel for targeted detection of cancer-specific methylation patterns. Further provided herein are methods of designing, making, and using the cancer assay panel for the diagnosis of cancer.
Cell-free DNA fragments often include jagged ends, where one end of one strand of double-stranded DNA extends beyond the other end of the other strand. The length and amount of these jagged ends may be used to determine a level of a condition of an individual, a fractional concentration of clinically-relevant DNA in a biological sample, an age of individual, or a tissue type exhibiting cancer. The jagged end length and amount may be determined using various techniques described herein.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Cell-free DNA fragments often include jagged ends, where one end of one strand of double-stranded DNA extends beyond the other end of the other strand. The length and amount of these jagged ends may be used to determine a level of a condition of an individual, a fractional concentration of clinically-relevant DNA in a biological sample, an age of individual, or a tissue type exhibiting cancer. The jagged end length and amount may be determined using various techniques described herein.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
87.
CELL-FREE DNA DAMAGE ANALYSIS AND ITS CLINICAL APPLICATIONS
Cell-free DNA fragments often include jagged ends, where one end of one strand of double-stranded DNA extends beyond the other end of the other strand. The length and amount of these jagged ends may be used to determine a level of a condition of an individual, a fractional concentration of clinically-relevant DNA in a biological sample, an age of individual, or a tissue type exhibiting cancer. The jagged end length and amount may be determined using various techniques described herein.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
Providing temporary use of on-line non-downloadable software and applications for use in studying, diagnosing or screening for cancer; providing temporary use of on-line non-downloadable cloud computing software for use in studying, diagnosing or screening cancer; all of the foregoing services provided to physicians, nurses, and other healthcare clinicians Genetic testing and reporting for medical purposes; medical testing for diagnostic or treatment purposes; medical screening; medical diagnostic testing, monitoring and reporting services; providing a website featuring medical information regarding genetics; genetic analysis and reporting services for medical purposes; all of the foregoing services provided to physicians, nurses, and other healthcare clinicians
89.
NUCLEIC ACID REARRANGEMENT AND INTEGRATION ANALYSIS
Provided herein are methods and systems for identifying chimeric nucleic acid fragments, e.g., organism-pathogen chimeric nucleic acid fragments and chromosomal rearrangement chimeric nucleic acid fragments. Also provided herein are methods and systems relating to determining a pathogen integration profile or a chromosomal rearrangement in a biological sample and determining a classification of pathology based at least in part on a pathogen integration profile or a chromosomal rearrangement in a biological sample. In certain aspects of the present disclosure, cell-free nucleic acid molecules from a biological sample are analyzed.
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Provided herein are methods and systems for identifying chimeric nucleic acid fragments, e.g., organism-pathogen chimeric nucleic acid fragments and chromosomal rearrangement chimeric nucleic acid fragments. Also provided herein are methods and systems relating to determining a pathogen integration profile or a chromosomal rearrangement in a biological sample and determining a classification of pathology based at least in part on a pathogen integration profile or a chromosomal rearrangement in a biological sample. In certain aspects of the present disclosure, cell-free nucleic acid molecules from a biological sample are analyzed.
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
91.
CONVOLUTIONAL NEURAL NETWORK SYSTEMS AND METHODS FOR DATA CLASSIFICATION
Classification of cancer condition, in a plurality of different cancer conditions, for a species, is provided in which, for each training subject in a plurality of training subjects, there is obtained a cancer condition and a genotypic data construct including genotypic information for the respective training subject. Genotypic constructs are formatted into corresponding vector sets comprising one or more vectors. Vector sets are provided to a network architecture including a convolutional neural network path comprising at least a first convolutional layer associated with a first filter that comprise a first set of filter weights and a scorer. Scores, corresponding to the input of vector sets into the network architecture, are obtained from the scorer. Comparison of respective scores to the corresponding cancer condition of the corresponding training subjects is used to adjust the filter weights thereby training the network architecture to classify cancer condition.
C40B 30/04 - Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
C40B 60/12 - Apparatus specially adapted for use in combinatorial chemistry or with libraries for screening libraries
G06N 3/12 - Computing arrangements based on biological models using genetic models
G06N 7/00 - Computing arrangements based on specific mathematical models
92.
INFERRING SELECTION IN WHITE BLOOD CELL MATCHED CELL-FREE DNA VARIANTS AND/OR IN RNA VARIANTS
Methods and systems for detecting positive, neutral, or negative selection at a locus include obtaining a test sample of cell-free nucleic acids from a subject, preparing a sequencing library of the cell-free nucleic acids, sequencing the library to obtain a plurality of sequence reads, analyzing the sequence reads to detect and quantify one or more somatic mutations at the locus, determining a selection coefficient for the locus, and comparing the selection coefficient with a threshold value to detect positive, neutral, or negative selection at the locus.
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
scientific and medical research services; providing an online database in the field of genetics and cancer for medical or scientific research; Providing temporary use of on-line non-downloadable software and applications for use in studying, diagnosing or screening for cancer and studying genetics and DNA; Providing temporary use of on-line non-downloadable cloud computing software for use in studying, diagnosing or screening cancer and studying genetics and DNA; Software as a Service (SaaS) services featuring software in the nature of a platform for genetic and bioinformatics analysis genetic testing and reporting for medical purposes; medical testing for diagnostic or treatment purposes; medical screening; medical diagnostic testing, monitoring and reporting services; providing a website featuring medical information regarding genetics; genetic analysis and reporting services for medical purposes
42 - Scientific, technological and industrial services, research and design
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services
Goods & Services
scientific and medical research services; providing an online database in the field of genetics and cancer for medical or scientific research; Providing temporary use of on-line non-downloadable software and applications for use in studying, diagnosing or screening for cancer and studying genetics and DNA; Providing temporary use of on-line non-downloadable cloud computing software for use in studying, diagnosing or screening cancer and studying genetics and DNA; Software as a Service (SaaS) services featuring software in the nature of a platform for genetic and bioinformatics analysis genetic testing and reporting for medical purposes; medical testing for diagnostic or treatment purposes; medical screening; medical diagnostic testing, monitoring and reporting services; providing a website featuring medical information regarding genetics; genetic analysis and reporting services for medical purposes
95.
SIZE-TAGGED PREFERRED ENDS AND ORIENTATION-AWARE ANALYSIS FOR MEASURING PROPERTIES OF CELL-FREE MIXTURES
Various applications can use fragmentation patterns related of cell-free DNA, e.g., plasma DNA and serum DNA. For example, the end positions of DNA fragments can be used for various applications. The fragmentation patterns of short and long DNA molecules can be associated with different preferred DNA end positions, referred to as size-tagged preferred ends. In another example, the fragmentation patterns relating to tissue-specific open chromatin regions were analyzed. A classification of a proportional contribution of a particular tissue type can be determined in a mixture of cell-free DNA from different tissue types. Additionally, a property of a particular tissue type can be determined, e.g., whether a sequence imbalance exists in a particular region for a tissue type or whether a pathology exists for the tissue type.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
96.
SIZE-TAGGED PREFERRED ENDS AND ORIENTATION-AWARE ANALYSIS FOR MEASURING PROPERTIES OF CELL-FREE MIXTURES
Various applications can use fragmentation patterns related of cell-free DNA, e.g., plasma DNA and serum DNA. For example, the end positions of DNA fragments can be used for various applications. The fragmentation patterns of short and long DNA molecules can be associated with different preferred DNA end positions, referred to as size-tagged preferred ends. In another example, the fragmentation patterns relating to tissue-specific open chromatin regions were analyzed. A classification of a proportional contribution of a particular tissue type can be determined in a mixture of cell-free DNA from different tissue types. Additionally, a property of a particular tissue type can be determined, e.g., whether a sequence imbalance exists in a particular region for a tissue type or whether a pathology exists for the tissue type.
C12Q 1/6883 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
97.
SIZE-TAGGED PREFERRED ENDS AND ORIENTATION-AWARE ANALYSIS FOR MEASURING PROPERTIES OF CELL-FREE MIXTURES
Various applications can use fragmentation patterns related of cell-free DNA, e.g., plasma DNA and serum DNA. For example, the end positions of DNA fragments can be used for various applications. The fragmentation patterns of short and long DNA molecules can be associated with different preferred DNA end positions, referred to as size-tagged preferred ends. In another example, the fragmentation patterns relating to tissue-specific open chromatin regions were analyzed. A classification of a proportional contribution of a particular tissue type can be determined in a mixture of cell-free DNA from different tissue types. Additionally, a property of a particular tissue type can be determined, e.g., whether a sequence imbalance exists in a particular region for a tissue type or whether a pathology exists for the tissue type.
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
C12Q 1/6881 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
Methods are provided to improve the positive predictive value for cancer detection using cell-free nucleic acid samples. The methods can include the use of at least two assays. The assays can vary, for example, with respect to sensitivity, specificity, sequencing depth, analyte, and cost. An exemplary method can be used to provide an initial cancer assay with high sensitivity and a follow-up assay with high specificity in detecting cancer.
G06F 19/22 - for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or Single-Nucleotide Polymorphism [SNP] discovery or sequence alignment
C12Q 1/68 - Measuring or testing processes involving enzymes, nucleic acids or microorganismsCompositions thereforProcesses of preparing such compositions involving nucleic acids
99.
SYSTEMS AND METHODS FOR USING PATHOGEN NUCLEIC ACID LOAD TO DETERMINE WHETHER A SUBJECT HAS A CANCER CONDITION
Methods for screening for a cancer condition in a subject are provided. A biological sample from the subject is obtained. The sample comprises cell-free nucleic acid from the subject and potentially cell-free nucleic acid from a pathogen in a set of pathogens. The cell-free nucleic acid in the biological sample is sequenced to generate a plurality of sequence reads from the subject. A determination is made, for each respective pathogen in the set of pathogens, of a corresponding amount of the plurality of sequence reads that map to a sequence in a pathogen target reference for the respective pathogen, thereby obtaining a set of amounts of sequence reads, each respective amount of sequence reads in the set of amounts of sequence reads for a corresponding pathogen in the set of pathogens. The set of amounts of sequence reads is used to determine whether the subject has the cancer condition.
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
C12Q 1/6888 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
100.
SYSTEMS AND METHODS FOR DETERMINING TUMOR FRACTION IN CELL-FREE NUCLEIC ACID
Systems and methods are disclosed for determining tumor fraction in cell-free nucleic acid of a liquid biological sample of a subject. Sequence reads are obtained using the biological sample. The sequence reads are used to identify support for each variant in a variant set thereby determining an observed frequency of each variant in the variant set. For each respective variant in the variant set, a corresponding reference frequency for the respective variant is obtained in a reference set, where each corresponding reference frequency in the reference set is for a respective variant in an aberrant solid tissue sample obtained from the subject. The observed frequency of each respective variant in the variant set is evaluated against the observed frequency of the respective variant in the reference set thereby determining the tumor fraction in cell-free nucleic acid of the liquid biological sample.
C12Q 1/6827 - Hybridisation assays for detection of mutation or polymorphism
C12Q 1/6886 - Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer