Methods, systems, and apparatuses for adaptive power supply voltage transient protection are disclosed. One system includes a power supply, a transient sensor, and a power control processing entity. The power supply operates to provide power to one or more processors. The transient sensor is connected to the power supply and operates to sense transients on the power supply at greater than a predetermined speed or rate. The power control processing entity operates to receive a representation of the sensed transients and adjust a power load based on the sensed transients.
Disclosed herein is a graph streaming neural network processing system comprising a first processor array, a second processor, and a thread scheduler. The thread scheduler dispatches a thread of a first node to the first processor array or the second processor, wherein the thread is executed to generate output data comprising a data unit stored in a private data buffer of the second processor. The thread scheduler determines that the data unit is sufficient for executing a thread of a second node. The second node is dependent on the output data generated by execution of a plurality of threads of the first node. Upon determining that the data unit is sufficient, the thread scheduler dispatches the thread of the second node. The thread scheduler determines to dispatch a subsequent thread of the first node for execution when a predefined threshold buffer size is available on the private data buffer.
Methods, systems and apparatuses for discovering novel artificial neural network architectures (ANN) architecture are disclosed. One method includes calculating ANN architecture fingerprints including an ANN architecture fingerprint of each of a plurality of existing ANN architectures, creating a plurality of next-generation candidate ANN architectures, calculating a plurality of next-generation candidate ANN architecture fingerprints including an ANN architecture fingerprint of each of the plurality of next-generation candidate ANN architectures, calculating ANN architecture pairwise similarities between each of the plurality of existing ANN architectures and each of the plurality of next-generation candidate ANN architectures using the plurality of existing ANN architecture fingerprints and the plurality of next-generation candidate ANN architecture fingerprints, retraining each of the plurality of next-generation candidate ANN architectures on the training dataset, obtaining a performance score of each of the next-generation candidate ANN architectures, and calculating a fitness score for each of the next-generation candidate ANN architectures.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Computer hardware, namely, electronic circuits, computer
chips and circuit boards for AI inferencing, machine
learning, deep learning, and vision processing; graph
streaming data processors; downloadable computer software,
namely, software for the creation, processing and streaming
of graphs; downloadable computer software development tools,
compiler software and electronic coding units for
programming graph streaming data processors; downloadable
computer software for providing an integrated development
environment (IDE) for artificial intelligence (AI) and/or
machine learning (ML) software design, development,
deployment, and management. Platform as a Service (PaaS) services for artificial
intelligence (AI) and/or machine learning (ML) software
design, development, deployment, and management.
5.
Group Thread Dispatch for Graph Streaming Processor
Methods, systems, and apparatuses for graph streaming processing are disclosed. One method includes receiving, by a thread scheduler, a group of threads, calculating a resource requirement for execution of the group of threads, calculating resource availability in a plurality of processors of each of a plurality of processor arrays, dispatching the group of threads to a selected one of plurality of processors of processor arrays, scheduling a group load instruction for all threads of the group of threads, including loading into a group load register a subset of inputs of the input tensor for processing of each thread of the group of threads, wherein the group load register provides the subset of the inputs of the input tensor to the group of threads of the selected one of the plurality of processors, wherein all threads of the group of threads are synchronized when executing the group load instruction.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
(1) Computer hardware, namely, electronic circuits, computer chips and circuit boards for AI inferencing, machine learning, deep learning, and vision processing; graph streaming data processors; downloadable computer software, namely, software for the creation, processing and streaming of graphs; downloadable computer software development tools, compiler software and electronic coding units for programming graph streaming data processors; downloadable computer software for providing an integrated development environment (IDE) for artificial intelligence (AI) and/or machine learning (ML) software design, development, deployment, and management. (1) Platform as a Service (PaaS) services for artificial intelligence (AI) and/or machine learning (ML) software design, development, deployment, and management.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Computer hardware, namely, electronic circuits, computer chips and circuit boards for AI inferencing, machine learning, deep learning, and vision processing; graph streaming data processors; Downloadable computer software, namely, software for the creation, processing and streaming of graphs; Downloadable computer software development tools, downloadable compiler software and electronic coding units for programming graph streaming data processors; Downloadable computer software for providing an integrated development environment (IDE) for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management Platform as a service (PAAS) featuring computer software platforms for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management
8.
METHOD AND SYSTEMS FOR PREDICTING MEDICAL CONDITIONS AND FORECASTING RATE OF INFECTION OF MEDICAL CONDITIONS VIA ARTIFICIAL INTELLIDENCE MODELS USING GRAPH STREAM PROCESSORS
Systems and methods are disclosed for predicting one or more medical conditions utilizing digital images and employing artificial intelligent algorithms. The system offers accurate predictions utilizing quantized pre-trained deep learning model. The pre-trained deep learning model is trained on data samples and later refined as the system processes more digital images or new medical conditions are incorporated. One pre-trained deep learning model is used to predict the probability of one or more medical conditions and identify locations in the digital image effected by the one or more medical conditions. Further, one pre-trained deep learning model utilizing additional data and plurality of digital images, forecasts rate of infection and spread of the medical condition over time.
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06F 18/214 - Génération de motifs d'entraînementProcédés de Bootstrapping, p. ex. ”bagging” ou ”boosting”
G06V 10/774 - Génération d'ensembles de motifs de formationTraitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source méthodes de Bootstrap, p. ex. "bagging” ou “boosting”
G06V 10/776 - ValidationÉvaluation des performances
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G16H 30/40 - TIC spécialement adaptées au maniement ou au traitement d’images médicales pour le traitement d’images médicales, p. ex. l’édition
G16H 50/20 - TIC spécialement adaptées au diagnostic médical, à la simulation médicale ou à l’extraction de données médicalesTIC spécialement adaptées à la détection, au suivi ou à la modélisation d’épidémies ou de pandémies pour le diagnostic assisté par ordinateur, p. ex. basé sur des systèmes experts médicaux
9.
Group thread dispatch for graph streaming processor
Methods, systems. and apparatuses for graph streaming processing are disclosed. One method includes receiving, by a thread scheduler, a group of threads, calculating a resource requirement for execution of the group of threads, calculating resource availability in a plurality of processors of each of a plurality of processor arrays, dispatching the group of threads to a selected one of plurality of processors of processor arrays, scheduling a group load instruction for all threads of the group of threads, including loading into a group load register a subset of inputs of the input tensor for processing of each thread of the group of threads, wherein the group load register provides the subset of the inputs of the input tensor to the group of threads of the selected one of the plurality of processors, wherein all threads of the group of threads are synchronized when executing the group load instruction.
Disclosed herein is a graph streaming processing system comprising a thread scheduler comprising a first component and a second component. The first component is configured to schedule a first set of threads of a first node to a first processor associated with the first node and initialize status of a completion pointer to an initial value. The completion pointer is associated with a command buffer of the first node. The first component is configured to detect the execution of the first set of threads and generation of a data unit and update the status of the completion pointer to an updated value indicating execution of the first set of threads in response to the generation of the data unit. The second component is configured to schedule a second set of threads of a plurality of second nodes to a second processor based on the status of the completion pointer. The second processor is associated with the plurality of second nodes and the second set of threads of the plurality of second nodes are dependent on execution of the first set of threads.
The present disclosure relates to a system and method of performing quantization of a neural network having multiple layers. The method comprises receiving a floating-point dataset as input dataset and determining a first shift constant for first layer of the neural network based on the input dataset. The method also comprises performing quantization for the first layer using the determined shift constant of the first layer. The method further comprises determining a next shift constant for next layer of the neural network based on output of a layer previous to the next layer, and performing quantization for the next layer using the determined next shift constant. The method further comprises iterating the steps of determining shift constant and performing quantization for all layers of the neural network to generate fixed point dataset as output.
G06V 10/776 - ValidationÉvaluation des performances
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
12.
Adaptive power supply voltage transient protection
Methods, systems, and apparatuses for adaptive power supply voltage transient protection are disclosed. One system includes a power supply, a voltage transient sensor, and a power control processing entity. The power supply operates to provide power to one or more processors. The voltage transient sensor is connected to the power supply and operates to sense voltage transients on the power supply at greater than a predetermined speed or rate. The power control processing entity operates to receive a representation of the sensed voltage transients and adjust a power load based on the sensed voltage transients.
Methods, systems and apparatuses for a custom artificial neural network (ANN) architecture are disclosed. One method includes selecting existing ANN architectures, calculating ANN architecture fingerprints, calculating ANN architecture pairwise similarities among the existing ANN architectures, calculating centrality scores for the existing ANN architectures using the ANN architecture pairwise similarities, calculating dataset pairwise similarities between the target dataset and each of the existing datasets using dataset fingerprints, calculating target performance scores for the existing ANN architectures on the target dataset using performance scores of the existing ANN architectures on the existing datasets and the dataset pairwise similarities, calculating interpolation weights for the existing ANN architectures using the target performance scores of the existing ANN architectures on the target dataset and the centrality scores, and obtaining the custom ANN architecture by interpolating among the existing ANN architectures using the calculated interpolation weights.
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06F 18/22 - Critères d'appariement, p. ex. mesures de proximité
G06N 3/084 - Rétropropagation, p. ex. suivant l’algorithme du gradient
G06N 5/046 - Inférence en avantSystèmes de production
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
Method and systems for predicting medical conditions and forecasting rate of infection of medical conditions via artificial intellidence models using graph stream processors
Systems and methods are disclosed for predicting one or more medical conditions utilizing digital images and employing artificial intelligent algorithms. The system offers accurate predictions utilizing quantized pre-trained deep learning model. The pre-trained deep learning model is trained on data samples and later refined as the system processes more digital images or new medical conditions are incorporated. One pre-trained deep learning model is used to predict the probability of one or more medical conditions and identify locations in the digital image effected by the one or more medical conditions. Further, one pre-trained deep learning model utilizing additional data and plurality of digital images, forecasts rate of infection and spread of the medical condition over time.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06F 18/214 - Génération de motifs d'entraînementProcédés de Bootstrapping, p. ex. ”bagging” ou ”boosting”
G06V 10/774 - Génération d'ensembles de motifs de formationTraitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source méthodes de Bootstrap, p. ex. "bagging” ou “boosting”
G06V 10/776 - ValidationÉvaluation des performances
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G16H 30/40 - TIC spécialement adaptées au maniement ou au traitement d’images médicales pour le traitement d’images médicales, p. ex. l’édition
G16H 50/20 - TIC spécialement adaptées au diagnostic médical, à la simulation médicale ou à l’extraction de données médicalesTIC spécialement adaptées à la détection, au suivi ou à la modélisation d’épidémies ou de pandémies pour le diagnostic assisté par ordinateur, p. ex. basé sur des systèmes experts médicaux
15.
Unsupervised data drift detection for classification neural networks
Methods, systems, and apparatuses for unsupervised data drift detection for classification neural networks are disclosed. One method includes providing the data stream of images to a neural network, generating, by the neural network, class wise probabilities, storing each image of the data stream of images, storing the class wise probabilities generated by the neural network, comparing artifacts of images of the data stream at a first time with artifacts of images of the data stream at a second time, comparing artifacts produced by the class wise probabilities of the data stream retrieved from the stored class wise probabilities at a third time with artifacts produced by the class wise probabilities of the data stream retrieved from the stored class wise probabilities at a fourth time, and generating an informative communication based on the comparisons.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/75 - Organisation de procédés de l’appariement, p. ex. comparaisons simultanées ou séquentielles des caractéristiques d’images ou de vidéosApproches-approximative-fine, p. ex. approches multi-échellesAppariement de motifs d’image ou de vidéoMesures de proximité dans les espaces de caractéristiques utilisant l’analyse de contexteSélection des dictionnaires
G06V 10/776 - ValidationÉvaluation des performances
Methods, systems, and apparatuses for graph stream processing are disclosed. One apparatus includes a cascade of graph streaming processors, wherein each of the graph streaming processor includes a processor array, and a graph streaming processor scheduler. The cascade of graph streaming processors further includes a plurality of shared command buffers, wherein each shared command buffer includes a buffer address, a write pointer, and a read pointer, wherein for each of the plurality of shared command buffers a graph streaming processor writes commands to the shared command buffer as indicated by the write pointer of the shared command buffer and the graph streaming processor reads commands from the shared command buffer as indicated by the read pointer, wherein at least one graph streaming processor scheduler operates to manage the write pointer and the read pointer to avoid overwriting unused commands of the shared command buffer.
Disclosed herein is a graph streaming neural network processing system comprising a first processor array, a second processor, and a thread scheduler. The thread scheduler dispatches a thread of a first node to the first processor array or the second processor, wherein the thread is executed to generate output data comprising a data unit stored in a private data buffer of the second processor. The thread scheduler determines that the data unit is sufficient for executing a thread of a second node. The second node is dependent on the output data generated by execution of a plurality of threads of the first node. Upon determining that the data unit is sufficient, the thread scheduler dispatches the thread of the second node. The thread scheduler determines to dispatch a subsequent thread of the first node for execution when a predefined threshold buffer size is available on the private data buffer.
G06F 9/48 - Lancement de programmes Commutation de programmes, p. ex. par interruption
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
18.
Single instruction multiple data execution with variable size logical registers
Systems, apparatuses and methods are disclosed for efficient management of registers in a graph stream processing (GSP) system. The GSP system includes a thread scheduler module operative to initiate a Single Instruction Multiple Data (SIMD) thread, the SIMD thread including a dispatch mask with an initial value. A thread arbiter module operative to select an instruction from the instructions and provide the instruction to each of one or more compute resources, and an instruction iterator module, associated with the each of one or more compute resources operative to determine a data type of the instruction. The instruction iterator module iteratively executes the instruction based on the data type and the dispatch mask.
Disclosed herein is a method and a system for generating a mixed precision quantization model for performing image processing. The method comprises receiving a validation dataset of images to train a neural network model. The method comprises for each image of the validation dataset, generating a union sensitivity list, selecting a group of layers, generating a mixed precision quantization model by quantizing the selected group of layers into a high precision format; computing accuracy of the mixed precision quantization model for comparison with a target accuracy; in response to determining the accuracy is less than the target accuracy, generating another mixed precision model by selecting a next group of layers and computing the accuracy. In response to determining the accuracy is greater than or equal to the target accuracy, storing the mixed precision quantization model as a final mixed precision quantization model for image processing.
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/28 - Quantification de l’image, p. ex. seuillage par histogramme visant à discriminer entre les formes d’arrière-plan et d’avant-plan
20.
Method of optimizing register memory allocation for vector instructions and a system thereof
The present disclosure relates to a system and a method of optimizing register allocation by a processor. The method comprising receiving an intermediate representation (IR) code of a source code and initializing single instruction multiple data (SIMD) width for the IR code. The method comprising analyzing each basic block of the IR code to classify determine one or more instructions of the IR code as vector instructions, wherein each basic block is one of LOAD, STORE and arithmetic logical and multiply (ALM) instructions. The method comprising dynamically setting the SIMD width for each of the vector instructions.
Methods, systems, and apparatuses for adaptive power supply voltage transient protection are disclosed. One system includes a system on a chip (SOC), wherein the SOC includes a power supply, a voltage transient sensor, and a power control processing entity. The power supply operates to provide power to one or more processors operating on the SOC. The voltage transient sensor is connected to the power supply and operates to sense voltage transients on the power supply at greater than a predetermined speed or rate. The power control processing entity operates to receive a digital representation of the sensed voltage transients and adjust a power load of the SOC based on the sensed voltage transients.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
downloadable computer software development platform for
artificial intelligence (ai) and machine learning (ml)
software design, development, deployment, and management;
downloadable computer software development platform for
graph streaming data processors; downloadable computer
software, namely, software for creating and optimizing
artificial intelligence applications; downloadable computer
software, namely, software for artificial intelligence
inferencing, machine learning, deep learning, and vision
processing. Platform as a service (PaaS) services for artificial
intelligence (AI) and machine learning (ML) software design,
development, deployment, and management; platform as a
service (PaaS) services for graph streaming data processors.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer
chips and circuit boards for ai inferencing, machine
learning, deep learning, and vision processing; graph
streaming data processors; downloadable computer software,
namely, software for ai inferencing, machine learning, deep
learning, and vision processing; downloadable computer
software development tools.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer
chips and circuit boards for ai inferencing, machine
learning, deep learning, and vision processing; graph
streaming data processors; downloadable computer software,
namely, software for ai inferencing, machine learning, deep
learning, and vision processing; downloadable computer
software development tools.
09 - Appareils et instruments scientifiques et électriques
Produits et services
(1) Computer hardware, namely, electronic circuits, computer chips and circuit boards for ai inferencing, machine learning, deep learning, and vision processing; graph streaming data processors; downloadable computer software, namely, software for providing an integrated development environment for ai inferencing, machine learning, deep learning, and vision processing; downloadable computer software development tools, namely, compiler software and electronic coding units for programming graph streaming data processors.
09 - Appareils et instruments scientifiques et électriques
Produits et services
(1) Computer hardware, namely, electronic circuits, computer chips and circuit boards for ai inferencing, machine learning, deep learning, and vision processing; graph streaming data processors; downloadable computer software, namely, software for providing an integrated development environment for ai inferencing, machine learning, deep learning, and vision processing; downloadable computer software development tools, namely, compiler software and electronic coding units for programming graph streaming data processors
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
(1) Downloadable computer software development platform for providing an integrated development environment for artificial intelligence (ai) and machine learning (ml) software design, development, deployment, and management; downloadable computer software development platform for the creation, processing and streaming of graphs; downloadable computer software, namely, software for providing an integrated development environment for creating and optimizing artificial intelligence applications; downloadable computer software, namely, software for providing an integrated development environment for artificial intelligence inferencing, machine learning, deep learning, and vision processing (1) Platform as a service (PaaS) services for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management; platform as a service (PaaS) services for the creation, processing and streaming of graphs
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer chips and circuit boards for AI inferencing, machine learning, deep learning, and vision processing; graph streaming data processors; Downloadable computer software, namely, software for AI inferencing, machine learning, deep learning, and vision processing; Downloadable computer software development tools
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer chips and circuit boards for AI inferencing, machine learning, and deep learning; graph streaming data processors.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Downloadable computer software development platform for the design, development, deployment, and management of artificial intelligence (AI) and machine learning (ML) software; Downloadable computer software, namely, software development platforms for graph streaming data processors; Downloadable computer software, namely, software for creating and optimizing artificial intelligence applications Platform as a Service (PaaS) services featuring computer software platforms for the design, development, deployment, and management of artificial intelligence (AI) and machine learning (ML) software; Platform as a Service (PaaS) services featuring computer software platforms for graph streaming data processors.
31.
Iterating group sum of multiple accumulate operations
Methods, systems and apparatuses for performing walk operations of single instruction, multiple data (SIMD) instructions are disclosed. One method includes initiating, by a scheduler, a SIMD thread, where the scheduler is operative to schedule the SIMD thread. The method further includes fetching a plurality of instructions for the SIMD thread. The method further includes determining, by a thread arbiter, at least one instruction that is a walk instruction, where the walk instruction iterates a block of instructions for a subset of channels of the SIMD thread, where the walk instruction includes a walk size, and where the walk size is a number of channels in the subset of channels of the SIMD thread that are processed in a walk iteration in association with the walk instruction. The method further includes executing the walk instruction based on the walk size.
The present disclosure relates to a system and method of performing quantization of a neural network having multiple layers. The method comprises receiving a floating-point dataset as input dataset and determining a first shift constant for first layer of the neural network based on the input dataset. The method also comprises performing quantization for the first layer using the determined shift constant of the first layer. The method further comprises determining a next shift constant for next layer of the neural network based on output of a layer previous to the next layer, and performing quantization for the next layer using the determined next shift constant. The method further comprises iterating the steps of determining shift constant and performing quantization for all layers of the neural network to generate fixed point dataset as output.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/776 - ValidationÉvaluation des performances
33.
Configurable scheduler with pre-fetch and invalidate threads in a graph stream processing system
Systems, apparatuses, and methods are disclosed for scheduling threads comprising of code blocks in a graph streaming processor (GSP) system. One system includes a scheduler for scheduling plurality of prefetch threads, main threads, invalidate threads. The plurality of prefetch threads includes prefetching data from main memory required for execution of the main threads of the next stage. The plurality of main threads includes a set of instructions operating on the graph streaming processors of GSP system. The plurality of the invalidate threads includes invalidating data location/s consumed by the plurality of the main threads of the previous stage. A portion of the scheduler is implemented in hardware.
Methods, systems, and apparatuses for graph streaming processing system are disclosed. One system includes a plurality of graph streaming processors operative to process a plurality of threads, wherein the plurality of threads is organized as nodes. The system further includes a scheduler that includes a plurality of stages. Each stage includes a command parser operative to interpret commands within a corresponding input command buffer, an alternate command buffer, and a thread generator coupled to the command parser. The thread generator is operative to generate the plurality of threads, and dispatch the plurality of threads, where the processing of the plurality of thread for each stage includes storing write commands in the corresponding output command buffer or in the alternate command buffer.
Methods, systems, and apparatuses for graph stream processing are disclosed. One apparatus includes a cascade of graph streaming processors, wherein each of the graph streaming processor includes a processor array, and a graph streaming processor scheduler. The cascade of graph streaming processors further includes a plurality of shared command buffers, wherein each shared command buffer includes a buffer address, a write pointer, and a read pointer, wherein for each of the plurality of shared command buffers a first graph streaming processor writes commands to the shared command buffer as indicated by the write pointer of the shared command buffer and a second graph streaming processor reads commands from the shared command buffer as indicated by the read pointer, wherein at least one graph streaming processor scheduler operates to manage the write pointer and the read pointer to avoid overwriting unused commands of the shared command buffer.
Disclosed embodiments relate to a method and device for optimizing compilation of source code. The proposed method receives a first intermediate representation code of a source code and analyses each basic block instruction of the plurality of basic block instructions contained in the first intermediate representation code for blockification. In order to blockify the identical instructions, the one or more groups of basic block instructions are assessed for eligibility of blockification. Upon determining as eligible, the group of basic block instructions are blockified using one of one dimensional SIMD vectorization and two-dimensional SIMD vectorization. The method further generates a second intermediate representation of the source code which is translated to executable target code with more efficient processing capacity.
Systems, apparatuses and methods are disclosed for efficient management of registers in a graph stream processing (GSP) system. The GSP system includes a thread scheduler module operative to initiate a Single Instruction Multiple Data (SIMD) thread, the SIMD thread including a dispatch mask with an initial value. A thread arbiter module operative to select an instruction from the instructions and provide the instruction to each of one or more compute resources, and an instruction iterator module, associated with the each of one or more compute resources operative to determine a data type of the instruction. The instruction iterator module iteratively executes the instruction based on the data type and the dispatch mask.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Downloadable computer software for providing an integrated
development environment (IDE) for artificial intelligence
(AI) and/or machine learning (ML) software design,
development, deployment, and management. Platform as a Service (PaaS) services for artificial
intelligence (AI) and/or machine learning (ML) software
design, development, deployment, and management.
39.
METHOD OF OPTIMIZING SCALAR REGISTER ALLOCATION AND A SYSTEM THEREOF
The present disclosure relates to a system and a method of optimizing scalar register allocation by a processor. The method comprises receiving an intermediate code and information about one or more available physical registers in a memory of the processor, as input. The method further comprises allocating one or more virtual registers based on the received information, wherein each virtual register is having size of each available physical register. The method also comprises mapping one or more groups of 8-bit location of the one or more virtual registers to one or more register classes. The method further comprises identifying a plurality of scalar variables from the input intermediate code, and dynamically assigning the one or more available physical registers to the identified scalar variables using the one or more register classes.
Systems, apparatuses, and methods are disclosed for scheduling threads comprising of code blocks in a graph streaming processor (GSP) system. One system includes a scheduler for scheduling plurality of prefetch threads, main threads, invalidate threads. The plurality of prefetch threads includes prefetching data from main memory required for execution of the main threads of the next stage. The plurality of main threads includes a set of instructions operating on the graph streaming processors of GSP system. The plurality of the invalidate threads includes invalidating data location/s consumed by the plurality of the main threads of the previous stage. A portion of the scheduler is implemented in hardware.
Methods, systems and apparatuses for performing walk operations of single instruction, multiple data (SIMD) instructions are disclosed. One method includes initiating, by a scheduler, a SIMD thread, where the scheduler is operative to schedule the SIMD thread. The method further includes fetching a plurality of instructions for the SIMD thread. The method further includes determining, by a thread arbiter, at least one instruction that is a walk instruction, where the walk instruction iterates a block of instructions for a subset of channels of the SIMD thread, where the walk instruction includes a walk size, and where the walk size is a number of channels in the subset of channels of the SIMD thread that are processed in a walk iteration in association with the walk instruction. The method further includes executing the walk instruction based on the walk size.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
(1) Downloadable computer software for providing an integrated development environment (IDE) for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management (1) Platform as a Service (PaaS) services for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management
09 - Appareils et instruments scientifiques et électriques
Produits et services
(1) Computer hardware, namely, electronic circuits, computer chips and circuit boards for machine learning, deep learning, and vision processing; graph streaming data processors; downloadable computer software, namely, software for the creation, processing and streaming of graphs; downloadable computer software development tools, compiler software and electronic coding units for programming graph streaming data processors.
09 - Appareils et instruments scientifiques et électriques
Produits et services
(1) Computer hardware, namely, electronic circuits, computer chips and circuit boards for machine learning, deep learning, and vision processing; graph streaming data processors; downloadable computer software, namely, software for the creation, processing and streaming of graphs; downloadable computer software development tools, compiler software and electronic coding units for programming graph streaming data processors.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Downloadable computer software for providing an integrated development environment (IDE) for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management Platform as a Service (PaaS) services for artificial intelligence (AI) and machine learning (ML) software design, development, deployment, and management
46.
Configurable scheduler for graph processing on multi-processor computing systems
Systems and methods are disclosures for scheduling code in a multiprocessor system. Code is portioned into code blocks by a compiler. The compiler schedules execution of code blocks in nodes. The nodes are connected in a directed acyclical graph with a top node, terminal node and a plurality of intermediate nodes. Execution of the top node is initiated by the compiler. After executing at least one instance of the top node, an instruction in the code block indicates to the scheduler to initiate at least one intermediary node. The scheduler schedules a thread for execution of the intermediary node. The data for the nodes resides in a plurality of data buffers; the index to the data buffer is stored in a command buffer.
The described embodiments include systems, methods, and apparatuses for increased efficiency processing flow. One method includes a plurality of stages configured to process an execution graph that includes a plurality of logical nodes with defined properties and resources associated with each logical node of the plurality of logical nodes, a recirculating ring buffer, wherein the recirculating ring buffer is configured to holding only any one of a control information, input, and, or out data necessary to stream a temporary data between each logical node of the execution graph, and a data producer, wherein the data producer is configured to stall from writing control information into a command buffer upon the command buffer being full, preventing command buffer over-writing.
Methods, systems and apparatuses for graph processing are disclosed. One graph streaming processor includes a thread manager, wherein the thread manager is operative to dispatch operation of the plurality of threads of a plurality of thread processors before dependencies of the dependent threads have been resolved, maintain a scorecard of operation of the plurality of threads of the plurality of thread processors, and provide an indication to at least one of the plurality of thread processors when a dependency between the at least one of the plurality of threads that a request has or has not been satisfied. Further, a producer thread provides a response to the dependency when the dependency has been satisfied, and each of the plurality of thread processors is operative to provide processing updates to the thread manager, and provide queries to the thread manager upon reaching a dependency.
Disclosed embodiments relate to a method and device for optimizing compilation of source code. The proposed method receives a first intermediate representation code of a source code and analyses each basic block instruction of the plurality of basic block instructions contained in the first intermediate representation code for blockification. In order to blockify the identical instructions, the one or more groups of basic block instructions are assessed for eligibility of blockification. Upon determining as eligible, the group of basic block instructions are blockified using one of one dimensional SIMD vectorization and two-dimensional SIMD vectorization. The method further generates a second intermediate representation of the source code which is translated to executable target code with more efficient processing capacity.
Methods, systems and apparatuses for discovering novel artificial neural network architectures (ANN) architecture are disclosed. One method includes calculating ANN architecture fingerprints including an ANN architecture fingerprint of each of a plurality of existing ANN architectures, creating a plurality of next-generation candidate ANN architectures, calculating a plurality of next-generation candidate ANN architecture fingerprints including an ANN architecture fingerprint of each of the plurality of next-generation candidate ANN architectures, calculating ANN architecture pairwise similarities between each of the plurality of existing ANN architectures and each of the plurality of next-generation candidate ANN architectures using the plurality of existing ANN architecture fingerprints and the plurality of next-generation candidate ANN architecture fingerprints, retraining each of the plurality of next-generation candidate ANN architectures on the training dataset, obtaining a performance score of each of the next-generation candidate ANN architectures, and calculating a fitness score for each of the next-generation candidate ANN architectures.
Method and systems for predicting medical conditions and forecasting rate of infection of medical conditions via artificial intelligence models using graph stream processors
Systems and methods are disclosed for predicting one or more medical conditions utilizing digital images and employing artificial intelligent algorithms. The system offers accurate predictions utilizing quantized pre-trained deep learning model. The pre-trained deep learning model is trained on data samples and later refined as the system processes more digital images or new medical conditions are incorporated. One pre-trained deep learning model is used to predict the probability of one or more medical conditions and identify locations in the digital image effected by the one or more medical conditions. Further, one pre-trained deep learning model utilizing additional data and plurality of digital images, forecasts rate of infection and spread of the medical condition over time.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G16H 30/40 - TIC spécialement adaptées au maniement ou au traitement d’images médicales pour le traitement d’images médicales, p. ex. l’édition
G16H 50/20 - TIC spécialement adaptées au diagnostic médical, à la simulation médicale ou à l’extraction de données médicalesTIC spécialement adaptées à la détection, au suivi ou à la modélisation d’épidémies ou de pandémies pour le diagnostic assisté par ordinateur, p. ex. basé sur des systèmes experts médicaux
G06F 18/214 - Génération de motifs d'entraînementProcédés de Bootstrapping, p. ex. ”bagging” ou ”boosting”
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06V 10/774 - Génération d'ensembles de motifs de formationTraitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source méthodes de Bootstrap, p. ex. "bagging” ou “boosting”
G06V 10/776 - ValidationÉvaluation des performances
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
Methods, systems and apparatuses for a custom artificial neural network (ANN) architecture are disclosed. One method includes selecting existing ANN architectures, calculating ANN architecture fingerprints, calculating ANN architecture pairwise similarities among the existing ANN architectures, calculating centrality scores for the existing ANN architectures using the ANN architecture pairwise similarities, calculating dataset pairwise similarities between the target dataset and each of the existing datasets using dataset fingerprints, calculating target performance scores for the existing ANN architectures on the target dataset using performance scores of the existing ANN architectures on the existing datasets and the dataset pairwise similarities, calculating interpolation weights for the existing ANN architectures using the target performance scores of the existing ANN architectures on the target dataset and the centrality scores, and obtaining the custom ANN architecture by interpolating among the existing ANN architectures using the calculated interpolation weights.
G06F 18/22 - Critères d'appariement, p. ex. mesures de proximité
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
53.
Iterating single instruction, multiple-data (SIMD) instructions
Methods, systems and apparatuses for performing walk operations of single instruction, multiple data (SIMD) instructions are disclosed. One method includes initiating, by a scheduler, a SIMD thread, where the scheduler is operative to schedule the SIMD thread. The method further includes fetching, a plurality of instructions for the SIMD thread. The method further includes determining, by a thread arbiter, at least one instruction that is a walk instruction, where the walk instruction iterates a block of instructions for a subset of channels of the SIMD thread, where the walk instruction includes a walk size, and where the walk size is a number of channels in the subset of channels of the SIMD thread that are processed in a walk iteration in association with the walk instruction. The method further includes executing the walk instruction based on the walk size.
Methods, systems and apparatuses for reducing operations of Sum-Of-Multiply-Accumulate (SOMAC) instructions are disclosed. One method includes scheduling, by a scheduler, a thread for execution, executing, by a processor of a plurality of processors, the thread, fetching, by the processor, a plurality of instructions for the thread from a memory, selecting, by a thread arbiter of the processor, an instruction of the plurality of instructions for execution in an arithmetic logic unit (ALU) pipeline of the processor, and reading the instruction, and determining, by a macro-instruction iterator of the processor, whether the instruction is a Sum-Of-Multiply-Accumulate (SOMAC) instruction with an instruction size, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
55.
Group load register of a graph streaming processor
Methods, systems and apparatuses for graph streaming processing are disclosed. One method includes loading, by a group load register, a subset of a an input tensor from a data cache, wherein the group load register provides the subset of the input tensor to all of a plurality of processors, loading, by a plurality of weight data registers, a plurality of weights of a weight tensor, wherein each of the weight data registers provide an weight to a single of the plurality of processors, and performing, by the plurality of processors, a SOMAC (Sum-Of-Multiply-Accumulate) instruction, including simultaneously determining, by each of the plurality of processors, an instruction size of the SOMAC instruction, wherein the instruction size indicates a number of iterations that the SOMAC instruction is to be executed and is equal to a number of outputs within a subset of a plurality of output tensors.
The described embodiments include systems, methods, and apparatuses for increased efficiency processing flow. One method includes a plurality of stages configured to process an execution graph that includes a plurality of logical nodes with defined properties and resources associated with each logical node of the plurality of logical nodes, a recirculating ring buffer, wherein the recirculating ring buffer is configured to holding only any one of a control information, input, and, or out data necessary to stream a temporary data between each logical node of the execution graph, and a data producer, wherein the data producer is configured to stall from writing control information into a command buffer upon the command buffer being full, preventing command buffer over-writing.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer
chips and circuit boards for machine learning, deep
learning, and vision processing; graph streaming data
processors; downloadable computer software, namely, software
for the creation, processing and streaming of graphs;
downloadable computer software development tools, compiler
software and electronic coding units for programming graph
streaming data processors.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Processor cores in the nature of computing hardware as components of Systems-on-a-Chip; Instruction set architecture and processor architectures in the nature of computing hardware for integrated circuits; Underlying processor architecture in the nature of computing hardware for integrated circuits
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer
chips and circuit boards for machine learning, deep
learning, and vision processing; graph streaming data
processors; downloadable computer software, namely, software
for the creation, processing and streaming of graphs;
downloadable computer software development tools, compiler
software and electronic coding units for programming graph
streaming data processors.
60.
Node topology employing recirculating ring command and data buffers for executing thread scheduling
The described embodiments include systems, methods, and apparatuses for increased efficiency processing flow. One method includes a plurality of stages configured to process an execution graph that includes a plurality of logical nodes with defined properties and resources associated with each logical node of the plurality of logical nodes, a recirculating ring buffer, wherein the recirculating ring buffer is configured to holding only any one of a control information, input, and, or out data necessary to stream a temporary data between each logical node of the execution graph, and a data producer, wherein the data producer is configured to stall from writing control information into a command buffer upon the command buffer being full, preventing command buffer over-writing.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Processor cores in the nature of computing hardware as components of Systems-on-a-Chip; Instruction set architecture and processor architectures for integrated circuits in the nature of computing hardware; Underlying processor architecture for integrated circuits in the nature of computing hardware
62.
Configurable scheduler in a graph streaming processing system
Systems and methods are disclosures for scheduling code in a multiprocessor system. Code is portioned into code blocks by a compiler. The compiler schedules execution of code blocks in nodes. The nodes are connected in a directed acyclical graph with a top node, terminal node and a plurality of intermediate nodes. Execution of the top node is initiated by the compiler. After executing at least one instance of the top node, an instruction in the code block indicates to the scheduler to initiate at least one intermediary node. The scheduler schedules a thread for execution of the intermediary node. The data for the nodes resides in a plurality of data buffers; the index to the data buffer is stored in a command buffer.
The claimed invention discloses system comprising a plurality of logical nodes comprised in a single or plurality of stages, with defined properties and resources associated with each node, for reducing compute resources, said system further comprising: at least a recirculating ring buffer holding only any one of a control information, input, and, or out data necessary to stream a temporary data between node and, or nodes in an execution graph, thereby reducing size of said recirculating ring buffer; said recirculating ring buffer being sufficiently reduced in size to reside in an on-chip cache, such that any one of the control information, input, and, or out data between node and, or nodes need not be stored in memory; wherein the control information further comprises a command related to invalidating any one of the input and, or out data held in a recirculating ring data buffer, clearing the buffer of tasked data; and wherein a producer is stalled from writing any more control information into a recirculating ring command buffer upon the buffer being full, preventing command buffer over-writing, and thereby reducing compute resources associated with a DRAM memory transaction.
Systems and methods are disclosures for scheduling code in a multiprocessor system. Code is portioned into code blocks by a compiler. The compiler schedules execution of code blocks in nodes. The nodes are connected in a directed acyclical graph with a top node, terminal node and a plurality of intermediate nodes. Execution of the top node is initiated by the compiler. After executing at least one instance of the top node, an instruction in the code block indicates to the scheduler to initiate at least one intermediary node. The scheduler schedules a thread for execution of the intermediary node. The data for the nodes resides in a plurality of data buffers; the index to the data buffer is stored in a command buffer.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer chips and circuit boards for machine learning, deep learning, and vision processing; graph streaming data processors; Downloadable computer software, namely, software for the creation, processing and streaming of graphs; Downloadable computer software development tools, compiler software and electronic coding units for programming graph streaming data processors
09 - Appareils et instruments scientifiques et électriques
Produits et services
Computer hardware, namely, electronic circuits, computer chips and circuit boards for machine learning, deep learning, and vision processing; graph streaming data processors; Downloadable computer software, namely, software for the creation, processing and streaming of graphs; Downloadable computer software development tools, compiler software and electronic coding units for programming graph streaming data processors
67.
Reduction of a number of stages of a graph streaming processor
Methods, systems and apparatuses for graph streaming processing system are disclosed. One system includes a plurality of graph streaming processors operative to process a plurality of threads, wherein the plurality of threads is organized as nodes. The system further includes a scheduler that includes a plurality of stages. Each stage includes a command parser operative to interpret commands within a corresponding input command buffer, an alternate command buffer, and a thread generator coupled to the command parser. The thread generator is operative to generate the plurality of threads, and dispatch the plurality of threads, where the processing of the plurality of thread for each stage includes storing write commands in the corresponding output command buffer or in the alternate command buffer.
Systems, apparatuses and methods are disclosed for scheduling threads comprising of code blocks in a graph streaming processor (GSP) system. One system includes a scheduler for scheduling plurality of threads, the plurality of threads includes a set of instructions operating on the graph streaming processors of GSP system. The scheduler comprises a plurality of stages where each stage is coupled to an input command buffer and an output command buffer. A portion of the scheduler is implemented in hardware and comprises of a command parser operative to interpret commands within a corresponding input command buffer, a thread generator coupled to the command parser operate to generate the plurality of threads, and a thread scheduler coupled to the thread generator for dispatching the plurality of threads for operating on the plurality of graph streaming processors.
Methods, systems and apparatuses for graph stream processing are disclosed. One apparatus includes a cascade of graph streaming processors, wherein each of the graph streaming processor includes a processor array, and a graph streaming processor scheduler. The cascade of graph streaming processors further includes a plurality of shared command buffers, wherein each shared command buffer includes a buffer address, a write pointer, and a read pointer, wherein for each of the plurality of shared command buffers a first graph streaming processor writes commands to the shared command buffer as indicated by the write pointer of the shared command buffer and a second graph streaming processor reads commands from the shared command buffer as indicated by the read pointer, wherein at least one graph streaming processor scheduler operates to manage the write pointer and the read pointer to avoid overwriting unused commands of the shared command buffer.
Methods, systems and apparatuses for graph processing are disclosed. One graph streaming processor includes a thread manager, wherein the thread manager is operative to dispatch operation of the plurality of threads of a plurality of thread processors before dependencies of the dependent threads have been resolved, maintain a scorecard of operation of the plurality of threads of the plurality of thread processors, and provide an indication to at least one of the plurality of thread processors when a dependency between the at least one of the plurality of threads that a request has or has not been satisfied. Further, a producer thread provides a response to the dependency when the dependency has been satisfied, and each of the plurality of thread processors is operative to provide processing updates to the thread manager, and provide queries to the thread manager upon reaching a dependency.
The claimed invention discloses system comprising a plurality of logical nodes comprised in a single or plurality of stages, with defined properties and resources associated with each node, for reducing compute resources, said system further comprising: at least a recirculating ring buffer holding only any one of a control information, input, and, or out data necessary to stream a temporary data between node and, or nodes in an execution graph, thereby reducing size of said recirculating ring buffer; said recirculating ring buffer being sufficiently reduced in size to reside in an on-chip cache, such that any one of the control information, input, and, or out data between node and, or nodes need not be stored in memory; wherein the control information further comprises a command related to invalidating any one of the input and, or out data held in a recirculating ring data buffer, clearing the buffer of tasked data; and wherein a producer is stalled from writing any more control information into a recirculating ring command buffer upon the buffer being full, preventing command buffer over-writing, and thereby reducing compute resources associated with a DRAM memory transaction.
Mechanism for minimal computation and power consumption for rendering synthetic 3D images, containing pixel overdraw and dynamically generated intermediate images
Embodiments disclosed include a mechanism in a system and method for significantly reducing power consumption by reducing computation and bandwidth. This mechanism is particularly applicable for modern 3D synthetic images which contain high pixel overdraw and dynamically generated intermediates images. Only blocks of computation which contribute to the final image are performed. This is accomplished by rendering in reverse order and by performing multiple visibility sort in a streaming fashion through the pipeline. Rendering of dynamically generated intermediate images is performed sparsely by projecting texture coordinates from a current image back into one or more dependent images in a recursive manner. The newly computed pixel values are then filtered and control is returned to the sampling shader of the current image. When only visible pixels are projected optimal computation is performed. Several implementations are presented with increasing efficiency. An acceleration structure, termed a Draw Buffer, simplifies the process of projecting backward and utilizes a hardware managed dynamic memory object. This mechanism reduces computation by 50%, with significant bandwidth and power savings.
Methods, systems and apparatuses for selecting graphics data of a server system for transmission are disclosed. One method includes reading data from memory of the server system, checking if the data is being read for the first time, checking if the data was written by a processor of the server system during processing, comprising checking if the data is available on a client system or present in a transmit buffer, placing the data in the transmit buffer if the data is being read for the first time and was not written by the processor during the processing as determined by the checking if the data was written by the processor of the server system during processing, wherein if the data is being read for the first time and was written by the processor of the server system during processing the data is not placed in the transmit buffer.
G09G 5/395 - Dispositions spécialement adaptées pour le transfert du contenu de la mémoire à mappage binaire vers l'écran
G09G 5/36 - Dispositions ou circuits de commande de l'affichage communs à l'affichage utilisant des tubes à rayons cathodiques et à l'affichage utilisant d'autres moyens de visualisation caractérisés par l'affichage de dessins graphiques individuels en utilisant une mémoire à mappage binaire
H04N 21/434 - Désassemblage d'un flux multiplexé, p. ex. démultiplexage de flux audio et vidéo, extraction de données additionnelles d'un flux vidéoRemultiplexage de flux multiplexésExtraction ou traitement de SIDésassemblage d'un flux élémentaire mis en paquets
G06T 1/20 - Architectures de processeursConfiguration de processeurs p. ex. configuration en pipeline
H04N 19/42 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques caractérisés par les détails de mise en œuvre ou le matériel spécialement adapté à la compression ou à la décompression vidéo, p. ex. la mise en œuvre de logiciels spécialisés
G06F 3/14 - Sortie numérique vers un dispositif de visualisation
H04N 21/236 - Assemblage d'un flux multiplexé, p. ex. flux de transport, en combinant un flux vidéo avec d'autres contenus ou données additionnelles, p. ex. insertion d'une adresse universelle [URL] dans un flux vidéo, multiplexage de données de logiciel dans un flux vidéoRemultiplexage de flux multiplexésInsertion de bits de remplissage dans le flux multiplexé, p. ex. pour obtenir un débit constantAssemblage d'un flux élémentaire mis en paquets
74.
Processing of graphics data of a server system for transmission including multiple rendering passes
Methods, systems and apparatuses for selecting graphics data of a server system for transmission are disclosed. One method includes a plurality of graphic render passes, wherein one or more of the graphics render passes includes reading data from graphics memory of the server system. The data read from the graphics memory is placed in a transmit buffer if the data is being read for the first time, and was not written by a processor of the server system. One system includes a server system including graphics memory, a frame buffer and a processor. The server system is operable to read data from the graphics memory. The server system is operable to place the data in a transmit buffer if the data is being read for the first time, and was not written by the processor during rendering.
H04N 21/236 - Assemblage d'un flux multiplexé, p. ex. flux de transport, en combinant un flux vidéo avec d'autres contenus ou données additionnelles, p. ex. insertion d'une adresse universelle [URL] dans un flux vidéo, multiplexage de données de logiciel dans un flux vidéoRemultiplexage de flux multiplexésInsertion de bits de remplissage dans le flux multiplexé, p. ex. pour obtenir un débit constantAssemblage d'un flux élémentaire mis en paquets
H04N 21/434 - Désassemblage d'un flux multiplexé, p. ex. démultiplexage de flux audio et vidéo, extraction de données additionnelles d'un flux vidéoRemultiplexage de flux multiplexésExtraction ou traitement de SIDésassemblage d'un flux élémentaire mis en paquets
G09G 5/36 - Dispositions ou circuits de commande de l'affichage communs à l'affichage utilisant des tubes à rayons cathodiques et à l'affichage utilisant d'autres moyens de visualisation caractérisés par l'affichage de dessins graphiques individuels en utilisant une mémoire à mappage binaire
H04N 19/42 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques caractérisés par les détails de mise en œuvre ou le matériel spécialement adapté à la compression ou à la décompression vidéo, p. ex. la mise en œuvre de logiciels spécialisés
75.
Processing of graphics data of a server system for transmission
Methods, systems and apparatuses for selecting graphics data of a server system for transmission are disclosed. One method includes reading data from graphics memory of the server system. The data read from the graphics memory is placed in a transmit buffer if the data is being read for the first time, and was not written by a processor of the server system. One system includes a server system including graphics memory, a frame buffer and a processor. The server system is operable to read data from the graphics memory. The server system is operable to place the data in a transmit buffer if the data is being read for the first time, and was not written by the processor during rendering.
G06F 13/00 - Interconnexion ou transfert d'information ou d'autres signaux entre mémoires, dispositifs d'entrée/sortie ou unités de traitement
G06F 15/00 - Calculateurs numériques en généralÉquipement de traitement de données en général
G06T 1/00 - Traitement de données d'image, d'application générale
G09G 5/36 - Dispositions ou circuits de commande de l'affichage communs à l'affichage utilisant des tubes à rayons cathodiques et à l'affichage utilisant d'autres moyens de visualisation caractérisés par l'affichage de dessins graphiques individuels en utilisant une mémoire à mappage binaire
G09G 5/37 - Détails concernant le traitement de dessins graphiques
G06F 15/16 - Associations de plusieurs calculateurs numériques comportant chacun au moins une unité arithmétique, une unité programme et un registre, p. ex. pour le traitement simultané de plusieurs programmes
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle