Apparatuses, systems, and techniques are presented to generate one or more images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, on one or noise values.
Approaches are described for defect detection and analysis in semiconductor substrate (e.g., wafer/reticle) inspection using neural networks and computational models. Disclosed approaches can process images of a semiconductor substrate to identify expected features, such as alignment marks, interconnects, or functional geometries, which represent the intended characteristics of the patterned wafer/reticle's design. These expected features may be provided in various forms, including text-based prompts, CAD-derived reference images, or other design-related data. Such analysis can involve comparing input images with these benchmarks to detect deviations, such as missing features, misalignments, overlapping patterns, or geometric inconsistencies. An example system can allow for analysis of individual images, sub-regions, or multiple images captured across different fabrication stages. Real-time data streams from manufacturing equipment may also be processed for continuous monitoring. Outputs can include detailed descriptions of anomalies, classifications into defect categories, and optional confidence scores, leveraging domain-specific and out-of-domain datasets to maintain zero-shot detection capabilities.
Technologies for optimizing post-FEC bit error rate (BER) performance of a Forward Error Correction (FEC) system are described. The processing device evaluates a quality metric associated with a trained deep neural network (DNN) relative to a quality criterion, the DNN to estimate a post-FEC bit error rate of a FEC circuit. The processing device updates a feature set or a neural network configuration when the quality metric does not satisfy the quality criterion, and retrains the DNN with an updated feature set or updated neural network configuration and re-evaluating the quality metric. The processing device selects a final feature set or a final neural network configuration for DNN inference when the quality metric satisfies the quality criterion, and stores trained DNN model parameters corresponding to the final feature set or final neural network configuration.
Approaches presented herein may be used to generate audio data from an initial audio input. One or more neural audio codecs may be used to generate representations of an audio input that may be decoded by a decoder. The representations of the audio input may include conditioning based on speaker identity or phonemes within the audio input. The neural audio codecs may be provided as part of a content generation pipeline for use at inferencing as part of an encoder/decoder architecture.
G10L 19/032 - Quantisation or dequantisation of spectral components
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
5.
VISUAL TOKENIZATION ENABLING HIGH QUALITY VISUAL RECONSTRUCTION
The tokenization process of input text and visuals (e.g., images, videos, or frames of videos) can be separated into two stages. In a first stage, a large batch size can be used for text encoding and visual encoding while focusing on the first objective of an alignment loss and mean square loss objectives. In a second stage, the text encoder can be stopped, and the visual encoder can be prevented from making additional changes. The second stage focuses on a second loss objective of a weighted sum of the mean square loss, the perceptual loss, and the generative adversarial network loss objectives. In the second stage, a discrete set of tokens can be generated from the inputs, and the set of tokens can be further fine-tuned. A transformer, with an autoregressive model, can be applied to the set of discrete tokens.
Apparatuses, systems, and techniques to perform an application programming interface (API) to add one or more graph nodes to a software graph, wherein the API is to cause a semaphore wait node to be added to a software graph based, at least in part, on a dependency type indicated by the API. In at least one embodiment, one or more nodes are added to a graph in accordance to one or more dependency types.
Apparatuses, systems, and techniques to identify a three-dimensional (3D) point cloud. In at least one embodiment, one or more three-dimensional (3D) point clouds of one or more objects based is generated using one or more neural networks based, at least in part, on one or more two-dimensional (2D) images and the one or more 3D point clouds.
In various examples, systems and techniques are provided that are directed to training and deployment of multi-golden sample inspection systems. The disclosed techniques include processing, using a backbone neural network, an image of a sample to generate a sample feature representative of the sample and obtaining reference features representative of reference images. The disclosed techniques further include generating differential features representative of differences between the sample feature and the reference features and predicting characteristics of the sample based on aggregated differential features.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Apparatuses, systems, and techniques related to a dual lookahead pass for video encoding. In at least one embodiment, frames of the video are processed during a first lookahead pass to extract characteristics and determine a fixed group of pictures (GOP) structure, the fixed GOP structure is then optimized based on the extracted characteristics to generate an adaptive GOP structure that is processed during a second lookahead pass.
H04N 19/194 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive involving only two passes
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/177 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
Embodiments of the present disclosure relate to enhanced AI-based audio-visual processing. Various aspects integrate multimodal analysis functionality that seamlessly combines and/or selects from audio, video, and/or text data to provide a holistic understanding of multimedia content. Relative to existing technologies, such an approach enables more accurate and contextually aware interpretations by leveraging the full spectrum of available information in multimedia content. By integrating these disparate sources of data, various embodiments achieve a more nuanced analysis that captures the complexity and richness of real-world multimedia scenarios.
G06F 16/68 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
G06F 16/683 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
G06F 16/783 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Disclosed is an integrated inductor package comprising a magnetic core, a first conductive coil coupled to the magnetic core and a second conductive coil coupled to the magnetic core. The first conductive coil is disposed on a first side of the integrated inductor package and coupled between a first input pin and a first output pin. The second conductive coil is disposed on a second side of the integrated inductor package and coupled between a second input pin and a second output pin.
Apparatuses, systems, and techniques related to a generative signal foundation model for temporal signal extraction and restoration. In at least one embodiment, a temporal signal is mapped into a latent space using an invertible transform, sampled and processed in the latent space, and the processed samples are inverse transformed to produce a restored or an output signal comprising an extracted target temporal signal or a signal with the target temporal signal removed. In at least one embodiment, the generative signal foundation model is pretrained using unlabeled training data and is then finetuned for a specific task.
Apparatuses, systems, and techniques related to a generative speech foundation model for speech extraction and restoration. In at least one embodiment, an audio signal is mapped into a latent space using an invertible transform, sampled and processed in the latent space, and the processed audio samples are inverse transformed to produce a restored (or extracted) audio signal. In at least one embodiment, the generative speech foundation model is pretrained using unlabeled training data and is then finetuned for a specific task.
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
14.
ENCODER-BASED MOTION DETECTION FOR IMAGE PROCESSING SYSTEMS AND APPLICATIONS
In various examples, techniques for encoder-based motion detection for image processing systems and applications are described herein. Systems and methods described herein may use an image processing pipeline that includes one or more encoder components that are configured to both encode image data and perform motion detection. For instance, in some embodiments, the encoded image data may represent at least motion information associated with frames, such as motion vectors associated with portions of the frames. As described in more detail herein, the encoder component (s) may then use this motion information to determine whether the frames represent motion of objects. Additionally, based on determining whether the frames represent motion, additional processing stages of the image processing pipeline may be performed, such as a processing stage associated with verifying the motion and/or a processing stage associated with storing the encoded image data representing the motion.
Apparatuses, systems, and techniques are described to predict missing time series information. In at least one embodiment, one or more circuits use one or more neural networks to predict missing time series data to include in a time series based, at least in part, on interval information associated with the missing time series data.
Mechanisms to tune resonant frequencies of the resonators in a double resonator structure that includes a first resonator and a second resonator utilizing logic to operate tuning elements to align a dip in the circulating power in the first resonator with a peak in the drop port power of the second resonator.
G02B 6/293 - Optical coupling means having data bus means, i.e. plural waveguides interconnected and providing an inherently bidirectional system by mixing and splitting signals with wavelength selective means
17.
FAST AND SECURE MEMORY ADDRESS TRANSLATION IN A VIRTUAL MACHINE COMPUTING SYSTEM
Various embodiments include techniques for translating memory addresses in a virtualized computing system that hosts multiple virtual machines. In such a virtualized computing system, conventional approaches for translating a guest virtual address to a system physical address can involve a large number of memory accesses. With the disclosed techniques, address translation can be reduced to approximately 5 memory accesses. This performance savings results from storing two data structures in high-speed on-chip memory that indicate which system physical address segments are mapped to and valid for the virtual machine that is accessing the memory segment. A third data structure indicates whether a system physical address segment is a protected/secure system physical address segment. If the virtual machine already has access to the segment, then confidential/secure compute policies can be applied on the accesses from this virtual machine based on whether this system physical segment is protected/secure.
G06F 12/14 - Protection against unauthorised use of memory
G06F 7/72 - Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radixComputing devices using combinations of denominational and non-denominational quantity representations using residue arithmetic
Systems and methods are disclosed that relate to object detection and to generating detected object representations. Sensor data corresponding to a scene may be obtained that may represent one or more objects. An output may be generated based at least on the sensor data, where the output may represent the one or more objects and may include respective predicted 3D characteristics of the one or more objects.
Mechanisms to improve the thermal conductivity of three-dimensional integrated circuit (3DIC) structures by thinning the metal and dielectric layers in comparison to the dimensions of these layers in conventional 3DIC structures, resulting in a commensurate thinning of the oxide layers between active layers and metal layers, and a reduction in the number of metal and oxide layers utilized compared to conventional 3DIC structures.
In various examples, multilabel hierarchical classification of objects for autonomous systems and applications is described herein. Systems and methods are disclosed that use one or more neural networks to classify objects, such as traffic signs, using multilabel classification and/or hierarchical classification. For instance, a multilabel subnetwork of the neural network(s) may classify an object based at least on one or more attributes associated with the object. As such, the output from the multilabel subnetwork may include at least a classification associated with the object and an attribute classification(s) associated with the object. A hierarchical subnetwork of the neural network(s) may also classify the object using one or more class labels, where a class label indicates another classification and/or a class group associated with the object. The systems and methods may then use the classification, the attribute classification(s), and/or the class label(s) to determine a final classification associated with the object.
Voltage-Frequency domain switching circuits that include multiple stages each configured to receive a throttle code, each of the stages providing a fast-propagation path for the throttle code to a digitally controlled oscillator, and a frequency locked loop configured to (a) generate a code to the digitally controlled oscillator over a slow path, and (2) disable the fast path to the digitally controlled oscillator upon the code satisfying a match with the throttle code.
Apparatuses, systems, and techniques related to a programmable algorithm circuit within an image signal processor (ISP). In at least one embodiment, a neural network circuit can be selectively coupled at different locations in an image processing pipeline within an ISP. The neural network circuit comprises a programmable algorithm circuit (PAC), where the algorithm is defined by programmable parameters. The parameters for different tasks may be learned via training and dynamically updated to configure the PAC.
H04N 23/81 - Camera processing pipelinesComponents thereof for suppressing or minimising disturbance in the image signal generation
H04N 23/84 - Camera processing pipelinesComponents thereof for processing colour signals
23.
POST-FEC BER ESTIMATION AND ADAPTING FORWARD ERROR CORRECTION (FEC) OR COMMUNICATION LINK PARAMETERS USING DEEP NEURAL NETWORKS FOR IMPROVED POST-FEC PERFORMANCE
Technologies for optimizing post-FEC bit error rate (BER) performance of a Forward Error Correction (FEC) system are described. The processing device receives measurement data including transmitter settings and impairment properties associated with a transmitter circuit, channel properties and impairment properties associated with a channel between the transmitter circuit and a receiver circuit, link properties and impairment properties associated with a link between the transmitter circuit and the receiver circuit, and/or receiver settings and impairment properties associated with the receiver circuit. The processing device determines, using the measurement data and a deep neural network (DNN), a post-FEC BER estimation of a FEC circuit. The processing device adjusts, based on the post-FEC BER estimation, at least one of a FEC parameter of the FEC circuit or a link parameter of the transmitter or receiver circuit to improve the post-FEC performance of the FEC circuit.
An electrical device including a memory circuit and a general input output circuit. The memory circuit includes bit-cell circuits, each bit-cell circuit including a memory cell and a read port. The memory cell is electrically connected to a read word line at a read output voltage and is electrically connected to the read port by the read word line. The read port is further connected to a read bit-line complement signal at a write output logic voltage or at a read bit-line complement output voltage that is less than or equal to the write output logic voltage. The general input output circuit includes a bit-line pre-charge control circuit and a voltage level shifter circuit, wherein the read bit-line complement signal is connected to the bit-line pre-charge control circuit and to the voltage level shifter circuit.
An end-to-end system for data generation, map creation using the generated data, and localization to the created map is disclosed. Mapstreams—or streams of sensor data, perception outputs from deep neural networks (DNNs), and/or relative trajectory data—corresponding to any number of drives by any number of vehicles may be generated and uploaded to the cloud. The mapstreams may be used to generate map data—and ultimately a fused high definition (HD) map—that represents data generated over a plurality of drives. When localizing to the fused HD map, individual localization results may be generated based on comparisons of real-time data from a sensor modality to map data corresponding to the same sensor modality. This process may be repeated for any number of sensor modalities and the results may be fused together to determine a final fused localization result.
An integrated circuit die binning process involves forming a batch of singulated die, configuring intersection regions of a multi-dimensional control structure with settings indicating die grouping compatibility, identifying a first die with the lowest die grouping compatibility, and randomly selecting a second die for grouping with the first die or voiding the first die from one dimension of the control structure, repeating these actions for more die from the batch.
G05B 19/418 - Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
27.
ENCODING AND DECODING MEDIA CONTENT USING CYCLIC DOWNSAMPLING AND DEEP LEARNING RECONSTRUCTION
A server may apply multiple filters to one or more media frames, combine the outputs of the filters into a single combined output, and deinterlace the combined output for transmission over a network to a client device. The client device may receive the deinterlaced outputs and interlace the outputs into a full resolution frame.
Apparatuses, systems, and techniques are presented to generate images with one or more visual effects applied. In at least one embodiment, one or more visual effects are applied to one or more images having a resolution that is less than a first resolution and those visual effects approximated for one or more images having a resolution that is greater than or equal to the first resolution.
Embodiments of the present disclosure relate to neural material networks. Materials that have similar properties may be organized into material clusters, where each material cluster is associated with a set of parameters used by a neural material network.
In various examples, systems and methods are disclosed that relate to the evaluation of image data for imperfections using a vision language model (VLM). For example, a system can be configured to obtain image data associated with at least one image that includes one or more imperfections. The system can determine a query based at least on the at least one image that is usable to prompt the VLM when processing the image data. In some embodiments, the system can provide the image data and the query to the VLM to cause the VLM to generate the output in accordance with the query, indicating whether one or more imperfections are present in the at least one image.
G06V 10/70 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 20/70 - Labelling scene content, e.g. deriving syntactic or semantic representations
31.
TEXT-TO-IMAGE DIFFUSION MODEL WITH COMPONENT LOCKING AND RANK-ONE EDITING
A text-to-image machine learning model takes a user input text and generates an image matching the given description. While text-to-image models currently exist, there is a desire to personalize these models on a per-user basis, including to configure the models to generate images of specific, unique user-provided concepts (via images of specific objects or styles) while allowing the user to use free text “prompts” to modify their appearance or compose them in new roles and novel scenes. Current personalization solutions either generate images with only coarse-grained resemblance to the provided concept(s) or require fine tuning of the entire model which is costly and can adversely affect the model. The present description employs component locking and/or rank-one editing for personalization of text-to-image diffusion models, which can improve the fine-grained details of the concepts in the generated images, reduce the memory footprint update of the underlying model instead of full fine-tuning, and reduce adverse effects to the model.
A first device includes training logic configured to autonomously train a link coupled between the first device and a second device without software intervention. The training logic transmits a message on a first portion of the link to the second device, the message associated with training a second portion of the link. The training logic receives training data on the second portion of the link from the second device in response to the message and trains the second portion of the link based on the training data. In some embodiments, a first device includes processing circuitry and a serializer/deserializer, wherein the processing circuitry trains a high-speed serial link in a first direction between the first device and a second device, and sends or receives a first sideband message on the high-speed serial link in a second direction that is opposite to the first direction.
Apparatuses, systems, and techniques to cause, use, and/or perform, one or more application programming interfaces, such as to indicate one or more software threads performing a backup. In at least one embodiment, one or more processors comprising one or more circuits are to perform an application programming interface (API) to indicate one or more identifiers of one or more first software threads performing a backup of one or more second software threads.
Hardware support for light-weight instances achieve the effect of multi-level instancing without incurring associated performance or area cost, enabling a Cluster-Level AS (CLAS) to be used for a faster build speed. The same subdivision mechanism can also be used to reduce the complexity of the TLAS build by dividing instances into groups. Pseudo-Instance Nodes (PIN) allow for an indirection from a BVH node to any arbitrary location of a child complet. Multi-Parent Root Complets (MPRC) allow for any CLAS to be reused across multiple BLASs.
In examples, autonomous vehicles are enabled to negotiate yield scenarios in a safe and predictable manner. In response to detecting a yield scenario, a wait element data structure is generated that encodes geometries of an ego path, a contender path that includes at least one contention point with the ego path, as well as a state of contention associated with the at least on contention point. Geometry of yield scenario context may also be encoded, such as inside ground of an intersection, entry or exit lines, etc. The wait element data structure is passed to a yield planner of the autonomous vehicle. The yield planner determines a yielding behavior for the autonomous vehicle based at least on the wait element data structure. A control system of the autonomous vehicle may operate the autonomous vehicle in accordance with the yield behavior, such that the autonomous vehicle safely negotiates the yield scenario.
Voltage-Frequency domain switching circuits that include multiple stages each configured to receive a throttle code, each of the stages providing a first fast-propagation path for the throttle code to a digitally controlled oscillator, and a frequency locked loop configured to (a) generate a code to the digitally controlled oscillator over a slow path, and (2) disable the fast path to the digitally controlled oscillator upon the code satisfying a match with the throttle code. A second fast-propagation path is configured to propagate a second code to the digitally controlled oscillator.
Systems and methods for enhancing image blocks of a frame of a video to be compressed are provided. The systems and methods enhance the image blocks to improve compression of the video to be compressed. In at least one embodiment, systems and methods are provided for using a learned prefiltering network, trained using a joint loss function, to enhance video content to improve compression.
H04N 19/80 - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/109 - Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
H04N 19/11 - Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/147 - Data rate or code amount at the encoder output according to rate distortion criteria
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/50 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
H04N 19/85 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Apparatuses, systems, and techniques to identify object distance with one or more cameras. In at least one embodiment, one or more cameras capture at least two images, where one image is transformed to the other, and a neural network determines whether said object is in front of or behind a known distance, whereby an object's distance may be determined after a set of known distances are analyzed.
Various embodiments include techniques for translating memory addresses in a virtualized computing system that hosts multiple virtual machines. In such a virtualized computing system, conventional approaches for translating a guest virtual address to a system physical address can involve a large number of memory accesses. With the disclosed techniques, address translation can be reduced to approximately 5 memory accesses. This performance savings results from storing two data structures in high-speed on-chip memory that indicate which system physical address segments are mapped to and valid for the virtual machine that is accessing the memory segment. A third data structure indicates whether a system physical address segment is a protected/secure system physical address segment. If the virtual machine already has access to the segment, then confidential/secure compute policies can be applied on the accesses from this virtual machine based on whether this system physical segment is protected/secure.
G06F 12/109 - Address translation for multiple virtual address spaces, e.g. segmentation
G06F 12/14 - Protection against unauthorised use of memory
G06F 12/1036 - Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] for multiple virtual address spaces, e.g. segmentation
40.
Application programming interface to indicate null-operation dependencies
Apparatuses, systems, and techniques to perform an application programming interface (API) to add one or more graph nodes to a software graph, wherein the API is to cause a null-operation graph node to be added to a software graph, based, at least in part, on a dependency type indicated by the API. In at least one embodiment, one or more nodes are added to a graph in accordance to one or more dependency types.
Apparatuses, systems, and techniques are presented to compress data. In at least one embodiment, one or more images are generated based, at least in part, upon one or more compressed images and image enhancement data generated using one or more neural networks.
Apparatuses, systems, and techniques to determine whether to remove one or more neural network layers. In at least one embodiment, one or more neural network layers are determined to be removed based on, for example, a neural architecture search (NAS).
42 - Scientific, technological and industrial services, research and design
Goods & Services
(Based on 44(d) Priority Application)(Based on Intent to Use) Infrastructure as a service (IAAS); infrastructure as a service (IAAS) being hosting software for operating virtual servers for use by others; application service provider (ASP), namely, hosting computer software applications of others; hosting of computer platforms; electronic storage of data; computer software technical support services, technical information and technical support regarding software patches, upgrades and updates; writing of computer software for others; providing information relating to computer technology; providing information relating to computer programming; providing information relating to computer programs; providing computer hardware and software information online; providing information in the fields of technology and software development via an on-line website; providing information relating to computer technology and programming via a website; quality control and authentication services; rental of humanoid robots with artificial intelligence (AI); technology consultation in the field of artificial intelligence (AI); artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for writing custom algorithms and implementing algorithms into dataflows; artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for creating and integrating computer models; providing temporary use of online non-downloadable chatbot software using large language models (LLMs); design and development of artificial intelligence (AI) software on an outsourcing basis; Design and development of computer hardware and software for aircraft, aerial vehicles, drones, unmanned aerial vehicles (UAVs), remotely piloted aircraft systems (RPAS), self-driving delivery robots, industrial robots, automated factories, and airborne platforms for space applications; Technological research, engineering design, testing, and product development in the fields of autonomous aerial vehicles, drones, robotics, physical AI, industrial automation, and aerospace systems; Providing temporary use of online non-downloadable software for autonomous navigation, localization, mapping, perception, sensor fusion, motion planning, obstacle avoidance, flight control, and fleet management for aircraft, aerial vehicles, drones, UAVs, RPAS, and self-driving delivery robots; Software as a service (SaaS) featuring software for factory automation, industrial automation, machine vision, industrial inspection, process control, predictive maintenance, and management of industrial robots, autonomous mobile robots (AMRs), and automated guided vehicles (AGVs); Platform as a service (PaaS) featuring computer software platforms for developing, training, testing, validating, deploying, monitoring, and managing artificial intelligence models for robotics, physical AI, autonomous vehicles, aerial vehicles, drones, and industrial automation applications; Design of drones; topographic surveying services by drone; cartographic measurement services by drone; geological prospecting using drones; motor vehicle parts design services; non-land vehicles design services; development and testing of occupant protection systems for motor vehicles; technological research in the field of Internet of Vehicles (IoV); technical research in the field of wireless electric vehicle charging; research on robotic process automation technology; rental of user-programmable humanoid robots, not configured; design and development of software for machine learning; computer technology consultancy in the field of machine learning; artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for analyzing data and interacting with humans; artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for data analytics; providing online non-downloadable machine learning software for enabling computers to learn to perform tasks autonomously; Providing online non-downloadable software for orchestrating and managing large-scale robotics deployments, including workflow scheduling, data pipeline coordination and fleet management of autonomous machines; software as a service (SaaS) featuring software for robot simulation and synthetic training data generation for use in training artificial intelligence models for robotics (Based on Intent to Use) Software as a Service (SaaS) featuring software in the field of self-driving and autonomous land vehicle and self-driving and autonomous land vehicle component operation, control, maintenance, management and communication; Software as a Service (SaaS) featuring software in the field of electronic control systems for land motor vehicles; providing online non-downloadable software for the operation, control, maintenance, and remote management of land vehicles, self-driving and autonomous land vehicles and land vehicle components, for land vehicle navigation, for travel and trip planning, for communications between vehicles and mobile devices, and for collecting, tracking, analyzing, and reporting data and information in the field of self-driving and autonomous land vehicles
44.
DIFFERENTIABLE SENSITIVITY-BASED SKEW SCHEDULING FRAMEWORK FOR TIMING OPTIMIZATION
Embodiments of the present disclosure provide systems and methods for clock latency adjustment. A clock latency set and a timing report are obtained, and based on the clock latency set and the timing report, a set of critical registers are identified. An initial clock latency adjustment procedure is performed to provide a latency adjustment for one or more registers in the set of critical registers. Pursuant to a zero-mean constraint, the one or more latency adjustments are modified. The modified one or more latency adjustments are used to adjust one or more latencies in the clock latency set.
In various examples, multimodal data processing for content retrieval systems and applications is described herein. Systems and methods described herein may convert different modalities of data into a common type of modality. For instance, content data representing a video may be separated into audio data representing sound corresponding to the video—such as speech—along with video data representing frames of the video. The audio data may then be processed using one or more models to generate first text corresponding to a transcript of the speech. Additionally, the video data may be processed to identify specific keyframes that provide important information associated with the video. The keyframes may then be processed using one or more models to generate second text describing the keyframes. The systems and methods may then combine the text from the different modalities and generate data for storage in one or more databases.
Apparatuses, systems, and techniques to perform signal processing operations in a fifth generation (5G) new radio (NR) signal. In at least one embodiment, one or more processors process a 5G NR signals according to one or more graph nodes.
The disclosed method for training a multimodal model includes performing one or more first operations to train a connector disposed between one or more vision encoders and a language model included in the multimodal model; performing one or more second operations to train the multimodal model using a first dataset; and performing one or more third operations to train the multimodal model using a second dataset to generate a trained multimodal model, where the second dataset is smaller than the first dataset, and where the trained multimodal model processes at least one of an input image or an input text to generate an output text.
G06F 40/40 - Processing or translation of natural language
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
48.
DIFFERENTIABLE OBJECT INSERTION USING HYBRID LIGHTING VOLUMES FOR SYNTHETIC DATA GENERATION APPLICATIONS
Systems and methods generate a hybrid lighting model for rendering objects within an image. The hybrid lighting model includes lighting effects attributed to a first source, such as the sun, and to a second source, such as spatially-varying effects of objects within the image. The hybrid lighting model may be generated for an input image and then one or more virtual objects may be rendered to appear as if part of the input image, where the hybrid lighting model is used to apply one or more lighting effects to the one or more virtual objects.
Apparatuses, systems, and techniques to cause matrix multiplication to be performed using encoded representations of matrix operands. In at least one embodiment, one or more processors are caused, or otherwise used, to cause first and second matrices to be multiplied at least by generating a plurality of encodings of portions of the first matrix and a plurality of encodings of portions of the second matrix. In at least one embodiment, the one or more processors are caused, or otherwise used, to cause the first and second matrices to be multiplied at least by performing a plurality of matrix multiplication operations between the plurality of encodings of the portions of the first matrix and the plurality of encodings of the portions of the second matrix.
Optical sensors (e.g., cameras) and (e.g., IR) illumination sources may be distributed in an environment (e.g., an interior space such as a cabin or cockpit of an ego-machine) and synchronized to generate frames of sensor data. By positioning the optical sensors and assigning them corresponding frequency ranges, the resulting sensor data (e.g., images from different perspectives and with different illumination patterns) may be used to extract reflectance data, the reflectance data may be used to generate more accurate sensor data (e.g., HDR images, images re-rendered using an extracted bidirectional reflectance distribution function), and the resulting sensor data may be used in one or more downstream tasks, such as operator or occupant monitoring or detection tasks (e.g., gaze detection, pose detection, attentiveness or fatigue assessment, facial recognition, gesture recognition, occupant presence detection, child presence detection, seat belt detection, hands-on-wheel detection, etc.), generating visualizations (e.g., video conference calls), and/or otherwise.
G06T 7/55 - Depth or shape recovery from multiple images
B60Q 3/20 - Arrangement of lighting devices for vehicle interiorsLighting devices specially adapted for vehicle interiors for lighting specific fittings of passenger or driving compartmentsArrangement of lighting devices for vehicle interiorsLighting devices specially adapted for vehicle interiors mounted on specific fittings of passenger or driving compartments
G01S 17/894 - 3D imaging with simultaneous measurement of time-of-flight at a 2D array of receiver pixels, e.g. time-of-flight cameras or flash lidar
G01S 17/93 - Lidar systems, specially adapted for specific applications for anti-collision purposes
G06T 5/40 - Image enhancement or restoration using histogram techniques
G06T 5/92 - Dynamic range modification of images or parts thereof based on global image properties
H05B 47/11 - Controlling the light source in response to determined parameters by determining the brightness or colour temperature of ambient light
51.
NETWORK BANDWIDTH-AWARE SCHEDULING IN DATA CENTERS
A computing device determines an expected network bandwidth usage value for a set of processes. The computing device further determines an available network bandwidth value for one or more server rack amongst multiple server racks, and selects a first server rack having an available network bandwidth value closest to the expected network bandwidth usage value. The first server rack includes a first set of servers, and the computing device further determines whether a first server of the first set of servers is available. Responsive to determining the first server is available, the computing device assigns the set of processes to the first server.
Disclosed are apparatuses, systems, and techniques that may use machine learning for generating artificial speech. The techniques include obtaining a synthetic embedding using learned embeddings associated with different speakers. At least one learned embedding may be generated using a multi-stage training of a machine learning model (MLM) with progressively increasing quality of training speech utterances. The techniques may further include using the MLM and the synthetic embedding to generate synthetic audio data.
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
53.
TECHNIQUES FOR TRAINING A MACHINE LEARNING MODEL TO RECONSTRUCT DIFFERENT THREE-DIMENSIONAL SCENES
In various embodiments, a training application trains a machine learning model to generate three-dimensional (3D) representations of two-dimensional images. The training application maps a depth image and a viewpoint to signed distance function (SDF) values associated with 3D query points. The training application maps a red, blue, and green (RGB) image to radiance values associated with the 3DI query points. The training application computes a red, blue, green, and depth (RGBD) reconstruction loss based on at least the SDF values and the radiance values. The training application modifies at least one of a pre-trained geometry encoder, a pre-trained geometry decoder, an untrained texture encoder, or an untrained texture decoder based on the RGBD reconstruction loss to generate a trained machine learning model that generates 3D representations of RGBD images.
A system includes a processing unit coupled with one or more additional devices. The processing unit determines a total power threshold value associated with the processing unit and the one or more additional devices. The processing unit also estimates a power consumption value associated with a first device of the one or more additional devices. Further, the processing unit determines that a combined power consumption of the power consumption value of the first device and a second power consumption value of the processing unit is below the total power threshold value. Responsive to this determination, the processing unit increases an amount of power supplied to the processing unit.
The present disclosure relates to determining a first illumination level corresponding to an area based at least on a first illumination detection obtained using a first illumination detector corresponding to a machine. A second illumination level corresponding to the area may be determined based at least on a second illumination detection obtained using a second illumination detector corresponding to the machine. Based at least on the first illumination level and the second illumination level, a scene illumination state of the area may be determined. Based at least on the scene illumination state, one or more lights of the machine may be controlled.
B60Q 1/14 - Arrangement of optical signalling or lighting devices, the mounting or supporting thereof or circuits therefor the devices being primarily intended to illuminate the way ahead or to illuminate other areas of way or environments the devices being headlights having dimming means
G06V 10/60 - Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
56.
HINT-BASED POWER CONSUMPTION STATE MANAGEMENT OF A COMMUNICATION INTERFACE
Devices, systems, and techniques for managing transitions of power states of a communication interface of a computing node. The techniques include generating an instruction associated with transitioning the communication interface from a first power state to a second power state, where the communication interface communicatively couples a first processing unit and a second processing unit. The techniques further include inserting the instruction into a code set executable by the first processing unit, where the first processing unit executes the instruction to cause the communication interface to transition from the first power state to the second power state.
Apparatuses, systems, and techniques translate program instructions that use references to access objects stored in memory to program instructions that use register identifiers to access objects stored in registers. In at least one embodiment, a compiler translates one or more references to one or more corresponding register identifiers.
Apparatuses, systems, and techniques to dynamically assign neural networks to different hardware. In at least one embodiment, one or more portions of one or more neural networks are to be dynamically assigned to different hardware during inferencing based, at least in part, on changes in inferencing workloads of the one or more portions or capabilities of the different hardware.
Technologies for providing pseudo-random binary sequence (PRBS) error correction in a noisy channel are described. A receiver device includes an error correction circuit that receives an incoming PRBS, the incoming PRBS comprising an error at a specific bit position. The error correction circuit generates a plurality of PRBSs using the incoming PRBS and delayed versions of the incoming PRBS, each delayed version being delayed by a different amount such that each of the plurality of PRBSs comprises errors at different bit positions than the specific bit position. The error correction circuit generates a corrected PRBS using the incoming PRBS and the plurality of PRBSs.
Approaches presented herein include dynamic authentication request routing based on storage locations for different attributes used to execute one or more authentication policies. Authentication policies may be executed at an instance associated with an application that uses the policy. When the authentication request is received for the application, a request router may be used to determine one or more attributes used for executing an appropriate policy and then route the authentication request to an instance that has stored at least a portion of the one or more attributes, which may reduce authentication latency associated with retrieving attributes from a remote network location.
Approaches presented herein include integrating custom code and functions prior to policy invocation in a unified access management (UAM) system. UAM may be used to implement attribute based access control to evaluate different attributes for a given request. Systems and methods may call one or more dependent endpoints to execute a custom function prior to invoking a given access policy responsive to a user request. The custom function may route an input request to one or more dependent endpoints and generate a modified, enriched output. The modified, enriched output may then be provided as an input to the policy for evaluation. By using the modified, enriched output as an input attribute for the different access policies, generic policies may be established that are called and executed using a variety of different input attributes.
Apparatuses, systems, and techniques to partition a dataset into a plurality of partitions, with some data elements of the dataset being duplicated. In at least one embodiment, neural network inferencing or training data is to be duplicated between partitions to be used by different accelerators based, at least in part, on an amount of activations shared between two or more of the different accelerators.
Disclosed are apparatuses, systems, and techniques implementing a language error correction system that leverages shared expertise of an ensemble of editor models. The techniques include processing, using a router network, an input representation of a text generated using text generation tool(s) to obtain an ensemble of scores, an individual score characterizing a degree of correspondence of the text to a field of expertise of a respective editor model of the ensemble. The techniques further include processing, using a plurality of editor models of the ensemble, the input representation of the text to generate a plurality of output representations of the text. The techniques further include generating a final representation of the text using a weighted combination of the plurality of output representations of the text, an individual output representation weighted using a score, of the ensemble of scores, obtained for a respective editor model of the plurality of editor models.
Devices, systems, and techniques for assessing and managing network security risks associated with an artificial intelligence (AI) model executed by a primary computing system. The techniques include receiving one or more executable files implementing the AI model. The techniques include determining, based at least in part on the one or more attributes, a risk rating associated with the AI model. The techniques include causing execution of an action with respect to the AI model based at least in part on the risk rating.
H04L 41/16 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
65.
ENCRYPTION KEY ROTATION WITHOUT INBAND SYNCHRONIZATION OVER A COMMUNICATION INTERCONNECT
A device includes a transmitter coupled to a communication network, a first controller coupled to the transmitter via a control channel, and control logic coupled to the first controller and the transmitter, the control logic to: determine whether the first controller is synchronized with a respective controller of a second device; cause the transmitter to transmit a first number of communications based on a first encryption key in response to a determination that the first controller is synchronized with the respective controller; determine whether the first number of communications satisfies a first threshold condition based on a first encryption interval corresponding to the first encryption key; and cause the transmitter to transmit a second number of communications based on a second encryption key in response to a determination that the first number of communications satisfies the first threshold condition.
H04L 9/16 - Arrangements for secret or secure communicationsNetwork security protocols using a plurality of keys or algorithms the keys or algorithms being changed during operation
H04L 9/06 - Arrangements for secret or secure communicationsNetwork security protocols the encryption apparatus using shift registers or memories for blockwise coding, e.g. D.E.S. systems
H04L 9/32 - Arrangements for secret or secure communicationsNetwork security protocols including means for verifying the identity or authority of a user of the system
In various examples, perception and planning techniques for interacting with objects are described herein. Systems and methods described herein may determine poses associated with objects located within a container or other partially enclosed space. For instance, image data representing one or more images of the objects may be segmented to generate segmentation masks associated with the objects. The segmentation masks may then be scored to select an object, such as the object that is associated with the highest-scoring segmentation mask. Additionally, one or more techniques may then be used to determine a pose associated with the object, where the pose includes at least a location and/or orientation of the object within the container. The pose may then be used to determine one or more operations for a machine to perform to interact with the object, such as removing the object from a container.
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
Apparatuses, systems, and techniques to estimate exponent values to be used in data type conversions. In at least one embodiment, one or more largest exponent values of a plurality of floating point operands are identified based, at least in part, on calculating one or more probabilities of less than all of the plurality of floating point operands having the one or more largest exponent values.
Methods and systems are disclosed for estimating 3D poses and sizes of vehicle occupants using neural networks. An image of the vehicle's interior is captured and a monocular depth map is generated. Both the depth map and the image are fed into a 3D pose estimation network as a combined four-channel RGBD input. The neural network may include an occlusion-aware masking layer that generates occlusion scores for key points associated with the occupant. The occlusion scores help the network adjust the weighting of depth information, such that key points with higher occlusion scores, indicating a greater likelihood of being hidden or partially obscured, receive lower weight. Scaling functions that estimate scale factors integrate depth information with the occlusion-aware masks to estimate the absolute depth positions of key points.
Apparatuses, systems, and techniques to perform beamforming in a wireless network. In at least one embodiment, a processor identifies two or more directions to transmit two or more wireless signals based, at least in part, on one or more matrix operands each comprising two or more beamforming matrices and transmit said wireless signals in said identified directions.
H04B 7/06 - Diversity systemsMulti-antenna systems, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station
H04B 7/08 - Diversity systemsMulti-antenna systems, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the receiving station
70.
NEURAL NETWORK VARIATIONAL AUTOENCODER WITH MULTILEVEL FEATURE EXTRACTION
Processors, systems and techniques to perform compression of video data using inferencing of neural networks is disclosed. In at least one embodiment, neural networks may independently extract both high level and low level feature information of image frames which may be subsequently combined to generate joint feature information to be compressed to encode the respective frames of a data stream.
The disclosed method for training a multimodal model includes performing one or more first operations to train a connector disposed between one or more vision encoders and a language model included in the multimodal model; performing one or more second operations to train the multimodal model using a first dataset; and performing one or more third operations to train the multimodal model using a second dataset to generate a trained multimodal model, where the second dataset is smaller than the first dataset, and where the trained multimodal model processes at least one of an input image or an input text to generate an output text.
In Very Large Scale Integration (VLSI) design, representations of library cells, which are generally comprised of functional, electrical, and physical properties, are vital for effective machine learning (ML)-based circuit analysis and optimization, as library cells are the fundamental building blocks of circuit netlists. Traditional methods often rely on manually defined features, requiring extensive expertise and feature engineering, whereas one-hot encoding methods demand large amounts of domain-specific training data, which may not always be available. The present disclosure provides a self-supervised learning approach to generate library cell representations, including for example the learning of functional and electrical representations of library cells in a vector space which are compatible with diverse machine learning architectures, including transformers.
Apparatuses, systems, and techniques are presented to synthesize consistent images or video. In at least one embodiment, one or more neural networks are used to generate one or more second images based, at least in part, on one or more point cloud representations of one or more first images.
Apparatuses, systems, and techniques to facilitate execution graph control. In at least one embodiment, an application programming interface comprising one or more parameters is used to control which of one or more portions of graph code are to be performed.
Systems and methods to detect one or more segments of one or more objects within one or more images based, at least in part, on a neural network trained in an unsupervised manner to infer the one or more segments. Systems and methods to help train one or more neural networks to detect one or more segments of one or more objects within one or more images in an unsupervised manner.
G06N 3/088 - Non-supervised learning, e.g. competitive learning
G06T 7/143 - SegmentationEdge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 20/40 - ScenesScene-specific elements in video content
Optical sensors (e.g., cameras) and (e.g., IR) illumination sources may be distributed in an environment (e.g., an interior space such as a cabin or cockpit of an ego-machine) and synchronized to generate frames of sensor data, which may be used to reconstruct 3D geometry and/or 3D pose of an occupant, operator, or other object in the environment. For example, stereo vision may be used to generate one or more depth maps from image data generated using different cameras, the depth map(s) may be transformed into a 3D point cloud, and surface reconstruction may be applied to reconstruct the 3D geometry of surface(s) in the environment. A 3D pose, one or more keypoints (e.g., facial landmarks), or some other representation of the shape of the reconstructed surface(s) may be extracted from the reconstructed surface and used in one or more downstream tasks, such as driver and/or occupant monitoring tasks.
G06T 7/55 - Depth or shape recovery from multiple images
B60Q 3/20 - Arrangement of lighting devices for vehicle interiorsLighting devices specially adapted for vehicle interiors for lighting specific fittings of passenger or driving compartmentsArrangement of lighting devices for vehicle interiorsLighting devices specially adapted for vehicle interiors mounted on specific fittings of passenger or driving compartments
G01S 17/894 - 3D imaging with simultaneous measurement of time-of-flight at a 2D array of receiver pixels, e.g. time-of-flight cameras or flash lidar
G01S 17/93 - Lidar systems, specially adapted for specific applications for anti-collision purposes
G06T 5/40 - Image enhancement or restoration using histogram techniques
G06T 5/92 - Dynamic range modification of images or parts thereof based on global image properties
H05B 47/11 - Controlling the light source in response to determined parameters by determining the brightness or colour temperature of ambient light
77.
APPLICATION PROGRAMMING INTERFACE TO SCHEDULE THREAD BLOCKS
Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to determine which of two or more blocks of threads are to be scheduled in parallel.
In various examples, table classification based query join reordering for relational database systems and applications are provided. In some embodiments, a relational database system is provided that includes a join optimizer that evaluates a join clause of a query and categorizes relational database tables as either fact tables or dimension tables based on a normalized cardinality statistic. The join optimizer uses the fact and dimension tables to deconstruct the query into a plurality of deconstructed query join trees. Individual deconstructed query join trees may be generated for each respective fact table. The deconstructed query join trees may be joined to generate a reordered join solution representing a sequential join of the plurality of deconstructed query join trees. An updated query may be generated based on the reordered join solution, and a query response generated that answers the query based at least on the updated query.
Disclosed are systems and techniques for training machine learning models. The techniques include training an unsupervised text-to-speech (TTS) model and an unsupervised automatic speech recognition (ASR) model via a cycle-consistency objective, generating a candidate text sample and a candidate audio sample using at least one of the trained unsupervised TTS model or the trained unsupervised ASR model, computing a confidence score for at least one of the candidate text sample or the candidate audio sample using a discriminator associated with the unsupervised TTS model or the unsupervised ASR model, and responsive to the confidence score exceeding a threshold, adding the candidate text sample and the candidate audio sample to a training dataset.
In various examples, one or more object detectors may regress bounding polygons for detected objects in systems (e.g., autonomous or semi-autonomous driving systems and applications) that provide object awareness, object identification, object avoidance, and/or object localization. The object detector may determine regression data representing a regressed polygon associated with a given shape of a detected object represented by classification data determined from a scene. The object detector may determine regression data for different regressed angles between different pairs of successive vertices of the regressed polygon and regressed lengths of vectors from a regressed geometric center of the regressed polygon to vertices of the regressed polygon. The object detector may generate, based at least in part on the regression data, a bounding shape for a detected object in the scene. In some embodiments, the object detector may be trained by deforming a regressed polygon to match a ground truth polygon.
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
81.
EFFICIENT PIXEL DENSITY MEASUREMENT OF STITCHED IMAGES
A method includes calibrating a stitching algorithm based on determining one or more feature densities for one or more regions of a composite image, the composite image comprising a stitching of two or more synthetic images associated with a configuration of two or more sensors and depicting two or more patterns. The method further includes generating, using sensor data captured by two or more sensors having the configuration, a stitched image using the calibrated stitching algorithm.
G06V 10/46 - Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]Salient regional features
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 20/70 - Labelling scene content, e.g. deriving syntactic or semantic representations
82.
COMPRESSION OF SIGNED DISTANCE FUNCTION GRIDS VIA A DISTANCE-GUIDED MULTI-RESOLUTION PREDICTOR-CORRECTOR SCHEME
Systems and methods are provided for compressing signed distance function (SDF) grids, for encoding an SDF grid into a data stream, and for decoding a data stream to generate an SDF grid. The systems and methods provided herein employ a prediction-correction scheme that repeatedly upsamples an SDF grid to generate predicted SDF values for new grid points and determines residuals for the SDF values for new grid points that satisfy a predetermined, distance-based condition.
G06T 9/20 - Contour coding, e.g. using detection of edges
G06T 3/4076 - Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
83.
REAL-TIME FORECASTING AND ADAPTIVE SCALING OF DISTRIBUTED WORKERS USING ARTIFICIAL INTELLIGENCE
Systems and techniques for real-time forecasting and adaptive scaling of distributed workers using artificial intelligence (AI) is disclosed. The techniques include receiving first metrics including a first number of work requests for a first task received within a first predetermined duration. The techniques further include applying an AI model to the first number of work requests for the first task to obtain a first predicted number of future work requests for the first task, wherein the AI model comprises a prediction function comprising one or more autoregressive terms and Fourier series terms. The techniques further include causing the first predicted number of future work requests for the first task to be provided to a worker controller for managing a first plurality of workers deployed within a first compute environment to execute the first task.
Various examples, systems, and methods are disclosed relating to resource and thermal management. A first computing system can obtain, from at least one monitor of at least one processing unit of or associated with the one or more processors, data corresponding to at least one metric of the at least one processing unit. The first computing system further can determine, using at least one artificial intelligence (AI) model, a predicted state of the at least one processing unit based at least on the at least one metric. The first computing system further can update, using the predicted state, at least one static data structure corresponding with a hardware configuration of the at least one processing unit to adjust resource management of the at least one processing unit for execution of a processing task by the at least one processing unit.
Metal track reduction mechanisms in systems utilizing an address bus wherein an address decoder is configured to operate on a first subset of applied address bits and a Gray Coder is configured to operate on a second subset of the applied address bits to generate a Gray Code. A sequence of inverters is configured to invert a bit of the Gray Code at multiple locations along the address bus, each location corresponding to a component to select with the applied address.
Technologies for providing Gold code sequences error correction in a noisy channel are described. A receiver device includes an error correction circuit that receives an incoming Gold code sequence. The incoming Gold code sequence includes an error at a specific bit position. The error correction circuit can generate a plurality of Gold code sequences using the incoming Gold code sequence and delayed versions of the incoming Gold code sequence, each delayed version being delayed by a different amount such that each of the plurality of Gold code sequences comprises errors at different bit positions than the specific bit position. The error correction circuit can generate a corrected Gold code sequence using the plurality of Gold code sequences.
Apparatuses, systems, and techniques to identify reasons of users canceling a subscription to an online service. In at least one embodiment, one or more reasons one or more users stop using an online services are identified using one or more neural networks, based on, one or more representative reasons among a plurality of reasons one or more users stopped using said online service. In addtion, apparatuses, systems, and techniques to identify reasons of users canceling a subscription to an online service. In at least one embodiment, one or more reasons one or more users stop using an online services are identified using one or more neural networks, based on, for example, one or more interactions with said online service by one or more users.
Apparatuses, systems, and techniques to cause, use, and/or perform, one or more neural networks, such as to locate an object. In at least one embodiment, one or more processors comprising one or more circuits is to use one or more neural networks to identify one or more probabilities of one or more ranges of depth of one or more pixels of one or more objects in relation to one or more cameras.
Apparatuses, systems, and techniques to train a neural network using varying levels of supervision. In at least one embodiment, a neural network is trained using a unified task head to facilitate supervision by both weak and strong methods of annotating input data.
Apparatuses, systems, and techniques to train neural networks to perform image processing tasks. In at least one embodiment, one or more second neural networks are used to train one or more first neural networks based, at least in part, on a first object type in one or more images and a second object type in the one or more images, in parallel.
Apparatuses, systems, and techniques to adjust one or more discontinuous wireless communication patterns. In at least one embodiment, a processor includes one or more circuits to use one or more neural networks to adjust one or more discontinuous wireless communication patterns.
Embodiments of the present disclosure relate to a system and method used to transfer image data via Ethernet. The system may include memory for storing frame data that may be received via Ethernet packets. In particular, the Ethernet packets may include a payload that may include one or more segments and a header. The header may include a sequence number field indicating a respective sequence number that corresponds to the respective segment, and a byte offset field that may indicate a respective byte offset that may be applied to the segment. Further, the system may include hardware that may be configured to perform packet analysis operations including determining whether a previously transmitted segment was lost. The system may additionally include a processing system for performing data processing operations including storing individual segments at respective memory locations based on the respective byte offsets included in the Ethernet packets.
Processors, systems and techniques to encode an input data set as a sequence of encoded values are described. In at least one embodiment, an input data set is obtained and encoded using one or more neural networks as a sequence of encoded values based, at least in part, on similarity measurements between encoded values in the sequence.
The disclosed method for generating molecules includes selecting, based on one or more molecule properties, one or more hard molecule fragments and one or more soft molecule fragments; and processing, using a trained machine learning model, the one or more hard molecule fragments and the one or more soft molecule fragments to generate a molecule, where the molecule includes the one or more hard molecule fragments, and the trained machine learning model generates the molecule based on the one or more soft molecule fragments.
G16B 15/00 - ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
The disclosed method for training one or more robot control models includes performing, based on one or more demonstration trajectories of a robot performing one or more skills associated with a task, one or more training operations to generate one or more first trained machine learning models for controlling the robot; and performing one or more reinforcement learning operations using the one or more first trained machine learning models to generate one or more second trained machine learning models for controlling the robot.
Techniques for emergent scene decomposition from multi-traverse include receiving a plurality of images from multiple traversals of a scene; generating a plurality of 3D Gaussians from the plurality of images; projecting each of the plurality of 3D Gaussians to generate a plurality of rendered 2D images; extracting a feature map from each of the plurality of images and the plurality of rendered 2D images; generating ephemeral objects masks for the plurality of images from the feature maps and the plurality of rendered 2D images; generating optimized 3D Gaussians from the plurality of images, the plurality of rendered 2D images, and the ephemeral objects masks; and generating a 3D environment from the optimized 3D Gaussians.
A pose of the calibration target that is associated with the image is determined determining for each image of the one or more images. The pose includes at least one of a position or an orientation of the calibration target in a local coordinate system of a local positioning system. The image capture device is calibrated based on determining a relationship between the position of the calibration target in the one or more images and the associated local pose of the calibration target in the local coordinate system.
Apparatuses, systems, and techniques are presented to generate images. In at least one embodiment, one or more neural networks are used to generate one or more images based, at least in part, upon one or more semantic features projected from a three-dimensional environment.
Feedforward reasoning models that include a video encoder configured to generate feature tokens from an input video, logic to condition the feature tokens with camera parameters, at least one sparse attention head with two-way attention to transform settings from the feature tokens into a tracking token, a depth token, and a visibility token in accordance with an input prompt, and logic configured to transform the tracking token, depth token, and visibility token into track predictions for an object specified by the input prompt.
G06V 10/771 - Feature selection, e.g. selecting representative features from a multi-dimensional feature space
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
Calibration of various sensors may be difficult without specialized software to process intrinsic and extrinsic information about the sensors. Certain types of input files, such as image files, may also lack certain information, like depth information, to effectively translate regions of interest between images taken from a different perspective. Landmarks can be used to establish points for associating regions of interest between images taken from a different perspective and provided as an overlay to verify sensor calibration.