Embodiments described herein provide a coolant distribution unit having one or more swappable components. In at least one embodiment, one or more coolant distribution units may contain one or more swappable filters coupled to one or more isolation valves that control one or more coolant flows through the one or more swappable filters.
F28F 19/01 - Preventing the formation of deposits or corrosion, e.g. by using filters by using means for separating solid materials from heat-exchange fluids, e.g. filters
H05K 7/20 - Modifications to facilitate cooling, ventilating, or heating
2.
DYNAMIC OBJECT DETECTION USING LIDAR DATA FOR AUTONOMOUS MACHINE SYSTEMS AND APPLICATIONS
In various examples, systems and methods of the present disclosure detect and/or track objects in an environment using projection images generated from LiDAR. For example, a machine learning model—such as a deep neural network (DNN)—may be used to compute a motion mask indicative of motion corresponding to points representing objects in an environment. Various input channels may be provided as input to the machine learning model to compute a motion mask. One or more comparison images may be generated based on comparing depth values projected from a current range image to a coordinate space of a previous range image to depth values of the previous range image. The machine learning model may use the one or more projection images, the one or more comparison images, and/or the one or more range images to compute a motion mask and/or a motion vector output representation.
G06T 7/254 - Analysis of motion involving subtraction of images
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
3.
VIRTUALIZED ROOT OF TRUST IN DISTRIBUTED COMPUTING SYSTEM
A system includes a plurality of application processors (APs), a plurality of flash memory devices associated with the plurality of APs, and a plurality of multiplexers, each to selectively couple a flash memory device of the plurality of flash memory devices to an AP of the plurality of APs. A controller is operatively coupled to the plurality of multiplexers and provides a trusted execution environment to execute a virtual root of trust (vROT) application for each respective AP of the plurality of APs. Each vROT application accesses a corresponding one or more of the plurality of flash memory devices via a corresponding one or more of the plurality of multiplexers.
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
4.
MODEL CUSTOMIZATION AND DEPLOYMENT IN CONTAINERIZED ENVIRONMENTS
Various examples, systems, and methods are disclosed relating to a model customization pipeline. A first computing system can receive at least one customization of at least one artificial intelligence (AI) model corresponding to a base instance. The first computing system can generate a customized instance of the at least one AI model by updating the base instance of the at least one AI model based on the at least one customization. The first computing system can generate a software component configured to perform at least one operation using the customized instance of the at least one AI model. The first computing system can package the software component and the customized instance of the at least one AI model into a first container instance. The first computing system can deploy the software component within a runtime environment.
According to one or more embodiments, operations may relate to a point cloud being generated based at least on filtering out points from a first point cloud based at least on an analysis of which of the filtered out points correspond to respective points of one or more second point clouds different from the first point cloud.
In various examples, systems and methods for calibration for sensor lens shading using non-radial correction of residual radial shading error are provided. In some embodiments, a calibration flow includes computation of calibration parameters corresponding to radial lens shading correction, and computation of calibration parameters corresponding to non-radial lens shading correction. A lens shading profile may be computed that defines a gain mapping of lens shading effect appearing in an image frame of calibration sensor data. Parameters for radial lens shading correction may be computed from the lens shading profile, and parameters for non-radial lens shading correction may be computed based a residual shading profile generated from the radial lens shading correction. Calibration parameters for radial and non-radial lens shading correction may be used to calibrate sensor data captured by an image sensor module to correct for lens shading.
G06T 7/80 - Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
H04N 17/00 - Diagnosis, testing or measuring for television systems or their details
H04N 25/61 - Noise processing, e.g. detecting, correcting, reducing or removing noise the noise originating only from the lens unit, e.g. flare, shading, vignetting or "cos4"
Apparatuses, systems, and techniques to perform exclusive assignment of processing resources, during operation, to operating applications to allow for exclusive fault reporting between applications. In at least one embodiment, processors comprising one or more circuits to cause performance of one or more threads corresponding to one or more respective kernels to be selectively stopped based, at least in part, on at least one of the one or more threads encountering an error.
A parallel processing unit (PPU), operating in a traditional processing environment or in a virtualized processing environment, can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.
One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.
Apparatuses, systems, and techniques to compile and modify software programs. In at least one embodiment, a software program is to be modified to initialize information to be used by one or more application programming interfaces (APIs).
Bandwidth shaping mechanisms in a memory hierarchy operate a first bandwidth shaper on a first memory, such as a cache memory, to shape bandwidth to a second memory, such as a Dynamic Random Access Memory, and replenish the first bandwidth shaper from a second bandwidth shaper based on a hit bandwidth on the first memory.
As integrated circuit geometries have shrunk, lithography simulation has developed to ensure that the masks used to fabricate the circuits satisfy the chip yield and fabrication turnaround time targets. To manufacture an integrated circuit (chip), an initial layout for the integrated circuit design is processed to compute a wafer image (e.g., resist material “printed” on the wafer using photomasks). Lithography simulation processes the initial layout according to optical physics to compute an estimated wafer image without actually constructing the physical masks or consuming any wafer fabrication resources and may be used to confirm manufacturability of the design layout before it is fabricated. Performing lithography simulation using a dual-band neural network produces accurate results efficiently. Dual-band refers to a dual frequency band processing whereby the input layout (mask image) is separately processed by both a first and second branch to extract low-frequency (global) features and high-frequency (local) features, respectively.
G06F 30/398 - Design verification or optimisation, e.g. using design rule check [DRC], layout versus schematics [LVS] or finite element methods [FEM]
G06F 17/14 - Fourier, Walsh or analogous domain transformations
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
G06F 119/18 - Manufacturability analysis or optimisation for manufacturability
13.
EMBEDDED COOLING FOR AN INTEGRATED CIRCUIT CONFIGURED WITH A BACKSIDE POWER RAIL
According to various embodiments, a packaged integrated circuit device includes: a first semiconductor substrate and a second semiconductor substrate. The first semiconductor substrate includes: an integrated circuit, a region of signal layers residing on a first side of the first semiconductor substrate, and a region of power delivery layers residing on a second side of the first semiconductor substrate. The second semiconductor substrate is coupled to the region of signal layers, wherein a first side of the second semiconductor substrate is coupled to the region of signal layers, and a second side of the second semiconductor substrate includes a plurality of fluidic channels.
Apparatuses, systems, and techniques to scale values. In at least one embodiment, a processor comprising one or more circuits to cause a largest value of each portion of two or more portions of an array to be identified and to use the largest value of each portion to scale one or more values within each portion sequentially.
G06F 5/01 - Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
In one embodiment of the present invention, a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch based on one or more start addresses and one or more offsets. Subsequently, the pipeline copies data from the source locations to the image tile. The pipeline then performs matrix multiplication operations between the image tile and a filter tile to generate a contribution of the image tile to an output matrix. To optimize the amount of memory used, the pipeline creates each image tile in shared memory as needed. Further, to optimize the throughput of the matrix multiplication operations, the values of the offsets are precomputed by a convolution preprocessor.
G06V 10/50 - Extraction of image or video features by performing operations within image blocksExtraction of image or video features by using histograms, e.g. histogram of oriented gradients [HoG]Extraction of image or video features by summing image-intensity valuesProjection analysis
G06V 10/56 - Extraction of image or video features relating to colour
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 30/142 - Image acquisition using hand-held instrumentsConstructional details of the instruments
16.
USER CONTENT SENTIMENT ANALYSIS USING LARGE LANGUAGE MODELS
Approaches presented herein relate to performing of sentiment analysis on various types of content, such as product reviews. The resulting sentiment data can be provided for various uses, such as to allow for sentiment-based search or to make sentiment-based recommendations. An example system collects and processes product reviews from various sources. A first language model (such as an LLM or VLM) may be used to analyze a review to generate a summary and perform sentiment analysis. A second language model may be used to perform further analysis based on the sentiment data to infer correlation between comments and reviews and an influence of the review and the comments. Such an approach can analyze the influence of user comments on subsequent reviews from the same commentator, top concerns and issues highlighted by the commentator, inherent bias, features evaluated, etc. The generated sentiment data and associated timestamp data can be stored and indexed in a database for subsequent retrieval or analysis.
In various examples, activation criteria and/or braking profiles corresponding to automatic emergency braking (AEB) systems and/or collision mitigation warning (CMW) systems may be determined using sensor data representative of an environment to a front, side, and/or rear of a vehicle. For example, activation criteria for triggering an AEB system and/or CMW system may be adjusted by leveraging the availability of additional information with regards to the surrounding environment of a vehicle-such as the presence of a trailing vehicle. In addition, the braking profile for the AEB activation may be adjusted based on information about the presence of and/or location of vehicles to the front, rear, and/or side of the vehicle. By adjusting the activation criteria and/or braking profiles of an AEB system, the potential for collisions with dynamic objects in the environment is reduced and the overall safety of the vehicle and its passengers is increased.
B60T 7/22 - Brake-action initiating means for automatic initiationBrake-action initiating means for initiation not subject to will of driver or passenger initiated by contact of vehicle, e.g. bumper, with an external object, e.g. another vehicle
B60Q 9/00 - Arrangement or adaptation of signal devices not provided for in one of main groups
B60T 8/171 - Detecting parameters used in the regulationMeasuring values used in the regulation
B60T 8/172 - Determining control parameters used in the regulation, e.g. by calculations involving measured or detected parameters
18.
INCREMENTAL APPLICATION SAVES FOR CONTENT STREAMING SYSTEMS
In various examples, incremental application saves for content streaming systems is described herein. Systems and methods are disclosed that perform incremental saves during sessions associated with applications, such as a gaming application, and then use the incremental saves to retrieve most current states associated with the applications if one or more termination events occur. For instance, during a session of an application, data packages may be generated that represent changes to the application, such as changes to one or more files and/or one or more portions (e.g., a block(s)) of the file(s). As such, if a termination event occurs during the session, such as the session crashing, user data may be used to reload an initial state of the application at the beginning of the session while the data packages are then used to reload a most current state of the application before the termination event(s) occurred.
Systems and methods are disclosed that interleave federated learning of multiple machine learning models across multiple data centers or other networks, which may be located in distinct geographic locations, regions, or zones. This interleaving of the federated learning of multiple machine learning models may comprise designating which machine learning models are to be trained at which data centers (or other location types), and when to trigger rounds of concurrent training in different data centers. For example, the beginning of a first round of training of corresponding machine learning model may be triggered at each corresponding data center, a determination may be made that the first round of training has been completed, model update data may be rotated to the next scheduled data centers, and the next schedule machine learning models may be loaded and trained.
Apparatuses, systems, and techniques to compile and modify software programs. In at least one embodiment, a software program is to be modified to initialize information to be used by one or more application programming interfaces (APIs).
Apparatuses, systems, and techniques to perform signal processing operations in a fifth generation (“5G”) radio signal. In at least one embodiment, one or more processors equalize, in parallel, one or more 5G radio signals.
In various examples, determining lane localization using two-way outputs for autonomous and/or semi-autonomous systems and applications is described herein. Systems and methods described herein may determine multiple outputs (e.g., vectors) associated with lanes of a driving surface (e.g., a road), where the outputs are indexed starting at different locations with respect to the driving surface, and then use the multiple outputs to determine a lane for which a machine is navigating. In some examples, an output may include a vector that includes a number of elements, where a respective element is associated with at least a lane of the driving surface and indicates a probability that the machine is located within the lane. Additionally, in some examples, the outputs may be indexed starting at different sides of the driving surface, such as the right and left sides of the driving surface.
In various examples, depth predictions obtained using machine learning models may be improved by leveraging relationships associated with two-dimensional (2D) images and three-dimensional (3D) environments. For instance, systems and methods are disclosed that may generate and use a depth distribution map as an additional input channel to a machine learning model. This depth distribution channel may represent average depth values for respective pixels of 2D images generated using a sensor. Additionally, or alternatively, the disclosed systems and methods may generate and use a 2D coordinate channel (e.g., Y coordinate channel) that is aligned with depth in 3D space. For example, the 2D coordinate channel may include points having values that increase in magnitude from a bottom portion of a frame to a top portion of the frame. One or more of these channels may then be applied to the machine learning model to improve depth predictions.
Approaches presented herein provide for efficient rendering of high quality, novel views of a scene, in this case achieved through a combination of volumetric particle representations and ray tracing. An object can be represented using a set of volumetric particles (e.g., 3D distributions) that are aligned to the underlying structure or geometry of the object. Volumetric particles can be encapsulated in a bounding mesh or proxy geometry that can be used to efficiently compute ray-particle intersections. For a view to be rendered, ray tracing can be performed to determine an intersection of the rays with the proxy geometry. When a hit is determined, the precise intersection location with the volumetric particle is computed and the value of the distribution returned for that ray. If a ray passes through multiple semi-transparent volumetric particles then the color value is determined based upon the values returned from those particles.
One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.
A first device includes a transceiver to communicate with a second device over an interposer, the interposer comprising a plurality of conductive traces between the first device and the second device. The first device also includes control logic, coupled to the transceiver associated with the first device, configured to send first data to the second device over a conductive trace of the plurality of conductive traces, simultaneously receive second data from the second device over the conductive trace, and extract the second data from a combined signal comprising the first data and the second data.
H04L 7/00 - Arrangements for synchronising receiver with transmitter
H04L 7/033 - Speed or phase control by the received code signals, the signals containing no special synchronisation information using the transitions of the received signal to control the phase of the synchronising-signal- generating means, e.g. using a phase-locked loop
27.
TARGET SPACE DETECTION FOR AUTONOMOUS AND SEMI-AUTONOMOUS SYSTEMS AND APPLICATIONS
A neural network may be used to determine corner points of a skewed polygon (e.g., as displacement values to anchor box corner points) that accurately delineate a region in an image that defines a parking space. Further, the neural network may output confidence values predicting likelihoods that corner points of an anchor box correspond to an entrance to the parking spot. The confidence values may be used to select a subset of the corner points of the anchor box and/or skewed polygon in order to define the entrance to the parking spot. A minimum aggregate distance between corner points of a skewed polygon predicted using the CNN(s) and ground truth corner points of a parking spot may be used simplify a determination as to whether an anchor box should be used as a positive sample for training.
G06T 17/30 - Surface description, e.g. polynomial surface description
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 10/772 - Determining representative reference patterns, e.g. averaging or distorting patternsGenerating dictionaries
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
28.
FAULT DETECTION FOR AUTONOMOUS AND SEMI-AUTONOMOUS SYSTEMS AND APPLICATIONS
Systems and methods for detecting hardware faults in computer-based feedback control systems. Multiple instances of the system control program(s) are run on system processors. System sensor data are input to each instance, and the control commands output by each instance are compared. As instantiations of the same programs receive largely the same sensor data, differences between output commands may indicate the presence of one or more hardware faults.
A broad bandwidth polarization splitting grating coupler is provided. The grating coupler includes a substrate (or at least a portion thereof) and a plurality of features formed on the substrate in a two-dimensional arrangement. The two-dimensional arrangement is defined by a regular two-dimensional lattice defining a plurality of lattice sites. Each position of the two-dimensional arrangement is displaced by a non-zero displacement from a corresponding lattice site of the plurality of lattice sites.
One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.
Apparatuses, systems, and techniques to report predicted channel state information (CSI). In at least one embodiment, a system includes one or more circuits to compare a predicted channel state information (CSI) to a measured CSI and to cause the predicted to CSI to more closely match the measured CSI.
H04B 7/06 - Diversity systemsMulti-antenna systems, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station
32.
Techniques to perform operations with matrix multiply accumulate (MMA) circuitry
Apparatuses, systems, and techniques to perform operations with matrix multiply accumulate (MMA) circuitry. In at least one embodiment, a processor includes MMA circuitry to perform a first portion of a mathematical operation, one or more first circuits to perform a second portion of the mathematical operation not performed by the MMA circuitry, and one or more second circuits to cause a result of the first portion and the second portion to be combined.
Systems and methods in accordance with the present disclosure can prevent interference of memory operations being performed, for example, by high priority or high safety applications. In various examples, a memory management unit (MMU) can receive, from a client, a request to perform a process. The MMU can select, based at least on an identifier of at least one of the client or the process, a target memory manager of a plurality of memory managers of the MMU. The MMU can cause the target memory manager to perform a memory translation operation for the request.
Silicon wafers including multiple wafer test structures, each comprising a ring oscillator comprising multiple bit-storing cells configured such that the discharge of a bitline triggers charging of a first adjacent bitline and discharge of a second adjacent bitline. The oscillation frequency of the ring oscillator changes in accordance with the discharge rate of the bitlines, which is affected by factors such as word line under-drive and aging. The silicon wafers include at least one frequency monitor coupled to one or more of the ring oscillators.
In various examples, systems and methods are disclosed relating to language model jailbreak detection using length-perplexity metrics. A system can identify a prompt for a language model—such as an LLM, VLM, etc.—and generate a perplexity score for the prompt. The system can determine, based at least on the perplexity score and a length of the prompt, that the prompt is indicative of a jailbreak attempt for the large language model. The system can restrict the prompt from input to the large language model—or block an output generated based on the prompt from being shared—responsive to determining that the prompt is indicative of the jailbreak attempt.
Embodiments of the present disclosure may relate to a method of generating a model of a system (e.g., a computing system), where the model may include one or more of a structural or architectural model and a behavioral or dynamic model. In some embodiments, the method may include obtaining data that may indicate one or more elements associated with functionality of a system (e.g., one or more software elements and/or hardware components). In some embodiments, the method may additionally include determining one or more operational dependencies corresponding to the one or more elements associated with the functionality of the system. Further, the method may include generating a model of the system based at least on the obtained data, the one or more elements, and the determined operational dependencies.
Embodiments of the present disclosure relate to language instructed temporal localization in videos, and provide multimodal large language models (LLMs) for performing language instructed temporal localization in video, as well as methods for training and implementing such models. In contrast to conventional systems, models according to embodiments of the present disclosure are designed to answer “when?” questions, while simultaneously improving other relevant capabilities of multimodal LLMs. Additionally, and/or alternatively, embodiments of the present disclosure may utilize a soft cross entropy loss and/or a dynamic sampling strategy to further improve the model, which allows the model to better understand temporal information and perform event localization tasks. For example, embodiments of the present disclosure may perform a dynamic sampling strategy and utilize video tokens and image tokens and/or utilize a soft cross entropy loss that applies a Gaussian distribution to the loss.
Triplanes are data representations used in computer graphics to encode scenes into compact feature representations that balance expressiveness with efficiency. Despite their efficiency, triplanes still suffer from large data bandwidth size, precluding use in streaming or dynamic settings. Methods which aim to compress triplanes, however, must be trained alongside the model and as a result are not generalizable among different scenes. The present disclosure provides a generalizable solution for triplane compression that can be applied to various triplanes without scene-specific training or finetuning.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
In examples, a VM may receive and aggregate a first attestation report corresponding to a CPU and a second attestation report corresponding to a GPU. The aggregated data may be provided to an attestation service, which may verify the attestation reports indicate a TCB is to include the VM and GPU state data and is to isolate the GPU state data and the VM from an untrusted host OS. Based at least on the TCB being verified, the VM may perform one or more operations using the TCB. The TCB may include a trusted hypervisor to isolate the VM and GPU state data within the GPU(s) from the untrusted host OS. The trusted hypervisor may prevent the host OS from accessing device memory assigned to the VM based at least on controlling an IOMMU and/or second-level address translation (SLAT) used to access the data.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
A63F 13/73 - Authorising game programs or game devices, e.g. checking authenticity
40.
HIGH-SPEED SIGNALING SYSTEM WITH GROUND REFERENCED SIGNALING (GRS) OVER SUBSTRATE
A system includes a first integrated circuit coupled to a printed circuit board (PCB) and a second integrated circuit coupled to the PCB. The system further includes a ground referenced signaling (GRS) link coupled between the first integrated circuit and the second integrated circuits through the PCB. Unencoded data is transmitted on the GRS link according to a memory coherence protocol.
A text-to-image machine learning model takes a user input text and generates an image matching the given description. As an extension to this concept, text-to-3D content models can take a user input text to generate a 3D content. However, existing text-to-3D content models require different views to be individually generated and optimized in order to form the content in 3D, which is costly in terms of computation and time, and are typically limited to the generation of 3D objects as opposed to large 3D scenes. The present description enables the creation of 3D scenes in a less costly manner by using a feed-forward neural network that can generate a 3D representation of a scene from a plurality of labeled voxels that describe the scene in 3D.
Apparatuses, systems, and techniques to generate neural networks optimized for different hardware. In at least one embodiment, a processor using circuits is to adjust a neural network architecture based on computing resources that use said neural network in inference.
Random sequence generators utilizing one or more noise generators that include an inverter chain with at least one input stage inverter configured with resistive feedback and additional inverters configured in series with the at least one input stage inverter, wherein an output of the inverter chain is coupled to a bit sequence generator.
In various examples, systems and methods are disclosed relating to transforming sensor measurements according to configurable atmospheric conditions. One or more circuits can identify a point cloud comprising a plurality of points and a parameter of a weather condition to simulate and modify an intensity of at least one of the plurality of points according to the parameter of the weather condition. The one or more circuits can update, based at least on a subset of the plurality of points and the parameter of the weather condition, the point cloud to include one or more additional points.
Methods and systems are provided for active thermal control (ATC) of an integrated circuit (IC) during a testing procedure. The methods and systems described herein involve obtaining a plurality of measurements of a die of the IC. The plurality of temperature measurements are provided by a plurality of temperature sensors integrated with the die. Each individual sensor can, for example, be integrated with an individual compute unit of a graphics processing unit (GPU) or with an individual core of a central processing unit (CPU). The methods and systems described herein further involve controlling, based on the plurality of temperature measurements, a temperature forcing system to implement ATC. Control of the temperature forcing system involves supplying heat to the IC when a temperature falls below a desired test temperature range and/or removing heat from the IC when a temperature exceeds the desired test temperature range.
Approaches presented herein provide systems and methods for generating one or more parameters for a content generation environment. One or more trained models may be used to generate parameters for the content generation environment based on a provided input. The input may be evaluated and then parsed or otherwise formatted to generate a deterministic output from the one or more trained models. A modified input generated from the input may then be provided to the one or more models for generation of the requested parameters. A configuration file may be generated and/or the parameters may be directly provided to an environment to configure different components based on the generated parameters.
Apparatuses, systems, and techniques to generate neural networks optimized for different hardware. In at least one embodiment, a processor using circuits is to adjust a neural network architecture based on computing resources that use said neural network in inference.
Apparatuses, systems, and techniques for enhancing autonomous driving systems. In at least one embodiment, visual input corresponding to an observable environment is tokenized into object-level knowledge and provided to a large language model (LLM). Object-level tokens are processed by the LLM to enhance autonomous vehicle route-planning, reducing trajectory error and decreasing collision rates.
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 20/70 - Labelling scene content, e.g. deriving syntactic or semantic representations
Frame buffer compression schemes used for image compression are oftentimes lossless so that the image can be decompressed as close as possible back to its original state. However, lossless compression schemes require that any image data that cannot be successfully compressed (i.e. without losing significant data) be kept in a non-compressed state for transmission and storage. As a result, lossless compression can reduce bandwidth requirements but not memory requirements. The more recently introduced lossy frame buffer compression schemes do allow for some data loss and therefore can save both bandwidth and memory, however, lossy frame buffer compression schemes are limited particularly in the amount by which image data can practically be reduced. The present disclosure provides lossy frame buffer compression which involves an additional compression step, thereby allowing image data to be compressed to a lower rate. This lossy frame buffer compression can reduce both bandwidth and memory usage.
An IC package includes a die and an elongate conductive trace formed adjacent to at least one peripheral edge of the die. Test logic in the package performs a die crack test by applying a test pattern to a first end of the conductive trace and sensing a response pattern at a second end of the conductive trace. The package is configured to operate in at least two distinct modes, including a manufacturing test mode in which a die crack test result is communicated from the package out of a JTAG port, and a field test mode in which the die crack test result is communicated from the package out of a data transfer port. A vehicle computer system may perform a fail safe action based on the die crack test result.
B60K 28/10 - Safety devices for propulsion-unit control, specially adapted for, or arranged in, vehicles, e.g. preventing fuel supply or ignition in the event of potentially dangerous conditions responsive to conditions relating to the vehicle
B60R 16/03 - Electric or fluid circuits specially adapted for vehicles and not otherwise provided forArrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric for supply of electrical power to vehicle subsystems
G01R 31/3185 - Reconfiguring for testing, e.g. LSSD, partitioning
G07C 5/08 - Registering or indicating performance data other than driving, working, idle, or waiting time, with or without registering driving, working, idle, or waiting time
51.
HAZARD DETECTION FOR AUTONOMOUS AND SEMI-AUTONOMOUS SYSTEMS AND APPLICATIONS
In various examples, systems and methods are disclosed that detect hazards on a roadway by identifying discontinuities between pixels on a depth map. For example, two synchronized stereo cameras mounted on an ego-machine may generate images that may be used extract depth or disparity information. Because a hazard's height may cause an occlusion of the driving surface behind the hazard from a perspective of a camera(s), a discontinuity in disparity values may indicate the presence of a hazard. For example, the system may analyze pairs of pixels on the depth map and, when the system determines that a disparity between a pair of pixels satisfies a disparity threshold, the system may identify the pixel nearest the ego-machine as a hazard pixel.
B60W 40/02 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to ambient conditions
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G06T 7/30 - Determination of transform parameters for the alignment of images, i.e. image registration
G06T 7/593 - Depth or shape recovery from multiple images from stereo images
G06T 7/80 - Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
H04N 13/271 - Image signal generators wherein the generated image signals comprise depth maps or disparity maps
A carrier for insertion of a device under test into a tester includes engagement structures and an integrated activator. The engagement structures are shaped to engage and hold the device under test in the carrier for insertion in tester. The activator is positioned to automatically contact and activate a configuration component on the device under test. The activator may particularly include a shunt positioned to electrically contact and short together configuration pins. The shorting or other activation sets an operating mode of the device under test during testing.
Apparatuses, systems, and techniques to generate one or more data items related to an object from a set of multiple data items related to the same object using a generative adversarial network. In at least one embodiment, data objects in a set are disentangled into common and unique components and encoded such that they may be used to train one or more neural networks to generate missing data from the data objects in the set, such as generating missing medical images from a set of related medical images.
Apparatuses, systems, and techniques to detect uplink data in a multi-user multiple input multiple output wireless communication system. In at least one embodiment, uplink data is detected using one or more graphics processing units to perform parallel computations in order to form a consensus belief about information received by one or more base stations in a wireless network.
In various examples, systems and methods are disclosed that relate to managing interactions with generative artificial intelligence models. For example, a system can receive data associated with a system prompt, the system prompt including a first string of text. The system can then receive data associated with a task prompt, the task prompt including a second string of text configured to cause a large language model (LLM) to generate an output. The system can generate a model prompt including a third string of text based at least on the first string of text and the second string of text. In examples, the system can provide the model prompt to the LLM to cause the LLM to generate the output, the output including an answer that is determined based at least on a context associated with the third string of text.
Apparatuses, systems, and frameworks for provisioning of efficient pipelines capable of multi-model inference and data processing using multiple processing units, including streaming data applications. The disclosed techniques include, during an initialization stage, assigning a plurality of machine learning models (MLMs) for execution on graphics processing units (GPUs), allocating memory space, on a hub GPU, to the plurality of MLMs, storing input data on the hub GPU before transferring the input data to other GPUs for execution. During an execution stage, output data is initially stored on GPUs that generated the output data before transferring the output data to the hub GPU.
Approaches presented herein provide for the selection of tracks of data to be used to generate, or update, a digital representation or reconstruction of a physical environment. Tracks of data may be obtained that correspond to roads or other features of a region, but there may be more tracks of data obtained for certain features than is needed, and few tracks obtained for other features. A selection process can cluster track segments into buckets, and attempt to select tracks so that the number of tracks for each bucket is above a minimum track threshold and below a maximum track threshold. An interactive selection process can be used, where selection of a track causes that track to be selected for all associated buckets that have not yet reached the maximum track threshold. Once at least a minimum number of tracks have been selected for each bucket, the tracks can be registered and provided for generation of the digital representation.
Systems and methods are disclosed that perform global human and camera motion estimation using a motion diffusion model that is attached to a control branch. For instance, using a controlled motion denoiser that comprises the motion diffusion model and the control branch, global human motions and the corresponding camera motions from “in-the-wild” videos may be estimated. Initially, SLAM may be used to initialize the camera motion and a pose estimation model may be used to estimate the local human motion. Combining the two, embodiments of the present disclosure initialize the global human motion. Then, during optimization and using a COIN system that includes the controlled motion denoiser and/or using a COIN algorithm, embodiments of the present disclosure enforce the global human and camera motion to satisfy a two-dimensional (2D) projection on videos and the motion distribution from the motion diffusion model.
Apparatuses, systems, and techniques for cross-modality alignment for large language models (LLMs), enabling enhanced multi-modal interaction. In at least one embodiment, a textual embedding is obtained by encoding a multi-modal input and algining the encoded results into a textual embedding space. A visual embedding is obtained based on features extracted from visual data in the multi-modal input using visual encoders. A multi-modal output is generated based on the textual embedding and the visual embedding.
Apparatuses, systems, and techniques to execute one or more application programming interfaces (APIs) to perform one or more operations for one or more accelerators within a heterogeneous processor. In at least one embodiment, one or more processors are to perform one or more instructions in response to one or more APIs to indicate one or more functions to be performed in response to one or more errors from one or more accelerators within a heterogeneous processor.
Heat transfer apparatus may include an integrated circuit (“IC”) package that itself includes a package substrate and one or more dies coupled to the package substrate such that bottom surfaces of the dies face toward the package substrate and top surfaces of the dies define die top areas. A metal member, such as a package lid, a heat sink, or a cold plate, may be designed to fit above at least part of the package substrate. A heat transfer member having a higher thermal conductivity than the metal member is attached to the metal member such that the heat transfer member is disposed over a die top area when the metal member is placed over the package substrate. In some embodiments, the heat transfer member may comprise a diamond material.
H01L 23/367 - Cooling facilitated by shape of device
H01L 21/50 - Assembly of semiconductor devices using processes or apparatus not provided for in a single one of the groups or
H01L 23/00 - Details of semiconductor or other solid state devices
H01L 23/373 - Cooling facilitated by selection of materials for the device
H01L 25/07 - Assemblies consisting of a plurality of individual semiconductor or other solid-state devices all the devices being of a type provided for in a single subclass of subclasses , , , , or , e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in subclass
62.
DEFINING STREAMING PIPELINES FOR AI-BASED PROCESSING USING DECLARATIVE CONFIGURATIONS
In various examples, a declarative configuration format defines a machine learning pipeline using nodes and edges that define the data flow between those nodes. Partial pipelines may be defined independently and referenced by multiple full pipelines. The configuration format may allow for constructing applications for a range of hardware and deployment environments. The system may support both graphical and programmatic interfaces for defining, modifying, and visualizing pipelines. The machine learning pipelines may be implemented on or interact with an underlying machine learning pipeline framework that provides runtime objects and a plugin architecture. The configuration data may be used to generate an application that includes the processing components and corresponding runtime dependencies of a machine learning pipeline in a containerized image.
At least one embodiment is directed towards a computer-implemented method for generating compressed video content. The computer-implemented method includes the steps of receiving a plurality of triplanes associated with video content; extracting channel range values from each triplane included in the plurality of triplanes; normalizing the plurality of triplanes based on the channel range values to generate a plurality of normalized triplanes; storing the channel range values with the plurality of normalized triplanes; generating a plurality of tiled triplanes based on the plurality of normalized triplanes; compressing the plurality of tiled triplanes to generate compressed video content; and transmitting the compressed video content to an endpoint device. Another embodiment is directed towards a computer-method for rendering video content. Yet another embodiment is directed towards a computer-implemented method for training generative artificial intelligence (AI) models.
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
G06T 15/00 - 3D [Three Dimensional] image rendering
64.
TECHNIQUES FOR REDUCING SECURITY RISKS ASSOCIATED WITH SHARED STORAGE
In various embodiments, a proxy application processes requests to access a storage system. The proxy application receives a client request from a proxy driver executing on a client node. The client request is associated with a client buffer and a location within the storage system. The proxy application converts the client request to a proxy request that is associated with a proxy buffer and the same location within the storage system. The proxy application transmits the proxy request to a storage driver that is associated with the storage system. The storage driver causes a file server to perform at least one operation at the location in accordance with the proxy request.
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules
G06F 21/54 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by adding security routines or objects to programs
65.
POLARIZATION-MULTIPLEXED OPTICAL BI-DIRECTIONAL LINKS USING SYMMETRICAL HARDWARE
One embodiment includes an optical communication channel, a first network device connected to a first end of the optical communication channel, and a second network device connected to a second end of the optical communication channel. At least one of the first network device or the second network device performs polarization tracking of packets of polarization multiplexed bidirectional communications through the optical communication channel.
Processors, systems, methods, and computer program products to detect unauthorized modifications to security information in a security device. In at least one embodiment, a processor comprises one or more circuits that use information stored in one or more storage locations of one or more security devices to detect whether the security devices have been rebooted.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
G06F 21/55 - Detecting local intrusion or implementing counter-measures
67.
APPLICATION PROGRAMMING INTERFACE TO INDICATE OBJECT INTERSECTED BY RAY
Apparatuses, systems, and techniques to indicate one or more objects intersected by one or more rays. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause one or mor objects intersected by one or more rays to be indicated to a user.
Apparatuses, systems, and techniques to select frequency of a processing unit. In at least one embodiment, an operating frequency of one or more integrated circuits is dynamically adjusted based, at least in part, on a dynamically measured maximum throughput of the one or more integrated circuits.
One critical objective of robotic learning is building a universal agent capable of performing a vast number of tasks across a diverse set of environments. Currently, an agent policy for performing a task can be learned from video depicting performance of the task. However, because the learning is susceptible to focusing on areas of the video that do not depict the actual performance of the task, errors can be introduced into the policy. The present disclosure provides video diffusion for a specified task with a focus on an active region in which the task is being performed, such that an agent policy then trained on the video will correctly learn the actions needed to be taken to perform the task.
Apparatuses, systems, and techniques to power balance multiple chips. In at least one embodiment, a system includes a plurality of processors having substantially equal performance capability and different power consumption capability, where a cumulative power consumption of the processors is not to exceed a system power threshold if each processor is operated at substantially peak performance.
At least one embodiment is directed towards a computer-implemented method for rendering video content. The computer-implemented method includes the steps of decompressing compressed video content to generate decompressed video content, wherein the decompressed video content includes a plurality of normalized triplanes; de-normalizing the plurality of normalized triplanes to generate a plurality of modified triplanes; performing neural rendering operations to generate a plurality of final images via ray tracing based on the plurality of modified triplanes; and displaying the plurality of final images as rendered video content via a display device. Another embodiment is directed towards a computer-method for generating compressed video content. Yet another embodiment is directed towards a computer-implemented method for training generative artificial intelligence (AI) models.
At least one embodiment is directed towards a computer-implemented method for training generative artificial intelligence (AI) models. The computer-implemented method includes the steps of receiving a plurality of training images; rendering, via a generative AI model, a plurality of synthetic images based on the plurality of training images; generating triplane loss metrics for the plurality of synthetic images by comparing the plurality of synthetic images against the plurality of training images; generating total variation (TV) loss metrics based on the triplane loss metrics; generating triplane compression loss metrics based on the triplane loss metrics; generating total loss metrics based on the TV loss metrics and the triplane compression loss metrics; and performing at least one backpropagation operation based on the total loss metrics to update weights associated with the generative AI model to generate an updated generative AI model.
Disclosed are apparatuses, systems, and techniques that implement efficient staggered authentication of sensor data in real-time streaming applications. In one embodiment, a processing device establishes an authentication schedule for a plurality of sensors and receives units of data from the sensors. The units of data are received over multiple times, with the processing device receiving, from respective sensors, a plurality of sub-units of data; selecting, using the authentication schedule, a number of one or more sub-units of data from the received sub-units of data; and performing an authentication of the one or more selected sub-units of data. The processing device determines authenticity of the units of data using the performed authentications of the sub-units of data.
A processing device including a first cache is coupled to a system memory and a parallel processing unit (PPU) including a second cache. An operation to modify cache lines of the second cache associated with a first aperture of the system memory is received. A first subset of cache lines of the second cache is identified. The first subset of cache lines is associated with the first aperture of the system memory and is different from a second subset of cache lines of a second aperture of the system memory. The first subset of cache lines is modified as specified by the cache operation.
Apparatuses, systems, and techniques to detect object in images including digital representations of those objects. In at least one embodiment, one or more objects are detected in an image based, at least in part, on points corresponding to a surface of one or more objects.
Approaches presented herein provide for the determination of realistic lighting parameters for a scene represented in an image. Realistic lighting parameters can allow for the insertion of one or more virtual objects into a scene image, where the lighting or shading applied to the virtual object(s) can be consistent with those for other objects in the scene. A machine learning model such as a discriminator or diffusion model can be used to analyze a composed image generated by a differential renderer, for example, in which at least one virtual object has been inserted into a scene image and had lighting effects applied in accordance with a set of lighting parameters. A loss value can be determined based on the results of this machine learning model, which can be used to optimize the lighting parameters and/or adjust the weights or parameters of a model used to generate the lighting parameters. Once fine-tuned or optimized, the lighting parameters can represent an accurate light map for the scene or environment that can be used to generate composed images.
In various examples, ground truth data for training machine learning models may be improved using other sources of information, such as outputs from neural networks and/or other vision-based algorithms. For instance, sensor data that is to be used as a ground truth for training/validating a machine learning model may be obtained using one or more sensors. However, instead of automatically using the sensor data as a presumed accurate version of the ground truth, the sensor data may be evaluated for inaccuracies and, in some instances, updated to reduce one or more of the inaccuracies. For example, a neural network, a vision-based algorithm, and/or another learned process may be used to generate validation data for comparing with the sensor data, identifying the inaccuracies, and/or refining the sensor data to generate a more accurate version of the ground truth.
Processors configured to execute instructions to enable more efficient computation of distances, collisions, and other common engineering tasks, including instructions to share register values among threads executing in a partition of the processor, instructions to compute a distance between surfaces of a sphere, and instructions to obtain identifiers of threads associated with minimal or maximal values of local registers.
Embodiments of voltage regulators that include a power transistor configured to provide a load current, a voltage follower configured to provide a gate voltage of the power transistor, and a circuit configured to transform the gate voltage of the power transistor into a mirror current of the load current and to adjust a current flow through the voltage follower based on the mirror current.
G05F 1/46 - Regulating voltage or current wherein the variable actually regulated by the final control device is DC
G05F 1/575 - Regulating voltage or current wherein the variable actually regulated by the final control device is DC using semiconductor devices in series with the load as final control devices characterised by the feedback circuit
G05F 1/59 - Regulating voltage or current wherein the variable actually regulated by the final control device is DC using semiconductor devices in series with the load as final control devices including plural semiconductor devices as final control devices for a single load
80.
SYSTEM MODIFICATION BASED ON CORRELATIONS BETWEEN OPERATIONAL DOMAIN PARAMETERS AND PERFORMANCE INDICATORS IN AUTONOMOUS SYSTEMS AND APPLICATIONS
Embodiments of the present disclosure may relate to a method of modifying system behavior based on one or more determined correlations. In some embodiments, the method may include obtaining first data and second data where the first data may include one or more performance indicator values that may correspond to the performance of the system and where the second data may include one or more operational domain parameters that may correspond to the system. In some embodiments, the method may additionally include assembling a data structure based on the first data and the second data. In some embodiments, the method may additionally include determining one or more correlations between individual operational domain parameters and individual performance indicator values based on the assembled data structure. In some embodiments, the method may additionally include modifying one or more aspects of the system based on the determined correlations.
In various examples, updates may be published for maps used by machines to navigate an environment without restricting use of previous versions of the maps that remain valid. For example, a map for a region of an environment may be updated to a new version, and compatibility information for the new version of the map may be published. The compatibility information may indicate whether the new version of the map is compatible with other maps for other regions. Additional compatibility information for the other maps may also be updated to indicate whether those other maps are compatible with the new version of the map and/or the previous versions of the map. In some examples, an index may be updated to indicate the new version of the map is the most current version, and the new version of the map may be made available for use by the machines.
Disclosed are apparatuses, systems, and techniques that leverage one or more artificial intelligence models for efficient automatic speech recognition (ASR) of speech in a diacritized language. The techniques include processing, using an ASR model, audio frame(s) encoding a speech in the diacritized language to generate, for a transcription token (TT) of the speech, likelihoods that the TT corresponds to various vocabulary tokens that include both non-diacritized and diacritized tokens of the language, and generating, using the likelihoods, a transcription of the speech.
One embodiment of a method for controlling vehicles includes generating, based on sensor data, a first plan for controlling a vehicle, generating, using a trained visual language model (VLM), a final plan for controlling the vehicle based on the first plan and a second plan, and controlling the vehicle based on the final plan.
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G06F 40/40 - Processing or translation of natural language
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
84.
POLARIZATION-MULTIPLEXED OPTICAL BI-DIRECTIONAL LINKS USING SYMMETRICAL HARDWARE
One embodiment includes an optical communication channel, a first network device connected to a first end of the optical communication channel, and a second network device connected to a second end of the optical communication channel. At least one of the first network device or the second network device performs polarization tracking of packets of polarization multiplexed bidirectional communications through the optical communication channel.
Embodiments of the present disclosure relate to performance by a machine of one or more planning, control, or navigation operations using a point cloud. The point cloud being generated using sensor data selected from a sensor data set obtained using one or more external sensors of the machine. The selected sensor data being selected for inclusion in the point cloud based at least on one or more criteria individually corresponding to generation of the point cloud.
G01S 13/04 - Systems determining presence of a target
B60W 40/10 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to vehicle motion
B60W 40/12 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to parameters of the vehicle itself
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G01S 7/00 - Details of systems according to groups , ,
G01S 13/86 - Combinations of radar systems with non-radar systems, e.g. sonar, direction finder
G01S 13/89 - Radar or analogous systems, specially adapted for specific applications for mapping or imaging
G01S 13/931 - Radar or analogous systems, specially adapted for specific applications for anti-collision purposes of land vehicles
G01S 17/04 - Systems determining the presence of a target
G01S 17/931 - Lidar systems, specially adapted for specific applications for anti-collision purposes of land vehicles
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/28 - Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
Apparatuses, techniques, and/or software to reduce a number of weight parameter values for one or more neural networks by causing said one or more neural networks to use one or more tensors in two or more portions of the one or more neural networks. For example, apparatuses, techniques, processors, and/or software to generate a linear combination of tensors to approximate two or more layers of aneural netowrk, where said same tensors can be used or repeated to apprximate differnet portions of a neural network. In one or more embodiments, said two or more portions are generated and provided to said one or more neural networks using knowledge distillation techniques.
In various examples, systems and methods are disclosed relating to high resolution and low latency video streaming are disclosed. A system can capture, from data generated by an application, a plurality of partial frames according to a sampling rate. The system can generate a plurality of packet groups, where each group includes one or more packets storing a respective partial frame of the plurality of partial frames and respective location metadata for the partial frame. The system can transmit the plurality of packet groups to a receiver system accessing the application, where each group is transmitted at a respective time.
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/132 - Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/423 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
88.
LEARNING MONOTONIC ALIGNMENT FOR LANGUAGE MODELS IN AI SYSTEMS AND APPLICATIONS
In various examples, learning monotonic alignment for language models in AI systems and applications is described herein. Systems and methods are disclosed that train one or more language models—such as LLMs or VLMs—using one or more techniques that improve the ability of the language model(s) to align inputs (e.g., text tokens) with outputs (e.g., speech tokens). For instance, to learn a stricter alignment and improve robustness of the language model(s), the training may encourage monotonic cross-attention scores using one or more attention priors and/or using one or more connectionist temporal classification (CTC) losses when updating the language model(s). For instance, the attention prior(s) may initialize the cross-attention scores to a monotonic heuristic while the CTC loss(es) may ensure the learned alignment attends over one or more text tokens (e.g., all text tokens) sequentially.
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
89.
ADAPTIVE SCALING FOR MULTI-RESOLUTION PROCESSING IN MACHINE LEARNING SYSTEMS AND APPLICATIONS
In various examples, sizes of images that are to be applied to machine learning models may be evaluated with respect to thresholds that correspond to input resolutions of the models. In some examples, if the size of an image is smaller than a threshold corresponding to a certain model's input resolution, the image may be incorporated into a frame that has the same resolution as that model's input resolution. For instance, the image may be copied into the frame at the image's original size, and the rest of the frame may include padding around the image to maintain the frame's resolution at the input resolution. This frame, which may include both the image and the padding, may then be applied to the model. In this way, the image may still be applied to the model at the model's input resolution without scaling and potentially distorting the image.
Embodiments of the present disclosure relate to differentiation of ray tracing of radio maps. Systems and methods are disclosed for using path replay backpropagation to efficiently compute a radio map. In an embodiment, an electric field of a propagating wave and its interaction with the environment is represented using the Stokes-Müller formalism. Instead of storing information needed for conventional backpropagation during the forward pass, in an embodiment, only the loss, loss gradients, and optionally information needed to retrace the paths that contribute to the loss are stored because replay backpropagation propagates gradients in a second forward pass. The loss gradients from the first forward pass are used during the second forward pass when paths are replayed to accumulate the loss gradients with additional gradients resulting from interactions with scattering surfaces.
Apparatuses, systems, and techniques to losslessly compress neural networks via semi-structured sparsity. In at least one embodiment, a weighted average of candidate masks for semi-structured sparsity is learned for each parameter block of a neural network, and a composite mask is determined by selecting candidate masks based on the learned weighted averages. In at least one embodiment, computational resources required for inference are reduced, thereby contributing to more sustainable and environmentally friendly AI applications.
Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until a portion of matrix multiply-accumulate (MMA) operations have been performed.
In various examples, a framework is or provides an end-to-end solution that includes multi-sensor capture, data processing, inferencing, synchronization, alignment, and 3D rendering for multi-modal perception fusion pipelines. A multi-modal perception fusion pipeline may include a mixer, an aligner, an inference environment, and a multi-view renderer. The mixer may merge sensor data from different data sources into a single HashMap frame. The aligner may use calibration data for sensor-to-sensor coordinate transformations. The inference environment may receive multi-modality data and use custom preprocessing and custom postprocessing to generate inference results. The renderer may generate different sensor data renderings. The framework may include an application that uses configuration data to generate or configure a custom multi-modal perception fusion pipeline. The inference environment may access inference models using a uniform inference interface and support remote inference, allowing the pipeline to become an API client of the inference models.
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
94.
VISIBILITY-BASED ENVIRONMENT IMPORTANCE SAMPLING FOR LIGHT TRANSPORT SIMULATION SYSTEMS AND APPLICATIONS
Systems and methods to implement a technique for determining an environment importance sampling function. An environment map may be provided where lighting information about the environment is known, but where certain pixels within a scene associated with the environment map are shaded. From these shaded pixels, rays may be drawn in random directions to determine whether the rays are occluded or can interact with the environment map, which provides an indication of a source of lighting that can be used for light transport simulations. A mask may be generated based on these occlusions and used to update the environment importance sampling function.
In various examples, a three-dimensional (3D) data processing pipeline for autonomous systems and applications is presented. Systems and methods are disclosed for 3D point cloud data processing fused with video analysis applications. Using the systems and methods described herein, processing of 3D data may be performed in different multimedia frameworks, allowing a user to use common libraries and/or to implement custom libraries on top of the existing system design. As a result, conventional 2D video processing may be combined with 3D data processing, to allow for data representing a flat 2D world to represent a rich 3D world. In this way, the fused 3D depth and/or range data with 2D camera image data allows for perception and/or vision that is more powerful, accurate, and precise.
An integrated circuit includes a signal detection circuit comprising an activation circuit and coupled to a first receiver and a first transmitter. The signal detection circuit monitors bit transitions in data signals arriving, via the first receiver, over a channel from a second transmitter. The signal detection circuit detects that the second transmitter has exited an idle mode based on the bit transitions. Upon detecting that the second transmitter has exited the idle mode, the signal detection circuit activates the first receiver using the activation circuit.
The invention relates to a method for controlling vehicles, including generating (804), based on sensor data, a first plan for controlling a vehicle, generating (812), using a trained visual language model, VLM, a final plan for controlling the vehicle based on the first plan and a second plan, and controlling (814) the vehicle based on the final plan.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable software; downloadable software libraries; downloadable software and software libraries for data compression and decompression; downloadable software and software libraries for data compression and decompression using graphics processing units (GPUs); downloadable software development tools for compression and decompression; downloadable software development tools for data compression and decompression using graphics processing units (GPUs) Providing non-downloadable software; providing non-downloadable software libraries; providing non-downloadable software and software libraries for data compression and decompression; providing non-downloadable software and software libraries for data compression and decompression using graphics processing units (GPUs); providing non-downloadable software development tools for compression and decompression; providing non-downloadable software development tools for data compression and decompression using graphics processing units (GPUs); design and development of computer software
99.
VULNERABILITY REMEDIATION FOR DIGITAL CERTIFICATES
Systems and methods herein are for a host machine to include memory having instructions and at least one processor to execute instructions, which can cause the host machine to communicate with a verification server using a vulnerability request associated with a digital certificate, which can also cause the host machine to receive and parse a vulnerability response which includes a completion indicator, a status indicator, and an information or reference indicator associated with the status indicator, and which can cause the host machine to use the information or reference indicator to determine or perform a response to a vulnerability associated with the digital certificate.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
G06F 21/33 - User authentication using certificates
Apparatuses, systems, and techniques to select optimizations to be performed by compilers. In at least one embodiment, a processor includes one or more circuits to perform a compiler to select one or more optimizations to one or more first versions of a program based, at least in part, on a result of performing said one or more optimizations on one or more second versions of said program.