09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware; computer hardware for controlling
integrated circuits, semiconductors and computer chipsets;
computer hardware for controlling graphics processing units
(GPUs); computer hardware for controlling quantum
processors; computer hardware for allowing quantum
processors to interface with supercomputing systems and
quantum computer hardware; integrated circuits,
semiconductors and computer chipsets; embedded processors
for computers; computer networking hardware; computer
hardware for communication among central processing units
(CPUs); computer hardware for enabling connections among
central processing units (CPUs), servers and data storage
devices; digital data processing equipment; digital data
conversion equipment; downloadable software; downloadable
software for controlling integrated circuits, semiconductors
and computer chipsets; downloadable software for controlling
graphics processing units (GPUs); downloadable software for
controlling quantum processors; downloadable software for
allowing quantum processors to interface with supercomputing
systems and quantum computer hardware; downloadable software
for data conversion. Providing online non-downloadable software; providing online
non-downloadable software for controlling integrated
circuits, semiconductors and computer chipsets; providing
online non-downloadable software for controlling graphics
processing units (GPUs); providing online non-downloadable
software for controlling quantum processors; providing
online non-downloadable software for allowing quantum
processors to interface with supercomputing systems and
quantum computer hardware; providing online non-downloadable
software for data conversion; design and development of
computer hardware; design and development of computer
hardware for controlling integrated circuits, semiconductors
and computer chipsets; design and development of computer
hardware for controlling graphics processing units (GPUs);
design and development of computer hardware for controlling
quantum processors; design and development of computer
hardware for allowing quantum processors to interface with
supercomputing systems; design and development of integrated
circuits, semiconductors and computer chipsets; design and
development of embedded processors for computers; design and
development of computer networking hardware; design and
development of computer hardware for communication among
central processing units (CPUs); design and development of
computer hardware for enabling connections among central
processing units (CPUs), servers and data storage devices;
design and development of digital data processing equipment;
design and development of computer hardware for artificial
intelligence, machine learning, deep learning, natural
language generation, statistical learning, supervised
learning, un-supervised learning, data mining, predictive
analytics and business intelligence.
A power delivery system includes a first power connector including a first feature having a first configuration. The system further includes a second power connector including a second feature having a second configuration. The system further includes a first cable terminal configured to couple to the first power connector, the first cable terminal including a third feature configured to mate with the first feature and to not mate with the second feature. The system further includes a second cable terminal configured to couple to the second power connector, the second cable terminal including a fourth feature configured to mate with the second feature and to not mate with the first feature.
The disclosed method for generating a virtual object includes processing a language embedding associated with a natural language description of an object using a trained diffusion model to generate a first object geometry embedding, processing the first object geometry embedding using a trained decoder to generate an object surface representation, and converting the object surface representation into a first object geometry of the virtual object.
The disclosed method for training machine learning models for object generation includes performing, based on object data, one or more operations to train an untrained machine learning model to generate a trained machine learning model that comprises a trained encoder and a trained decoder, wherein the trained machine learning model is trained to generate an object surface representation, performing, based on the object data and natural language data, one or more operations to train an untrained diffusion model to generate a trained diffusion model, where the trained diffusion model is trained to generate an object geometry embedding, and where the trained diffusion model and the trained decoder are used to generate a virtual object based on natural language input.
Apparatuses, systems, and techniques for designing a data path circuit such as a parallel prefix circuit with reinforcement learning are described. A method can include receiving a first design state of a data path circuit, inputting the first design state of the data path circuit into a machine learning model, and performing reinforcement learning using the machine learning model to output a final design state of the data path circuit, wherein the final design state of the data path circuit has decreased area, power consumption and/or delay as compared to conventionally designed data path circuits.
Systems and methods are disclosed related to a 3D grounded video foundation model. A video generation method and system provide 3D conditioning information to a video diffusion model to improve generated video quality (object and temporal consistency) that is grounded in three dimensions (3D). The video generation method and system also enable precise camera control, cinematic effects, and scene editing. Video output corresponding to a set of camera specifications is generated for a scene from input image(s) including one or more images of a static scene or a sequence of images (video) for a dynamic scene. The input image(s) are used to calculate a 3D cache representing the scene. The 3D cache is rendered according to the set of camera specifications to produce a frame sequence and a mask sequence that identifies missing pixels in each frame. The frame sequence is encoded and masked to generate the output video.
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 21/431 - Generation of visual interfacesContent or additional data rendering
7.
GENERATING ANIMATABLE THREE-DIMENSIONAL CHARACTERS USING COMPOSITIONAL MULTI-VIEW DIFFUSION
The disclosed method of training a machine learning model and a diffusion model includes generating, based on multi-camera video data, one or more first input views and one or more target views, the first input view(s) comprising a first input image of a first character and the first target view(s) comprising a first target image of the first character; and performing, based on the first input view(s) and the first target view(s), training operations to train an untrained diffusion model and an untrained machine learning model to generate a trained diffusion model and a trained machine learning model, the trained diffusion model being trained to generate one or more predicted target image latents and the trained machine learning model being trained to generate a global representation of the first character. An animatable representation of a second character is generated using the trained diffusion model and the trained machine learning model.
Apparatuses, systems, and techniques to detect errors in content recognized by neural networks. In at least one embodiment, content is recognized in input data along with descriptive information that describes the recognized content in order to evaluate the descriptive information to detect an error in the recognized content generated by one or more neural networks.
The disclosed method of generating an animatable representation of a character includes generating, using a trained diffusion model, one or more predicted target image latents and a diffusion timestep, generating, using a trained machine learning model and based on the diffusion timestep and the one or more predicted target image latents, a first global representation of the character at the diffusion timestep, determining, based on the first global representation of the character and the diffusion timestep, a second global representation of the character, and generating, based on the second global representation of the character, the animatable representation of the character.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a refrigerant-to-air (R2A) heat exchanger is interfaced with at least one cold plate to absorb heat from at least one computing device using a two-phase fluid and is interfaced with a compressor or condensing unit that causes dissipation of at least part of the heat within a datacenter.
Apparatuses, systems, and techniques to detect errors in content recognized by neural networks. In at least one embodiment, respective document transcriptions of one or more document images are generated using one or more neural networks. The respective document transcriptions may include document content and descriptive information of the document content. An error may be detected in the document content of at least one document transcription of the respective document transcriptions based, at least in part, on a syntax error identified in the descriptive information of the at least one document transcription
Apparatuses, systems, and techniques to generate a document transcription of a document image. In at least one embodiment, one or more neural networks generate a document transcription of a document image according to a configurable combination of annotation types input to the one or more neural networks. The document transcription may include respective annotations of the annotation types for corresponding portions of content included in the document transcription.
G06F 40/169 - Annotation, e.g. comment data or footnotes
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 30/416 - Extracting the logical structure, e.g. chapters, sections or page numbersIdentifying elements of the document, e.g. authors
One or more embodiments of the present disclosure relate to identifying reference portions corresponding to a bounding shape that corresponds to an object. Additionally, the reference portions may include a first reference edge, a second reference edge, and a reference where the first reference edge and the second reference edge intersect. In some embodiments, operations may further include obtaining a first state estimate corresponding to the object and receiving first sensor data corresponding to a first portion of the object, the first sensor data including a first position measurement. Further, operations may further include determining that the first position measurement corresponds to a first reference portion that is one of the reference portions corresponding to the bounding shape and determining a first expected position corresponding to the first portion based at least on the first reference portion. Embodiments may additionally include determining a second position estimate corresponding to the object.
G01S 13/72 - Radar-tracking systemsAnalogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar
G01S 7/41 - Details of systems according to groups , , of systems according to group using analysis of echo signal for target characterisationTarget signatureTarget cross-section
G01S 13/58 - Velocity or trajectory determination systemsSense-of-movement determination systems
G01S 13/931 - Radar or analogous systems, specially adapted for specific applications for anti-collision purposes of land vehicles
Neural network architectures and machine learning techniques that support tokenization of raw visual input to generate a compact representation in a latent feature space as well as de-tokenization to generate raw visual output. In at least one embodiment, tokenization systems and methods leverages wavelet transforms and causal operations to capture spatial and temporal dependencies in the raw visual input.
Apparatuses, systems, and techniques to execute software programs. In at least one embodiment, an application programming interface (API) is performed to cause one or more kernel attributes to be indicated to one or more users based, at least in part, on one or more user-provided identifiers of the one or more kernel attributes.
Apparatuses, systems, and techniques are presented to reconstruct one or more images. In at least one embodiment, one or more objects in an image are caused to be generated based, at least in part, on a motion of the one or more objects between two or more frames of the image.
G06T 3/40 - Scaling of whole images or parts thereof, e.g. expanding or contracting
G06N 3/04 - Architecture, e.g. interconnection topology
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
Apparatuses, systems, and techniques are presented to reduce noise in audio. In at least one embodiment, a sequence of neural networks is used to remove foreground and background noise from audio including a primary audio signal.
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/84 - Detection of presence or absence of voice signals for discriminating voice from noise
Apparatuses, systems, and techniques to detect memory errors and isolate or migrate partitions on a parallel processing unit using an application programming interface to facilitate parallel computing, such as CUDA. In at least one embodiment, interrupts are intercepted and processed on a graphics processing unit indicating a memory error for one or more partitions, and a policy is applied to isolate that memory error from other partitions.
G06F 11/20 - Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
19.
EFFICIENT FILTERING FOR LIGHT TRANSPORT SIMULATION
In examples, threads of a schedulable unit (e.g., a warp or wavefront) of a parallel processor may be used to sample visibility of pixels with respect to one or more light sources. The threads may receive the results of the sampling performed by other threads in the schedulable unit to compute a value that indicates whether a region corresponds to a penumbra (e.g., using a wave intrinsic function). Each thread may correspond to a respective pixel and the region may correspond to the pixels of the schedulable unit. A frame may be divided into the regions with each region corresponding to a respective schedulable unit. In denoising ray-traced shadow information, the values for the regions may be used to avoid applying a denoising filter to pixels of regions that are outside of a penumbra while applying the denoising filter to pixels of regions that are within a penumbra.
Apparatuses, systems, and techniques to perform substantiation of task pipelines for sequential tasks performed by simultaneous, sequential kernels. In at least one embodiment, processors comprising one or more circuits to cause a compiler to indicate one or more portions of one or more software programs to be performed by one or more processors concurrently.
Stochastic texture filtering introduces randomness into texel sampling and/or filtering. Instead of computing a closest texel for the texture coordinates, randomness is introduced by stochastic sampling to obtain one texel. Stochastic sampling is also applied for filtering the texels when multiple samples are used and/or to perform temporal filtering. A first technique is used for discrete filters and filter-specific sample weights are generated. In contrast with conventional techniques, the sample weights are not applied directly to the single texel value. The single texel is randomly selected for each pixel, with probability proportional to an associated sample weight. A second technique is used for continuous filters and weights are not generated. Instead, the texture coordinates are perturbed with a random offset, which is drawn from a filter-specific probability distribution. Stochastic texture filtering improves the performance of texture filtering in terms of speed and quality and is compatible with image reconstruction techniques.
Disclosed are systems and techniques for three-dimensional (3D) visualization of datacenter entities, connections, and metrics. The techniques include receiving datacenter state information representing a plurality of entities, one or more connections between the plurality of entities, and one or more entity properties for at least a first entity of the plurality of entities. The techniques further include generating a first view of a three-dimensional (3D) visualization of the datacenter state information. The 3D visualization of the datacenter includes at least first visual elements representing a first subset of the plurality of entities, second visual elements representing a second subset of the plurality of entities, and a third visual element representing a first connection of the one or more connections. A spatial position of at least a first visual element of the first visual elements is determined based on the one or more entity properties.
The disclosed method for training a first machine learning model includes generating, based on training data, first output data using a first teacher machine learning model included in one or more teacher machine learning models, generating, based on the training data, second output data using the first machine learning model, wherein the first machine learning model comprises a second machine learning model and one or more low-rank adaptation (LoRA) towers, calculating, based on the first output data and the second output data, a loss, generating, based on the loss, one or more gradients, generating, based on the one or more gradients, one or more LoRA tower ranks, and updating, based on the loss and the one or more LoRA tower ranks, one or more parameters of the one or more LoRA towers.
Systems and methods are disclosed related to a 3D grounded video foundation model. A video generation method and system provide 3D conditioning information to a video diffusion model to improve generated video quality (object and temporal consistency) that is grounded in three dimensions (3D). The video generation method and system also enable precise camera control, cinematic effects, and scene editing. Video output corresponding to a set of camera specifications is generated for a scene from input image(s) including one or more images of a static scene or a sequence of images (video) for a dynamic scene. The input image(s) are used to calculate a 3D cache representing the scene. The 3D cache is rendered according to the set of camera specifications to produce a frame sequence and a mask sequence that identifies missing pixels in each frame. The frame sequence is encoded and masked to generate the output video.
The disclosed method of generating an animatable representation of a character includes generating, based on a global representation of the character, one or more local views, generating, based on the global representation of the character and the one or more local views, one or more local ray maps, generating, using a trained diffusion model and a trained machine learning model and based on the one or more local views and the one or more local ray maps, one or more multi-part local views, and generating, based on the global representation of the character and the one or more multi-part local views, a refined representation of the character.
In various examples, action models for interactive applications and systems are described herein. Systems and methods are disclosed that generate a training dataset using data from one or more sources, such as application services and/or content sharing services. As described herein, the training dataset may include videos, input information (e.g., actions taken), textual information, and/or any other type of information that is retrieved and/or generated using one or more processing pipelines. Systems and methods are also disclosed that use the training dataset to train one or more machine learning models—such as one or more vision-language-action (VLA) models—to perform one or more tasks. For example, after training, the VLA model(s) may process input data associated with an application, such as video frames, received inputs and/or actions, and/or previous instructions, and predict at least additional instructions to perform with regard to the application.
Apparatuses, systems, and techniques to access one or more non-uniform memory access (NUMA) nodes. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause one or more NUMA nodes or one or more physical addresses allocated to one or more graphics processing units (GPUs) to be accessed based, at least in part, on one or more indications within the API.
Managing memory when processing a large language model (LLM) using a multi-turn interaction framework can be difficult as the LLM can produce significantly more key-value (KV) pairs than can be stored in a processor's memory. The multi-turn framework allows the LLM to process information more efficiently using the KV pairs. The KV pairs can be cached, such as in a KV cache. Policies can be used to identify KV pairs that should remain in the cache, KV pairs that can be moved to a more distant cache, or KV pairs that can be discarded. These policies can assist in managing the memory so the most valuable KV pairs for LLM processing efficiency remain in the processor's local cache memory. More distant cache can be memory locations outside of the processor, or in memory stacks connected via a communication bus.
Apparatuses, systems, and techniques to facilitate memory management. In at least one embodiment, an application programming interface is performed to enable access to shared virtual memory by a plurality of processors.
Apparatuses, systems, and techniques to generate a surface. In at least one embodiment, one or more neural networks are used to generate a surface of an object based, at least in part, on motion of the object.
Apparatuses, systems, and techniques to optimize processor performance. In at least one embodiment, a method increases a maximum operating voltage (Vmax) of one or more processors to be dynamically adjusted, based at least in part, on one or more indications of processor usage.
Apparatuses, systems, and techniques to perform graph nodes. In at least one embodiment, a processor comprises one or more circuits to perform an API to cause one or more first graph nodes to be performed independently with respect to two or more second graph nodes, which have a dependency relationship with respect to each other.
Apparatuses, systems, and techniques to perform channel estimation. In at least one embodiment, a processor includes one or more circuits to perform channel estimation corresponding to one or more wireless signals without using a reference signal.
Apparatuses, systems, and techniques to execute one or more application programming interface (API) functions to facilitate parallel computing. In at least one embodiment, one or more APIs are to indicate one or more storage locations using various novel techniques described herein.
Disclosed are apparatuses, systems, and techniques for a multimodal interaction system for digital humans with real-time engagement and pose analysis, which receive a video stream comprising a plurality of frames depicting at least a portion of a user, wherein the video stream is associated with an interaction of the user with an avatar; determine, for at least one frame of the plurality of frames, a pose orientation corresponding to at least one of one or more body landmarks of the user represented in the corresponding frame; determine, based on at least one of a series of pose orientations corresponding to the plurality of frames, an engagement metric of the user; and cause a representation of the avatar performing an action based on the engagement metric to be generated.
Approaches are presented for training an inverse graphics network. An image synthesis network can generate training data for an inverse graphics network. In turn, the inverse graphics network can teach the synthesis network about the physical three-dimensional (3D) controls. Such an approach can provide for accurate 3D reconstruction of objects from 2D images using the trained inverse graphics network, while requiring little annotation of the provided training data. Such an approach can extract and disentangle 3D knowledge learned by generative models by utilizing differentiable renderers, enabling a disentangled generative model to function as a controllable 3D “neural renderer,” complementing traditional graphics renderers.
Approaches in accordance with various illustrative embodiments provide for the generation of synthetic communications for use in training and fine-tuning threat detection models for various categories of recipients. In at least one embodiment, guidelines can be determined for a category of recipient that can be used to generate multiple types of content using generative artificial intelligence (AI), as may include text, image, and file content. A training communication can be generated using these types of content, such as to generate an email message that corresponds to a potential spear phishing attack. The generated messages can be checked for quality, and any messages that are caught by existing filters can be deleted or regenerated so that only high quality examples of spear phishing are provided as output. These training communications can be used to train a spear phishing detector for a specific category of recipient, in order to accurately flag and prevent access to actual spear phishing communications.
Systems and methods to support on-demand deployment of pre-configured containers are disclosed. Exemplary implementations may store information electronically, including a particular artificial intelligence (AI) model and corresponding installation information; effectuate a presentation to a user, through a user interface, of a selectable user interface element, wherein the selectable user interface element is associated with the particular artificial intelligence model; responsive to the user selecting the selectable user interface element, provision a particular server that includes a particular Graphics Processing Unit (GPU), launch a container instance on the particular server such that the user has access to the particular GPU, install software in the container instance in accordance with the corresponding installation information, and install the particular AI model in the container instance; and/or perform other actions.
In various examples, feature detection models for autonomous and/or semi-autonomous systems and applications are described herein. Systems and methods described herein may use one or more trained machine learning models to automatically generate representations of traffic features corresponding to a map, such as road markings and/or road edges. For instance, the model(s) may take, as input, an image representing at least a portion of a map that includes one or more traffic features along with one or more indications of one or more points associated with the traffic feature(s) as represented by the image. Based at least on processing the inputs, the model(s) may generate and/or output data representing additional points associated with the traffic feature(s) and/or a heatmap representing one or more lines representing the traffic feature(s). This output data may then be used to determine the representation(s) of the traffic feature(s) for annotating the map.
Approaches presented herein provide for receiving liquid coolant from external sources to cold plates of a server or other liquid-cooled computer system. In at least one embodiment, initial standalone manifolds of the server can be forgone or bypassed, with the flow of liquid coolant received to a server at the cold plates. Some components of the server can be provided a source of cooling from the cold plates and other components can be provided the flow of liquid coolant distributed from the cold plates. The flow of liquid coolant can be provided from the cold plates to the some components to as a source of cooling, as well as to a separate manifold to be further distributed. The cold plates can be connected together to provide the appropriate flow of liquid coolant for the server.
Approaches presented herein provide for generation of alternate views from disparity data captured for one or more objects in a scene. The generation can be performed using an embedded processor with DMA memory access, or other limited capacity hardware. An intermediate representation can be generated that is a 2D histogram view of the disparity data. This intermediate representation can be transformed, using the embedded processor, to an alternate view image, such as a bird's eye view image. Morphological or similar filtering can be performed on the one or more objects in the intermediate representation using the same size filter, regardless of distance from a camera plane used to capture the disparity data.
H04N 13/111 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
G06T 5/40 - Image enhancement or restoration using histogram techniques
G06T 7/285 - Analysis of motion using a sequence of stereo image pairs
G06T 7/593 - Depth or shape recovery from multiple images from stereo images
G06T 7/66 - Analysis of geometric attributes of image moments or centre of gravity
H04N 13/00 - Stereoscopic video systemsMulti-view video systemsDetails thereof
H04N 13/239 - Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
42.
GENERATING ALTERNATIVE IMAGE VIEWS FROM STEREO DISPARITY DATA
Approaches presented herein provide for generation of alternate views from disparity data captured for one or more objects in a scene. The generation can be performed using an embedded processor with DMA memory access, or other limited capacity hardware. An intermediate representation can be generated that is a 2D histogram view of the disparity data. This intermediate representation can be transformed, using the embedded processor, to an alternate view image, such as a bird's eye view image. Morphological or similar filtering can be performed on the one or more objects in the intermediate representation using the same size filter, regardless of distance from a camera plane used to capture the disparity data.
H04N 13/117 - Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
G06T 5/20 - Image enhancement or restoration using local operators
In various examples, the systems and methods of the present disclosure may train and use machine learning models to determine attributes and, in some instances, classifications associated with traffic lights to determine traffic rules for operating a machine (e.g., an autonomous or semi-autonomous machine or vehicle) in an environment. For instance, an image depicting a traffic light device may be applied to a machine learning model that includes a plurality of component heads. Each one of component heads may be trained to detect different attributes and/or combinations of attributes associated with the traffic light device. Additionally, in some examples, the machine learning model may include a fusion head that is trained to classify the traffic light device. For instance, the fusion head may classify the traffic light device using the detected attributes and/or using a combined feature vector of multiple feature vectors applied to the plurality of component heads.
G08G 1/097 - Supervising of traffic control systems, e.g. by giving an alarm if two crossing streets have green light simultaneously
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
44.
SENSOR CALIBRATION USING PROJECTED TARGETING FOR VEHICLE OCCUPANT MONITORING
In various examples, systems and methods are provided for sensor calibration using projected targeting for vehicle occupant monitoring. A target projector may be used to cause a projection of a target to appear at predefined points on boundaries of the gaze regions. Region mapping data that includes 3D coordinates of the predefined points on the boundaries of the gaze regions is generated in the coordinate system of the target projector by pointing the target projector at each of the predefined points on the boundaries of the gaze regions. One or more sensors may be calibrated based at least on a transformation of the region mapping data from the coordinate system of the target projector to a coordinate system of the one or more sensors.
Various examples, systems, and methods are disclosed relating to domain-specific document retrieval that incorporates custom vocabulary integration and embedding model updates. A computing system can extract multiple segments from a collection of documents and generate queries that correspond to at least one segment. The computing system can identify terms that satisfy a uniqueness criterion and input the terms into a tokenizer to create a vocabulary dataset. The vocabulary dataset, the document segments, and the queries can be used to update an embedding model to support retrieval and semantic alignment within private documents.
Systems and methods are disclosed that generate blob video representations such as blob video parameters and blob video descriptions and use the blob video representations to generate videos. For example, embodiments of the present disclosure may decompose videos into visual primitives (e.g., blob video representations, which may be general representations for controllable video generation). Based on the blob video representations, a blob-grounded text-to-video diffusion model that includes masked three-dimensional (3D) self-attention layers and/or masked spatial cross-attention layers may be developed. The masked 3D self-attention layers and/or masked spatial cross-attention layers may effectively improve regional consistency across frames. Additionally, and/or alternatively, embodiments of the present disclosure may utilize context interpolation that may interpolate text embeddings. Additionally, and/or alternatively, the blob-grounded text-to-video diffusion model may be model-agnostic and may include and/or be associated with a U-Net and/or a diffusion transformer.
One embodiment of a method for modifying program code includes processing, using a first trained language model, program code to identify one or more modifications to the program code; processing, using a second trained language model, the program code and the one or more modifications to generate a plan for applying the one or more modifications; and processing, using a third trained language model, the program code and the plan to generate a modified program code.
Apparatuses, systems, and techniques to cause one or more directions of travel to be indicated to a user in order to improve wireless signal strength. In at least one embodiment, one or more directions of travel are indicated to a device in order to improve wireless signal strength, based on, for example, wireless signal strength values obtained by said device at one or more locations.
Apparatuses, systems, and methods cause one or more neural networks to generate one or more representations of one or more integrated circuits. In at least one embodiment, a processor comprises one or more circuits to use one or more neural networks to generate one or more representations of one or more integrated circuits to perform one or more instructions, wherein the one or more representations are based, at least in part, on one or more different instruction operands and one or more representations resulting from the one or more different instruction operands.
In various examples, a technique for managing data uploads from location-aware systems includes determining a set of attributes associated with a set of data uploaded using a set of location-aware systems in a geographic region. The technique also includes computing a set of upload control parameters for the geographic region based at least on the set of attributes. The technique further includes receiving, from a location-aware system, a request indicating the geographic region. The technique additionally includes sending, to the location-aware system in response to the request, the set of upload control parameters within one or more control layers included in map data for the geographic region, wherein the location-aware system controls upload of additional data associated with the geographic region based at least on the one or more control layers.
Embodiments of the present disclosure relate to a method of encrypting a secret storage structure. The method may include storing a secret in a secret storage structure. The secret storage structure may be encrypted by encrypting the secret using a wrap key that is generated based at least on a hardware-based root key and a first context. The secret storage structure may additionally be encrypted by encrypting the secret storage structure using an authentication key that is generated based at least on the hardware-based root key and a second context.
G06F 21/78 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
Some embodiments described herein provide intelligent movable racks for a data center and a central system for monitoring and directing the positioning of such racks within the data center. For example, a rack may include computing equipment as well as a power system, a cooling system, and a cabling system (e.g., for data communication). The rack may include a controller in communication with the computing equipment, the power system, the cooling system, and the cabling system. The rack may also include a rack interface for physically supporting the rack and operatively connecting the systems of the rack to power, cooling, and cabling infrastructure of the data center. The rack interface may receive an autonomous robot for moving the rack within the data center. The controller may control the power system and the cooling system based in part on the autonomous movement of the rack.
In various examples, systems and methods are disclosed that use one or more machine learning models (MLMs)—such as deep neural networks (DNNs)—to compute outputs indicative of an estimated visibility distance corresponding to sensor data generated using one or more sensors of an autonomous or semi-autonomous machine. Once the visibility distance is computed using the one or more MLMs, a determination of the usability of the sensor data for one or more downstream tasks of the machine may be evaluated. As such, where an estimated visibility distance is low, the corresponding sensor data may be relied upon for less tasks than when the visibility distance is high.
B60W 30/095 - Predicting travel path or likelihood of collision
B60W 30/09 - Taking automatic action to avoid collision, e.g. braking and steering
B60W 40/02 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to ambient conditions
G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
An optical apparatus, with an optical interconnect, the optical interconnect including a first optical transceiver having a first notch filter, the first notch filter including first and second optical add drop multiplexer demultiplexers connected to receive a continuous wave light beam and send a first and second filtered wavelengths to first and second resonant modulators which send first and send modulated optical signals through a light propagation path. The second filtered wavelength is different from the first filtered wavelength, and the second modulated optical signal has a polarity that is orthogonal to a polarity of the first modulated optical signal. Methods of communicating using the apparatus and an optical filter for use in an optical transceiver are also
H04B 10/80 - Optical aspects relating to the use of optical transmission for specific applications, not provided for in groups , e.g. optical power feeding or optical transmission through water
H04J 14/02 - Wavelength-division multiplex systems
55.
FLEXIBLE PRINTED CIRCUIT BOARD FOR SMALL BENDING-RADIUS APPLICATIONS
According to various embodiments, a flexible printed circuit board includes: a first flexible dielectric layer that includes reinforcing fibers; a second flexible dielectric layer that includes no reinforcing fibers; and a first conductive layer that is disposed between the first dielectric layer and the second dielectric layer and contacts the first dielectric layer and the second dielectric layer.
In various examples, a vision language model may be prompted to select a supported environment visualization pipeline (e.g., a bowl visualization pipeline that models the surrounding environment as a 3D bowl, surface topology visualization pipeline that that models the surrounding environment as a detected 3D surface topology), one or more parameters of a supported environment visualization pipeline (e.g., for a bowl visualization pipeline, a parametrization of the shape of the 3D bowl model, stitching parameters such as seam placement, blend width, or blend area, etc.), and/or a rendering viewport (e.g., a virtual camera position and orientation). As such, the selected and/or configured technique may be used to visualize an environment around an ego-machine, such as a vehicle, robot, and/or other type of object, in systems such as parking visualization systems, Surround View Systems, and/or others.
Systems and methods herein are for dual purpose cooling for a computer module that may include a device cooling loop which may be configured to cool computing features of the computer module. The systems and methods herein may include an interconnect cooling loop, provided together with the device cooling loop, where the interconnect cooling loop may be configured to reduce, by at least a predetermined threshold, electrical resistance of interconnect features of the computer module.
A method receives a batch of one or more first job requests to be performed by a high-performance computing cluster. The batch of first job requests is received from a container orchestration platform. The batch of one or more first job requests are translated into one or more second job requests. The second job requests are interpretable by a scheduler corresponding to the HPC cluster. The second job requests are sent to the scheduler.
Approaches presented herein may be used to generate captions using raw caption information. Raw caption information may be used, with an associated image, to generate a detailed image caption. Object lists may then be generated from the image and/or the detailed image caption to produce an image including boxing box proposals for objects within the image. One or more trained machine learning systems may then be used to generate region of interest captions that infuse the global caption context associated with the raw caption information.
Apparatuses, systems, and methods cause one or more neural networks to generate one or more representations of one or more integrated circuits. In at least one embodiment, a processor comprises one or more circuits to use one or more neural networks to generate one or more representations of one or more integrated circuits to perform one or more instructions, wherein the one or more representations are based, at least in part, on one or more different instruction operands and one or more representations resulting from the one or more different instruction operands.
Apparatuses, systems, and techniques are presented to perform segmentation on images. In at least one embodiment, one or more neural networks are used to segment an image based, at least in part, on one or more visual modifications of the image.
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 10/40 - Extraction of image or video features
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
09 - Scientific and electric apparatus and instruments
Goods & Services
Artificial intelligence supercomputers; high performance computers and computer hardware for artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics and business intelligence; high performance computers and computer hardware with specialized features for software development; high performance computers and computer hardware with specialized features for developing, testing, and validating artificial intelligence models and software applications; high performance computers and computer hardware with specialized features for data analytics, data management, data integration, data processing, and data visualization; high performance computer hardware with specialized features for development of edge applications; high performance computers and computer hardware with specialized features for development of robotics, smart cities, and computer vision solutions; computers; computer hardware; downloadable software; recorded computer software; integrated circuit components for graphics and video systems, namely, multimedia accelerators, graphic accelerators and peripheral units; computer software for operating and managing the aforementioned integrated circuit components; computer software for the display of digital media; computer software for management, storage and network management of digital media, and enhancement of graphical and video display; computer servers; computer network servers; servers
63.
TECHNIQUES FOR COMPILER LOWERING OF TASK-BASED PROGRAMS TO ASYNCHRONOUS ACCELERATORS
A computer-implemented technique for compiling program code includes receiving first program code, generating a first dependence graph based on the first program code, removing one or more parallel loops in the first dependence graph to generate a second dependence graph, removing one or more copy operations from the second dependence graph to generate a third dependence graph, allocating memory to one or more data objects in the third dependence graph, assigning one or more sub-graphs of the third dependence graph to one or more corresponding warps, and generating second program code based on the one or more sub-graphs of the third dependence graph.
One embodiment of a method for training a robot grasp diffusion model includes performing, based on grasp data that includes one or more first robot grasp poses, one or more operations to train an untrained diffusion model to generate a trained diffusion model; generating, using the trained diffusion model, one or more second robot grasp poses; simulating the one or more second robot grasp poses to generate one or more labels indicating if the one or more second robot grasp poses are successful robot grasp poses; and performing, based on the one or more second robot grasp poses and the one or more labels, one or more operations to train an untrained machine learning model to generate a trained machine learning model.
Apparatuses, systems, and techniques including APIs to enable one or more fifth generation new radio (5G-NR) network components to write, read, send, transmit, load, or otherwise obtain packaging, synchronization, and/or management information. For example, a processor comprising one or more circuits to perform an application programming interface (API) to cause fifth generation new radio (5G-NR) packaging, synchronization, or management information to be indicated to one or more accelerators.
Text-to-image transformers configured in one aspect to associate an input text token with the specific object, apply latent blending with attention to a combination of keys and values for the input text token and a background image upon which to add the object; and which in another aspect perform latent blending with attention to keys and values for the object to add, keys and values for the background, and keys and values for a text prompt.
A new transaction barrier synchronization primitive enables executing threads and asynchronous transactions to synchronize across parallel processors. The asynchronous transactions may include transactions resulting from, for example, hardware data movement units such as direct memory units, etc. A hardware synchronization circuit may provide for the synchronization primitive to be stored in a cache memory so that barrier operations may be accelerated by the circuit. A new wait mechanism reduces software overhead associated with waiting on a barrier.
One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.
G06T 7/30 - Determination of transform parameters for the alignment of images, i.e. image registration
G06T 7/70 - Determining position or orientation of objects or cameras
G06V 30/262 - Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
Processors, systems, and techniques to predict a query to a database are described. In at least one embodiment, one or more prior query results are obtained from a database and one or more neural networks are utilized to predict a query to the database based, at least in part, on one or more prior query results.
In various examples, machine learning models may be trained and used to determine associations between traffic control devices (e.g., traffic signs, traffic lights, etc.) and lane segments of a driving surface. The systems and methods of the present disclosure may effectively combine rule-based methods and machine-learning based methods for traffic control device to lane association. For instance, training data may be synthetically generated based on traffic regulations relating to placement of traffic lights, and machine learning models may be trained to associate traffic lights to respective lanes using the training data with ground truth generated by rules. As such, image data may not be needed for the machine learning models to predict light to lane associations. Instead, given a set of non-image features indicative of lane segment and traffic light geometry and/or semantics, the machine learning models may predict the associated lane segment for each traffic light.
Disclosed are apparatuses, systems, and techniques that use one or more artificial intelligence models for time-aligned automatic speech recognition (ASR) of speech. The techniques include processing, an ASR model, one or more audio frames representative of a speech to generate, for a transcription unit (TU) of the speech a first set of likelihood values and a second set of likelihood values. An individual likelihood value of the first set characterizes a probability that the TU corresponds to a vocabulary token. An individual likelihood value of the second set characterizes a probability that the TU corresponds to a timestamp token. The techniques further include generating, using the first set of likelihood values and the second set of likelihood values, a timed transcription of the speech.
The disclosure provides a method, apparatus, and system for operating chips that satisfy operating at a minimum operating temperature but include circuitry that has not been validated to operate at the minimum operating temperature. In one example the disclosure provides a method of booting a chip that includes: (1) initiating a booting sequence for a chip in response to receiving a boot-up signal, (2) determining a chip temperature using a temperature sensor, (3) activating warming circuitry of the chip during the booting sequence when the chip temperature is less than a first temperature, wherein the warming circuity is configured to operate at a second temperature, and (4) when activated, deactivating the warming circuitry when the chip temperature is equal to or greater than the first temperature.
Apparatuses, systems, and techniques to perform an application programming interface (API) to cause configuration information of one or more radio units to be stored. In at least one embodiment, configuration information is obtained based, at least in part, on one or more values received from at least one of said radio unit(s) and said configuration information is used to enable communication between said radio unit(s) and one or more distribution units.
Apparatuses, systems, and techniques to perform an application programming interface (API) to cause configuration information of one or more radio units to be stored. In at least one embodiment, configuration information is obtained based, at least in part, on one or more values received from at least one of said radio unit(s) and said configuration information is used to enable communication between said radio unit(s) and one or more distribution units.
Systems and methods are disclosed that perform efficient training of a graph neural network (GNN) using graph structure-aware randomized mini-batching. For example, nodes from a graph may be obtained. Subsequently, the nodes of the graph may be grouped into communities and then the order of the communities as well as the nodes within each of the communities may be shuffled. Based on shuffling the order of the communities and the nodes within the communities, mini-batches for training the GNN may be determined. Following, based on a sampling bias, a sub-graph may be constructed for each of the mini-batches to obtain a plurality of sub-graphs. The sampling bias may indicate a bias for sampling intra-connections instead of inter-connections. After, the GNN may be trained based on the constructed sub-graphs.
In some embodiments, a generative DNN trained as part of a probabilistic state simulation stack may be sampled generatively to generate ground truth recovery scenarios for other navigation policies or other supervised DNNs (e.g., a neural planner) that were not part of the probabilistic state simulation stack. For example, an initial trajectory that drifts from an optimal or target trajectory may be generated (e.g., using a neural planner to control navigation of an ego-machine in a simulation environment or in a latent space of a probabilistic state simulation stack). As such, a control stack trained as part of a probabilistic state simulation stack may be used to recover from the initial trajectory, and the resulting recovery trajectory may be recorded and used to train a navigation policy or other supervised DNN such as a neural planner (e.g., the neural planner that generated the initial trajectory).
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
77.
APPLICATION PROGRAMMING INTERFACE TO STORE CONFIGURATION INFORMATION OF RADIO UNITS
Apparatuses, systems, and techniques to perform an application programming interface (API) to cause configuration information of one or more radio units to be stored. In at least one embodiment, configuration information is obtained based, at least in part, on one or more values received from at least one of said radio unit (s) and said configuration information is used to enable communication between said radio unit (s) and one or more distribution units.
Apparatuses, systems, and techniques to assign a processing resource to an inference request directed to a neural network based on an amount of information to be inferenced indicated by said request. In at least one embodiment, an AI application is deployed with a software wrapper that intercepts inference requests and dynamically distributes such requests among available processing resources such as host processor(s) and/or AI accelerator(s) to improve execution performance of said AI application.
Apparatuses, systems, and techniques to transfer grammar between sentences. In at least one embodiment, one or more first sentences are translated into one or more second sentences having different grammar using one or more neural networks.
The disclosed method for controlling a robot to grasp an object includes receiving sensor data from one or more sensors, generating, based on the sensor data and using a first trained machine learning model, one or more grasp poses, selecting, from the one or more grasp poses and using a first trained machine learning model, one or more filtered grasp poses, generating, based on the one or more filtered grasp poses, a grasping plan, and causing the robot to grasp the object based on the grasping plan.
Adaptive clock mechanisms for serial links utilizing a delay-chain-based edge generation circuit to generate a clock that is a faster (higher-frequency) version of an incoming digital clock. The base frequency of the link clock utilized by the line transmitters is determined by the (slower) clock utilized by the digital circuitry supplying data to the line transmitters. An edge generator that may be composed of only non-synchronous circuit elements multiplies the edges of the slower clock to generate the link clock and also a clock forwarded to the receiver at a phase offset from the link clock.
H04L 7/00 - Arrangements for synchronising receiver with transmitter
H04L 7/033 - Speed or phase control by the received code signals, the signals containing no special synchronisation information using the transitions of the received signal to control the phase of the synchronising-signal- generating means, e.g. using a phase-locked loop
A machine-learning control system is trained to perform a task using a simulation. The simulation is governed by parameters that, in various embodiments, are not precisely known. In an embodiment, the parameters are specified with an initial value and expected range. After training on the simulation, the machine-learning control system attempts to perform the task in the real world. In an embodiment, the results of the attempt are compared to the expected results of the simulation, and the parameters that govern the simulation are adjusted so that the simulated result matches the real-world attempt. In an embodiment, the machine-learning control system is retrained on the updated simulation. In an embodiment, as additional real-world attempts are made, the simulation parameters are refined and the control system is retrained until the simulation is accurate and the control system is able to successfully perform the task in the real world.
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G05B 13/04 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
G05D 1/00 - Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
G05D 101/15 - Details of software or hardware architectures used for the control of position using artificial intelligence [AI] techniques using machine learning, e.g. neural networks
Apparatuses, systems, and techniques to obtain one or more captions for a video using machine learning. In at least one embodiment, at least one machine learning process is used to generate at least one output caption using at least one image-level caption, at least one video-level caption, and/or at least one motion caption. In at least one embodiment, the video-level caption(s) is/are generated by one or more second machine learning processes using the video, and the image-level caption(s) is/are generated by one or more third machine learning processes using one or more images sampled from the video.
In a system including a processing unit and a set of one or more stacked memory chips, the processing unit can request data. When the data is distributed such that there is at least one non-contiguous memory sector in the smallest unit of memory segments usable by the system, then a gather operation can be utilized to instruct the set of one or more stacked memory chips to gather the requested data into a virtual address space, e.g., a gather accelerated address space. The requested data can be aligned to the byte chunk size used by the processing unit and at least some of the unneeded memory segments can be skipped, e.g., not copied into the virtual address space. The requested data in the virtual address space can be communicated to the processing unit using less bandwidth resources than when not using the gather operation.
Approaches presented herein provide for the management of resources to be used to process a request, such as may involve orchestration of nodes for an inference request. Upon receiving an inference request, an orchestrator can determine a sequence of context nodes and inference nodes to be used to process the inference request, based in part upon a determined class of inferencing to be performed. The orchestrator can append metadata to the inference request that identifies the sequence, and can transmit the appended request to one or more first nodes in the sequence. If the nodes have a network programmable device, or similar capability, the request can be forwarded to the nodes in sequence without having to go back to the orchestrator between nodes.
Embodiments described herein provide a hybrid data center cooling system. In at least one embodiment, a data center cooling system includes one or more immersive cooling chambers having one or more heat exchangers to transfer heat from one or more immersive fluids to one or more refrigerant flows received from one or more computing hardware inside the one or more immersive cooling chambers.
Disclosed are systems and techniques for dynamic L1 cache reconfiguration. The techniques include executing a first task by a processor in a first mode. The processor has a first memory configuration. The techniques further include receiving, at a hardware controller operatively coupled to the processor, a second task with memory metadata. The techniques further include determining a second memory configuration of the processor based on the memory metadata and the first memory configuration of the processor. The techniques further include reconfiguring a memory of the processor based on the second memory configuration. The techniques further include executing the second task by the processor in the first mode.
Reverse offload mechanisms that utilize a second processor to receiving a workload from a first processor, the workload including multiple tasks, where the second processor collects portions of the tasks from a set of co-executing threads in the second processor and dispatches portions of the tasks to queues for threads of the first processor, and in response to one or more of status indications satisfying a completion condition for the first portions of the tasks, combines first partial results of the tasks from the set of co-executing threads with second partial results of the portions of the tasks from the first processor.
In various examples, automatic document analysis and modification systems and applications are described herein. Systems and methods are disclosed that automatically identify clauses that potentially need updating in documents—such as templates—using one or more language models. Systems and methods are further disclosed that provide information associated with updating the identified documents to users. For instance, user interfaces are provided that allow users to view at least the clauses that potentially need updating, reasons the clauses potentially need updating, techniques for updating the clauses, and/or text showing the clauses as updated. Systems and methods are then further disclosed that use the language model(s) to automatically update the clauses in the documents. For instance, once the updates are verified, the language model(s) may process input data associated with the documents and the updated clauses in order to apply the updates to the documents.
Apparatuses, systems, and techniques to perform one or more memory copy operations via a single application programming interface. In at least one embodiment, said memory copy operations are performed with a single set of startup and/or shutdown operations between them. In at least one embodiment, processors comprising one or more circuits to perform an application programming interface (API) to cause information to be copied from two or more first storage locations to two or more second storage locations based, at least in part, on one or more parameters of the APIs to indicate the two or more first storage locations and the two or more second storage locations.
A computing system includes a main processor and an accelerator. The main processor includes a cache. The main processor is to assign a computing task to the accelerator. The accelerator is to select a sub-task of the computing task, and to assign the sub-task back to the main processor by stashing the sub-task directly into the cache of the main processor.
Apparatuses, systems, methods, and techniques to generate new demonstrations by using machine learning to generate trajectories for segments in which an agent is to interact with object(s), and using at least one motion planner for segments in which the agent is not to interact with the object(s). In at least one embodiment, a system generates a first trajectory for a modified first segment, obtained by modifying a first segment of a demonstration. In at least one embodiment, a first agent is to interact with object(s) in the first segment. In at least one embodiment, the system uses motion planner(s) to generate a second trajectory for a second segment of the demonstration that is adjacent the first segment and in which the first agent did not interact with the object(s). In at least one embodiment, the system generates a new demonstration by combining the first and second trajectories.
In various examples, to support training a deep neural network (DNN) to predict a dense representation of a 3D surface structure of interest, a training dataset is generated using a simulated environment. For example, a simulation may be run to simulate a virtual world or environment, render frames of virtual sensor data (e.g., images), and generate corresponding depth maps and segmentation masks (identifying a component of the simulated environment such as a road). To generate input training data, 3D structure estimation may be performed on a rendered frame to generate a representation of a 3D surface structure of the road. To generate corresponding ground truth training data, a corresponding depth map and segmentation mask may be used to generate a dense representation of the 3D surface structure.
Apparatuses, systems, and techniques to train one or more neural networks using unannotated images. In at least one embodiment, the one or more neural networks are trained based, at least in part, on one or more loss functions calculated using a randomly selected patch pair on a same image and a spatial relationship between two patches within the randomly selected patch pair on the same image.
Apparatuses, systems, and techniques to train one or more neural networks using unannotated images. In at least one embodiment, the one or more neural networks are trained based, at least in part, on one or more loss functions calculated using a randomly selected portion pair from two different images and a randomly selected portion pair from the same image.
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
A methods and systems for using artificial intelligence to recommend optimal parking spaces for vehicles. Using artificial intelligence includes using computer vision systems to analyze images of a vehicle and/or of the vehicle's occupants. This can include using an image description model to automatically generate natural language descriptions of the images. These natural language descriptions can be further processed using a large language model. Information from these natural language descriptions is used as inputs to a parking recommendation system. Based on information about the vehicle, occupants, and available parking spaces, the system can automatically recommend a parking space that is optimal for the vehicle and passengers. The system can also provide navigation information for the space to the driver and/or onboard vehicle systems.
Generation of three-dimensional (3D) object models may be challenging for users without a sufficient skill set for content creation and may also be resource intensive. One or more style transfer networks may be combined with a generative network to generate objects based on parameters associated with a textual input. An input including a 3D mesh and texture may be provided to a trained system along with a textual input that includes parameters for object generation. Features of the input object may be identified and then tuned in accordance with the textual input to generate a modified 3D object that includes a new texture along with one or more geometric adjustments.
One embodiment of a method for controlling a robot includes computing, using a trained machine learning model and based on sensor data, one or more costs associated with one or more trajectories; determining an action based on the one or more costs; and controlling the robot to move based on the action.
Apparatuses, systems, and techniques to perform collision-free motion generation (e.g., to operate a real-world or virtual robot). In at least one embodiment, at least a portion of the collision-free motion generation is performed in parallel.
Apparatuses, systems, and techniques to deterministically classify data. In at least one embodiment, inference classes with weights within a threshold range are treated as equivalent and one representative inference class is selected.