A transceiver circuit is disclosed, the transceiver circuit including a first register circuit, configured to receive serial stimulus data and to generate multi-bit parallel stimulus data, a serializer circuit configured to receive the multi-bit parallel stimulus data and to generate serialized data based on the multi-bit parallel stimulus data, where the serializer circuit includes a serializer data storage device, and where the serializer data storage device lacks circuit structures for scanability, a deserializer circuit configured to receive serial receiver data corresponding with the serialized data and to generate multi-bit parallel response data based on the serial receiver data, where the deserializer circuit includes a deserializer data storage device, and where the deserializer data storage device lacks circuit structures for scanability, and a second register circuit, configured to receive the multi-bit parallel response data and to generate serial response data.
Embodiments herein describe a content adaptive array that can include different types of data. In content adaptive arrays, the datatype of the array can vary depending on the actual values of the data in the array. For example, for arrays where the data values have a small dynamic range, an INT4 datatype may be preferred since it can provide the most accuracy and still avoid underflow. For arrays where the data values have larger dynamic ranges, an FP datatype may be preferred since it provides more dynamic range. The content adaptive array can include metadata (e.g., type selector bits) that indicates what the datatype of the data in the array. Thus, when the hardware receives the array, it can use the metadata to identify the datatype of the data and then process the array accordingly.
In response to one or more conditions, a processing system determines whether transferring one or more experts to different processing units would improve load balancing at the processing system. The processing system determines an amount of variance between the utilization for each expert relative to the average utilization of all experts at their currently-assigned processing units. The processing system then measures the amount of variance under one or more different configurations of expert-processing unit assignments. If so, the processing system transfers one or more of the experts to different processing units.
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
A command processor dispatches instructions to a processing unit and a systolic array. The command processor receives a packet including instructions for execution on the systolic array. In response to determining that reconfiguration of the systolic array is to be performed in order to process the instructions, the command processor determines whether a previously dispatched packet is executing on the systolic array. The command processor dispatches reconfiguration instructions for execution concurrently with the processing unit executing the previously dispatched packet in response to determining that there is no conflict between the reconfiguration instructions and a current configuration used by the previously dispatched packet. If a conflict exists between the reconfiguration instructions and the current reconfiguration, the command processor waits for an acknowledgment indicating that execution of the previously dispatched packet is complete and dispatches the reconfiguration instructions.
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 9/48 - Lancement de programmes Commutation de programmes, p. ex. par interruption
Dynamically pooled allocation of memory buffer on spatial compute architectures, including analyzing, at compile-time, access patterns (e.g., cyclo-static execution/firing rules) of consumer and/or producer processes that have shared access to local data memory of one or more compute tiles, and identifying situations in which multiple buffers can be replaced with a pooled buffer having a memory footprint that is less than a sum of the memory footprints of the multiple buffers. A compiler may identify instances of mutual exclusiveness in the execution patterns of the processes, differences in execution times between compute kernels of the processes, and/or variations in execution times of the kernels. The compiler may generate controller code and/or configuration parameters to enforce memory allocation/mapping at application run-time.
Systems and methods for transmitting and processing data can use representations of data portions (e.g., blocks, chunks, or other subunits of data) that match a specified pattern, such as zero gradients in a machine learning training algorithm. These representations can allow different parts of a system to communicate the existence of these data portions to each other without actually transmitting the data portions while also allowing for the transmission of data portions that do not match the specified pattern. Processing of data can also use these representations or indicators as placeholders for the omitted data and perform calculations based on tallies, skipped memory locations, or other ways of accounting for the omitted data. This can in some cases reduce computing resources used to process data, such as data that may have been communicated using such representations.
Systems and methods for transmitting and processing data can use representations of data portions (e.g., blocks, chunks, or other subunits of data) that match a specified pattern, such as zero gradients in a machine learning training algorithm. These representations can allow different parts of a system to communicate the existence of these data portions to each other without actually transmitting the data portions while also allowing for the transmission of data portions that do not match the specified pattern. Processing of data can also use these representations or indicators as placeholders for the omitted data and perform calculations based on tallies, skipped memory locations, or other ways of accounting for the omitted data. This can in some cases reduce computing resources used to process data, such as data that may have been communicated using such representations.
An inference server is capable of receiving a plurality of inference requests from one or more client systems. Each inference request specifies one of a plurality of different endpoints. The inference server can generate a plurality of batches each including one or more of the plurality of inference requests directed to a same endpoint. The inference server also can process the plurality of batches using a plurality of workers executing in an execution layer therein. Each batch is processed by a worker of the plurality of workers indicated by the endpoint of the batch.
Dynamically pooled allocation of memory buffer on spatial compute architectures, including analyzing, at compile-time, access patterns (e.g., cyclo-static execution/firing rules) of consumer and/or producer processes that have shared access to local memory of one or more compute tiles, and identifying situations in which multiple buffers can be replaced with a pooled buffer having a memory footprint that is less than a sum of the memory footprints of the multiple buffers. A compiler may identify instances of mutual exclusiveness in the execution patterns of the processes, differences in execution times between compute kernels of the processes, and/or variations in execution times of the kernels. The compiler may generate controller code and/or configuration parameters to enforce memory allocation/mapping at application run-time.
A memory circuit is disclosed. The memory circuit includes a plurality of bit lines; a plurality of memory cells arranged in columns, each memory cell connected to a pair of bit lines; and a plurality of clamp circuits, each including a first clamp and logic circuit connected to a first bit line and a second clamp and logic circuit connected to a second bit line, where the first clamp and logic circuit is configured to selectively clamp the first bit line in response to the memory circuit operating in a particular mode, and where the second clamp and logic circuit is configured to selectively clamp the second bit line in response to the memory circuit operating in the particular mode.
G11C 7/12 - Circuits de commande de lignes de bits, p. ex. circuits d'attaque, de puissance, de tirage vers le haut, d'abaissement, circuits de précharge, circuits d'égalisation, pour lignes de bits
G11C 7/06 - Amplificateurs de lectureCircuits associés
G11C 7/10 - Dispositions d'interface d'entrée/sortie [E/S, I/O] de données, p. ex. circuits de commande E/S de données, mémoires tampon de données E/S
11.
TRANSMISSION AND PROCESSING OF DATA IN PARALLEL SYSTEMS
Systems and methods for transmitting and processing data can use representations of data portions (e.g., blocks, chunks, or other subunits of data) that match a specified pattern, such as zero gradients in a machine learning training algorithm. These representations can allow different parts of a system to communicate the existence of these data portions to each other without actually transmitting the data portions while also allowing for the transmission of data portions that do not match the specified pattern. Processing of data can also use these representations or indicators as placeholders for the omitted data and perform calculations based on tallies, skipped memory locations, or other ways of accounting for the omitted data. This can in some cases reduce computing resources used to process data, such as data that may have been communicated using such representations.
H04L 41/16 - Dispositions pour la maintenance, l’administration ou la gestion des réseaux de commutation de données, p. ex. des réseaux de commutation de paquets en utilisant l'apprentissage automatique ou l'intelligence artificielle
H04L 69/22 - Analyse syntaxique ou évaluation d’en-têtes
Embodiments herein describe a security ring for integrated circuit. In an example, a first die includes functional circuitry within an inner region, and a first security ring surrounding the functional circuitry. A second die includes protection circuitry (e.g., tamper detection circuitry) within an inner region, and a second security ring surrounding the protection circuitry. The first security ring sends probing signals to the protection circuitry via the second security ring, receives probing responses from the protection circuitry via the second security ring, determines a physical status of the protection circuitry based on the probing responses, and initiates a remedial action if the physical status of the protection circuitry indicates physical tampering of the protection circuitry.
An electronic device includes a plurality of integrated circuits (ICs), each IC comprising an array of resources, and a regional clock circuitry comprising horizontal routing tracks located on each horizontal edge of each of the resources, and vertical routing tracks located on each vertical edge of each of the resources, and a global clock circuitry formed using the horizontal routing tracks and the vertical routing tracks. At least one pair of the horizontal routing tracks located on horizontal IC interface circuitries or the vertical routing tracks located on vertical IC interface circuitries of at least two adjacent ICs of the plurality of ICs are tied together and the global clock circuitry is configured to route a clock signal to each of the plurality of ICs.
H03K 19/177 - Circuits logiques, c.-à-d. ayant au moins deux entrées agissant sur une sortieCircuits d'inversion utilisant des éléments spécifiés utilisant des circuits logiques élémentaires comme composants disposés sous forme matricielle
G06F 30/34 - Conception de circuits pour circuits reconfigurables, p. ex. réseaux de portes programmables [FPGA] ou circuits logiques programmables [PLD]
14.
OVERSAMPLED CHANNELIZER CIRCUITRY having time-varying filter coefficients
A signal processing system includes channelizer circuitry that includes first delay circuitry that receives first data. The channelizer circuitry generates first combined data based on the first data and a first coefficient set and second combined data based on the first data and a second coefficient set. The first coefficient set differs from the second coefficient set. Further, the channelizer circuitry outputs a first signal based on at least one of the first combined data and the second combined data.
Redundant data storage includes performing a first data transfer to a first storage device as part of a redundant write operation. A first error detection code is generated for the first data transfer. A second data transfer is performed to a second storage device as part of the redundant write operation. A second error detection code is generated for the second data transfer. The first error detection code is compared with the second error detection code for a match indicating that data of the first data transfer matches data of the second data transfer.
G06F 11/10 - Détection ou correction d'erreur par introduction de redondance dans la représentation des données, p. ex. en utilisant des codes de contrôle en ajoutant des chiffres binaires ou des symboles particuliers aux données exprimées suivant un code, p. ex. contrôle de parité, exclusion des 9 ou des 11
16.
3D INTEGRATED CIRCUIT WITH ENHANCED DEBUGGING CAPABILITY
An integrated circuit includes a plurality of layers. A subset of the plurality of layers is reserved for implementing user circuitry. At least a portion of a selected layer of the plurality of layers is reserved for debugging.
G06F 30/367 - Vérification de la conception, p. ex. par simulation, programme de simulation avec emphase de circuit intégré [SPICE], méthodes directes ou de relaxation
H01L 21/66 - Test ou mesure durant la fabrication ou le traitement
H01L 21/768 - Fixation d'interconnexions servant à conduire le courant entre des composants distincts à l'intérieur du dispositif
H01L 23/528 - Configuration de la structure d'interconnexion
17.
FIREWALLING COMMUNICATION PORTS IN A MULTI-PORT SYSTEM
Handling port resets in a multi-port system includes monitoring, using a plurality of firewall circuits, a plurality of controllers corresponding to different communication ports for a reset condition. The plurality of controllers are coupled to a direct memory access (DMA) system through a plurality of bridge circuits. A selected firewall circuit detects a reset condition on a selected controller coupled thereto. The selected controller is coupled to a selected bridge circuit of the plurality of bridge circuits. In response to detecting the reset condition, the selected firewall circuit implements a firewall operating mode. While operating in the firewall operating mode, the selected firewall circuit is configured to control operation of the selected bridge circuit thereby isolating the selected controller from the DMA system. Firewall operating mode of firewall circuits also may be initiated by a management processor in a proactive manner.
A method for providing power integrity to a semiconductor device can include providing one or more die of a semiconductor device that contains functional circuitry of the semiconductor device. The method can also include stacking one or more semiconductor device layers with the one or more die. The method can additionally include providing, in the one or more semiconductor device layers, metal layers that are configured to provide power integrity to the functional circuitry of the semiconductor device. Various other methods and systems are also disclosed.
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 25/00 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
An electronic device includes a plurality of integrated circuits (ICs), each IC comprising an array of resources, and a regional clock circuitry comprising horizontal routing tracks located on each horizontal edge of each of the resources, and vertical routing tracks located on each vertical edge of each of the resources, and a global clock circuitry formed using the horizontal routing tracks and the vertical routing tracks. At least one pair of the horizontal routing tracks located on horizontal IC interface circuitries or the vertical routing tracks located on vertical IC interface circuitries of at least two adjacent ICs of the plurality of ICs are tied together and the global clock circuitry is configured to route a clock signal to each of the plurality of ICs.
Embodiments herein describe a security ring for integrated circuit. In an example, a first die includes functional circuitry within an inner region, and a first security ring surrounding the functional circuitry. A second die includes protection circuitry (e.g., tamper detection circuitry) within an inner region, and a second security ring surrounding the protection circuitry. The first security ring sends probing signals to the protection circuitry via the second security ring, receives probing responses from the protection circuitry via the second security ring, determines a physical status of the protection circuitry based on the probing responses, and initiates a remedial action if the physical status of the protection circuitry indicates physical tampering of the protection circuitry.
G06F 21/75 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information par inhibition de l’analyse de circuit ou du fonctionnement, p. ex. pour empêcher l'ingénierie inverse
Multiple semiconductor dice are disposed on a silicon interposer and are communicatively coupled via the interposer. A first die includes a first memory and a readback circuit, which is coupled to the first memory and coupled to receive a readback command communicated through the interposer. A hash circuit on the first die is configured to generate a message digest from data in the first memory, and an encryption circuit on the first die is configured to encrypt the message digest into an encrypted message digest. The encrypted message digest is accessible through the interposer.
Disclosed herein are an interposer, an integrated (IC) chip assembly including the interposer, and a method for making the IC chip assembly. The interposer includes a transparent core having a cavity and an optical waveguide formed in a surface of the cavity, an optical source disposed within the cavity and coupled with the optical waveguide, and a photonic integrated circuit disposed within the cavity and coupled with the optical waveguide. The photonic integrated circuit, the optical source, and the transparent core are co-planar with each other. The interposer further includes a redistribution layer disposed on the transparent core and having metal traces coupled with the optical source. The chip assembly includes an IC die stack having a plurality of IC dies and a package substrate disposed under the interposer. The interposer disposed under the IC die stack, and the interposer coupling the IC die stack with the package substrate.
Implementing a data movement network includes tiling one or more layers of a machine learning model based, at least in part, on amounts of addressable memory available in different memory levels of a memory architecture of an electronic system. Logical connections specifying compute tiles of the electronic system and logical address spaces corresponding to the compute tiles are generated. Physical connections are generated within the memory architecture by binding ports of direct memory access circuits of the memory architecture to the logical connections. Data transfers for memories between the different memory levels are scheduled based, at least in part, on a loop order of the tiling. Buffers for data of the data transfers are placed within the memories based on the scheduling.
Embodiments herein describe storing unaligned data structures in local memory that are then loaded into cores. For example, the data structures may have a length that is not a power of 2 so that they do not align with the width (or the bandwidth of the local memories). A load unit in the core can receive multiple data chunks from the local memory and identify an unaligned data structure that spans across the data chunks. The data structures can then be stored in a register as an aligned data structure as the width of the register may match the length of the data structure.
Embodiments herein describe a hardware accelerator that indudes multiple clock domains. For example, the hardware accelerator can include data processing engines (DPEs) which include circuitry for performing acceleration tasks (e,g., artificial intelligence (Al) tasks, data encryption tasks, data compression tasks, and the like). The DPEs are interconnected to permit them to share data when performing the acceleration tasks. In addition to the DPEs, the hardware accelerator can Include interface circuitry such as an interconnect, a controller, address translation circuitry, etc. The DPEs may be in a first clock domain while the other circuitry is in a second clock domain. The two dock domains can use different frequency clock circuits, for example, to generate more bandwidth for moving data into and out of the hardware accelerator while reducing power consumption.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 15/173 - Communication entre processeurs utilisant un réseau d'interconnexion, p. ex. matriciel, de réarrangement, pyramidal, en étoile ou ramifié
G06F 1/04 - Génération ou distribution de signaux d'horloge ou de signaux dérivés directement de ceux-ci
G06F 12/1045 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] associée à une mémoire cache de données
G06F 12/1081 - Traduction d'adresses pour accès périphérique à la mémoire principale, p. ex. accès direct en mémoire [DMA]
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
26.
DECOUPLING PROCESSING AND INTERFACE CLOCKS IN AN IPU
Embodiments herein describe a hardware accelerator that includes multiple clock domains. For example, the hardware accelerator can include data processing engines (DPEs) which include circuitry for performing acceleration tasks (e.g., artificial intelligence (AI) tasks, data encryption tasks, data compression tasks, and the like). The DPEs are interconnected to permit them to share data when performing the acceleration tasks. In addition to the DPEs, the hardware accelerator can include interface circuitry such as an interconnect, a controller, address translation circuitry, etc. The DPEs may be in a first clock domain while the other circuitry is in a second clock domain. The two clock domains can use different frequency clock circuits, for example, to generate more bandwidth for moving data into and out of the hardware accelerator while reducing power consumption.
Embodiments herein describe using DMA circuitry in multiple tiles in a hardware accelerator array to program the DMA operations within the array. For example, a system on a chip (SoC) may include a controller that is external to the hardware accelerator array. While the controller can be used to program the DMA circuitry within the array, this can be slow since the controller may be compute limited. Instead, the embodiments herein describe techniques where the controller is provided pointers to the register read and write corresponding to the DMA operations. The controller can provide these pointers to multiple DMA engines in the hardware accelerator array (e.g., DMA circuitry in interface tiles) which fetch the DMA operations and program themselves, as well as other DMA circuitry in the array.
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 13/16 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus de mémoire
Embodiments herein describe storing unaligned data structures in local memory that are then loaded into cores. For example, the data structures may have a length that is not a power of 2 so that they do not align with the width (or the bandwidth of the local memories). A load unit in the core can receive multiple data chunks from the local memory and Identify an unaligned data structure that spans across the data chunks. The data structures can then be stored in a register as an aligned data structure as the width of the register may match the length of the data structure.
Embodiments herein describe a system including a first optical device disposed adjacent a photonics integrated circuit (PIC), wherein the first optical device includes a first mirror to receive a light beam from the PIC and deflect the light beam in a first direction, a second optical device including a second mirror to receive the light beam deflected in the first direction and deflect the light beam deflected in the first direction toward a second direction, and a multi-channel fiber array to receive the light beam deflected in the second direction.
embodiments herein describe a system including a first optical device disposed adjacent a photonics integrated circuit (pic), wherein the first optical device includes a first mirror to receive a light beam from the pic and deflect the light beam in a first direction, a second optical device including a second mirror to receive the light beam deflected in the first direction and deflect the light beam deflected in the first direction toward a second direction, and a multi-channel fiber array to receive the light beam deflected in the second direction.
G02B 6/12 - Guides de lumièreDétails de structure de dispositions comprenant des guides de lumière et d'autres éléments optiques, p. ex. des moyens de couplage du type guide d'ondes optiques du genre à circuit intégré
An apparatus includes a data processing array having a plurality of array tiles. Each array tile can include a random-access memory (RAM) having a local memory interface accessible by circuitry within the array tile and an adjacent memory interface accessible by circuitry disposed within an adjacent array tile. Each adjacent memory interface of each array tile can include isolation logic that is programmable to allow the circuitry disposed within the adjacent array tile to access the RAM or prevent the circuitry disposed within the adjacent array tile from accessing the RAM. The data processing array can be subdivided into a plurality of partitions wherein the isolation logic of the adjacent memory interfaces is programmed to prevent array tiles from accessing RAMs across a boundary between the plurality of partitions.
Embodiments herein describe secure solutions for resource-restrictions on integrated circuits. In an example, dedicated compliance circuitry monitors resource metrics of functional circuitry over dedicated communication infrastructure based on a hardware-embedded authentication metric, and performs a remedial action if the resource metrics exceed resource restrictions (e.g., disables the functional circuitry). The compliance circuitry may include a dedicated processor, non-reprogrammable storage circuitry encoded with first instructions and the authentication metric, and reprogrammable storage circuitry encoded with second instructions. The processor executes the first instructions on power-up. The first instructions cause the processor to authenticate the second instructions based on the authentication metric, and execute the second instructions if the second instructions are authenticated. The second instructions cause the processor to monitor the resource metric and perform the remedial action. The second instructions may be modified but will not pass authentication if the modification is not encoded based on the authentication metric.
Embodiments herein describe a host that polls a network adapter to receive data from a network. That is, the host/CPU/application thread polls the network adapter (e.g., the network card, NIC, or SmartNIC) to determine whether a packet has been received. If so, the host informs the network adapter to store the packet (or a portion of the packet) in a CPU register. If the requested data has not yet been received by the network adapter from the network, the network adapter can delay the responding to the request to provide extra time for the adapter to receive the data from the network.
H04L 43/103 - Surveillance active, p. ex. battement de cœur, utilitaire Ping ou trace-route avec interrogation adaptative, c.-à-d. adaptation dynamique du taux d'interrogation
H04L 67/1097 - Protocoles dans lesquels une application est distribuée parmi les nœuds du réseau pour le stockage distribué de données dans des réseaux, p. ex. dispositions de transport pour le système de fichiers réseau [NFS], réseaux de stockage [SAN] ou stockage en réseau [NAS]
An integrated circuit die stack and method thereof are described herein that is capable of detecting a physical tampering event. The integrated circuit die stack includes a first integrated circuit die including a sensor network that extends substantially across an entire top surface of the first integrated circuit die, and a second integrated circuit die stacked below the first integrated circuit die. The second integrated circuit die is configured to receive sensing signals generated by the sensor network via a plurality of through-silicon-vias coupled with the first integrated circuit die and the second integrated circuit die.
G11C 19/28 - Mémoires numériques dans lesquelles l'information est déplacée par échelons, p. ex. registres à décalage utilisant des éléments semi-conducteurs
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
35.
LOCALIZED AND RELOCATABLE SOFTWARE PLACEMENT AND NOC-BASED ACCESS TO MEMORY CONTROLLERS
An integrated circuit device includes a processing element, a plurality of memory controllers, and a network on chip (NoC). The NoC has a first network including a plurality of interconnected switches having routing tables and a second network coupled to the first network. The second network includes a crossbar. The NoC is configured to implement a path coupling the processing element and the plurality of memory controllers in which a first portion of the path is implemented in the first network and a second portion of the path is implemented in the second network. The crossbar connects the processing element to any memory controller of the plurality of memory controllers while maintaining a same delay for the path.
A system includes a high bandwidth memory (HBM) and a convolutional neural network (CNN) engine. The HBM includes a virtual bank portion and a system memory portion. The virtual bank portion is configured to store a feature map data and the system memory portion is configured to support data exchanges with a host. The CNN engine includes a convolutional unit configured to execute convolutional layer instructions, a depthwise convolutional unit configured to execute depthwise layer instructions, and a first on-chip buffer. The first on-chip buffer is configured to receive and store the feature map data from the virtual bank portion or receive and store data results from the convolutional unit. The first on-chip buffer is further configured to send the feature map data or the data results from the convolutional unit to the depthwise convolutional unit for processing.
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
Systems and methods for bypassing subsequent lookups in packet processing pipelines in which multiple circuit blocks includes pre-processing circuitry that determine keys based on parsed contents of packets, and that retrieve responses from respective look-up tables (LUTs) based on the keys. The LUT of a first one of the blocks may be programmed with keys and/or responses for other ones of the circuit blocks, and the first circuit block may provide the keys and/or responses in metadata of the packets. Alternatively, or additionally, the first circuit block may provide parsed contents of the packets in the metadata of the respective packets. The other circuit blocks may selectively bypass the respective pre-processing circuitry based on the metadata.
Described herein are systems and methods for programmable, hardware-accelerated congestion monitoring and/or control. At least one circuit can configure a plurality of hardware circuits with one or more rules that, when satisfied, cause the plurality of hardware circuits to generate one or more congestion events indicative of congestion in a network. The at least one circuit can receive the one or more congestion events generated by the plurality of hardware circuits based on one or more network signals in the network satisfying the one or more rules. In response to receipt of the one or more congestion events from the plurality of hardware circuits configured with the one or more rules to detect the congestion in the network, the at least one circuit can analyze the one or more congestion events to address the congestion in the network. Various other methods, systems, and computer-readable media are also disclosed.
A system for packing data includes a controller configured to receive compressed data. The compressed data includes data items and qualifier bits for the data items. The controller is configured to discard the data items designated as invalid by the qualifier bits. The controller is configured to generate data type bits specifying data type information for the data items designated as valid by the qualifier bits. The system includes a buffer. The controller is configured to store the data items designated as valid by the qualifier bits and the data type bits in the buffer. A system can include one or more decoders configured to decode encoded data and output literals, lengths, and distances.
To perform matrix multiplication operations for one or more applications, a processing system includes an acceleration unit (AU) having a block-scaled dot-product circuitry configured to multiply a first matrix by a second matrix. To this end, the block-scaled dot-product circuitry first partitions the first matrix into one or more multi-dimensional scaled blocks and the second matrix also into one or more multi-dimensional scaled blocks. The block-scaled dot-product circuitry next determines dot products of respective portions of the first matrix and corresponding portions of the second matrix using the multi-dimensional scaled blocks of the matrices and then combines these dot products to determine the dot product of the first matrix and the second matrix.
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
Examples herein describe pulse generation circuitry. The pulse generation circuitry includes a first pulse generator circuit configured to generate a first pulsed output by sampling a data input using a first clock signal having first pulses and a second clock signal having second pulses that do not overlap the first pulses. The first and second clock signals are separated by a phase shift. The pulse generation circuitry also includes a second pulse generator circuit configured to generate a second pulsed output by sampling the data input using a third clock signal having third pulses and a fourth clock signal having fourth pulses that do not overlap the third pulses. The third and fourth clock signals are separated by the phase shift. A multiplexor is configured to output a third pulsed output based on the first pulsed output and the second pulsed output.
To perform matrix multiplication operations for one or more applications, a processing system includes an acceleration unit (AU) having a block-scaled dot-product circuitry configured to multiply a first matrix by a second matrix. To this end, the block-scaled dot-product circuitry first partitions the first matrix into one or more multi-dimensional scaled blocks and the second matrix also into one or more multi-dimensional scaled blocks. The block-scaled dot-product circuitry next determines dot products of respective portions of the first matrix and corresponding portions of the second matrix using the multi-dimensional scaled blocks of the matrices and then combines these dot products to determine the dot product of the first matrix and the second matrix.
G06F 5/01 - Procédés ou dispositions pour la conversion de données, sans modification de l'ordre ou du contenu des données maniées pour le décalage, p. ex. la justification, le changement d'échelle, la normalisation
43.
EMULATING A CIRCUIT DESIGN IN COMMUNICATION WITH A PERIPHERAL
Emulating a circuit design in communication with a peripheral includes an emulator including at least a portion of the circuit design. The portion of the circuit design includes a processor circuit and a first bridge circuit coupled to the processor circuit. The first bridge circuit is configured to receive first data from the processor circuit, generate packetized first data from the first data, and convey the packetized first data over a network to a peripheral. The peripheral is remotely located from the emulator and is controlled by signals derived from the packetized first data.
G06F 30/331 - Vérification de la conception, p. ex. simulation fonctionnelle ou vérification du modèle par simulation avec accélération matérielle, p. ex. en utilisant les réseaux de portes programmables [FPGA] ou une émulation
G06F 30/333 - Conception en vue de la testabilité [DFT], p. ex. chaîne de balayage ou autotest intégré [BIST]
Embodiments herein describe a computer architecture including a device having a test vector memory (TVM) space configured to store test patterns and a deterministic built-in self-test (DBIST) direct memory access (DMA) controller configured to receive the test patterns directly from the TVM space and apply the test patterns to at least one hardware block under test. The DBIST DMA controller sends a scan bus signal and a scan clock signal to the at least one hardware block under test, the scan bus signal providing the test patterns to the at least one hardware block under test. The test patterns are generated by a manufacturer of the at least one hardware block under test.
Embodiments herein describe a methodology to achieve transaction redundancy in memory-constrained devices. In an example, an initiator circuit issues an original transaction that includes a memory access request and an address of a first region of memory cells. Transaction redundancy circuitry generates a redundant transaction having an address of a second region of the memory cells (e.g., at a fixed offset from the address of the original transaction). Address transformer circuitry transforms the initial target address of the original and/or redundant transaction to ensure that a bit fault in the initial address results in an incorrect transformed address that is separated from a desired address, which will result in a data mismatch when original data and redundant data are retrieved and compared. The initial target address may be transformed based on a Hamming, SECDED, CRC, and/or other code.
Examples herein describe memory lifecycle state sensors. A memory lifecycle state sensor includes a memory and a processor. The processor is configured to write a first value to a cell of the memory at a first voltage, and the cell is storing a second value written to the cell at a second voltage that is greater than the first voltage. A value is read from the cell and compared with the first value. An indication of a lifecycle state for the cell is generated based on comparing the value with the first value, the first voltage, and the second voltage.
Embodiments herein describe a 3D splintered physical unclonable function (3D-sPUF). In an example, an integrated circuit (IC) device includes multiple dies in a stacked configuration, and a PUF circuit generates a set of bits that is unique to the PUF circuit based on physical variations of elements of the PUF circuit, where the PUF circuit is distributed amongst two or more of the dies.
H04L 9/32 - Dispositions pour les communications secrètes ou protégéesProtocoles réseaux de sécurité comprenant des moyens pour vérifier l'identité ou l'autorisation d'un utilisateur du système
An implementation includes an integrated circuit, a network-on-chip (NoC) a plurality of first circuits, each first circuit may include a compressor and a decompressor, the compressor being configured to compress datastreams, and the decompressor being configured to decompress the compressed datastreams, the compressed datastreams may include symbols, the decompressor may include a plurality of symbol decoders configured to decode in parallel the compressed datastreams. The integrated circuit also includes a memory circuit. The circuit also includes a plurality of switches, the plurality of switches being interconnected and communicatively linking the plurality of first circuits with the memory circuit.
An implementation includes an integrated circuit, a network-on-chip (NoC) a plurality of first circuits, each first circuit may include a compressor and a decompressor, the compressor being configured to compress datastreams, and the decompressor being configured to decompress the compressed datastreams, the compressed datastreams may include symbols, the decompressor may include a plurality of symbol decoders configured to decode in parallel the compressed datastreams. The integrated circuit also includes a memory circuit. The circuit also includes a plurality of switches, the plurality of switches being interconnected and communicatively linking the plurality of first circuits with the memory circuit.
H03M 7/30 - CompressionExpansionÉlimination de données inutiles, p. ex. réduction de redondance
H03M 7/46 - Conversion en, ou à partir de codes à longueur de série, c.-à-d. par représentation du nombre de chiffres successifs ou groupes de chiffres de même type à l'aide d'un mot-code et d'un chiffre représentant ce type
Embodiments herein describe CRAM validation using an external device (ED). The ED selects unused addresses of CRAM as challenge registers (CRs), determines challenge bits for the CRs, and provides the selected addresses and the challenge bits to challenge circuitry of the IC device. The challenge circuitry initiates storage of the challenge bits at the selected CRAM addresses and invokes scan circuitry to scan the CRAM. The scan circuitry retrieves contents of CRAM addresses used to store configuration bits and contents of the selected CRAM addresses, and provides the contents or a code determined from the contents to the challenge circuitry (i.e., bypassing validation circuitry of scan logic). The challenge circuitry forwards the contents or the code to the ED as a challenge response, and the ED validates the CRAM based on the challenge response and a golden copy of the configuration bits.
G06F 21/79 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du stockage de données dans les supports de stockage à semi-conducteurs, p. ex. les mémoires adressables directement
Updating machine learning models with user data includes executing, by a data processing system, a container including a first machine learning (ML) model, training data for the first ML model, and a library of machine learning functions. The data processing system executes one or more of the machine learning functions of the library. The one or more of the machine learning functions are configured to build a second ML model trained, at least in part, on user training data and to compare accuracy of the first ML model with accuracy of the second ML model. An ML model also may be trained to predict compilation time for circuit designs using training data that includes circuit design features, hardware features of a data processing system, and runtime features from the data processing system.
An integrated circuit device includes logic circuitry and a Network-on-Chip (NoC). The logic circuitry performs one or more operations of an application. The NoC is coupled to the logic circuitry via a NoC master unit (NMU). The NoC includes cache controller circuitry that receives a first memory command from the logic circuitry via the NMU. Further, in response to data associated with the first memory command being stored within a cached memory, the cache controller circuitry executes the first memory command on the data of the cache memory.
G06F 12/0802 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache
53.
MISMATCH RESISTANT THERMAL CONTROL LOOP FOR CASCADED OPTICAL RING RESONATORS
An integrated circuit (IC) device includes an optoelectronic circuitry having a first heater and a second heater, and a controller circuitry having an input coupled to a photodiode of the optoelectronic circuitry and an output coupled to the first heater and the second heater of the optoelectronic circuitry, the controller circuitry configured to determine an offset from a baseline heater control signal code based on a transimpedance (TIA) control signal code of an input signal received from the photodiode, and provide a first heater control signal to the first heater and a second heater control signal to the second heater based on the offset of the optoelectronic circuitry.
G02F 1/21 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur par interférence
G02B 6/293 - Moyens de couplage optique ayant des bus de données, c.-à-d. plusieurs guides d'ondes interconnectés et assurant un système bidirectionnel par nature en mélangeant et divisant les signaux avec des moyens de sélection de la longueur d'onde
G02F 1/025 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur basés sur des éléments à semi-conducteurs ayant des barrières de potentiel, p. ex. une jonction PN ou PIN dans une structure de guide d'ondes optique
G02F 1/01 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur
Examples herein describe optical receiver circuitry. The optical receiver circuitry includes a polarization diversifier and first and second waveguides. The polarization diversifier is configured to receive in input optical signal, output a first component of the input optical signal into a first end of an optical path, and output a second component of the input optical signal into a second end of the optical path. An add-drop ring resonator filter is disposed in the optical path. The first waveguide is configured to transmit the first optical component from the add-drop ring resonator filter to a photodetector circuit. The second waveguide is configured to transmit the second optical component from the add-drop ring resonator filter to the photodetector circuit. The first waveguide has a first length and the second waveguide has a second length that is greater than the first length.
G02B 6/12 - Guides de lumièreDétails de structure de dispositions comprenant des guides de lumière et d'autres éléments optiques, p. ex. des moyens de couplage du type guide d'ondes optiques du genre à circuit intégré
G02B 6/126 - Guides de lumièreDétails de structure de dispositions comprenant des guides de lumière et d'autres éléments optiques, p. ex. des moyens de couplage du type guide d'ondes optiques du genre à circuit intégré utilisant des effets de polarisation
H04J 14/02 - Systèmes multiplex à division de longueur d'onde
An injection locked ring oscillator (ILRO) system is disclosed. The ILRO system includes an ILRO circuit configured to receive a plurality of injection control signals and a phase control signal, and to generate a plurality of output clock signals; a phase detector circuit configured to receive the output clock signals and to generate a phase output signal based on phase differences of particular pairs of the output clock signals; and a phase to voltage circuit configured to receive the phase output signal from the phase detector circuit, and to generate the phase control signal based on the phase output signal, where the phase control signal presents a negative feedback phase signal to the ILRO circuit for the phase differences in the particular pairs of the output clock signals.
Disclosed herein are thermal control devices suitable for thermally regulating chip packages, and electronic devices having the same. In one example, a multi-cavity thermal control device is provided that includes a body having a center cavity disposed between inlet and outlet cavities. The inlet cavity has an inlet port formed proximate a first side of the body. The outlet cavity has an outlet port formed proximate a second side of the body. The center cavity has an inlet coupled to an outlet of the inlet cavity. The inlet of the center cavity is disposed closer to a center of the center cavity than an edge of the center cavity. The center cavity has an outlet configured to flow fluid into the outlet cavity.
H01L 23/427 - Refroidissement par changement d'état, p. ex. caloducs
H01L 23/40 - Supports ou moyens de fixation pour les dispositifs de refroidissement ou de chauffage amovibles
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
57.
Configuration of manager-subordinate connectivity paths of a system-on-chip
A connectivity tool validates a connectivity configuration of each manager of a plurality of managers in a system-on-chip (SoC) architecture based on data that indicate excluded interfaces of a plurality of interfaces of the SoC architecture and excluded subordinates of a plurality of subordinates of the SoC architecture. The connectivity configuration specifies a connectivity status of the manager and one or more of the plurality of interfaces. The connectivity tool configures connectivity paths of the SoC to include a path from a manager of the plurality of managers to a subordinate of the plurality of subordinates for each valid connectivity configuration.
A method of using an oscillator circuit is disclosed. The method includes: with an oscillator, generating a plurality of output clock signals based on an override control signal; with a phase detector circuit, generating a phase error signal based on the output clock signals; with a gain stage circuit, modifying a feedback control signal based on the phase error signal and an offset compensation code; with a controller, modifying the offset compensation code; with a comparator, generating an equality signal indicating that the feedback control signal is equal to the override control signal; and with the controller, in response to the equality signal, causing the modified offset compensation code to be stored.
H03L 7/085 - Détails de la boucle verrouillée en phase concernant principalement l'agencement de détection de phase ou de fréquence, y compris le filtrage ou l'amplification de son signal de sortie
H03L 7/099 - Détails de la boucle verrouillée en phase concernant principalement l'oscillateur commandé de la boucle
An integrated circuit (IC) includes a clock modulation circuitry including a delay hierarchy circuitry coupled to the register, the delay hierarchy circuitry configured to receive a clock (CLK) signal, provide a delayed master clock (CLKM) signal to a master latch of the register, and provide a delayed slave clock (CLKS) signal to a slave latch of the register.
Embodiments herein describe a circuit including a passive intermodulation (PIM) model circuit configured to process first data to generate a PIM interference model output to be concatenated with second data, the second data including a first carrier frequency and a second carrier frequency, and the circuit further including a PIM model adapt circuit configured to receive frequency shifted captured data and frequency shifted PIM models to generate updated values to compensate for PIM interference after the PIM interference model output is concatenated with the second data.
H04B 1/525 - Dispositions hybrides, c.-à-d. dispositions pour la transition d’une transmission bilatérale sur une voie à une transmission unidirectionnelle sur chacune des deux voies ou vice versa avec des moyens de réduction de la fuite du signal de l’émetteur vers le récepteur
H04B 1/00 - Détails des systèmes de transmission, non couverts par l'un des groupes Détails des systèmes de transmission non caractérisés par le milieu utilisé pour la transmission
Embodiments herein describe a computer architecture including at least one core including a first cache and a second cache, a shared cache, and an accelerator comprising circuitry configured to manage data and instructions transferred between the first and second caches and the shared cache, wherein the accelerator platform is configured to allow an implementation of a user task to perform multi-level prefetching to timely obtain address translation mappings. Address translation mappings are mappings between virtual addresses and physical addresses stored in a page table. The multi-level prefetching includes a first prefetching request (far request), a second prefetching request (near request), and a third prefetching request (now request).
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
G06F 12/084 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement avec mémoire cache partagée
G06F 12/0873 - Mappage de mémoire de mémoire cache vers des dispositifs ou des parties de dispositifs de stockage
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
Embodiments herein describe a circuit including a user domain configured to execute user functions and a hardened domain configured to communicate with the user domain. The hardened domain includes peripheral component interconnect express (PCIe) function decoding logic having a plurality of register bits and a Trusted Execution Environment (TEE) Device Interface Security Protocol (TDISP) core communicating with the PCIe function decoding logic. The TDISP core supports a plurality of PCIe functions. Each register bit of the plurality of register bits is assigned to a respective PCIe function of the plurality of PCIe functions.
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/455 - ÉmulationInterprétationSimulation de logiciel, p. ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
A processing system identifies and removes stuck channels in a quantized neural network (QNN), where a stuck channel is one whose outputs are always mapped to the same quantized number. The processing system identifies, at a layer of the neural network, a first channel as a stuck channel based on the first channel having a constant output. In response to identifying the first channel as a stuck channel, the processing system adjusts a first operator of the layer.
Embodiments herein describe a hardware accelerator that includes multiple power or clock domains. For example, the hardware accelerator can include an array of data processing engines (DPEs) where different subsets of the DPEs (e.g., different columns, rows, or blocks) are disposed in different power or clock domains within the hardware accelerator. When one or more subsets of the DPEs are idle (e.g., the hardware accelerator has not assigned any tasks to those DPEs), the accelerator can deactivate the corresponding power or clock domain (or domains), which deactivates the DPEs in those domains while the DPEs in the other power or clock domains remain operational. As such, idle DPEs can be deactivated to conserve energy while DPEs with work can remain operational.
G06F 1/3237 - Économie d’énergie caractérisée par l'action entreprise par désactivation de la génération ou de la distribution du signal d’horloge
G06F 1/04 - Génération ou distribution de signaux d'horloge ou de signaux dérivés directement de ceux-ci
G06F 1/3228 - Surveillance d’exécution de tâches, p. ex. par utilisation de temporisations d’attente, de commandes d’arrêt ou de commandes d’attente
G06F 1/3234 - Économie d’énergie caractérisée par l'action entreprise
G06F 1/3287 - Économie d’énergie caractérisée par l'action entreprise par la mise hors tension d’une unité fonctionnelle individuelle dans un ordinateur
65.
CONTROLLER FOR AN ARRAY OF DATA PROCESSING ENGINES
Embodiments herein describe integrating an accelerator into a same SoC (or same chip or IC) as a CPU. The SoC also includes a controller (e.g., a microcontroller) that orchestrates data processing engines (DPEs) in the accelerator. The controller (or orchestrator) receives a task from the CPU and then configures the DPEs to perform the task. For example, the controller may divide the task into a sequence of operations that are performed by one or more of the DPEs. The controller can then report back to the CPU when the task is complete.
G06F 9/48 - Lancement de programmes Commutation de programmes, p. ex. par interruption
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
Embodiments herein describe integrating an AI accelerator into a same SoC (or same chip or IC) as a CPU. Thus, instead of relying on off-chip communication techniques, on-chip communication techniques such as an interconnect (e.g., a NoC) can be used to facilitate communication. This can result in faster communication between the AI accelerator and the CPU. Moreover, a tighter integration between the CPU and AI accelerator can make it easier for the CPU to offload AI tasks to the Al accelerator. In one embodiment, the AI accelerator includes address translation circuitry for translating virtual addresses used in the AI accelerator to physical addresses used to store the data.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 12/1036 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] pour espaces adresse virtuels multiples, p. ex. segmentation
67.
CONTROLLER FOR AN ARRAY OF DATA PROCESSING ENGINES
Embodiments herein describe integrating an accelerator into a same SoC (or same chip or IC) as a CPU. The SoC also includes a controller (e.g., a microcontroller) that orchestrates data processing engines (DPEs) in the accelerator. The controller (or orchestrator) receives a task from the CPU and then configures the DPEs to perform the task. For example, the controller may divide the task into a sequence of operations that are performed by one or more of the DPEs. The controller can then report back to the CPU when the task is complete.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
G06F 12/1036 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] pour espaces adresse virtuels multiples, p. ex. segmentation
G06F 12/0831 - Protocoles de cohérence de mémoire cache à l’aide d’un schéma de bus, p. ex. avec moyen de contrôle ou de surveillance
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
68.
AN AREA AND POWER EFFICIENT CLOCK DATA RECOVERY (CDR) AND ADAPTATION IMPLEMENTATION FOR DENSE WAVELENGTH-DIVISION MULTIPLEXING (DWDM) OPTICAL LINKS
Embodiments herein describe techniques for area and power efficient clock data recovery (CDR) and adaptation implementations for dense wavelength-division multiplexing (DWDM) optical links and other types of links. One example is a system that includes a plurality of receiver circuits that sample signals based on respective receiver clocks, where the receiver circuits include a reference receiver circuit and remaining receiver circuits, and where the receiver clock of the reference receiver circuit comprises a reference clock. The system further includes a clock and data recovery (CDR) circuit that controls a phase of the reference clock based on outputs of the reference receiver circuit, and time-multiplexed de-skew circuitry configured to determine time-multiplexed phase offsets for the remaining receiver circuits based on time-multiplexed outputs of the remaining receiver circuits, where the remaining receiver circuits phase-shift the reference clock based on the respective time- multiplexed phase offsets to provide the respective receiver clocks.
H04J 14/02 - Systèmes multiplex à division de longueur d'onde
G02B 6/12 - Guides de lumièreDétails de structure de dispositions comprenant des guides de lumière et d'autres éléments optiques, p. ex. des moyens de couplage du type guide d'ondes optiques du genre à circuit intégré
69.
Prediction-based Extrapolation of Pixels for Improved Video Compression
Methods and systems for generating missing reference pixels for intra prediction of coding units are described. A pattern amongst a plurality of available reference pixel samples from a set of reference pixel samples is computed. The pattern can be determined based on a computed difference between actual pixel values of the available reference pixel samples. The patterns are learned based on a comparison of the computed difference between the actual pixel values to a predetermined threshold. The unavailable pixel values are then generated based on the learned pattern. Further, one or more image effects corresponding to the available reference pixel samples are automatically replicated in the generated pixels as well.
H04N 19/105 - Sélection de l’unité de référence pour la prédiction dans un mode de codage ou de prédiction choisi, p. ex. choix adaptatif de la position et du nombre de pixels utilisés pour la prédiction
H04N 19/132 - Échantillonnage, masquage ou troncature d’unités de codage, p. ex. ré-échantillonnage adaptatif, saut de trames, interpolation de trames ou masquage de coefficients haute fréquence de transformée
H04N 19/136 - Caractéristiques ou propriétés du signal vidéo entrant
H04N 19/159 - Type de prédiction, p. ex. prédiction intra-trame, inter-trame ou de trame bidirectionnelle
H04N 19/182 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c.-à-d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant un pixel
H04N 19/593 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif mettant en œuvre des techniques de prédiction spatiale
A transceiver circuit is disclosed, the transceiver circuit including a first register circuit, configured to receive serial stimulus data and to generate multi-bit parallel stimulus data, a serializer circuit configured to receive the multi-bit parallel stimulus data and to generate serialized data based on the multi-bit parallel stimulus data, where the serializer circuit includes a serializer data storage device, and where the serializer data storage device lacks circuit structures for scanability, a deserializer circuit configured to receive serial receiver data corresponding with the serialized data and to generate multi-bit parallel response data based on the serial receiver data, where the deserializer circuit includes a deserializer data storage device, and where the deserializer data storage device lacks circuit structures for scanability, and a second register circuit, configured to receive the multi-bit parallel response data and to generate serial response data.
Embodiments herein describe techniques for area and power efficient clock data recovery (CDR) and adaptation implementations for dense wavelength-division multiplexing (DWDM) optical links and other types of links. One example is a system that includes a plurality of receiver circuits that sample signals based on respective receiver clocks, where the receiver circuits include a reference receiver circuit and remaining receiver circuits, and where the receiver clock of the reference receiver circuit comprises a reference clock. The system further includes a clock and data recovery (CDR) circuit that controls a phase of the reference clock based on outputs of the reference receiver circuit, and time-multiplexed de-skew circuitry configured to determine time-multiplexed phase offsets for the remaining receiver circuits based on time-multiplexed outputs of the remaining receiver circuits, where the remaining receiver circuits phase-shift the reference clock based on the respective time-multiplexed phase offsets to provide the respective receiver clocks.
H04L 7/00 - Dispositions pour synchroniser le récepteur avec l'émetteur
H04L 1/00 - Dispositions pour détecter ou empêcher les erreurs dans l'information reçue
H04L 7/02 - Commande de vitesse ou de phase au moyen des signaux de code reçus, les signaux ne contenant aucune information de synchronisation particulière
Described herein are systems and methods for scalable communications. A circuit can receive a request from an application to communicate with a destination over a network. The circuit can identify the destination from information included in the request. In a first case that resources have been allocated for communicating with the destination identified from the request, the circuit can communicate data to the destination over the network using the resources that have been allocated. In a second case that resources have not been allocated for communicating with the destination identified from the request, the circuit can allocate resources for communicating the data with the destination. The circuit can communicate the data to the destination over the network using the resources that have been allocated.
H04L 47/76 - Contrôle d'admissionAllocation des ressources en utilisant l'allocation dynamique des ressources, p. ex. renégociation en cours d'appel sur requête de l'utilisateur ou sur requête du réseau en réponse à des changements dans les conditions du réseau
H04L 67/1097 - Protocoles dans lesquels une application est distribuée parmi les nœuds du réseau pour le stockage distribué de données dans des réseaux, p. ex. dispositions de transport pour le système de fichiers réseau [NFS], réseaux de stockage [SAN] ou stockage en réseau [NAS]
Embodiments herein describe a hardware accelerator that includes multiple power or clock domains. For example, the hardware accelerator can include data processing engines (DPEs) which include circuitry for performing acceleration tasks (e.g., artificial intelligence (AI) tasks, data encryption tasks, data compression tasks, and the like). The DPEs are interconnected to permit them to share data when performing the acceleration tasks. In addition to the DPEs, the hardware accelerator can include other circuitry such as an interconnect, a controller, address translation circuitry, etc. The DPEs may be in a first power or clock domain while the other circuitry is in a second power or clock domain. That way, when the DPEs are idle (e.g., the hardware accelerator currently has no tasks assigned to it), the first power or clock domain can be powered down while the second power or clock domain can remain powered.
Embodiments herein describe integrating an Al accelerator into a same SoC (or same chip or IC) as a CPU. Thus, instead of relying on off-chip communication techniques, on-chip communication techniques such as an interconnect (e.g., a NoC) can be used to facilitate communication. This can result in faster communication between the Al accelerator and the CPU. Moreover, a tighter integration between the CPU and Al accelerator can make it easier for the CPU to offload Al tasks to the Al accelerator. In one embodiment, the Ai accelerator includes address translation circuitry for translating virtual addresses used in the Al accelerator to physical addresses used to store the data.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
G06F 12/1081 - Traduction d'adresses pour accès périphérique à la mémoire principale, p. ex. accès direct en mémoire [DMA]
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
75.
Systems and methods for decentralized address translation
The disclosed computer-implemented method for decentralized address translation can include receiving, by at least one processor implemented outside a processor core, a virtual address translation request. The method can additionally include, retrieving, by the at least one processor and in response to the virtual address translation request, a physical address. The method can also include returning, by the at least one processor, the physical address. Various other methods, systems, and computer-readable media are also disclosed.
G06F 12/1045 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] associée à une mémoire cache de données
G06F 12/0897 - Mémoires cache caractérisées par leur organisation ou leur structure avec plusieurs niveaux de hiérarchie de mémoire cache
76.
MITIGATION OF CONTROL SET PACKING RESTRICTIONS FOR INTEGRATED CIRCUITS
Mitigation of controls set packing includes generating an Observability Don't Care (ODC) expression for a target register of a circuit design. The target register has an original reset signal that is a constant. A plurality of supports of the ODC expression that are driven by driver registers are grouped into a plurality of groups. Each group of the plurality of groups includes only supports that are driven by driver registers having a same reset signal. A control set of each group is different from a control set of the target register. The reset signal of a selected group of the plurality of groups is designated as a candidate reset signal for the target register based on an evaluation of the ODC expression. The circuit design is modified by connecting the candidate reset signal to the target register in place of the original reset signal.
Disclosed approaches for rendering event data from subsystems in different clock domains according to a system-level timeline include, for each of multiple subsystems, sampling a system timer in a first clock domain for a first timestamp by a host processor. A host processor requests a subsystem timestamp from a subsystem timer in each of the subsystems. The subsystem timestamp is associated with the first timestamp, and the subsystem timer operates in a clock domain different from the first clock domain. The host processor translates timestamps in traced event data of the subsystems to a timeline of the system timer using the first timestamp and associated subsystem timestamps.
A network-on-chip (NoC) includes a switch. The switch includes a first sub-switch, a second sub-switch, and a synchronization channel coupled to the first sub-switch and the second sub-switch. The first sub-switch and the second sub-switch are coupled to corresponding sub-switches in at least one other switch included in the NoC. Each of the first sub-switch and the second sub-switch includes ports in north, south, east, and west directions. The first sub-switch and the second sub-switch exchange flits of data through an additional port of the first sub-switch coupled to an additional port of the second sub-switch.
H04L 49/109 - Éléments de commutation de paquets caractérisés par la construction de la matrice de commutation intégrés sur micropuce, p. ex. interrupteurs sur puce
79.
SYSTEMS AND METHODS FOR PARALLELIZATION OF EMBEDDING OPERATIONS
A disclosed method may include initializing a deep learning recommendation model (DLRM) comprising a plurality of embedding tables, each embedding table comprising a plurality of embeddings. The method may also include receiving input data associated with accessing embeddings from the plurality of embedding tables and applying a parallelization strategy to process the plurality of embedding tables, the parallelization strategy configured to improve performance by distributing computational workloads and optimizing memory access. The method may also include processing the embeddings based on the input data in accordance with the parallelization strategy, the processing comprising aggregating embeddings accessed from the plurality of embedding tables. The method may also include generating, for further processing, output data based on the processed embeddings. Various other methods, systems, and computer-readable media are also disclosed.
Techniques are described for fine-tuning a neural network. A plurality of fine-tuning layers of a neural network are executed, each corresponding to a respective reference layer of a reference neural network. For each of the fine-tuning layers, a fine-tuning weight matrix is generated based on a reference weight matrix associated with the corresponding reference layer. One or more weights of the fine-tuning weight matrix are then iteratively adjusted based on a comparison of the output of the fine-tuning layer with the output of the corresponding reference layer.
A method includes a method includes receiving, by a compiler of a host of a computing system, input code, generating, by the compiler, pipelined input code by adding first tokens in a loop iteration argument field of a loop in the input code to pipeline the loop, the first tokens configured to sequentialize and serialize loop operations, a quantity of the first tokens based on a quantity of pipeline stages, and providing, by the host, the pipelined input code to a controller of an integrated circuit (IC) of the computing system.
Techniques for substrate noise isolation structures for electronic devices are provided. The disclosed techniques greatly reduce substrate noise induced by circuits in integrated circuits (ICs) that include Fin Field Effect Transistors (FinFETs). In an example, an electronic device is provided that includes a first circuit and a second circuit formed on a substrate, a first guard structure formed in the substrate, and a plurality of vias formed through the substrate. The first guard structure formed in the substrate is disposed between the first circuit and the second circuit. The plurality of vias formed through the substrate contact the first guard structure.
H01L 21/768 - Fixation d'interconnexions servant à conduire le courant entre des composants distincts à l'intérieur du dispositif
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
H01L 23/522 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre comprenant des interconnexions externes formées d'une structure multicouche de couches conductrices et isolantes inséparables du corps semi-conducteur sur lequel elles ont été déposées
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
Disclosed herein are thermal management devices and electronic devices that utilized a plurality of plunger assemblies to route heat efficiently from chip packages. In some examples, the thermal management devices may also be used in electronic devices to route heat efficiently from power delivery layer residing below chip packages. In one example, a thermal management device is provided that includes a plurality of plunger assemblies retained to a metal plate. Each plunger assembly includes a metal body extending normally through an aperture formed between first and second sides of the plate and a spring biasing a distal end of the metal body away from the second side of the plate.
Embodiments herein describe a method for selectively filtering different wavelengths of optical signals received from an optical channel using cascaded ring resonators, each of the cascaded ring resonators having a first ring and a second ring. The first ring has a varying waveguide width along its length configured to form a first waveguide width portion and a second waveguide width portion, the first waveguide width portion having a greater width than the second waveguide width portion. The second ring has a varying waveguide width along its length configured to form a third waveguide width portion and a fourth waveguide width portion, the fourth waveguide width portion having a greater width than the third waveguide width portion. The method further connects receivers to respective cascaded ring resonators, each of the receivers having a photodetector configured to differentiate between the optical signals.
G02B 6/293 - Moyens de couplage optique ayant des bus de données, c.-à-d. plusieurs guides d'ondes interconnectés et assurant un système bidirectionnel par nature en mélangeant et divisant les signaux avec des moyens de sélection de la longueur d'onde
Disclosed herein are thermal management devices and electronic devices that utilized a plurality of plunger assemblies to route heat efficiently from chip packages. In some examples, the thermal management devices may also be used in electronic devices to route heat efficiently from power delivery layer residing below chip packages. In one example, a thermal management device is provided that includes a plurality of plunger assemblies retained to a metal plate. Each plunger assembly includes a metal body extending normally through an aperture formed between first and second sides of the plate and a spring biasing a distal end of the metal body away from the second side of the plate.
H01L 23/46 - Dispositions pour le refroidissement, le chauffage, la ventilation ou la compensation de la température impliquant le transfert de chaleur par des fluides en circulation
H01L 23/433 - Pièces auxiliaires caractérisées par leur forme, p. ex. pistons
H01L 23/52 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
Examples herein describe polynomial root search circuitry. The polynomial root search circuitry includes a search circuit configured to identify distinct roots of a first locator polynomial using parallel processing elements. A first subset of the parallel processing elements is configured to output terms of a second locator polynomial based on a first candidate root of the second locator polynomial. A second subset of the parallel processing elements is configured to output the terms of the second locator polynomial based on a second candidate root of the second locator polynomial.
G06F 7/556 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul de fonctions logarithmiques ou exponentielles
G06F 7/552 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul de puissances ou racines
A method for probing power contact pads on an integrated circuit (IC) die are disclosed. The method includes depositing a probing bump over multiple vias. The vias may be directly exposed or include an exposed contact pad. The method also includes forming a probing bump over and in electric contact with multiple vias. Optionally, the probing bump may be removed after probing.
Approaches for determining quantization scale factors include generating a population of chromosomes. Each chromosome has multiple genes, and each gene specifies a scale factor associated with a layer of a machine learning model. The population of chromosomes are evaluated, and the evaluating includes, for each chromosome in the population, quantizing floating point weights and floating point values of a representative dataset using the scale factors of the chromosome to produce quantized weights and a quantized dataset in the memory arrangement, initiating processing of the quantized dataset using the quantized weights according to the machine learning model, and gauging a level of accuracy of results produced by the processing of the quantized dataset. Satisfaction of termination criteria is determined based the levels of accuracy associated with the chromosomes in the population. The population of chromosomes is evolved and the evaluating repeated in response to the termination criteria not being satisfied.
Embodiments herein describe a method for selectively filtering different wavelengths of optical signals received from an optical channel using cascaded ring resonators, each of the cascaded ring resonators having a first ring and a second ring. The first ring has a varying waveguide width along its length configured to form a first waveguide width portion and a second waveguide width portion, the first waveguide width portion having a greater width than the second waveguide width portion. The second ring has a varying waveguide width along its length configured to form a third waveguide width portion and a fourth waveguide width portion, the fourth waveguide width portion having a greater width than the third waveguide width portion. The method further connects receivers to respective cascaded ring resonators, each of the receivers having a photodetector configured to differentiate between the optical signals.
G02B 6/293 - Moyens de couplage optique ayant des bus de données, c.-à-d. plusieurs guides d'ondes interconnectés et assurant un système bidirectionnel par nature en mélangeant et divisant les signaux avec des moyens de sélection de la longueur d'onde
90.
MODULAR INTERCONNECT FOR AN INTEGRATED CIRCUIT DEVICE
An integrated circuit device includes a network-on-chip (NoC). Connections for the NoC are generated from a circuit design for the corresponding integrated circuit device. Connections within the NoC are generated by analyzing the circuit design to detect a first connection attribute. The first connection attribute defines a first NoC master unit (NMU) and a first NoC slave unit (NSU). Further, a first NoC configuration is generated. The first NoC configuration includes the connections determined based on the first NMU and the first NSU.
Embodiments herein describe a multiple die system that includes an interposer that connects a first die to a second die. Each die has a bump interface structure that is connected to the other structure using traces in the interposer. However, the bump interface structures may have different orientations relative to each other, or one of the interface structures defines fewer signals than the other. Directly connecting the corresponding signals defined by the structures to each other may be impossible to do in the interposer, or make the interposer too costly. Instead, the embodiments here simplify routing in the interposer by connecting the signals in the bump interface structures in a way that simplifies the routing but jumbles the signals. The jumbled signals can then be corrected using reordering circuitry in the dies (e.g., in the link layer and physical layer).
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
G11C 5/06 - Dispositions pour interconnecter électriquement des éléments d'emmagasinage
H01L 23/538 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre la structure d'interconnexion entre une pluralité de puces semi-conductrices se trouvant au-dessus ou à l'intérieur de substrats isolants
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
92.
Dynamic adjustment of floating point exponent bias for exponent compression
Approaches for compressing exponents of floating point values include accumulating a distribution of values of exponents of the first set of floating point values, and compressing the exponents of the first set of floating point values into a compressed exponent bit-width as a function of a compressed exponent bias. The compressed exponent bit-width and the compressed exponent bias are adjusted based on the distribution of values of exponents of the first set of floating point values. The distribution of values of exponents of the first set of floating point values is accumulated with values of exponents of a second set of floating point values that is input in subsequent time period. The exponents of second set of floating point values are compressed into the compressed exponent bit-width as a function of the compressed exponent bias after the adjusting of the compressed exponent bit-width and the compressed exponent bias.
G06F 7/483 - Calculs avec des nombres représentés par une combinaison non linéaire de nombres codés, p. ex. nombres rationnels, système de numération logarithmique ou nombres à virgule flottante
G06F 7/499 - Maniement de valeur ou d'exception, p. ex. arrondi ou dépassement
H03M 7/24 - Conversion en, ou à partir de codes à virgule flottante
H03M 7/30 - CompressionExpansionÉlimination de données inutiles, p. ex. réduction de redondance
93.
PROTECTION OF A CIRCUIT DESIGN WITHIN A DESIGN CONTAINER
A key block can be generated from a session key used by a computer-based design tool for a circuit design by encrypting the session key using computer hardware. The key block can be divided, by the computer hardware, into a plurality of sub-blocks. A plurality of enhanced sub-blocks can be generated by the computer hardware by encrypting each sub-block of the plurality of sub-blocks with a different key of a plurality of keys corresponding to a plurality of Intellectual Property (IP) cores of the circuit design. The plurality of enhanced sub-blocks can be stored in a memory.
G06F 21/72 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information dans les circuits de cryptographie
G06F 30/392 - Conception de plans ou d’agencements, p. ex. partitionnement ou positionnement
G06F 115/08 - Blocs propriété intellectuelle [PI] ou cœur PI
Low-latency gigabit transceiver PHY-based signal switching for emulation, prototyping, and high performance computing (HPC) in a computing platform that includes multiple ICs, where a first one of the ICs includes functional circuitry, a receiver that receives a signal from a second one of the ICs, a transmitter that transmits outgoing data to a third one of the ICs, and a bypass circuit that provides an output of the receiver to one of the functional circuitry and the transmitter (e.g., based on a destination address). The bypass circuit may bypass the functional circuitry, and may further bypass a receive-side media access controller (MAC) and a transmit-side MAC. The IC may multiplex outgoing data to the transmitters. Selectable functions of PHY circuitry may be disabled in bypass mode. The ICs may include field-programmable gate arrays, which may be programmed to emulate respective partitions of a circuit design and/or to perform other functions.
Examples herein describe alignment detection circuitry. The alignment detection circuitry includes a buffer, a first set of correlators, and a second set of correlators. The buffer is configured to output a data stream of multiplexed groups of symbols from multiple data lanes. The first set of correlators is configured to search a candidate data lane of the data stream for bits matching bits of a reference alignment marker based on a first search method. The second set of correlators is configured to search the candidate data lane of the data stream for bits matching the bits of the reference alignment marker based on a second search method.
Computer-based co-simulation includes simulating a circuit design and a co-simulation model configured to model circuitry that operates in coordination with a hardware implementation of the circuit design. In response to a request for a data transfer received by the co-simulation model from the circuit design, a ready signal is provided from the co-simulation model to the circuit design after a first predetermined number of simulation clock cycles corresponding to an initiation interval of the circuitry modeled by the co-simulation model. In response to receiving state information for the data transfer, a response from the co-simulation model is provided to the circuit design after a second predetermined number of simulation clock cycles corresponding to a response time of the circuitry modeled by the co-simulation model.
Methods for fabricating an integrated circuit (IC) device, an IC die configured for probe testing, and an IC device are described therein. In one example, the method includes: forming a conductive cap above and in electrical contact with two or more of a pillars, each pillar coupled to a power contact pads of an IC die, removing the cap after testing; and depositing a hybrid bonding layer over the IC die device, the hybrid bonding layer having hybrid bond pads coupled the plurality of power contact pads and the signal contact pads of the IC die.
H01L 21/66 - Test ou mesure durant la fabrication ou le traitement
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
98.
DYNAMIC DATA CONVERSION FOR NETWORK COMPUTER SYSTEMS
A computing node for a computing system includes a processor, conversion circuitry, and routing circuitry. The processor generates a data signal based on a function of an application executed by the computing system. The data signal has a first precision format and a first sparse representation. The conversion circuitry receives the data signal from the processor and generate a converted data signal by at least one of converting the first precision format to a second precision format and converting the first sparse representation to a second sparse representation. The routing circuitry transmits the converted data signal to switch circuitry of the computing system.
A computer-implemented method for task management can include managing performance of a task on a message by a plurality of circuits. In some aspects, the task can comprise a sequence of processings to be performed on the message and each circuit of the plurality of circuits performing a processing of the sequence of processings. In some aspects, the method can include routing, based on the sequence, a first information regarding the task to a first circuit of the plurality of circuits to perform a first processing of the sequence of processings on the message; receiving, from the first circuit, an output of the first processing; and routing, based on the sequence of processings identified for the task, a second information regarding the task to a second circuit of the plurality of circuits to perform a second processing that follows the first processing in the sequence of processings.
A memory includes a read circuit having a first primitive configured to output a first data item based on least significant bits (LSBs) of a read address and a multiplexer coupled to the primitive. The multiplexer outputs a selected bit from the first data item as read data based on most significant bits (MSBs) of the read address. The memory includes a write circuit having a second primitive that outputs a second data item based on LSBs of a write address and a modifier circuit that generates a third data item by modifying a bit of the second data item to correspond to write data. The bit is at a location within the second data item selected based on MSBs of the write address. The modifier circuit writes the third data item to a location in the write primitive based on the LSBs of the write address.
G06F 30/327 - Synthèse logiqueSynthèse de comportement, p. ex. logique de correspondance, langage de description de matériel [HDL] à liste d’interconnections [Netlist], langage de haut niveau à langage de transfert entre registres [RTL] ou liste d’interconnections [Netlist]
G06F 30/323 - Traduction ou migration, p. ex. logique à logique, traduction de langage descriptif de matériel ou traduction de liste d’interconnections [Netlist]