An event aggregation fabric, implemented over an Integrated Circuit (IC) may include a Directed Acyclic Graph (DAG), including one or more aggregation nodes. Each aggregation node may receive input cue signals, originating from one or more source modules in the IC via respective one or more lean (e.g., single-wire) connections, indicate occurrence of events in respective source modules. The aggregation node may maintain a deficit count of the input cue signals, and generate an output cue signal, based on the deficit count. When the aggregation node is a terminal node of the DAG, it may transfer the output cue signal as an aggregated indication, representing occurrence of events in the source modules, to a target module in the IC. Otherwise, the aggregation node may transfer the output cue signal via lean connection to a subsequent aggregation node of the DAG, towards the terminal node.
An Integrated Circuit (IC) device, and a method of utilizing thereof, may include: a plurality of Processing Elements (PEs), each comprising one or more configurable hardware logic blocks. The IC may further include a plurality of configuration memory elements, each associated with a respective PE, and adapted to maintain two or more configuration settings of the respective PE. The IC may further include a configuration manager circuit, configured to: receive a reconfiguration instruction, dictating a required function of the IC device; based on the reconfiguration instruction, identify at least one target PE of the plurality of PEs as a target for reconfiguration; based on the required function, select a specific configuration setting in the configuration memory element associated with the at least one target PE; and reconfigure at least one hardware logic block of the at least one target PE, according to the selected configuration setting.
A system for accessing memory, comprising: transformation circuitry configured to: receive a memory access request; access a transformation mode value associated with the memory access request and indicative of an address transformation function; apply the address transformation function, indicated by the transformation mode value, to a memory address of the memory access request to compute a transformed memory address; and generate a new memory access request using the memory access request and the transformed memory address; and at least one memory area configured to serve the new memory access request according to the transformed memory address.
An apparatus for computing functions using polynomial-based approximation, comprising one or more processing circuitries configured for computing a polynomial-based approximant approximating a function by executing one or more iterations. Each iteration comprising computing the polynomial-based approximant using scaled fixed-point unit(s) according to a constructed set of coefficients, minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with one or more constraints selected from a group comprising at least: an accuracy, a compute graph size, a computation complexity, and a hardware utilization of the processing circuitry(s), adjusting one or more of the coefficients in case the approximation error is incompliant with the constraint(s) and initiating another iteration. The polynomial-based approximant and its adjusted set of coefficients for which the computed polynomial-based approximant complies with the constraint(s) may be output to one or more processing circuitries configured to approximate the function by computing the polynomial-based approximant.
An apparatus for computing, comprising a processing circuitry configured for computing an outcome of executing a set of computer instructions comprising a group of data variables, by: identifying an initial state of the processing circuitry; executing a set of anticipated computer instructions produced based on the set of computer instructions and a likely data value, where the likely data value is a value of one the group of data variables anticipated to be computed by executing the set of computer instructions and computed using at least one program data value; and when identifying, while executing the set of anticipated computer instructions, a failed prediction where the data variable is not equal to the likely data value: restoring the initial state of the processing circuitry; and executing a set of alternative computer instructions, produced based on the set of computer instructions and the at least one likely data value.
A digital hardware circuit and method for performing reduction operations that achieves constant timing depth regardless of input count. The invention replaces conventional sequential binary tree approaches with a parallel matrix comparison architecture where multiple comparator circuits simultaneously compare input signals against each other. Combinatorial logic circuits process comparison outputs to generate dominance signals indicating which input satisfies the reduction criteria, and selection logic outputs the final result. The parallel approach maintains exactly three logic levels regardless of number of processing inputs, enabling significantly higher clock frequencies than conventional methods whose timing depth increases logarithmically with input count. Applications include matrix multiplication engines, floating-point arithmetic units, and artificial intelligence accelerators where reduction operations for maximum/minimum finding, normalization, and other computations are performed millions of times per second. The constant timing depth enables operation at frequencies exceeding 1 GHz while providing scalable performance.
G06F 5/01 - Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
H03K 19/20 - Logic circuits, i.e. having at least two inputs acting on one outputInverting circuits characterised by logic function, e.g. AND, OR, NOR, NOT circuits
7.
JOINT SCHEDULER FOR HIGH BANDWIDTH MULTI-SHOT PREFETCHING
A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.
G06F 12/0862 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
G06F 12/0897 - Caches characterised by their organisation or structure with two or more cache hierarchy levels
8.
Device and method of computing an output value of a mathematical function, and method of designing an integrated circuit for the same
The present invention relates to the technological field of microelectronics and electronic engineering, specifically to advanced liquid-cooling-based thermal management systems for electronic components on printed circuit boards. The present invention represents a liquid cooling assembly that provides an improvement to the technological field of microelectronics and electronic engineering, by providing an effective cooling solution for systems employing vertical power delivery. Specifically, it provides liquid cooling for electronic components located on both sides of the PCB while fitting within the spatial constraints of modem electronic devices. The invention further provides a computing device that includes such an improved liquid cooling assembly, thereby improving the aforementioned technological field by increasing device operation efficiency, specifically enabling prolonged and stable operation in computationally intensive regimes due to highly efficient thermal management.
A system for executing a software program comprising processing units and a hardware processor configured to: for at least one set of blocks, each set comprising a calling block and a target block of an intermediate representation of the software program, generate control-transfer information describing at least one value of the software program at an exit of the calling block (out-value) and at least one other value of the software program at an entry to the target block (in-value); select a set of blocks according to at least one statistical value collected while executing the software program; generate a target set of instructions using the target block and the control-transfer information; generate a calling set of instructions using the calling block and the control-transfer information; configure a calling processing unit to execute the calling set of instructions; and configure a target processing unit to execute the target set of instructions.
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPU), peripheral component interface (PCI) accelerator cards, RAM [random access memory] card, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits); microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; recorded and/or downloadable computer software and/or downloadable open-source computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading.
12.
MATCHING BINARY CODE TO INTERMEDIATE REPRESENTATION CODE
A system for generating executable code of a software program that is matched with an intermediate representation (IR) of a source code of the software program. The system comprises a processor adapted for adding one or more annotation entries, each for a location in the IR, to program data in the IR. An internal annotation entry is generated for an internal location in the IR that is not referenced by an IR symbol in the global IR symbol table of the IR. The processor is further adapted for compiling the IR to produce a binary object comprising the annotation entries, and providing the binary object to a linker or to a dynamic loader to update in an executable object an executable internal annotation entry associated with an internal annotation entry to reference a run-time location in the executable object.
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPU) peripheral component interface (PCI) accelerator cards, Compute Express Link (CXL) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits); microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; recorded and/or downloadable computer software and/or downloadable open-source computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
14.
Liquid cooling assembly for electronic components on printed circuit boards and computing device including same
The present invention relates to the technological field of microelectronics and electronic engineering, specifically to advanced liquid-cooling-based thermal management systems for electronic components on printed circuit boards. The claimed invention represents a liquid cooling assembly and a computing device that provide an improvement to the technological field of microelectronics and electronic engineering. Specifically, the suggested solution is easily adjustable to mitigate mechanical tolerances between the different electrical components of the target PCB and adaptable to address the diverse thermal loads presented by them, while maintaining the compactness of the design. The suggested solution thereby increases the overall heat dissipation efficiency of thermal management systems.
A system for executing multiple concurrent threads, comprising: context storages, each configured to store thread contexts, each context for one of the multiple threads, each of the context storages associated with an operation of the threads; and processing circuitry configured to: while a first and a second thread are executed simultaneously by the circuitry: store a first context of the first thread in a first storage, identified in the first storage by a value; store a second context of the second thread in a second storage, identified in the second storage by the value; and upon completing execution of a first operation of the first thread, the operation associated with the first storage, when applying a test to the value indicates that the value is available in the second storage, store the first context in the second storage, the first context identified in the second storage by the value.
A method for cache coherency in a reconfigurable cache architecture is provided. The method includes receiving a memory access command, wherein the memory access command includes at least an address of a memory to access; determining at least one access parameter based on the memory access command; and determining a target cache bin for serving the memory access command based in part on the at least one access parameter and the address.
A system for processing a plurality of concurrent threads comprising: a reconfigurable processing grid, comprising logical elements and a context storage for storing thread contexts, each thread context for one of a plurality of concurrent threads, each implementing a dataflow graph comprising an identified operation; and a hardware processor configured for configuring the at reconfigurable processing grid for: executing a first thread of the plurality of concurrent threads; and while executing the first thread: storing a runtime context value of the first thread in the context storage; while waiting for completion of the identified operation by identified logical elements, executing the identified operation of a second thread by the identified logical element; and when execution of the identified operation of the first thread completes: retrieving the runtime context value of the first thread from the context storage; and executing another operation of the first thread.
A method of computing logarithms, comprising receiving a number, computing an exponent and significand of the received number, selecting a breakpoint value from a plurality of breakpoint values segmenting a range of the significand wherein the selected breakpoint value is the significand's greatest lower bound or lowest upper bound, computing a multiplication of the exponent and a logarithm value of two, computing a first intermediate value based on a least significant portion of the significand and an inverse value of the selected breakpoint value, computing an approximated logarithm value of a second intermediate value derived from the first intermediate value, computing a logarithm value of the significand by summing the approximated logarithm value and a logarithm value of the selected breakpoint value, computing a logarithm value of the received number by summing the logarithm value of the significand and the multiplication of the exponent and the logarithm value of two.
A method of generating automatically architecture-specific algorithms, comprising receiving an architecture independent algorithm and one or more algorithm parameters defining at least a target processing architecture and a format of an output of an architecture-specific algorithm implementing the received algorithm, determining automatically a functionality of the algorithm by analyzing the algorithm, selecting one or more architecture-specific computing blocks of the target processing architecture according to the functionality of the algorithm and the algorithm parameter(s) wherein each computing block is dynamically reconfigurable in runtime and associated with (1) simulation code simulating its functionality, and (2) execution code executing its functionality, testing an emulated architecture-specific algorithm constructed using the simulation code of the selected architecture-specific computing block(s) to verify compliance with the algorithm parameter(s), and, responsive to successful compliance verification, generating automatically an architecture-specific code segment implementing the architecture-specific algorithm based on the execution code of the selected architecture-specific computing blocks.
A method of processing incoming packets prior to complete reception, comprising receiving a pointer to one or more memory blocks allocated for storing one or more incoming packets to be written by one or more another controllers where each packet comprises one or more packet segments, determining all valid data values of fields contained in the packet segments, initializing one or more memory sections in the memory blocks which are mapped to the fields with predefined data pattern which are different from any of the valid values of the fields, checking continuously content of the memory sections, determining packet segment(s) were written in the memory block(s) responsive to detecting that the content of one or more of the memory sections do not match the one or more predefined data patterns, and processing one or more of the packets according to at least part of the received packet segment(s).
A system for accessing memory comprising a memory management component configured to: mark each of a plurality of memory areas as pending in response to identifying at least one data retrieval instruction directed towards a target memory comprising the plurality of memory areas, each memory area associated with a range of memory addresses that is mapped thereto; and while at least one of the plurality of memory areas is marked as pending: remove the marking as pending for at least one first memory area of the plurality of memory areas upon the at least one first memory area being ready for access; and access at least one first value in the at least one first memory area in response to at least one first memory access instruction, subject to the removal of the marking as pending of the at least one first memory area; and a manufacturing processes thereof.
G05B 19/418 - Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
22.
SYSTEM AND METHOD OF MANAGING MEMORY ACCESS AMONG ONE OR MORE COMPUTING ENTITIES
A method, a system and an accelerator for managing memory access among one or more computing entities may continuously monitor data access to at least one memory module, associated with a respective source computing entity. Based on the monitoring, embodiments may identify a memory area of the source memory module that contains a predetermined quantity of data that is expected to be used by a sink computing entity, and transmit a first version of content of the identified memory area to the sink computing entity. Embodiments may then identify an explicit request, from the sink computing entity to the source computing entity, for accessing data of the identified memory area. Embodiments may subsequently calculate a difference between the first version of content of the identified memory area and a current content of the identified memory area, and transmit the calculated difference to the requesting sink computing entity.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPUs), and graphic processing units (GPUs), Tensor Processing Units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; downloadable computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all of the aforesaid goods for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading. Computer design, engineering, development, implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPUs), graphic processing units (GPUs), and tensor processing units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing; temporary use of non-downloadable computer software, namely, to computer application software for processing acceleration and for managing hardware or software components used for processing; all of the aforesaid services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPUs), and graphic processing units (GPUs), Tensor Processing Units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed data processing; downloadable computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for data processing; all of the foregoing for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading Computer design, engineering, development, implementation of computer technologies and technology consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPUs), graphic processing units (GPUs), and tensor processing units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed data processing; providing temporary use of non-downloadable computer software, namely, to computer application software for processing acceleration and for managing hardware or software components used for data processing provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPUs), and graphic processing units (GPUs), Tensor Processing Units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed data processing; downloadable computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for data processing; all of the foregoing for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
26.
Joint scheduler for high bandwidth multi-shot prefetching
A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.
G06F 12/08 - Addressing or allocationRelocation in hierarchically structured memory systems, e.g. virtual memory systems
G06F 12/0862 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
G06F 12/0897 - Caches characterised by their organisation or structure with two or more cache hierarchy levels
27.
Reconfigurable integrated circuit (IC) device and a system and method of configuring thereof
An Integrated Circuit (IC) device, and a method of utilizing thereof, may include: a plurality of Processing Elements (PEs), each comprising one or more configurable hardware logic blocks. The IC may further include a plurality of configuration memory elements, each associated with a respective PE, and adapted to maintain two or more configuration settings of the respective PE. The IC may further include a configuration manager circuit, configured to: receive a reconfiguration instruction, dictating a required function of the IC device; based on the reconfiguration instruction, identify at least one target PE of the plurality of PEs as a target for reconfiguration; based on the required function, select a specific configuration setting in the configuration memory element associated with the at least one target PE; and reconfigure at least one hardware logic block of the at least one target PE, according to the selected configuration setting.
There is provided a method, comprising simultaneously presenting in a GUI, a source code and an interactive graph of nodes connected by edges representing the source code mapped to physical configurable elements of computational cluster(s) of a processor each configurable to execute mathematical operations, each node represents operation(s) mapped to physical configurable elements, and edges represent dependencies between the operations, mapped to physical dependency links between the configurable elements, receiving, via the GUI, a user selection of a portion of the source code, determining node(s) and/or edge(s) of the interactive graph corresponding to the portion, and updating the GUI for visually distinguishing the node(s) and/or edge(s), wherein the visually distinguished node(s) represents a mapping to certain physical configurable elements and the visually distinguished edge(s) represents certain dependency links between the certain physical configurable elements of the processor configured to execute the user selected portion of the source code.
An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.
A method of generating automatically architecture-specific algorithms, comprising receiving an architecture independent algorithm and one or more algorithm parameters defining at least a target processing architecture and a format of an output of an architecture-specific algorithm implementing the received algorithm, determining automatically a functionality of the algorithm by analyzing the algorithm, selecting one or more architecture-specific computing blocks of the target processing architecture according to the functionality of the algorithm and the algorithm parameter(s) wherein each computing block is dynamically reconfigurable in runtime and associated with (1) simulation code simulating its functionality, and (2) execution code executing its functionality, testing an emulated architecture-specific algorithm constructed using the simulation code of the selected architecture-specific computing block(s) to verify compliance with the algorithm parameter(s), and, responsive to successful compliance verification, generating automatically an architecture-specific code segment implementing the architecture-specific algorithm based on the execution code of the selected architecture-specific computing blocks.
A hardware acceleration circuit, comprising a communication interface for connecting to one or more event-driven circuits, a memory, an event handling circuit, and a hardware acceleration engine. The event handling circuit is adapted to detect one or more events triggered by one or more of the event-driven circuits, update one or more pointers pointing to one or more event handling routines stored in the memory and to a context memory segment in the memory storing a plurality of context parameters relating to the one or more events, and transmit the pointer(s) to the hardware acceleration engine. The hardware acceleration engine is adapted to receive the pointer(s) from the event handling circuit, and execute the event handling routine(s) pointed by the pointer(s) to process data relating to the event(s) according to at least some of the context parameters retrieved from the context memory segment using the pointer(s).
A method for caching memory comprising caching two data values, each of one of two ranges of application memory addresses, each associated with one of a set of threads, by: organizing a plurality of sequences of consecutive address sub-ranges in an interleaved sequence of address sub-ranges by alternately selecting, for each thread in an identified order of threads, a next sub-range in the respective sequence of sub-ranges associated therewith; generating a mapping of the interleaved sequence of sub-ranges to a range of physical memory addresses in order of the interleaved sequence of sub-ranges; and when a thread accesses an application memory address of the respective range of application addresses associated thereof: computing a target address according to the mapping using the application address; and storing the two data values in one cache-line of a plurality of cache-lines of a cache by accessing the physical memory area using the target address.
A system for executing a software program comprising processing units and a hardware processor configured to: for at least one set of blocks, each set comprising a calling block and a target block of an intermediate representation of the software program, generate control-transfer information describing at least one value of the software program at an exit of the calling block (out-value) and at least one other value of the software program at an entry to the target block (in-value); select a set of blocks according to at least one statistical value collected while executing the software program; generate a target set of instructions using the target block and the control-transfer information; generate a calling set of instructions using the calling block and the control-transfer information; configure a calling processing unit to execute the calling set of instructions; and configure a target processing unit to execute the target set of instructions.
A system for processing a plurality of concurrent threads comprising: a reconfigurable processing grid, comprising logical elements and a context storage for storing thread contexts, each thread context for one of a plurality of concurrent threads, each implementing a dataflow graph comprising an identified operation; and a hardware processor configured for configuring the at reconfigurable processing grid for: executing a first thread of the plurality of concurrent threads; and while executing the first thread: storing a runtime context value of the first thread in the context storage; while waiting for completion of the identified operation by identified logical elements, executing the identified operation of a second thread by the identified logical element; and when execution of the identified operation of the first thread completes: retrieving the runtime context value of the first thread from the context storage; and executing another operation of the first thread.
A method of processing incoming packets prior to complete reception, comprising receiving a pointer to one or more memory blocks allocated for storing one or more incoming packets to be written by one or more another controllers where each packet comprises one or more packet segments, determining all valid data values of fields contained in the packet segments. initializing one or more memory sections in the memory blocks which are mapped to the fields with predefined data pattern which are different from any of the valid values of the fields, checking continuously content of the memory sections, determining packet segment(s) were written in the memory block(s) responsive to detecting that the content of one or more of the memory sections do not match the one or more predefined data patterns, and processing one or more of the packets according to at least part of the received packet segment(s).
There is provided a memory, comprising: issuing an allocation operation for allocation of a region of a memory by a first process of a plurality of first processes executed in parallel on a first processor, sending a message to a second processor indicating the allocation of the region of the pool of the memory, issuing a free operation for release of the allocated region of the pool of the memory by a second process of a plurality of second processes executed in parallel on a second processor, and releasing, by the first processor, the allocated region of the pool of the memory as indicated in the free operation, wherein a same region of memory is allocated by the first process and released by the second process, wherein the first processes are concurrently attempting to issue the allocation operation and the second processes are concurrently attempting to issue the free operation.
An apparatus for computing functions using polynomial-based approximation, comprising one or more processing circuitries configured for computing a polynomial-based approximant approximating a function by executing one or more iterations. Each iteration comprising computing the polynomial-based approximant using scaled fixed-point unit(s) according to a constructed set of coefficients, minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with one or more constraints selected from a group comprising at least: an accuracy, a compute graph size, a computation complexity, and a hardware utilization of the processing circuitry(s), adjusting one or more of the coefficients in case the approximation error is incompliant with the constraint(s) and initiating another iteration. The polynomial-based approximant and its adjusted set of coefficients for which the computed polynomial-based approximant complies with the constraint(s) may be output to one or more processing circuitries configured to approximate the function by computing the polynomial-based approximant.
A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.
G06F 12/08 - Addressing or allocationRelocation in hierarchically structured memory systems, e.g. virtual memory systems
G06F 12/0862 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
G06F 12/0897 - Caches characterised by their organisation or structure with two or more cache hierarchy levels
39.
Graphical user interface for code to dataflow graph representation
There is provided a method, comprising simultaneously presenting in a GUI, a source code and an interactive graph of nodes connected by edges representing the source code mapped to physical configurable elements of computational cluster(s) of a processor each configurable to execute mathematical operations, each node represents operation(s) mapped to physical configurable elements, and edges represent dependencies between the operations, mapped to physical dependency links between the configurable elements, receiving, via the GUI, a user selection of a portion of the source code, determining node(s) and/or edge(s) of the interactive graph corresponding to the portion, and updating the GUI for visually distinguishing the node(s) and/or edge(s), wherein the visually distinguished node(s) represents a mapping to certain physical configurable elements and the visually distinguished edge(s) represents certain dependency links between the certain physical configurable elements of the processor configured to execute the user selected portion of the source code.
A system for generating executable code of a software program that is matched with an intermediate representation (IR) of a source code of the software program. The system comprises a processor adapted for adding one or more annotation entries, each for a location in the IR, to program data in the IR. An internal annotation entry is generated for an internal location in the IR that is not referenced by an IR symbol in the global IR symbol table of the IR. The processor is further adapted for compiling the IR to produce a binary object comprising the annotation entries, and providing the binary object to a linker or to a dynamic loader to update in an executable object an executable internal annotation entry associated with an internal annotation entry to reference run-time location in the executable object.
There is provided a computer implemented method of allocation of memory, comprising: issuing an allocation operation for allocation of a region of a pool of a memory by a first process executed on a first processor, sending a message to a second processor indicating the allocation of the region of the pool of the memory, wherein the first processor and the second processor access the region of the pool of the memory, issuing a free operation for release of the allocated region of the pool of the memory by a second process executed on a second processor, and releasing, by the first processor, the allocated region of the pool of the memory as indicated in the free operation, wherein the region of the pool of the memory allocated by the first process and released by the second process is a same region of memory.
There is provided a device for allocation of memory configured for: in response to a request for allocation of a region of a pool of a memory by a process being executed by a processor, a memory allocator is configured to perform in a single atomic operation: read a data structure indicating a state of allocation of the pool, check the data structure for a condition, update the data structure according to an outcome of the check, and return an address of the allocated region of the memory.
A system for executing a plurality of software threads, comprising: a plurality of processing circuitries; a plurality of memory areas connected to the processing circuitries, each memory area associated with at least one of the processing circuitries; and at least one hardware processor, connected to the processing circuitries and configured for: in each of a plurality of iterations: while the processing circuitries execute the software threads, collecting for each thread a plurality of thread statistical values indicative of a plurality of memory accesses to at least some of the memory areas performed when executing the thread; for at least one thread, performing an analysis comprising the thread statistical values thereof to identify a preferred memory area of the plurality of memory areas; and configuring one of the at least one processing circuitry associated with the preferred memory area to execute the at least one thread.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPUs), and graphic processing units (GPUs), Tensor Processing Units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; downloadable computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading Computer design, engineering, development, implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPUs), graphic processing units (GPUs), and tensor processing units (TSPs), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing; temporary use of non-downloadable computer software, namely, to computer application software for processing acceleration and for managing hardware or software components used for processing; provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
45.
Executing concurrent threads on a reconfigurable processing grid
A system for processing a plurality of concurrent threads comprising: a reconfigurable processing grid, comprising logical elements and a context storage for storing thread contexts, each thread context for one of a plurality of concurrent threads, each implementing a dataflow graph comprising an identified operation; and a hardware processor configured for configuring the at reconfigurable processing grid for: executing a first thread of the plurality of concurrent threads; and while executing the first thread: storing a runtime context value of the first thread in the context storage; while waiting for completion of the identified operation by identified logical elements, executing the identified operation of a second thread by the identified logical element; and when execution of the identified operation of the first thread completes: retrieving the runtime context value of the first thread from the context storage; and executing another operation of the first thread.
A method for caching memory comprising caching two data values, each of one of two ranges of application memory addresses, each associated with one of a set of threads, by: organizing a plurality of sequences of consecutive address sub-ranges in an interleaved sequence of address sub-ranges by alternately selecting, for each thread in an identified order of threads, a next sub-range in the respective sequence of sub-ranges associated therewith; generating a mapping of the interleaved sequence of sub-ranges to a range of physical memory addresses in order of the interleaved sequence of sub-ranges; and when a thread accesses an application memory address of the respective range of application addresses associated thereof: computing a target address according to the mapping using the application address; and storing the two data values in one cache-line of a plurality of cache-lines of a cache by accessing the physical memory area using the target address.
A method for cache coherency in a reconfigurable cache architecture is provided. The method includes receiving a memory access command, wherein the memory access command includes at least an address of a memory to access; determining at least one access parameter based on the memory access command; and determining a target cache bin for serving the memory access command based in part on the at least one access parameter and the address.
An apparatus for computing, comprising a processing circuitry configured for computing an outcome of executing a set of computer instructions comprising a group of data variables, by: identifying an initial state of the processing circuitry; executing a set of anticipated computer instructions produced based on the set of computer instructions and a likely data value, where the likely data value is a value of one the group of data variables anticipated to be computed by executing the set of computer instructions and computed using at least one program data value; and when identifying, while executing the set of anticipated computer instructions, a failed prediction where the data variable is not equal to the likely data value: restoring the initial state of the processing circuitry; and executing a set of alternative computer instructions, produced based on the set of computer instructions and the at least one likely data value.
A device for executing a software program by at least one computational device, comprising an interconnected computing grid, connected to the at least one computational device, comprising an interconnected memory grid comprising a plurality of memory units connected by a plurality of memory network nodes, each connected to at least one of the plurality of memory units; wherein configuring the interconnected memory comprises: identifying a bypassable memory unit; selecting a backup memory unit connected to a backup memory network node; configuring the respective memory network node connected to the bypassable memory unit to forward at least one memory access request, comprising an address in a first address range, to the backup memory network node; and configuring the backup memory network node to access the backup memory unit in response to the at least one memory access request, in addition to accessing the respective at least one memory unit connected thereto.
A system for generating executable code of a software program that is matched with an intermediate representation (IR) of a source code of the software program. The system comprises a processor adapted for adding one or more annotation entries, each for a location in the IR, to program data in the IR. An internal annotation entry is generated for an internal location in the IR that is not referenced by an IR symbol in the global IR symbol table of the IR. The processor is further adapted for compiling the IR to produce a binary object comprising the annotation entries, and providing the binary object to a linker or to a dynamic loader to update in an executable object an executable internal annotation entry associated with an internal annotation entry to reference a run-time location in the executable object.
An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.
A method for executing a software program, comprising: identifying in a program a plurality of host threads, each for performing some of a plurality of parallel sub-tasks of a task; and for each of the host threads: generating device threads, each associated with the host thread, each for one of the parallel tasks associated thereof; generating a parent thread associated with the host thread for communicating with the device threads; configuring a host processing circuitry to execute the parent thread; and configuring at least one other processing circuitry to execute in parallel the device threads while the host processing circuitry executes the parent thread; and for at least one of the host threads: receiving by the parent thread a value from the at least one other processing circuitry, the value generated when executing at least one of the device threads associated with the at least one host thread.
An apparatus for computing functions using polynomial-based approximation, comprising one or more processing circuitries configured for computing a polynomial-based approximant approximating a function by executing one or more iterations. Each iteration comprising computing the polynomial-based approximant using scaled fixed-point unit(s) according to a constructed set of coefficients, minimizing an approximation error of the computed polynomial-based approximant compared to the function while complying with one or more constraints selected from a group comprising at least: an accuracy, a compute graph size, a computation complexity, and a hardware utilization of the processing circuitry(s), adjusting one or more of the coefficients in case the approximation error is incompliant with the constraint(s) and initiating another iteration. The polynomial-based approximant and its adjusted set of coefficients for which the computed polynomial-based approximant complies with the constraint(s) may be output to one or more processing circuitries configured to approximate the function by computing the polynomial-based approximant.
A method for caching memory comprising caching two data values, each of one of two ranges of application memory addresses, each associated with one of a set of threads, by: organizing a plurality of sequences of consecutive address sub-ranges in an interleaved sequence of address sub-ranges by alternately selecting, for each thread in an identified order of threads, a next sub-range in the respective sequence of sub-ranges associated therewith; generating a mapping of the interleaved sequence of sub-ranges to a range of physical memory addresses in order of the interleaved sequence of sub-ranges; and when a thread accesses an application memory address of the respective range of application addresses associated thereof: computing a target address according to the mapping using the application address; and storing the two data values in one cache-line of a plurality of cache-lines of a cache by accessing the physical memory area using the target address.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely, control-flow processors (CPUs/GPUs), application-specific hardware accelerators, PCI accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing; computer software namely, computer application software and algorithms for compute acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading. Assessing technical requirements for others in connection with computer hardware and software, namely, control-flow processors (CPUs/GPUs), application-specific hardware accelerators, PCI accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors; Assessing technical requirements for others in connection with computer hardware and software, namely, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing, computer software namely, computer application software for processing acceleration and for managing hardware or software components used for processing; all the aforesaid services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading; design, engineering, development, implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors (CPUs/GPUs), application-specific hardware accelerators, PCI accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors; design, engineering, development, implementation and consulting for others in connection with computer hardware and software, namely, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing, computer software namely, computer application software for processing acceleration and for managing hardware or software components used for processing; all the aforesaid services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading.
56.
DYNAMIC ALLOCATION OF EXECUTABLE CODE FOR MULTI-ARCHITECTURE HETEROGENEOUS COMPUTING
An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.
An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.
An apparatus for executing a software program, comprising at least one hardware processor configured for: identifying in a plurality of computer instructions at least one remote memory access instruction and a following instruction following the at least one remote memory access instruction; executing after the at least one remote memory access instruction a sequence of other instructions, where the sequence of other instructions comprises a return instruction to execute the following instruction; and executing the following instruction; wherein executing the sequence of other instructions comprises executing an updated plurality of computer instructions produced by at least one of: inserting into the plurality of computer instructions the sequence of other instructions or at least one flow-control instruction to execute the sequence of other instructions; and replacing the at least one remote memory access instruction with at least one non-blocking memory access instruction.
An apparatus for computing, comprising a processing circuitry configured for computing an outcome of executing a set of computer instructions comprising a group of data variables, by: identifying an initial state of the processing circuitry; executing a set of anticipated computer instructions produced based on the set of computer instructions and a likely data value, where the likely data value is a value of one the group of data variables anticipated to be computed by executing the set of computer instructions and computed using at least one program data value; and when identifying, while executing the set of anticipated computer instructions, a failed prediction where the data variable is not equal to the likely data value: restoring the initial state of the processing circuitry; and executing a set of alternative computer instructions, produced based on the set of computer instructions and the at least one likely data value.
An apparatus for executing a software program, comprising at least one hardware processor configured for: identifying in a plurality of computer instructions at least one remote memory access instruction and a following instruction following the at least one remote memory access instruction; executing after the at least one remote memory access instruction a sequence of other instructions, where the sequence of other instructions comprises a return instruction to execute the following instruction; and executing the following instruction; wherein executing the sequence of other instructions comprises executing an updated plurality of computer instructions produced by at least one of: inserting into the plurality of computer instructions the sequence of other instructions or at least one flow-control instruction to execute the sequence of other instructions; and replacing the at least one remote memory access instruction with at least one non-blocking memory access instruction.
A device for executing a software program by at least one computational device, comprising an interconnected computing grid, connected to the at least one computational device, comprising an interconnected memory grid comprising a plurality of memory units connected by a plurality of memory network nodes, each connected to at least one of the plurality of memory units; wherein configuring the interconnected memory comprises: identifying a bypas sable memory unit; selecting a backup memory unit connected to a backup memory network node; configuring the respective memory network node connected to the bypassable memory unit to forward at least one memory access request, comprising an address in a first address range, to the backup memory network node; and configuring the backup memory network node to access the backup memory unit in response to the at least one memory access request, in addition to accessing the respective at least one memory unit connected thereto.
An apparatus for computing, comprising a processing circuitry configured for computing an outcome of executing a set of computer instructions comprising a group of data variables, by: identifying an initial state of the processing circuitry; executing a set of anticipated computer instructions produced based on the set of computer instructions and a likely data value, where the likely data value is a value of one the group of data variables anticipated while executing the set of computer instructions; and when identifying, while executing the set of anticipated computer instructions, a failed prediction where the data variable is not equal to the likely data value: restoring the initial state of the processing circuitry; and executing a set of alternative computer instructions, produced based on the set of computer instructions and the at least one likely data value.
A method for cache coherency in a reconfigurable cache architecture is provided. The method includes receiving a memory access command, wherein the memory access command includes at least an address of a memory to access; determining at least one access parameter based on the memory access command; and determining a target cache bin for serving the memory access command based in part on the at least one access parameter and the address.
A device for executing a software program by at least one computational device, comprising an interconnected computing grid, connected to the at least one computational device, comprising an interconnected memory grid comprising a plurality of memory units connected by a plurality of memory network nodes, each connected to at least one of the plurality of memory units; wherein configuring the interconnected memory comprises: identifying a bypassable memory unit; selecting a backup memory unit connected to a backup memory network node; configuring the respective memory network node connected to the bypassable memory unit to forward at least one memory access request, comprising an address in a first address range, to the backup memory network node; and configuring the backup memory network node to access the backup memory unit in response to the at least one memory access request, in addition to accessing the respective at least one memory unit connected thereto.
An apparatus for executing a software program, comprising at least one hardware processor configured for: identifying in a plurality of computer instructions at least one remote memory access instruction and a following instruction following the at least one remote memory access instruction; executing after the at least one remote memory access instruction a sequence of other instructions, where the sequence of other instructions comprises a return instruction to execute the following instruction; and executing the following instruction; wherein executing the sequence of other instructions comprises executing an updated plurality of computer instructions produced by at least one of: inserting into the plurality of computer instructions the sequence of other instructions or at least one flow-control instruction to execute the sequence of other instructions; and replacing the at least one remote memory access instruction with at least one non-blocking memory access instruction.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 3/06 - Digital input from, or digital output to, record carriers
G06F 13/00 - Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
G06F 13/28 - Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access, cycle steal
66.
Dynamic allocation of executable code for multi-architecture heterogeneous computing
An apparatus for executing a software program, comprising processing units and a hardware processor adapted for: in an intermediate representation of the software program, where the intermediate representation comprises blocks, each associated with an execution block of the software program and comprising intermediate instructions, identifying a calling block and a target block, where the calling block comprises a control-flow intermediate instruction to execute a target intermediate instruction of the target block; generating target instructions using the target block; generating calling instructions using the calling block and a computer control instruction for invoking the target instructions, when the calling instructions are executed by a calling processing unit and the target instructions are executed by a target processing unit; configuring the calling processing unit for executing the calling instructions; and configuring the target processing unit for executing the target instructions.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely,
control-flow processors (CPUs/GPUs), application-specific
hardware accelerators, PCI accelerator cards, chipsets,
semiconductor processors, semiconductor ASICs
(application-specific integrated circuits), microprocessors,
microcontrollers, electronic circuit boards and other
computer hardware for high-speed processing; all for
implementation in various market segments and applications,
namely, in data centers, supercomputers, appliances,
servers, big data analytics, neural networks, optimization
and computation of intensive applications, artificial
intelligence and machine learning, network processors, image
analysis, cybersecurity, financial trading. Characterization, design, engineering, development,
implementation and consulting for others in connection with
computer hardware and software, namely, control-flow
processors (cpus/gpus), application-specific hardware
accelerators, pci accelerator cards, semiconductor
processors, semiconductor asics (application-specific
integrated circuits), microprocessors, microcontrollers,
electronic circuit hoards and other computer hardware for
high-speed processing; computer software namely, to computer
application software for processing computing acceleration
and for managing hardware or software components used for
processing provided in various market segments and
applications, namely, big data analytics, neural networks,
optimization and computation of intensive applications,
artificial intelligence and machine learning, network
processors, image analysis, cybersecurity, financial
trading; computer software namely, computer application
software and algorithms programs for computing acceleration
and for managing hardware or software components used for
processing.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely,
control-flow processors (cpus/gpus), application-specific
hardware accelerators, pci accelerator cards, chipsets,
semiconductor processors, semiconductor asics
(application-specific integrated circuits), microprocessors,
microcontrollers, electronic circuit boards and other
computer hardware for high-speed processing; computer
software namely, computer application software and
algorithms programs for computing acceleration and for
managing hardware or software components used for
processing; all for implementation in various market
segments and applications, namely, in data centers,
supercomputers, appliances, servers, big data analytics,
neural networks, optimization and computation of intensive
applications, artificial intelligence and machine learning,
network processors, image analysis, cybersecurity, financial
trading. Characterization, design, engineering, development,
implementation and consulting for others in connection with
computer hardware and software, namely, control-flow
processors (CPUs/GPUs), application-specific hardware
accelerators, PCI accelerator cards, semiconductor
processors, semiconductor ASICs (application-specific
integrated circuits), microprocessors, microcontrollers,
electronic circuit boards and other computer hardware for
high-speed processing; provided in various market segments
and applications, namely, big data analytics, neural
networks, optimization and computation of intensive
applications, artificial intelligence and machine learning,
network processors, image analysis, cyber security,
financial trading; providing computer software namely,
computer application software for processing computing
acceleration and for managing hardware or software
components used for processing provided in various market
segments and applications, namely, big data analytics,
neural networks, optimization and computation of intensive
applications, artificial intelligence and machine learning,
network processors, image analysis, cybersecurity, financial
trading.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely,
control-flow processors (CPUs/GPUs), application-specific
hardware accelerators, PCI accelerator cards, chipsets,
semiconductor processors, semiconductor ASICs
(application-specific integrated circuits; microprocessors,
microcontrollers, electronic circuit boards and other
computer hardware for high-speed processing; computer
software namely, computer application software and
algorithms for computer acceleration and for managing
hardware or software components used for processing; all for
implementation in various market segments and applications,
namely, in data centers, supercomputers, appliances,
servers, big data analytics, neural networks, optimization
and computation of intensive applications, artificial
intelligence and machine learning, network processors, image
analysis, cybersecurity, financial trading. Characterization, design, engineering, development
implementation and consulting for others in connection with
computer hardware and software, namely, control-flow
processors (CPUs/GPUs), application-specific hardware
accelerators, PCI accelerator cards, semiconductor
processors, semiconductor ASICs (application-specific
integrated circuits), microprocessors, microcontrollers,
electronic circuit boards and other computer hardware for
high-speed processing, computer software namely, to computer
application software for processing acceleration and for
managing hardware or software components used for
processing; all the aforementioned services provided in
various market segments and applications, namely, big data
analytics, neural networks, optimization and computation of
intensive applications, artificial intelligence and machine
learning, network processors, image analysis, cybersecurity,
financial trading.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely,
control-flow processors (CPUs/GPUs), application-specific
hardware accelerators, PCI accelerator cards, chipsets,
semiconductor processors, semiconductor ASICs
(application-specific integrated circuits; microprocessors,
microcontrollers, electronic circuit boards and other
computer hardware for high-speed processing; computer
software namely, computer application software and
algorithms for computer acceleration and for managing
hardware or software components used for processing; all for
implementation in various market segments and applications,
namely, in data centers, supercomputers, appliances,
servers, big data analytics, neural networks, optimization
and computation of intensive applications, artificial
intelligence and machine learning, network processors, image
analysis, cybersecurity, financial trading. Characterization, design, engineering, development
implementation and consulting for others in connection with
computer hardware and software, namely, control-flow
processors (CPUs/GPUs), application-specific hardware
accelerators, PCI accelerator cards, semiconductor
processors, semiconductor ASICs (application-specific
integrated circuits), microprocessors, microcontrollers,
electronic circuit boards and other computer hardware for
high-speed processing, computer software namely, to computer
application software for processing acceleration and for
managing hardware or software components used for
processing; all the aforementioned services provided in
various market segments and applications, namely, big data
analytics, neural networks, optimization and computation of
intensive applications, artificial intelligence and machine
learning, network processors, image analysis, cybersecurity,
financial trading.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely,
control-flow processors (CPUs/GPUs), application-specific
hardware accelerators, PCI accelerator cards, chipsets,
semiconductor processors, semiconductor ASICs
(application-specific integrated circuits; microprocessors,
microcontrollers, electronic circuit boards and other
computer hardware for high-speed processing; computer
software namely, computer application software and
algorithms for computer acceleration and for managing
hardware or software components used for processing; all for
implementation in various market segments and applications,
namely, in data centers, supercomputers, appliances,
servers, big data analytics, neural networks, optimization
and computation of intensive applications, artificial
intelligence and machine learning, network processors, image
analysis, cybersecurity, financial trading. Characterization, design, engineering, development
implementation and consulting for others in connection with
computer hardware and software, namely, control-flow
processors (CPUs/GPUs), application-specific hardware
accelerators, PCI accelerator cards, semiconductor
processors, semiconductor ASICs (application-specific
integrated circuits), microprocessors, microcontrollers,
electronic circuit boards and other computer hardware for
high-speed processing, computer software namely, to computer
application software for processing acceleration and for
managing hardware or software components used for
processing; all the aforementioned services provided in
various market segments and applications, namely, big data
analytics, neural networks, optimization and computation of
intensive applications, artificial intelligence and machine
learning, network processors, image analysis, cybersecurity,
financial trading.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware and peripherals, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific computer hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits); microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; downloadable computer software, namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading Computer design, engineering, development implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing, computer software, namely, to computer application software for processing acceleration and for managing hardware or software components used for processing; all the aforementioned services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific computer hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits); microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; downloadable computer software namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading Computer design, engineering, development implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing, computer software namely, to computer application software for processing acceleration and for managing hardware or software components used for processing; all the aforementioned services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific computer hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits); microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; downloadable computer software namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading Computer design, engineering, development implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing, computer software namely, to computer application software for processing acceleration and for managing hardware or software components used for processing; all the aforementioned services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware, software and peripherals, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific computer hardware accelerators, peripheral component interface (PCI) accelerator cards, chipsets, semiconductor processors, semiconductor ASICs (application-specific integrated circuits); microprocessors, microcontrollers, electronic circuit boards and computer hardware for high-speed processing; downloadable computer software namely, computer application software and algorithms for computer acceleration and for managing hardware or software components used for processing; all for implementation in various market segments and applications, namely, in data centers, supercomputers, appliances, servers, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading Computer design, engineering, development implementation and consulting for others in connection with computer hardware and software, namely, control-flow processors in the nature of central processing units (CPU) and graphic processing units (GPU), application-specific hardware accelerators, peripheral component interface (PCI) accelerator cards, semiconductor processors, semiconductor ASICs (application-specific integrated circuits), microprocessors, microcontrollers, electronic circuit boards and other computer hardware for high-speed processing, computer software namely, to computer application software for processing acceleration and for managing hardware or software components used for processing; all the aforementioned services provided in various market segments and applications, namely, big data analytics, neural networks, optimization and computation of intensive applications, artificial intelligence and machine learning, network processors, image analysis, cybersecurity, financial trading
76.
DIRECTED AND INTERCONNECTED GRID DATAFLOW ARCHITECTURE
A computing grid including an interconnect network including input ports and output ports; a plurality of egress ports; a plurality of configurable data routing junctions; a plurality of logical elements interconnected using the plurality of configurable data routing junctions; a plurality of ingress ports. In an embodiment at least one compute graph is projected onto the computing grid as a configuration of various elements of the computing grid.
A computing grid including an interconnect network including input ports and output ports; a plurality of egress ports; a plurality of configurable data routing junctions; a plurality of logical elements interconnected using the plurality of configurable data routing junctions; a plurality of ingress ports. In an embodiment at least one compute graph is projected onto the computing grid as a configuration of various elements of the computing grid.
A method for cache coherency in a reconfigurable cache architecture is provided. The method includes receiving a memory access command, wherein the memory access command includes at least an address of a memory to access; determining at least one access parameter based on the memory access command; and determining a target cache bin for serving the memory access command based in part on the at least one access parameter and the address.
A method for runtime optimization of a configurable processing architecture are disclosed. The method comprises receiving a plurality of calls for running at least one function; identifying at least one pattern among the plurality of received calls; and based on the at least one pattern, manipulating at least a portion of the configurable processing architecture, to compute the least one function.
A method for runtime optimization of a configurable processing architecture are disclosed. The method comprises receiving a plurality of calls for running at least one function; identifying at least one pattern among the plurality of received calls; and based on the at least one pattern, manipulating at least a portion of the configurable processing architecture, to compute the least one function.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
A method for cache coherency in a reconfigurable cache architecture is provided. The method includes receiving a memory access command, wherein the memory access command includes at least an address of a memory to access; determining at least one access parameter based on the memory access command; and determining a target cache bin for serving the memory access command based in part on the at least one access parameter and the address.