A computing system can include a plurality of deterministic processors. Each deterministic processor can be configured to execute one or more assembled programs in a program order of the one or more assembled programs. The computing system can include a plurality of communication links providing communication between pairs of processors of the plurality of deterministic processors.
Systems and methods described herein provide for: determining a first amount of power demanded by a load device in a first scheduled operating state; provisioning, by a deterministic energy provisioning device, the first amount of power to the load device during the first scheduled operating state; determining a second amount of power demanded by the load device in a second scheduled operating state, the second amount of power being greater than the first amount of power; and provisioning, by the deterministic energy provisioning device, the second amount of power to the load device during the second scheduled operating state.
H02M 3/158 - Conversion of DC power input into DC power output without intermediate conversion into AC by static converters using discharge tubes with control electrode or semiconductor devices with control electrode using devices of a triode or transistor type requiring continuous application of a control signal using semiconductor devices only with automatic control of output voltage or current, e.g. switching regulators including plural semiconductor devices as final control devices for a single load
3.
POWER MANAGEMENT IN DETERMINISTIC TENSOR STREAMING PROCESSORS
Embodiments pertain to reducing power consumption in a computing system comprising one or more deterministic processors. A controller generates a plurality of control signals for a voltage regulator to regulate a supply voltage of a respective one of the one or more deterministic processors. A power management module determines an initial profile for power consumption and performance of an algorithm executed on the respective deterministic processor having an initial value for the supply voltage and an initial value for a clock frequency. The power management module further determines a target profile for power consumption and performance of the algorithm executed on the respective deterministic processor. The controller modifies the plurality of control signals based on the initial profile and the target profile. The respective deterministic processor executes the algorithm while the supply voltage is dynamically modified by the voltage regulator based on the modified plurality of control signals.
A system contains a network of processors arranged in a plurality of nodes. Each node comprises a respective plurality of processors connected via local links, and different nodes are connected via global links. The processors of the network communicate with each other to establish a global counter for the network, enabling deterministic communication between the processors of the network. A compiler is configured to explicitly schedule communication traffic across the global and local links of the network of processors based upon the deterministic links between the processors, which enable software-scheduled networking with explicit send or receive instructions executed by functional units of the processors at specific times, to establish a specific ordering of operations performed by the network of processors. In some embodiments, the processors of the network of processors are tensor streaming processors (TSPs).
A deterministic apparatus comprising a deterministic near-compute memory communicatively coupled with and proximate to a deterministic processor. The deterministic near-compute memory comprises a plurality of data banks having a global memory address space, a control bus, a data input bus and a data output bus for each data bank. The deterministic processor is configured to initiate, via the control bus, retrieval of a set of data from the plurality of data banks. The retrieved set of data comprises at least one row of a selected one of the data banks passed via the data output bus onto a plurality of stream registers of the deterministic processor.
Improved placement of memory and functional modules, ‘tiles’, within a tiled processor architecture are disclosed for linear algebra calculations involving vectors and matrices comprising large amounts of data. The improved placement places the data in close proximity to the functional modules performing calculations using the data. These modules enable these calculations to be performed more quickly while using less energy. These modules, in particular, improve the efficiency of the training and application of deep learning and artificial neural network systems. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
Embodiments are directed to an integrated circuit with multiple dies connected in a die-to-die (D2D) configuration. The integrated circuit can include a first die and a second die connected to the first die via a D2D interface circuit in the D2D configuration forming a D2D structure with the first die. The D2D interface can connect a first plurality of superlanes of the first die with a second plurality of superlanes of the second die for streaming data between the first die and the second die along a first direction or a second direction orthogonal to the first direction.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 9/30 - Arrangements for executing machine instructions, e.g. instruction decode
H01L 25/07 - Assemblies consisting of a plurality of individual semiconductor or other solid-state devices all the devices being of a type provided for in a single subclass of subclasses , , , , or , e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in subclass
8.
PERMISSION CONTROL VIA DATA REDUNDANCY IN DETERMINISTIC STREAMING SYSTEM
Embodiments are directed to a computing system with permission control via data redundancy. The computing system includes a memory and a permission control circuit coupled to the memory. The permission control circuit encodes a first data vector by using a bit position register with a first permission control code for a first user, writes the encoded first data vector into the memory, and updates content of the bit position register from the first permission control code to a second permission control code for a second user. The encoded first data vector written into the memory is inaccessible for the second user based on the updated content of the bit position register.
Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing temporary use of online non-downloadable software
for use as an application programming interface (api) for
integrating natural language processing, machine learning,
and artificial intelligence software into third-party
computer programs; providing temporary use of online
non-downloadable software for use as an application
programming interface (api) for access to open-source large
language models (llms); platform as a service (paas)
featuring computer software platforms providing integration
of natural language processing, machine learning, and
artificial intelligence software into third-party computer
programs; platform as a service (paas) featuring computer
software platforms providing access to open-source large
language models (llms); platform as a service (paas)
featuring software for using artificial intelligence for the
generation and processing of natural language into
machine-executable commands; platform as a service (paas)
featuring software for conversion between speech or language
recognition and text.
12.
BALANCED BINARY TREE STRUCTURES FOR STREAM REDUCING OPERATIONS
Methods, systems, and other embodiments are described for incorporating a balanced binary tree into the multiplication modules of a tensor processor to execute sequences of instructions more efficiently for Stream Reducing operations. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
Methods are provided using lattices (representing graph data structures that encode a machine learning application program) and lattice images and algorithms to accelerate optimization problems that arise during the mapping of tensor processing algorithms to specialized processors, such as for machine learning (ML) or artificial intelligence (AI), optimizations such as the scheduling of instructions and the allocation of memory. These optimizations are obtained with improved compiler methods that use lattice image and transformation algorithms to accelerate, among other tasks, memory partitioning, operation splitting, loop fusion, and optimal allocation of intermediate results of matrix mathematical operations.
A method comprises receiving a kernel used to convolve with an input tensor. For a first dimension of the kernel, a square block of values for each single dimensional vector of the kernel that includes all rotations of that single dimensional vector is generated. For each additional dimension of the kernel, group blocks of an immediately preceding dimension into sets of blocks, each set of blocks including blocks of the immediately preceding dimension that are aligned along a vector that is parallel to the axis of the dimension; and generate, for the additional dimension, one or more blocks of values, each block including all rotations of blocks within each of the sets of blocks of the immediately preceding dimension. The block of values corresponding to the last dimension in the additional dimensions of the kernel is output as the expanded kernel.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06F 7/76 - Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
G06F 18/2137 - Feature extraction, e.g. by transforming the feature spaceSummarisationMappings, e.g. subspace methods based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps
G06N 3/04 - Architecture, e.g. interconnection topology
A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions. The processed instructions are dispatched to columns of the computational array via a plurality of instruction queues, wherein an instruction vector of instructions is shifted between adjacent instruction queues in a first direction, and dispatches instructions to the computational elements in a second direction.
G06F 9/48 - Program initiatingProgram switching, e.g. by interrupt
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
16.
OPTIMIZING INSTRUCTION SCHEDULING AND MEMORY ALLOCATION FOR TENSOR AND GRAPHICAL PROCESSORS USING LATTICE IMAGE DATA STRUCTURE OPTIMIZATIONS
Optimizing instruction scheduling and memory allocation for tensor and graphical processors using lattice image data structure optimizations is provided. A method of using a loop fusion by a compiler in interconnected accelerator units to simplify a machine learning (ML) graph representing a program to be compiled is provided. The method includes (A) lowering a plurality of operations in an initial first program to an original plurality of at least two loops. The method also includes (B) inferring a fused loop structure from the original plurality of at least two loops in the initial first program thus creating a second program having the fused loop. The fused loop in the second program is pipelining the multiplication and addition operations thus significantly reducing the memory bandwidth requirements and improving cache locality.
Embodiments pertain to reducing power consumption in a computing system comprising one or more deterministic processors. A controller generates a plurality of control signals for a voltage regulator to regulate a supply voltage of a respective one of the one or more deterministic processors. A power management module determines an initial profile for power consumption and performance of an algorithm executed on the respective deterministic processor having an initial value for the supply voltage and an initial value for a clock frequency. The power management module further determines a target profile for power consumption and performance of the algorithm executed on the respective deterministic processor. The controller modifies the plurality of control signals based on the initial profile and the target profile. The respective deterministic processor executes the algorithm while the supply voltage is dynamically modified by the voltage regulator based on the modified plurality of control signals.
09 - Scientific and electric apparatus and instruments
35 - Advertising and business services
38 - Telecommunications services
39 - Transport, packaging, storage and travel services
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
(1) Downloadable and recorded artificial intelligence software for speech, voice, and language recognition, generation, and translation; downloadable and recorded software for training artificial intelligence models; downloadable and recorded artificial intelligence software for generating text, images, audio, video and 3D models; downloadable and recorded artificial intelligence software for machine learning, natural language generation, predictive analytics and business intelligence; downloadable and recorded artificial intelligence inference software; downloadable and recorded software for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, predictive analytics, and artificial intelligence; downloadable and recorded software for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; downloadable and recorded speculative decoding software; downloadable and recorded computer software development tools; downloadable software development kits (SDKs); downloadable databases, namely, open source libraries of software tools for software development; downloadable and recorded computer software for deploying and optimizing models in the fields of data science, artificial intelligence, and machine learning; downloadable and recorded computer software for developing, integrating and deploying software scripts; downloadable and recorded computer performance software, for operating integrated circuits, semiconductors, computer chip sets and microprocessors; downloadable and recorded application programming interface (API) software; downloadable and recorded application programming interface (API) software for software development in the fields of artificial intelligence, machine learning, neural networks, high performance computing, and distributed computing systems; computer hardware and recorded and downloadable computer software for use in machine learning and artificial intelligence applications; downloadable and recorded computer software for use as an application programming interface (API); computer hardware and recorded and downloadable software for use in managing and operating data centers and data storage; downloadable and recorded compiler software; downloadable and recorded cloud computing software for deploying and managing virtual machines on a cloud computing platform; computer network server; computer software for accessing, browsing, sharing, defining, maintaining, virtualizing and communicating information over computer networks, servers and secure private networks; computer software for use in creating, managing, controlling, changing, replicating, deploying, naming and linking information over computer networks, servers and secure private networks; downloadable and recorded virtualization software for cloud access and cloud computing; network connection devices for the transportation of data (network fabrics); downloadable and recorded computer software for running development programs and application programs; computer hardware and peripherals; computer memory hardware; microprocessors, integrated circuits, and computer chips; semiconductors; host channel adapters; target channel adapters; computer network adapters, switches, routers, and hubs; backplanes; host bus adapters; computer hardware and downloadable and recorded computer software to enable computing devices to communicate and share data with other computing devices; software stacks; server racks and network racks namely hardware to support computing devices; computer hardware for enabling connections among central processing units (CPUs), servers and data storage devices; peripheral component interconnect express (PCIE) expansion cards; computer hardware that enables peripheral component interconnect express (PCIE) access to multiple central processing units (CPUs); computer hardware and downloadable and recorded computer software for creating, facilitating, and managing remote access to and communication with local area networks (LANs), virtual private networks (VPN), wide area networks (WANs) and global computer networks; central processing units (CPU); computer cards; computer nodes; system-on-chip processors; tensor computing processors; tensor processing units (TCU); data processing equipment; data visualization software; tokens for use in quantifying units of artificial intelligence; downloadable and recorded computer software for generating digital tokens for use in quantifying units of artificial intelligence; downloadable and recorded cryptography software; downloadable and recorded data mining software; downloadable and recorded system maintenance software; downloadable and recorded computer software for retrieval-augmented generation; graphics processor units (GPU); downloadable computer software for AI voice modelling; downloadable computer software for cloud services, artificial intelligence (AI), mobile applications, natural language processing (NLP), large language models (LLMs), question-answering systems, chatbots, knowledge retrieval systems, AI personal assistants, AI coaching apps, AI technical support agents, AI sales representatives, and development frameworks; downloadable computer software, namely, software development tools for the creation of mobile internet applications; downloadable computer software for natural language processing (NLP); downloadable computer software for large language models (LLMs) for use in artificial intelligence computer software solutions; downloadable computer software for question-answering systems based on artificial intelligence in the field(s) of artificial intelligence with natural language processing technology; downloadable computer chatbot software for simulating conversations; downloadable publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence (1) Promoting the exchange of information and resources within the computer developer community to achieve advances in the field of artificial intelligence technology; providing consumer product information for the purpose of selecting artificial intelligence (AI) hardware and software to meet the consumer's specification; consulting services in the field of hardware and software development, digital technology, information technology, and artificial intelligence; arranging and conducting business conferences and special events in the field of hardware and software development, digital technology, information technology, and artificial intelligence; professional networking services; providing online databases featuring information in the field of hardware and software development, digital technology, information technology, and artificial intelligence; creating and facilitating a community of professionals in the field of hardware and software development, digital technology, information technology, and artificial intelligence; association services that promote the interests of professionals and businesses in the field of hardware and software development, digital technology, information technology, and artificial intelligence
(2) Providing user and multi-user access to global computer networks; communications by computer terminals, fiber optic networks, cell phones, telephone; computer aided transmission of messages and images; geolocation services; providing telecommunications connections to a global computer network; satellite transmission; telecommunications routing and junction services; teleconferencing services; video-on-demand transmission; videoconferencing services; voice mail services; wireless broadcasting; providing access to online platforms and portals; communication via virtual private networks; providing online virtual communities; providing online forums with access to community professionals and experts to provide technical support and problem solving; providing access to online libraries with content on computer software, computer hardware, artificial intelligence, computer programming, software development and technical advice; providing access to online databases of audio, text and images created by artificial intelligence; telecommunication services, namely, transmission and streaming of voice, data, images, audio, entertainment content and information by means of telecommunications networks, wireless communication networks, and the internet; providing online forums, chat rooms and electronic bulletin boards for transmission of information, messages, videos, images, and sound among users in the field of general interest; providing an electronic bulletin board for transmission of messages between computer users and artificial intelligence chatbots and virtual assistants for facilitating communication, conversation and discussions; rental of access time to global computer networks; rental of telecommunications equipment; workgroup communications services over computer networks; instant messaging services
(3) Arranging and conducting seminars, workshops, colloquia, lectures, non-downloadable webinars, multimedia presentations and trainings in the field of hardware and software development, digital technology, information technology, and artificial intelligence; educational services, namely, providing educational conferences, symposiums and round-table discussions in the field of hardware and software development, digital technology, information technology, and artificial intelligence; providing a website featuring blogs and non-downloadable publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence; providing research, information and news in the field of hardware and software development, digital technology, information technology, and artificial intelligence; providing online videos featuring research, information, news, and trainings in the field of hardware and software development, digital technology, information technology, and artificial intelligence; non-downloadable electronic publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence
(4) Research, design and development of computer hardware and software; research and development services in the field of artificial intelligence; product research and development; computer programming; providing online non-downloadable software, infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software platforms for natural language processing, machine learning, voice command and recognition, speech to text conversion, data analytics, data processing, and artificial intelligence; providing online non-downloadable software, infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software platforms for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; infrastructure as a service (IaaS) services, software as a service (SaaS) services, and platform as a service (PaaS) services, namely, computer software platforms with software for use by others in developing software applications; infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services with knowledge-based software platforms for developing artificial intelligence; technical support services, namely, troubleshooting in the nature of diagnosing computer hardware and software problems; technical support services, namely, remote and on-site infrastructure management services for monitoring, administration and management of public and private cloud computing it and application systems; provision of non-downloadable software for creating, facilitating, and managing remote access to and communication with local area networks (LANs), virtual private networks (VPNs), wide area networks (WANs) and global computer networks (GPNs); infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software for use in automatically generating computer software; application service provider (ASP), namely, hosting computer operating systems and computer software applications for use by others; provision of non-downloadable software for artificial intelligence development, machine learning, natural language generation, data retrieval, predictive analytics and business intelligence; provision of non-downloadable software for processing images, graphics, audio, video and text; artificial intelligence inference as a service (IaaS); provision of non-downloadable software for speculative decoding algorithms; provision of non-downloadable software to enable computing devices to communicate and share data with other computing devices; provision of non-downloadable software for running development programs and application programs in a common development environment; provision of online open source libraries to assist software developers; provision of non-downloadable educational software in the field of software development, computer programming, hardware development and design, cloud computing and artificial intelligence; provision of non-downloadable computer software for retrieval-augmented generation; provision of non-downloadable software platforms for assisting programming developers to collaborate on computer programming code as used in multiple application programs; provision of non-downloadable software for assisting developers in developing computer software programming code; provision of non-downloadable compiler software; provision of non-downloadable data visualization software; providing temporary use of online non-downloadable software development tools; provision of non-downloadable software for generation of digital tokens for use in quantifying units of artificial intelligence; provision of non-downloadable data mining software; provision of non-downloadable system maintenance software; providing online non-downloadable software for parallel computing; server hosting; cloud storage services for electronic data; providing temporary use of on-line non-downloadable operating software for accessing and using cloud computing networks and cloud servers; provision of non-downloadable virtualization software to create virtual representations of servers, storage, networks, and other physical machines; provision of non-downloadable computer performance software, for operating integrated circuits, semiconductors, computer chip sets, microprocessors and host channel adapters; providing online non-downloadable software for cloud infrastructure management and automation; providing online non-downloadable software for deploying, distributing, configuring, testing, installing, upgrading, updating, customizing, debugging, and managing other software; installation, maintenance, and updating of computer software; testing, analysis and evaluation of computer hardware and software created by developers for the purpose of certification; application service provider featuring application programming interface (API) software for artificial intelligence methods, natural language processing, natural language understanding, dialog systems, voice and speech recognition and text to speech systems, natural language human-machine interfaces and predictive assistance technologies; application service provider featuring application programming interface (API) software to provide natural language processing, artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, unsupervised learning, data mining, predictive analytics and business intelligence; providing on-line non-downloadable software for use in natural language processing, artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, unsupervised learning, data mining, predictive analytics and business intelligence; artificial intelligence as a service (AIaaS) services featuring software using artificial intelligence for speech, voice, and language recognition, generation, and translation; artificial intelligence as a service (AIaaS) services featuring software using artificial intelligence for conversation; rental of computer hardware and computer peripherals; rental of computer software; providing online non-downloadable chatbot software for simulating conversations; user authentication services using single sign-on technology for online software applications
09 - Scientific and electric apparatus and instruments
35 - Advertising and business services
38 - Telecommunications services
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable and recorded artificial intelligence software for speech, voice, and language recognition, generation, and translation; Downloadable and recorded software for training artificial intelligence models; Downloadable and recorded artificial intelligence software for generating text, images, audio, video and 3D models; Downloadable and recorded artificial intelligence software for machine learning, natural language generation, predictive analytics and business intelligence; Downloadable and recorded artificial intelligence inference software; Downloadable and recorded software for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, predictive analytics, and artificial intelligence; Downloadable and recorded software for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Downloadable and recorded speculative decoding software; Downloadable and recorded computer software development tools; Downloadable software development kits (SDKS); Downloadable databases, namely, open source libraries of software tools for software development; Downloadable and recorded computer software for deploying and optimizing models in the fields of data science, artificial intelligence, and machine learning; Downloadable and recorded computer software for developing, integrating and deploying software scripts; Downloadable and recorded computer performance software, for operating integrated circuits, semiconductors, computer chip sets and microprocessors; Downloadable and recorded application programming interface (API) software; Downloadable and recorded application programming interface (API) software for software development in the fields of artificial intelligence, machine learning, neural networks, high performance computing, and distributed computing systems; Computer hardware and recorded and downloadable computer software for use in machine learning and artificial intelligence applications; Downloadable and recorded computer software for use as an application programming interface (API); Computer hardware and recorded and downloadable software for use in managing and operating data centers and data storage; Downloadable and recorded compiler software; Downloadable and recorded cloud computing software for deploying and managing virtual machines on a cloud computing platform; Computer network server; Computer software for accessing, browsing, sharing, defining, maintaining, virtualizing and communicating information over computer networks, servers and secure private networks; Computer software for use in creating, managing, controlling, changing, replicating, deploying, naming and linking information over computer networks, servers and secure private networks; Downloadable and recorded virtualization software for cloud access and cloud computing; Network connection devices for the transportation of data (network fabrics); Downloadable and recorded computer software for running development programs and application programs; Computer hardware and peripherals; Computer memory hardware; Microprocessors, integrated circuits, and computer chips; Semiconductors; Host channel adapters; Target channel adapters; Computer network adapters, switches, routers, and hubs; Backplanes; Host bus adapters; Computer hardware and downloadable and recorded computer software to enable computing devices to communicate and share data with other computing devices; Software stacks; Server racks and network racks namely hardware to support computing devices; Computer hardware for enabling connections among central processing units (CPUs), servers and data storage devices; Peripheral component interconnect express (PCIe) expansion cards; Computer hardware that enables peripheral component interconnect express (PCIe) access to multiple central processing units (CPUs); Computer hardware and downloadable and recorded computer software for creating, facilitating, and managing remote access to and communication with local area networks (LANS), virtual private networks (vpn), wide area networks (wans) and global computer networks; Central processing units (CPU); Computer cards; Computer nodes; System-on-chip processors; Tensor computing processors; Tensor processing units (TCU); Data processing equipment; Data visualization software; Downloadable and recorded computer software for generating digital tokens for use in quantifying units of artificial intelligence; Downloadable and recorded cryptography software; Downloadable and recorded data mining software; Downloadable and recorded system maintenance software; Downloadable and recorded computer software for retrieval-augmented generation; Graphics processor units (GPU); Downloadable computer software for AI voice modelling; Downloadable computer software for cloud services, artificial intelligence (AI), mobile applications, natural language processing (NLP), Large Language Models (LLMs), question-answering systems, chatbots, knowledge retrieval systems, AI personal assistants, AI coaching apps, AI technical support agents, AI sales representatives, and development frameworks; Downloadable computer software, namely, software development tools for the creation of mobile internet applications; Downloadable computer software for natural language processing (NLP); Downloadable computer software for Large Language Models (LLMs) for use in artificial intelligence computer software solutions; Downloadable computer software for question-answering systems based on artificial intelligence in the field(s) of artificial intelligence with natural language processing technology; Downloadable computer chatbot software for simulating conversations; Downloadable publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence. Promoting the exchange of information and resources within the computer developer community to achieve advances in the field of artificial intelligence technology; Providing consumer product information for the purpose of selecting artificial intelligence (AI) hardware and software to meet the consumer's specification; Arranging and conducting special events in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Professional networking services; Creating and facilitating a community of professionals in the field of hardware and software development, digital technology, information technology, and artificial intelligence [business networking services]; association services, namely marketing and management assistance services that promote the interests of professionals and businesses in the field of hardware and software development, digital technology, information technology, and artificial intelligence. Telecommunication services; Providing user and multi-user access to global computer networks; Communications by computer terminals, fiber optic networks, cell phones, telephone; Computer aided transmission of messages and images; Geolocation services; Providing telecommunications connections to a global computer network; Satellite transmission; Telecommunications routing and junction services; Teleconferencing services; Video-on-demand transmission; Videoconferencing services; Voice mail services; Wireless broadcasting; Providing access to online platforms and portals; Communication via virtual private networks; Providing online virtual forums and chatrooms; Providing online forums with access to community professionals and experts to provide technical support and problem solving; Providing access to online libraries with content on computer software, computer hardware, artificial intelligence, computer programming, software development and technical advice; Providing access to online databases of audio, text and images created by artificial intelligence; Telecommunication services, namely, transmission and streaming of voice, data, images, audio, entertainment content and information by means of telecommunications networks, wireless communication networks, and the internet; Providing online forums, chat rooms and electronic bulletin boards for transmission of information, messages, videos, images, and sound among users in the field of general interest; Providing an electronic bulletin board for transmission of messages between computer users and artificial intelligence chatbots and virtual assistants for facilitating communication, conversation and discussions; Rental of access time to global computer networks; Rental of telecommunications equipment; Workgroup communications services over computer networks; Instant messaging services; providing access to online computer databases featuring information in the field of hardware and software development, digital technology, information technology, and artificial intelligence. Arranging and conducting seminars, workshops, colloquia, lectures, non-downloadable webinars, multimedia presentations and trainings in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Educational services, namely, providing educational conferences, symposiums and round-table discussions in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Providing blogs and non-downloadable publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence, all provided via a website; news reporting services in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Providing online videos featuring research, information, news, and trainings in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Non-downloadable electronic publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Arranging and conducting business conferences in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Provision of online open source libraries to assist software developers. Research, design and development of computer hardware and software; Research and development services in the field of artificial intelligence; Product research and development; Computer programming; Providing online non-downloadable software, infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software platforms for natural language processing, machine learning, voice command and recognition, speech to text conversion, data analytics, data processing, and artificial intelligence; Providing online non-downloadable software, infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software platforms for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Infrastructure as a service (IaaS) services, software as a service (SaaS) services, and platform as a service (PaaS) services, namely, computer software platforms with software for use by others in developing software applications; Infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services with knowledge-based software platforms for developing artificial intelligence; Technical support services, namely, troubleshooting in the nature of diagnosing computer hardware and software problems; Technical support services, namely, remote and on-site infrastructure management services for monitoring, administration and management of public and private cloud computing IT and application systems; Provision of non-downloadable software for creating, facilitating, and managing remote access to and communication with local area networks (LANs), virtual private networks (VPNs), wide area networks (WANs) and global computer networks (GPNs); Infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software for use in automatically generating computer software; Application service provider (ASP), namely, hosting computer operating systems and computer software applications for use by others; Provision of non-downloadable software for artificial intelligence development, machine learning, natural language generation, data retrieval, predictive analytics and business intelligence; Provision of non-downloadable software for processing images, graphics, audio, video and text; Artificial intelligence inference as a service (IaaS); Provision of non-downloadable software for speculative decoding algorithms; Provision of non-downloadable software to enable computing devices to communicate and share data with other computing devices; Provision of non-downloadable software for running development programs and application programs in a common development environment; Provision of non-downloadable educational software in the field of software development, computer programming, hardware development and design, cloud computing and artificial intelligence; Provision of non-downloadable computer software for retrieval-augmented generation; Provision of non-downloadable software platforms for assisting programming developers to collaborate on computer programming code as used in multiple application programs; Provision of non-downloadable software for assisting developers in developing computer software programming code; Provision of non-downloadable compiler software; Provision of non-downloadable data visualization software; Providing temporary use of online non-downloadable software development tools; Provision of non-downloadable software for generation of digital tokens for use in quantifying units of artificial intelligence; Provision of non-downloadable data mining software; Provision of non-downloadable system maintenance software; Providing online non-downloadable software for parallel computing; Server hosting; Cloud storage services for electronic data; Providing temporary use of on-line non-downloadable operating software for accessing and using cloud computing networks and cloud servers; Provision of non-downloadable virtualization software to create virtual representations of servers, storage, networks, and other physical machines; Provision of non-downloadable computer performance software, for operating integrated circuits, semiconductors, computer chip sets, microprocessors and host channel adapters; Providing online non-downloadable software for cloud infrastructure management and automation; Providing online non-downloadable software for deploying, distributing, configuring, testing, installing, upgrading, updating, customizing, debugging, and managing other software; Installation, maintenance, and updating of computer software; Testing, analysis and evaluation of computer hardware and software created by developers for the purpose of certification; Application service provider featuring application programming interface (API) software for artificial intelligence methods, natural language processing, natural language understanding, dialog systems, voice and speech recognition and text to speech systems, natural language human-machine interfaces and predictive assistance technologies; Application service provider featuring application programming interface (API) software to provide natural language processing, artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, unsupervised learning, data mining, predictive analytics and business intelligence; Providing on-line non-downloadable software for use in natural language processing, artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, unsupervised learning, data mining, predictive analytics and business intelligence; Artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for speech, voice, and language recognition, generation, and translation; Artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for conversation; Rental of computer hardware and computer peripherals; Rental of computer software; Providing online non-downloadable chatbot software for simulating conversations; User authentication services using single sign-on technology for online software applications; Consulting services in the field of hardware and software development, digital technology, information technology, and artificial intelligence; providing information in the field of hardware and software development, digital technology, information technology, and artificial intelligence from a computer database; Providing research, information in the field of hardware and software development, digital technology, information technology, and artificial intelligence; providing information in the form of news in the field of hardware and software development, digital technology, information technology, and artificial intelligence.
09 - Scientific and electric apparatus and instruments
41 - Education, entertainment, sporting and cultural services
42 - Scientific, technological and industrial services, research and design
Goods & Services
Promoting the exchange of information and resources within the computer developer community to achieve advances in the field of artificial intelligence technology; Providing consumer product information for the purpose of selecting artificial intelligence (AI) hardware and software to meet the consumer's specification; Consulting services in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Arranging and conducting business conferences and special events in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Professional networking services; Providing online databases featuring information in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Creating and facilitating a community of professionals in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Association services that promote the interests of professionals and businesses in the field of hardware and software development, digital technology, information technology, and artificial intelligence Providing user and multi-user access to global computer networks; Communications by computer terminals, fiber optic networks, cell phones, telephone; Computer aided transmission of messages and images; Geolocation services; Providing telecommunications connections to a global computer network; Satellite transmission; Telecommunications routing and junction services; Teleconferencing services; Video-on-demand transmission; Videoconferencing services; Voice mail services; Wireless broadcasting; Providing access to online platforms and portals; Communication via virtual private networks; Providing online virtual communities; Providing online forums with access to community professionals and experts to provide technical support and problem solving; Providing access to online libraries with content on computer software, computer hardware, artificial intelligence, computer programming, software development and technical advice; Providing access to online databases of audio, text and images created by artificial intelligence; Telecommunication services, namely, transmission and streaming of voice, data, images, audio, entertainment content and information by means of telecommunications networks, wireless communication networks, and the internet; Providing online forums, chat rooms and electronic bulletin boards for transmission of information, messages, videos, images, and sound among users in the field of general interest; Providing an electronic bulletin board for transmission of messages between computer users and artificial intelligence chatbots and virtual assistants for facilitating communication, conversation and discussions; Rental of access time to global computer networks; Rental of telecommunications equipment; Workgroup communications services over computer networks; Instant messaging services Downloadable and recorded artificial intelligence software for speech, voice, and language recognition, generation, and translation; Downloadable and recorded software for training artificial intelligence models; Downloadable and recorded artificial intelligence software for generating text, images, audio, video and 3D models; Downloadable and recorded artificial intelligence software for machine learning, natural language generation, predictive analytics and business intelligence; Downloadable and recorded artificial intelligence inference software; Downloadable and recorded software for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, predictive analytics, and artificial intelligence; Downloadable and recorded software for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Downloadable and recorded speculative decoding software; Downloadable and recorded computer software development tools; Downloadable software development kits (SDKS); Downloadable databases, namely, open source libraries of software tools for software development; Downloadable and recorded computer software for deploying and optimizing models in the fields of data science, artificial intelligence, and machine learning; Downloadable and recorded computer software for developing, integrating and deploying software scripts; Downloadable and recorded computer performance software, for operating integrated circuits, semiconductors, computer chip sets and microprocessors; Downloadable and recorded application programming interface (API) software; Downloadable and recorded application programming interface (API) software for software development in the fields of artificial intelligence, machine learning, neural networks, high performance computing, and distributed computing systems; Computer hardware and recorded and downloadable computer software for use in machine learning and artificial intelligence applications; Downloadable and recorded computer software for use as an application programming interface (API); Computer hardware and recorded and downloadable software for use in managing and operating data centers and data storage; Downloadable and recorded compiler software; Downloadable and recorded cloud computing software for deploying and managing virtual machines on a cloud computing platform; Computer network server; Computer software for accessing, browsing, sharing, defining, maintaining, virtualizing and communicating information over computer networks, servers and secure private networks; Computer software for use in creating, managing, controlling, changing, replicating, deploying, naming and linking information over computer networks, servers and secure private networks; Downloadable and recorded virtualization software for cloud access and cloud computing; Network connection devices for the transportation of data (network fabrics); Downloadable and recorded computer software for running development programs and application programs; Computer hardware and peripherals; Computer memory hardware; Microprocessors, integrated circuits, and computer chips; Semiconductors; Host channel adapters; Target channel adapters; Computer network adapters, switches, routers, and hubs; Backplanes; Host bus adapters; Computer hardware and downloadable and recorded computer software to enable computing devices to communicate and share data with other computing devices; Software stacks; Server racks and network racks namely hardware to support computing devices; Computer hardware for enabling connections among central processing units (CPUs), servers and data storage devices; Peripheral component interconnect express (PCIe) expansion cards; Computer hardware that enables peripheral component interconnect express (PCIe) access to multiple central processing units (CPUs); Computer hardware and downloadable and recorded computer software for creating, facilitating, and managing remote access to and communication with local area networks (LANS), virtual private networks (vpn), wide area networks (wans) and global computer networks; Central processing units (CPU); Computer cards; Computer nodes; System-on-chip processors; Tensor computing processors; Tensor processing units (TCU); Data processing equipment; Data visualization software; Tokens for use in quantifying units of artificial intelligence; Downloadable and recorded computer software for generating digital tokens for use in quantifying units of artificial intelligence; Downloadable and recorded cryptography software; Downloadable and recorded data mining software; Downloadable and recorded system maintenance software; Downloadable and recorded computer software for retrieval-augmented generation; Graphics processor units (GPU); Downloadable computer software for AI voice modelling; Downloadable computer software for cloud services, artificial intelligence (AI), mobile applications, natural language processing (NLP), Large Language Models (LLMs), question-answering systems, chatbots, knowledge retrieval systems, AI personal assistants, AI coaching apps, AI technical support agents, AI sales representatives, and development frameworks; Downloadable computer software, namely, software development tools for the creation of mobile internet applications; Downloadable computer software for natural language processing (NLP); Downloadable computer software for Large Language Models (LLMs) for use in artificial intelligence computer software solutions; Downloadable computer software for question-answering systems based on artificial intelligence in the field(s) of artificial intelligence with natural language processing technology; Downloadable computer chatbot software for simulating conversations; Downloadable publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence Arranging and conducting seminars, workshops, colloquia, lectures, non-downloadable webinars, multimedia presentations and trainings in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Educational services, namely, providing educational conferences, symposiums and round-table discussions in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Providing a website featuring blogs and non-downloadable publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Providing research, information and news in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Providing online videos featuring research, information, news, and trainings in the field of hardware and software development, digital technology, information technology, and artificial intelligence; Non-downloadable electronic publications, namely, letters, newsletters, articles, news stories, blogs, case studies, research papers, and product spec sheets in the field of hardware and software development, digital technology, information technology, and artificial intelligence Research, design and development of computer hardware and software; Research and development services in the field of artificial intelligence; Product research and development; Computer programming; Providing online non-downloadable software, infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software platforms for natural language processing, machine learning, voice command and recognition, speech to text conversion, data analytics, data processing, and artificial intelligence; Providing online non-downloadable software, infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software platforms for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Infrastructure as a service (IaaS) services, software as a service (SaaS) services, and platform as a service (PaaS) services, namely, computer software platforms with software for use by others in developing software applications; Infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services with knowledge-based software platforms for developing artificial intelligence; Technical support services, namely, troubleshooting in the nature of diagnosing computer hardware and software problems; Technical support services, namely, remote and on-site infrastructure management services for monitoring, administration and management of public and private cloud computing IT and application systems; Provision of non-downloadable software for creating, facilitating, and managing remote access to and communication with local area networks (LANs), virtual private networks (VPNs), wide area networks (WANs) and global computer networks (GPNs); Infrastructure as a service (IaaS) services, software as a service (SaaS) services featuring software, and platform as a service (PaaS) services featuring computer software for use in automatically generating computer software; Application service provider (ASP), namely, hosting computer operating systems and computer software applications for use by others; Provision of non-downloadable software for artificial intelligence development, machine learning, natural language generation, data retrieval, predictive analytics and business intelligence; Provision of non-downloadable software for processing images, graphics, audio, video and text; Artificial intelligence inference as a service (IaaS); Provision of non-downloadable software for speculative decoding algorithms; Provision of non-downloadable software to enable computing devices to communicate and share data with other computing devices; Provision of non-downloadable software for running development programs and application programs in a common development environment; Provision of online open source libraries to assist software developers; Provision of non-downloadable educational software in the field of software development, computer programming, hardware development and design, cloud computing and artificial intelligence; Provision of non-downloadable computer software for retrieval-augmented generation; Provision of non-downloadable software platforms for assisting programming developers to collaborate on computer programming code as used in multiple application programs; Provision of non-downloadable software for assisting developers in developing computer software programming code; Provision of non-downloadable compiler software; Provision of non-downloadable data visualization software; Providing temporary use of online non-downloadable software development tools; Provision of non-downloadable software for generation of digital tokens for use in quantifying units of artificial intelligence; Provision of non-downloadable data mining software; Provision of non-downloadable system maintenance software; Providing online non-downloadable software for parallel computing; Server hosting; Cloud storage services for electronic data; Providing temporary use of on-line non-downloadable operating software for accessing and using cloud computing networks and cloud servers; Provision of non-downloadable virtualization software to create virtual representations of servers, storage, networks, and other physical machines; Provision of non-downloadable computer performance software, for operating integrated circuits, semiconductors, computer chip sets, microprocessors and host channel adapters; Providing online non-downloadable software for cloud infrastructure management and automation; Providing online non-downloadable software for deploying, distributing, configuring, testing, installing, upgrading, updating, customizing, debugging, and managing other software; Installation, maintenance, and updating of computer software; Testing, analysis and evaluation of computer hardware and software created by developers for the purpose of certification; Application service provider featuring application programming interface (API) software for artificial intelligence methods, natural language processing, natural language understanding, dialog systems, voice and speech recognition and text to speech systems, natural language human-machine interfaces and predictive assistance technologies; Application service provider featuring application programming interface (API) software to provide natural language processing, artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, unsupervised learning, data mining, predictive analytics and business intelligence; Providing on-line non-downloadable software for use in natural language processing, artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, unsupervised learning, data mining, predictive analytics and business intelligence; Artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for speech, voice, and language recognition, generation, and translation; Artificial intelligence as a service (AIAAS) services featuring software using artificial intelligence for conversation; Rental of computer hardware and computer peripherals; Rental of computer software; Providing online non-downloadable chatbot software for simulating conversations; User authentication services using single sign-on technology for online software applications
Systems and circuits are disclosed for synchronizing processor clocks in a network of processors by using the method of Flit Rate Synchronization, where counts of flits (a unit of data) sent and received by processors are used to determine if a child processor in a network needs to increase or decrease the speed of its clock. Other systems and circuits are disclosed for Beacon Rate Synchronization that use a periodic beacon to eliminate relative drift between processors by applying a small adjustment to selected clock periods on a child processor in order to maintain a constant distance between arrival times of a periodic time beacon sent by a parent processor. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
Described are embodiments related to counter threat vector. A system includes two Linear Feedback Shift Registers (LFSR) of an equal size in the number of bits, initialized at power up or reset to an initial condition, wherein the values of the two LFSRs are compared during each clock cycle and if there is a mismatch an error is reported and threat mitigation initiated. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
G06F 21/75 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by inhibiting the analysis of circuitry or operation, e.g. to counteract reverse engineering
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
Systems and circuits are disclosed for synchronizing processor clocks in a network of processors by using the method of Flit Rate Synchronization, where counts of flits (a unit of data) sent and received by processors are used to determine if a child processor in a network needs to increase or decrease the speed of its clock. Other systems and circuits are disclosed for Beacon Rate Synchronization that use a periodic beacon to eliminate relative drift between processors by applying a small adjustment to selected clock periods on a child processor in order to maintain a constant distance between arrival times of a periodic time beacon sent by a parent processor. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
Described are embodiments related to counter threat vector. A system includes two Linear Feedback Shift Registers (LFSR) of an equal size in the number of bits, initialized at power up or reset to an initial condition, wherein the values of the two LFSRs are compared during each clock cycle and if there is a mismatch an error is reported and threat mitigation initiated. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
G06F 21/75 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information by inhibiting the analysis of circuitry or operation, e.g. to counteract reverse engineering
G06F 21/70 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
Methods and other embodiments are described for enabling clock waveform synthesis for, in some embodiments, tensor or graphical processors that enable shorter runtime latency, higher computational job throughput, more efficient power management, and a lower implementation cost than alternative clock waveform methods. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
Clock period synthesis for fine-grain power management is provided. Methods are described for enabling clock waveform synthesis for, in some embodiments, tensor or graphical processors that enable shorter runtime latency, higher computational job throughput, more efficient power management, and a lower implementation cost than alternative clock waveform methods. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
Embodiments are directed to a deterministic streaming system with a scheduler, a compiler, and a plurality of deterministic streaming processors. The scheduler evaluates a latency for each task of a plurality of tasks to be run at the deterministic streaming system, and adjusts at least one of an accuracy metric and a quality metric for an output of each task based on the evaluated latency until the plurality of tasks can be completed before expiration of contractual deadlines. At least a subset of the plurality of deterministic streaming processors ruins the plurality of tasks each having the output with the adjusted accuracy metric and/or the adjusted quality metric. The compiler performs partial compilation of at least one model into an intermediate representation before requiring more information from the scheduler on how to finish the compilation. The scheduler generates the information for the compiler during a static capacity planning process.
Embodiments are directed to a deterministic streaming system with one or more deterministic streaming processors each having an array of processing elements and a first deterministic memory coupled to the processing elements. The deterministic streaming system further includes a second deterministic memory with multiple data banks having a global memory address space, and a controller. The controller initiates retrieval of first data from the data banks of the second deterministic memory as a first plurality of streams, each stream of the first plurality of streams streaming toward a respective group of processing elements of the array of processing elements. The controller further initiates writing of second data to the data banks of the second deterministic memory as a second plurality of streams, each stream of the second plurality of streams streaming from the respective group of processing elements toward a respective data bank of the second deterministic memory.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
Proactive thermal management of a deterministic processor to improve latency, throughput, and reliability is provided herein. Embodiments improve the compiler of a deterministic processor to calculate the estimated power and temperature profile of a program and/or workload over time, to then proactively schedule necessary and adequate cooling resources, and/or add dead compute cycles and other power reduction methods, to maintain the processor temperature within a specific “safe” range of operation while maximizing performance, efficiency and/or other figures of merit. Proactive thermal management by the compiler will increase processor throughput, and avoid unsafe temperature excursions, to improve the reliability and extend the lifetime of the processor. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
One or more embodiments of a regulator circuit for providing power to a load device having a first power demand profile over time. The regulator circuit comprises a regulator and an energy storage device coupled to the regulator and the load device. The regulator circuit is configured to scavenge provided energy that is available beyond the first power demand profile. Further, the regulator circuit is configured to store that energy in the energy storage device, and the energy storage device is configured to augment deliverable peak power to the load device when the load device requires more power than is provided by the regulator circuit.
H02M 3/158 - Conversion of DC power input into DC power output without intermediate conversion into AC by static converters using discharge tubes with control electrode or semiconductor devices with control electrode using devices of a triode or transistor type requiring continuous application of a control signal using semiconductor devices only with automatic control of output voltage or current, e.g. switching regulators including plural semiconductor devices as final control devices for a single load
31.
MODEL SUBSTITUTION FOR EFFICIENT DEVELOPMENT OF A SOLUTION
A neural network architecture for natural language processing is provided. The neural network architecture comprises: a speech-to-text encoder configured to encode an input speech signal; a Bifrost speech recognizable engine configured for processing the encoded speech signal to generate a speech recognized signal corresponding to the input speech signal; and a decoder configured to decode the speech recognized signal and to generate the output sequence.
Methods are described for enabling clock waveform synthesis for, in one embodiment, tensor processors, that enable more efficient power management, shorter runtime latency, higher computational job throughput, and a lower implementation cost than alternative clock waveform methods. Further embodiments describe modifications to power regulators to enable programmatic control of power management. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
When reading and writing DRAM (dynamic random-access memory), the latency and bandwidth is often unpredictable with large variations. One reason is because all the DRAM memory banks require periodic refreshes and maintenance cycles that interrupt these accesses. DRAM refresh and maintenance cycles are synchronized with the read/write accesses in a mutually exclusive manner, hence, preventing the accesses from being interfered with by a refresh or maintenance cycle resulting in predictable latency and bandwidth performance during read/write operations.
A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
Compilers for some processor architectures, in particular, deterministic processors, can predict exact processor current demands for a time period as brief as a few nanoseconds. Information generated by such compilers of future excessive current demand is used by the embodiments disclosed herein for predictive mitigation of voltage overshoot and undershoot. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Application service provider featuring application programming interface (API) software providing integration of natural language processing, machine learning, and artificial intelligence software into third-party computer programs; Application service provider featuring application programming interface (API) software providing access to open-source large language models (LLMs); Platform as a service (PaaS) featuring computer software platforms providing integration of natural language processing, machine learning, and artificial intelligence software into third-party computer programs; Platform as a service (PaaS) featuring computer software platforms providing access to open-source large language models (LLMs); Platform as a service (PaaS) featuring software for using artificial intelligence for the generation and processing of natural language into machine-executable commands; Platform as a service (PaaS) featuring software for conversion between speech or language recognition and text
37.
Power optimization in an artificial intelligence processor
In one embodiment, the present disclosure includes a method of reducing power in an artificial intelligence processor. For each cycle, over a plurality of cycles, an AI model is translated into operations executable on an artificial intelligence processor. The translating is based on power parameters that correspond to power consumption and performance of the artificial intelligence processor. The AI processor is configured with the executable operations, and input activation data sets are processed. Accordingly, result sets, power consumption data, and performance data are generated and stored over the plurality of cycles. The method further includes training an AI algorithm using the stored parameters, the power consumption data, and the performance data. A trained AI algorithm outputs a plurality of optimized parameters to reduce power consumption of the AI processor. The AI model is then translated into optimized executable operations based on the plurality of optimized parameters.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
A method comprises receiving a kernel used to convolve with an input tensor. For a first dimension of the kernel, a square block of values for each single dimensional vector of the kernel that includes all rotations of that single dimensional vector is generated. For each additional dimension of the kernel, group blocks of an immediately preceding dimension into sets of blocks, each set of blocks including blocks of the immediately preceding dimension that are aligned along a vector that is parallel to the axis of the dimension; and generate, for the additional dimension, one or more blocks of values, each block including all rotations of blocks within each of the sets of blocks of the immediately preceding dimension. The block of values corresponding to the last dimension in the additional dimensions of the kernel is output as the expanded kernel.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06F 7/76 - Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
G06F 18/2137 - Feature extraction, e.g. by transforming the feature spaceSummarisationMappings, e.g. subspace methods based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps
G06N 3/04 - Architecture, e.g. interconnection topology
One or more embodiments of a regulator circuit for providing power to a load device having a first power demand profile over time. The regulator circuit comprises a regulator and an energy storage device coupled to the regulator and the load device. The regulator circuit is configured to scavenge provided energy that is available beyond the first power demand profile. Further, the regulator circuit is configured to store that energy in the energy storage device, and the energy storage device is configured to augment deliverable peak power to the load device when the load device requires more power than is provided by the regulator circuit.
H02M 3/158 - Conversion of DC power input into DC power output without intermediate conversion into AC by static converters using discharge tubes with control electrode or semiconductor devices with control electrode using devices of a triode or transistor type requiring continuous application of a control signal using semiconductor devices only with automatic control of output voltage or current, e.g. switching regulators including plural semiconductor devices as final control devices for a single load
A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions. The processed instructions are dispatched to columns of the computational array via a plurality of instruction queues, wherein an instruction vector of instructions is shifted between adjacent instruction queues in a first direction, and dispatches instructions to the computational elements in a second direction.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 9/48 - Program initiatingProgram switching, e.g. by interrupt
Introduced here is a technique to create small compressed image files while preserving data quality upon decompression. Upon receiving an uncompressed data, such as an image, a video, an audio, and/or a structured data, a machine learning model identifies an object in the uncompressed data such as a house, a dog, a text, a distinct audio signal, a unique data pattern, etc. The identified object is compressed using a compression treatment optimized for the identified object. The identified object, either before or after the compression, is removed from the uncompressed data. The uncompressed data with the identified object removed is compressed using a standard compression treatment.
H04N 19/625 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
42 - Scientific, technological and industrial services, research and design
Goods & Services
Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) services featuring computer software platforms and software for providing medical professionals access to patient's health history, medications, current and past health status, and medical information on diseases, disease management, and prognoses for care; Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) services featuring computer software platforms and software that allows users to review machine learning-assisted predictive analysis of medical or prescription outcomes for patients; Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) services featuring computer software platforms and software that provides users with information relating to the interactions between diseases, infirmities, prescribed and potential medications, and specific patient information to determine proper diagnoses, prognoses, and treatment options for patients.
44.
Instruction format and instruction set architecture for tensor streaming processor
Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
A processor architecture and model exploration system for deep learning is provided. A method of improving performance of a processor system and associated software includes selecting a set of performance parameter targets for a processor architecture having a set of functional units and an AI model. The method also includes evaluating performance of the processor architecture and the AI model and adjusting at least one of the functional units of the processor architecture to form a new processor architecture prior to iteratively evaluating the combination of the new processor architecture and the AI model. Further, the method includes repeating the evaluating step and the adjustment step until the performance evaluation of the processor architecture and AI model meets the set of performance parameter targets.
A system and method of generating an efficient neural network model architecture and an efficient processor for deep learning in an artificial intelligence (AI) processor are provided. The system and method to create the processor architecture as a companion to the neural network model by composing a plurality of processor architectures to enable architectural exploration. The compilation can be implemented for any arbitrary spatial processor architecture using either ASIC or FPGA devices. The processor architecture can be uniquely defined for a selected ML or AI model without having to update the software compiler.
A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions. The processed instructions are dispatched to columns of the computational array via a plurality of instruction queues, wherein an instruction vector of instructions is shifted between adjacent instruction queues in a first direction, and dispatches instructions to the computational elements in a second direction.
G06F 9/30 - Arrangements for executing machine instructions, e.g. instruction decode
G06F 9/48 - Program initiatingProgram switching, e.g. by interrupt
G06F 5/01 - Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
49.
GRAPH PARTITIONING AND IMPLEMENTATION OF LARGE MODELS ON TENSOR STREAMING PROCESSORS
A graph partitioning compiler partitions an AI program or model for execution on multiple TSP modules configured for accelerating deep learning workloads.
Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
09 - Scientific and electric apparatus and instruments
Goods & Services
Recorded software implemented on a semiconductor chip or collection of semiconductor chips for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Recorded software implemented on a semiconductor chip or collection of semiconductor chips for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Downloadable software for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Downloadable software for natural language processing, generation, understanding and analysis relating to automatically generating computer software
42 - Scientific, technological and industrial services, research and design
Goods & Services
Online non-downloadable software for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Online non-downloadable software for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Software as a service (SaaS) featuring computer software for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Software as a service (SaaS) featuring computer software for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software; Platform as a service (PaaS) featuring computer software platforms for use in natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Platform as a service (PaaS) featuring computer software platforms for use in natural language processing, generation, understanding and analysis relating to automatically generating computer software
53.
Compiler operations for tensor streaming processor
Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
Improved placement of workload requests in a hosted compute resource uses a ‘friendly’ cuckoo hash algorithm to assign each workload request to an appropriately configured compute resource. When a first workload request is received, the workload is assigned to the compute resource module that has been pre-configured to execute that workload. Subsequent requests for a similar workload are either assigned to a second pre-configured compute resource or queued behind the first workload request.
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware, namely, computer server arrays for use in artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics and business intelligence
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware, namely, microprocessor chips for use in artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics and business intelligence
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware, namely, accelerator expansion cards for use in artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics and business intelligence; computer hardware, namely, accelerator cards for use in deep learning
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware, namely, computer servers for use in artificial intelligence, machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics and business intelligence
59.
DIE-TO-DIE DENSE PACKAGING OF DETERMINISTIC STREAMING PROCESSORS
Embodiments are directed to an integrated circuit with multiple dies connected in a die-to-die (D2D) configuration. The integrated circuit can include a first die and a second die connected to the first die via a D2D interface circuit in the D2D configuration forming a D2D structure with the first die. The D2D interface can connect a first plurality of superlanes of the first die with a second plurality of superlanes of the second die for streaming data between the first die and the second die along a first direction or a second direction orthogonal to the first direction.
H01L 25/065 - Assemblies consisting of a plurality of individual semiconductor or other solid-state devices all the devices being of a type provided for in a single subclass of subclasses , , , , or , e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group
H01L 23/538 - Arrangements for conducting electric current within the device in operation from one component to another the interconnection structure between a plurality of semiconductor chips being formed on, or in, insulating substrates
G06F 13/42 - Bus transfer protocol, e.g. handshakeSynchronisation
G06F 15/173 - Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star or snowflake
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware featuring linear algebraic accelerators implemented on a semiconductor chip or collection of semiconductor chips to process transformers based in artificial or machine learning models; Integrated circuit modules; integrated circuit cards and electrical computer components and printed manuals associated therewith sold as a unit
42 - Scientific, technological and industrial services, research and design
Goods & Services
Platform as a service (PaaS) featuring computer software platforms for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Platform as a service (PaaS) featuring computer software platforms for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence using linear algebraic accelerators implemented on a semiconductor chip or collection of semiconductor chips to process transformers based in artificial or machine learning models
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware featuring linear algebraic accelerators implemented on a semiconductor chip or collection of semiconductor chips to process transformers based in artificial or machine learning models; Integrated circuit modules; integrated circuit cards and electrical computer components and printed manuals associated therewith sold as a unit.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Platform as a service (PaaS) featuring computer software platforms for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Platform as a service (PaaS) featuring computer software platforms for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence using linear algebraic accelerators implemented on a semiconductor chip or collection of semiconductor chips to process transformers based in artificial or machine learning models
42 - Scientific, technological and industrial services, research and design
Goods & Services
Platform as a service (PaaS) featuring computer software platforms for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence; Platform as a service (PaaS) featuring computer software platforms for natural language processing, machine learning, voice command and recognition, converting speech to text, data analytics, and artificial intelligence using linear algebraic accelerators implemented on a semiconductor chip or collection of semiconductor chips to process transformers based in artificial or machine learning models
09 - Scientific and electric apparatus and instruments
Goods & Services
Computer hardware featuring linear algebraic accelerators implemented on a semiconductor chip or collection of semiconductor chips to process transformers based in artificial or machine learning models; Integrated circuit modules; integrated circuit cards and electrical computer components and printed manuals associated therewith sold as a unit
66.
Software-defined tensor streaming multiprocessor for large-scale machine learning
A system contains a network of processors arranged in a plurality of nodes. Each node comprises a respective plurality of processors connected via local links, and different nodes are connected via global links. The processors of the network communicate with each other to establish a global counter for the network, enabling deterministic communication between the processors of the network. A compiler is configured to explicitly schedule communication traffic across the global and local links of the network of processors based upon the deterministic links between the processors, which enable software-scheduled networking with explicit send or receive instructions executed by functional units of the processors at specific times, to establish a specific ordering of operations performed by the network of processors. In some embodiments, the processors of the network of processors are tensor streaming processors (TSPs).
A visualizer receives a compiled program to be run on a tensor streaming processor, which indicates a predetermined timing at which each functional unit of the processor receives instructions for processing data, and generates a visualization model used to display a schedule comprising elements corresponding to instructions received by each functional unit of a data path of the processor, arranged based upon a time at which each instruction is executed by its respective functional unit in accordance with the generated model. Due to the deterministic nature of the tensor streaming processor, the visualizer infers the flow of data across communication lanes of the processor, and to predicts the location of data within the processor for a given cycle during execution of the compiled program, without the need to actually execute the compiled program or to implement breakpoints within the program at specific cycles.
Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
Embodiments are directed to a computing system with permission control via data redundancy. The computing system includes a memory and a permission control circuit coupled to the memory. The permission control circuit encodes a first data vector by using a bit position register with a first permission control code for a first user, writes the encoded first data vector into the memory, and updates content of the bit position register from the first permission control code to a second permission control code for a second user. The encoded first data vector written into the memory is inaccessible for the second user based on the updated content of the bit position register.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
A deterministic apparatus comprising a deterministic near-compute memory communicatively coupled with and proximate to a deterministic processor. The deterministic near-compute memory comprises a plurality of data banks having a global memory address space, a control bus, a data input bus and a data output bus for each data bank. The deterministic processor is configured to initiate, via the control bus, retrieval of a set of data from the plurality of data banks. The retrieved set of data comprises at least one row of a selected one of the data banks passed via the data output bus onto a plurality of stream registers of the deterministic processor.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
Embodiments are directed to a deterministic streaming system with a scheduler, a compiler, and a plurality of deterministic streaming processors. The scheduler evaluates a latency for each task of a plurality of tasks to be run at the deterministic streaming system, and adjusts at least one of an accuracy metric and a quality metric for an output of each task based on the evaluated latency until the plurality of tasks can be completed before expiration of contractual deadlines. At least a subset of the plurality of deterministic streaming processors runs the plurality of tasks each having the output with the adjusted accuracy metric and/or the adjusted quality metric. The compiler performs partial compilation of at least one model into an intermediate representation before requiring more information from the scheduler on how to finish the compilation. The scheduler generates the information for the compiler during a static capacity planning process.
G06F 15/82 - Architectures of general purpose stored program computers data or demand driven
G06F 7/57 - Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups or for performing logical operations
G06F 9/48 - Program initiatingProgram switching, e.g. by interrupt
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
Embodiments are directed to a deterministic streaming system with one or more deterministic streaming processors each having an array of processing elements and a first deterministic memory coupled to the processing elements. The deterministic streaming system further includes a second deterministic memory with multiple data banks having a global memory address space, and a controller. The controller initiates retrieval of first data from the data banks of the second deterministic memory as a first plurality of streams, each stream of the first plurality of streams streaming toward a respective group of processing elements of the array of processing elements. The controller further initiates writing of second data to the data banks of the second deterministic memory as a second plurality of streams, each stream of the second plurality of streams streaming from the respective group of processing elements toward a respective data bank of the second deterministic memory.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
Introduced here is a technique to detect and/or correct errors in computation. The ability to correct errors in computation can increase the speed of the processor, reduce the power consumption of the processor, and reduce the distance between the transistors within the processor because the errors thus generated can be detected and corrected. In one embodiment, an error correcting module, running either in software or in hardware, can detect an error in matrix multiplication, by calculating an expected sum of all elements in the resulting matrix, and an actual sum of all elements in the resulting matrix. When there is a difference between the expected sum and the resulting sum, the error correcting module detects an error. In another embodiment, in addition to detecting the error, the error correcting module can determine the location and the magnitude of the error, thus correcting the erroneous computation.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) services featuring computer software platforms and software for providing medical professionals access to patient's health history, medications, current and past health status, and medical information on diseases, disease management, and prognoses for care; Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) services featuring computer software platforms and software that allows users to review machine learning-assisted predictive analysis of medical or prescription outcomes for patients; Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) services featuring computer software platforms and software that provides users with information relating to the interactions between diseases, infirmities, prescribed and potential medications, and specific patient information to determine proper diagnoses, prognoses, and treatment options for patients
79.
Power grid distribution for tensor streaming processors
Embodiments are directed to a power grid distribution for a deterministic processor. The deterministic processor includes a plurality of functional slices, a plurality of data transport lanes for transporting data across the functional slices along a first spatial dimension, and a plurality of instruction control units (ICUs). An instruction in each subset of the ICUs includes a functional slice specific operation code and is transported to a corresponding functional slice along a second spatial dimension orthogonal to the first spatial dimension. A power supply grid of metal traces is spread across the first and second spatial dimensions for supplying power to the functional slices and the ICUs. At least a portion of the metal traces are routed as discontinuous stubs along the first spatial dimension or the second spatial dimension.
Embodiments of the present disclosure pertain to switch matrix circuit including a data permutation circuit. In one embodiment, the switch matrix comprises a plurality of adjacent switching blocks configured along a first axis, wherein the plurality of adjacent switching blocks each receive data and switch control settings along a second axis. The switch matrix includes a permutation circuit comprising, in each switching block, a plurality of switching stages spanning a plurality of adjacent switching blocks and at least one switching stage that does not span to adjacent switching blocks. The permutation circuit receives data in a first pattern and outputs the data in a second pattern. The data permutation performed by the switching stages is based on the particular switch control settings received in the adjacent switching blocks along the second axis.
G06F 7/76 - Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
G06F 7/78 - Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data for changing the order of data flow, e.g. matrix transposition or LIFO buffersOverflow or underflow handling therefor
81.
NUMERICAL PRECISION IN DIGITAL MULTIPLIER CIRCUITRY
In one embodiment, multiplier circuitry multiplies operands of a first format. One or more storage register circuits store digital bits corresponding to an operand and another operand of the first format. A decomposing circuit decomposes the operand into a first plurality of operands, and the other operand into a second plurality of operands. Each multiplier circuit multiplies a respective first operand of the first plurality of operands with a respective second operand of the second plurality of operands to generate a corresponding partial result of a plurality of partial results. An accumulator circuit accumulates the plurality of partial results using a second format to generate a complete result of the second format that is stored in the accumulator circuit. A conversion circuit truncates the complete result of the second format and converts the truncated result into an output result of an output format.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06F 7/527 - Multiplying only in serial-parallel fashion, i.e. one operand being entered serially and the other in parallel
G06F 7/498 - Computations with decimal numbers using counter-type accumulators
G06F 5/08 - Methods or arrangements for data conversion without changing the order or content of the data handled for changing the speed of data flow, i.e. speed regularising having a sequence of storage locations, the intermediate ones not being accessible for either enqueue or dequeue operations, e.g. using a shift register
Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
A method comprises receiving a kernel used to convolve with an input tensor. For a first dimension of the kernel, a square block of values for each single dimensional vector of the kernel that includes all rotations of that single dimensional vector is generated. For each additional dimension of the kernel, group blocks of an immediately preceding dimension into sets of blocks, each set of blocks including blocks of the immediately preceding dimension that are aligned along a vector that is parallel to the axis of the dimension; and generate, for the additional dimension, one or more blocks of values, each block including all rotations of blocks within each of the sets of blocks of the immediately preceding dimension. The block of values corresponding to the last dimension in the additional dimensions of the kernel is output as the expanded kernel.
G06F 7/76 - Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
G06N 7/00 - Computing arrangements based on specific mathematical models
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06N 3/04 - Architecture, e.g. interconnection topology
G06F 18/2137 - Feature extraction, e.g. by transforming the feature spaceSummarisationMappings, e.g. subspace methods based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps
84.
Multimodal digital multiplication circuits and methods
Embodiments of the present disclosure pertain to multimodal digital multiplier circuits and methods. In one embodiment, partial product outputs of digital multiplication circuits are selectively inverted based on a mode control signal. The mode control signal may be set based on a format of the operands input to the multiplier. Example embodiments of the disclosure may multiply combinations of signed and unsigned input operands using different modes.
m-1-port data structures, such that each port of the part can access data entries of a first half of the n data entries either by accessing the structure storing that half directly, or by accessing both the difference structure and the structure containing the second half to reconstruct the data entries of the first half, thus allowing for a pair of ports to concurrently access any of the stored data entries in parallel.
In one embodiment, multiplier circuitry multiplies operands of a first format. One or more storage register circuits store digital bits corresponding to an operand and another operand of the first format. A decomposing circuit decomposes the operand into a first plurality of operands, and the other operand into a second plurality of operands. Each multiplier circuit multiplies a respective first operand of the first plurality of operands with a respective second operand of the second plurality of operands to generate a corresponding partial result of a plurality of partial results. An accumulator circuit accumulates the plurality of partial results using a second format to generate a complete result of the second format that is stored in the accumulator circuit. A conversion circuit truncates the complete result of the second format and converts the truncated result into an output result of an output format.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06F 5/01 - Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
A deterministic apparatus comprising a deterministic near-compute memory communicatively coupled with and proximate to a deterministic processor. The deterministic near-compute memory comprises a plurality of data banks having a global memory address space, a control bus, a data input bus and a data output bus for each data bank. The deterministic processor is configured to initiate, via the control bus, retrieval of a set of data from the plurality of data banks. The retrieved set of data comprises at least one row of a selected one of the data banks passed via the data output bus onto a plurality of stream registers of the deterministic processor.
A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
The present disclosure provides circuits and methods that can be used to update configurations. An example circuit can include a plurality hLUTs and a plurality of registers configured to propagate a set of data or a portion thereof to the plurality of hLUTs. An hLUT of the plurality of hLUTs can have a transformation unit comprising transformation circuitry configured to (i) receive the set of data or the portion thereof from a register of the plurality of registers and (ii) transform the set of data or the portion thereof into configurations for the hLUT.
H03K 19/173 - Logic circuits, i.e. having at least two inputs acting on one outputInverting circuits using specified components using elementary logic circuits as components
H03K 19/17728 - Reconfigurable logic blocks, e.g. lookup tables
H03K 19/21 - EXCLUSIVE-OR circuits, i.e. giving output if input signal exists at only one inputCOINCIDENCE circuits, i.e. giving output only if all input signals are identical
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
H03K 19/017 - Modifications for accelerating switching in field-effect transistor circuits
Introduced here is a technique to create small compressed image files while preserving data quality upon decompression. Upon receiving an uncompressed data, such as an image, a video, an audio, and/or a structured data, a machine learning model identifies an object in the uncompressed data such as a house, a dog, a text, a distinct audio signal, a unique data pattern, etc. The identified object is compressed using a compression treatment optimized for the identified object. The identified object, either before or after the compression, is removed from the uncompressed data. The uncompressed data with the identified object removed is compressed using a standard compression treatment.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
H04N 19/625 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
In one embodiment, in a first mode, first and second input operands having a first data type are multiplied using one or more of a plurality of multipliers, and in second mode, a plurality of input operands having a second data type are multiplied using the plurality of multipliers. Accordingly, multiplier circuitry may process different input data types and share circuitry across the different modes. In some embodiments, in the first mode, products may be converted to a third data type, and in the second mode, multiple products may be concatenated. Values in the third data type, in the first mode, and concatenated values having the second data type, in the second mode, may be added across different multimodal multipliers to form a multiply-accumulator. In some embodiments, the plurality of multiply-accumulators may be configured in series.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06F 5/01 - Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
96.
LOADING OPERANDS AND OUTPUTTING RESULTS FROM A MULTI-DIMENSIONAL ARRAY USING ONLY A SINGLE SIDE
A computational array is implemented in which all operands and results are loaded or output from a single side of the array. The computational array comprises a plurality of cells arranged in n rows and in columns, each configured to produce a processed value based upon a weight value and an activation value. The cells receive weight and activation values are received via colinear weight and activation transmission channels that each extend across a first side edge of the computational array to provide weight values and activations values to the cells of the array. In addition, result values produced at a top cell of each of the m columns of the array are routed through the array to be output from the same first side edge of the array at a same relative timing at which the result values were produced.
A computational array is implemented in which all operands and results are loaded or output from a single side of the array. The computational array comprises a plurality of cells arranged in n rows and m columns, each configured to produce a processed value based upon a weight value and an activation value. The cells receive weight and activation values via colinear weight and activation transmission channels that each extend across a first side edge of the computational array to provide weight values and activation values to the cells of the array. In addition, result values produced at a top cell of each of the m columns of the array are routed through the array to be output from the same first side edge of the array at a same relative timing at which the result values were produced.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 9/30 - Arrangements for executing machine instructions, e.g. instruction decode
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Introduced here is a technique to create small compressed image files while preserving data quality upon decompression. Upon receiving an uncompressed data, such as an image, a video, an audio, and/or a structured data, a machine learning model identifies an object in the uncompressed data such as a house, a dog, a text, a distinct audio signal, a unique data pattern, etc. The identified object is compressed using a compression treatment optimized for the identified object. The identified object, either before or after the compression, is removed from the uncompressed data. The uncompressed data with the identified object removed is compressed using a standard compression treatment.
H04N 19/625 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
H04N 19/103 - Selection of coding mode or of prediction mode
In one embodiment, the present disclosure includes a method of reducing power in an artificial intelligence processor. For each cycle, over a plurality of cycles, an AI model is translated into operations executable on an artificial intelligence processor. The translating is based on power parameters that correspond to power consumption and performance of the artificial intelligence processor. The AI processor is configured with the executable operations, and input activation data sets are processed. Accordingly, result sets, power consumption data, and performance data are generated and stored over the plurality of cycles. The method further includes training an AI algorithm using the stored parameters, the power consumption data, and the performance data. A trained AI algorithm outputs a plurality of optimized parameters to reduce power consumption of the AI processor. The AI model is then translated into optimized executable operations based on the plurality of optimized parameters.
A system may comprise a processor integrated circuit (IC) and a vector mapping sub-system that is separate from the processor IC and includes one or more ICs. The system may receive input data for processing by a predictive model and generate at least one memory address from the input data. At least one memory address may be provided to the vector mapping sub-system. The vector mapping sub-system generates a resulting vector of numbers based on the at least one memory address. The resulting vector can be a fixed length vector representation of the input data. The resulting vector is provided from the vector mapping sub-system to the processor IC. The processor IC executes one or more instructions for the predictive model using the resulting vector to generate a prediction. A corresponding method also is disclosed.
G06F 9/30 - Arrangements for executing machine instructions, e.g. instruction decode
G06F 15/173 - Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star or snowflake
G06F 12/0864 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
G06F 9/34 - Addressing or accessing the instruction operand or the result
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 8/35 - Creation or generation of source code model driven