An image annotation data generation method and apparatus, a model training method and apparatus, a device and a medium. The image annotation data generation method comprises: acquiring a sample image, a target recognition object in the sample image, and visual description information of the target recognition object (S101); verifying words in the visual description information on the basis of the sample image and the target recognition object to obtain a verification result of the visual description information (S102); adjusting the words in the visual description information on the basis of the verification result of the visual description information to obtain target description information of the target recognition object (S103); and taking the target description information of the target recognition object as annotation data of the target recognition object (S104).
Provided are a training method for an image generation model, an image generation method, apparatus, and a device. The training method includes extracting reference keypoints of a character from a sample reference image; based on a model to be trained, performing motion estimation using sample audio data and the reference keypoints to obtain predicted keypoints that match the sample audio data; performing parameter estimation using the reference keypoints and the predicted keypoints to obtain motion parameters of the predicted keypoints, and performing prior motion estimation using the motion parameters of the predicted keypoints to obtain optical flow of non-key pixel points; performing image prediction using the sample reference image and dense optical flow to obtain predicted image data that matches the sample audio data; performing model training using the predicted image data and annotated image data to obtain the image generation model.
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 30/18 - Extraction of features or characteristics of the image
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals
3.
METHOD FOR CONSTRUCTING MAP BASED ON LARGE MODEL, VEHICLE CONTROL METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A method of constructing a map based on a large model, a vehicle control method, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, in particular to fields of computer vision, deep learning, large model and generative model technologies. The method includes: acquiring an associated-region lane attribute and an image to be detected, which is collected by an onboard sensor and represents a road region to be detected; constructing a target prompt information based on the associated-region lane attribute; and processing the target prompt information and the image to be detected by using the large model to obtain a regional road map for the road region to be detected. The associated-region lane attribute corresponds to an associated road region, and the associated road region and the road region to be detected meet a predetermined similarity condition.
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
4.
INTERACTIVE METHOD BASED ON LARGE MODEL, TRAINING METHOD, INTELLIGENT AGENT, DEVICE, AND MEDIUM
An interactive method based on a large model, a training method, and an intelligent agent, which relate to fields of artificial intelligence, speech recognition, speech interaction, deep learning, large models, and application scenarios of knowledge search, autonomous driving, intelligent customer service, intelligent speech control, smart e-commerce, AI healthcare. The interactive method includes: acquiring a request speech; performing a speech recognition on the request speech to obtain a speech recognition feature representing a request semantics; and processing the speech recognition feature using the large model to obtain a response text, where the response text includes response words arranged in sequence, a target response word among the response words is determined by processing the speech recognition feature and an associated response word feature using an attention fusion layer of the large model, and the associated response word feature is related to an associated response word arranged before the target response word.
A task execution method and apparatus for a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence, and in particular to fields of deep learning and large model technologies. The method includes: executing, according to a target feature to be processed, a collaborative computing task using a target computing unit, where the collaborative computing task includes a first collaborative task and a second collaborative task, the first collaborative task is used to process the target feature to be processed and a first collaborative sub-weight to obtain an intermediate collaborative feature, the second collaborative task is used to process the intermediate collaborative feature and a second collaborative sub-weight to obtain a target collaborative feature; and fusing a target basic feature and the target collaborative feature to obtain a next target feature to be processed
A method, electronic device and computer-readable storage medium for extracting entity relationships, which relates to artificial intelligence technologies such as natural language processing, knowledge graphs, deep learning, and large language models. The method for extracting entity relationships includes: inputting a target long text into a target large language model to obtain a target keyword list based on an output result of the target large language model; inputting the target keyword list into multiple target relationship agents respectively to obtain multiple target regular expressions corresponding to different entity relationships based on output results of the multiple target relationship agents; and processing texts in a preset text set using the multiple target regular expressions to obtain entity relationship extraction results.
Provided is a method for editing an online document, an electronic device, and a storage medium, relating to the fields of computer technology, document processing, document editing, and artificial intelligence. The method includes: a client sends a data acquisition request for a target file to a server in response to an open request for the target file; receives a data entity of the target file returned by the server, where the data entity comprises multiple pieces of log data; loads the data entity in a preset web view container, and displays a web view with the data entity loaded; and receives an editing instruction for the web view, displays response result data in the web view in response to the editing instruction, and sends the response result data to the server so that the server updates the data entity of the target file according to the response result data.
The present disclosure provides an acquisition method and apparatus for attribute information of a directory of a distributed file system, and a device, and relates to the field of computer technologies and, in particular, to the field of big data and a distributed file system. The specific implementation is as follows: when an acquisition instruction is acquired, judging whether a log index of a directory indicated by the acquisition instruction is less than a transaction index of the directory, and whether a queue index of the directory is greater than or equal to the transaction index; if both of the above conditions are met, reading attribute modification information of a directory from a memory queue; and determining the attribute information of the current directory according to the attribute modification information and an attribute file of the directory.
A communication method, apparatus, electronic device and storage medium for a computing power cluster are provided. An implementation of the method includes: during a process of communicating with a communication receiver using Remote Direct Memory Access (RDMA) protocol, obtaining a first packet loss rate corresponding to the RDMA protocol; in response to the first packet loss rate being higher than a first preset packet loss rate, initiating a first handshake request to the communication receiver for requesting to switch to Transmission Control Protocol (TCP) for communication; receiving a first handshake response returned by the communication receiver for the first handshake request, and determining a first starting transmission position of data according to a last data receiving position in the first handshake response; and communicating, by using the TCP, with the communication receiver starting from data corresponding to the first starting transmission position.
An image search method, an intelligent agent, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology. The method includes: acquiring at least one first candidate image matched with an input text information; performing a semantic analysis on the input text information by using a first large model to generate at least one question-answer pair which includes a question information and a first answer information; performing an image-text analysis on the question information and the at least one first candidate image by using a second large model to generate a second answer information for answering each question information; and determining at least one target image matched with the image search requirement from the at least one first candidate image according to a comparison result between the at least one first answer information and the at least one second answer information.
The present disclosure provides a method and an apparatus for intent recognition based on a large language model (LLM), an electronic device, and a storage medium, relating to a field of computer technology, specifically to a field of artificial intelligence technology, such as natural language processing and an LLM. A specific implementation solution is as follows: obtaining a query statement, a preset intent, and descriptive information of the preset intent; obtaining a first candidate intent corresponding to the query statement by matching the query statement with the preset intent and the descriptive information of the preset intent; generating first prompt information based on the query statement, the first candidate intent, and descriptive information of the first candidate intent; and determining a first target intent corresponding to the query statement from the first candidate intent by inputting the first prompt information into the LLM.
A large model-based video processing method, device and storage medium in the field of artificial intelligence technology, particularly in the fields of deep learning and large models are disclosed. The specific solution includes: collecting an imitation video made by a user based on a target video; extracting three-dimensional postures of the imitation video using a pre-trained large model based on the imitation video; and performing posture assessment on the imitation video using the pre-trained large model based on the three-dimensional postures of the imitation video and pre-obtained three-dimensional postures of the target video to obtain an assessment result.
Provided is a method for controlling a virtual machine, an electronic device and a storage medium, relating to the field of computer technology, and in particular to application fields of cloud service, cloud computing and big data processing. The method includes: obtaining a virtual machine control request; generating a control operation instruction for a target virtual machine based on the virtual machine control request by using a computing node service deployed on the smart card; and sending the control operation instruction to a virtualization control program deployed on the physical machine; where the virtualization control program is used to perform a control operation on the target virtual machine based on the control operation instruction.
A method and apparatus for cross-computing power cluster communication, an electronic device, and a computer readable storage medium, are provided. An implementation of the method includes: in response to a communication initiator and a communication receiver respectively belonging to different computing power clusters, increasing a RDMA connection count on the basis of an initial connection count until a detected actual bandwidth value no longer increases, to obtain a target connection count; increasing, with the RDMA connection count being maintained at the target connection count, a Buffer size on the basis of an initial size until a detected actual bandwidth value no longer increases, to obtain a target size; determining cross-cluster transmission parameters, based on the target connection count and the target size; and communicating, cross-cluster transmission parameters, with the communication receiver belonging to a different computing power cluster.
H04L 67/1097 - Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
H04L 49/9005 - Buffering arrangements using dynamic buffer space allocation
15.
METHOD AND APPARATUS FOR PROCESSING DIRECTORY METADATA IN A DISTRIBUTED FILE SYSTEM, AND DEVICE
Baidu International Technology (Shenzhen) Co., Ltd. (China)
Inventor
Cao, Biao
Chen, Qiushi
Jian, Jielong
Duan, Liguo
Zheng, Ran
Abstract
The present disclosure provides a method for processing directory metadata in a distributed file system, an apparatus and a device, which are related to the field of computer technologies, in particular to the field of big data and distributed file systems. The specific solution is as follows: when a processing request is acquired, a first index shard corresponding to directory metadata indicated by the processing request may be determined according to the processing request; the first index shard may be loaded, and a metadata shard corresponding to the directory metadata indicated by the processing request is determined according to the first index shard. Information in the determined metadata shard is adjusted. A second index shard related to the directory metadata indicated by the processing request is determined according to the processing request, and index information corresponding to the directory metadata in the second index shard is adjusted.
Provided are an inference method and apparatus for a large language model, a device, and a storage medium. The inference method for the large language model includes: performing encryption on a target input text to obtain a target input ciphertext; sending the target input ciphertext to a server so that an encrypted model is used for performing inference on the target input ciphertext by the server to obtain a target result ciphertext; receiving the target result ciphertext sent by the server; and performing decryption on the target result ciphertext to obtain a target result plaintext.
H04L 9/06 - Arrangements for secret or secure communicationsNetwork security protocols the encryption apparatus using shift registers or memories for blockwise coding, e.g. D.E.S. systems
An image labeling method and apparatus, a device, and a storage medium. The method comprises: using a preset labeling model to perform initial labeling processing on a target image to be labeled, to obtain target position information of a target object in the target image, and description information to be verified (S101); inputting the target image and target prompt information into a generative pre-trained transformer (GPT) model, to obtain an object search result, the target prompt information being used for prompting the GPT model to search, on the basis of the target description information, the target image for the target object and/or a virtual object associated with the target object, and the target description information being determined on the basis of the description information to be verified (S102); and when it is determined on the basis of the object search result that the verification of the description information to be verified is passed, determining labeling information of the target image on the basis of the description information to be verified and the target position information (S103).
The present application provides a method and apparatus for transmitting audio and video data, a device, and a medium. The method for transmitting audio and video data comprises: determining a plurality of candidate transmission sequences for data frames in audio and video data to be transmitted, the data frames comprising audio frames and video frames; on the basis of frame attributes of the data frames, determining a loss parameter of each candidate transmission sequence; and, on the basis of the loss parameters, determining a target transmission sequence from the plurality of candidate transmission sequences, and transmitting the data frames to a client by using the target transmission sequence.
H04N 21/262 - Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission or generating play-lists
H04N 21/8547 - Content authoring involving timestamps for synchronizing content
20.
DIRECTORY METADATA OPERATION METHOD AND APPARATUS, ELECTRONIC DEVICE, AND READABLE STORAGE MEDIUM
A directory metadata operation method, including: in response to receiving a directory query request, obtaining a query path carried therein; obtaining a prebuilt prefix directory data table and a prebuilt full load directory data table, where a key of the prefix directory data table is a prefix path of the directory metadata, which is a subpath corresponding to a hierarchy other than a default hierarchy counting from end in a path corresponding to the directory metadata, a value of the prefix directory data table is the directory metadata, a key of the full load directory data table is identification information of the directory metadata and a value thereof is the directory metadata; determining a matching result between the query path and the key of the prefix directory data table; and determining a query result based on the matching result and the full load directory data table.
An information prediction method, a method of training an autonomous driving model, a device, a medium, and an autonomous driving vehicle, which relate to a field of artificial intelligence technology, and in particular, to fields of computer vision technology and deep learning technology, which may be applied to scenarios such as autonomous driving. Specific implementation scheme of the information prediction method is: acquiring perception data including image data acquired by a sensor in a vehicle and driving data of the vehicle; encoding the image data to obtain an image token sequence corresponding to the image data; encoding the driving data to obtain a driving feature corresponding to the driving data; and generating, using a generative model, a predicted token sequence corresponding to the image token sequence and a control information for the vehicle based on the driving feature and the image token sequence.
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G06V 10/40 - Extraction of image or video features
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
22.
METHOD AND APPARATUS FOR CONFIGURING SCREEN, SYSTEM, ELECTRONIC DEVICE, AND STORAGE MEDIUM
The present disclosure provides a method and apparatus for configuring a screen, a system, an electronic device, a storage medium, and a program product, relating to the technical field of data processing, in particular to the technical field of display control and the technical field of large models. The specific implementation scheme comprises: on the basis of a received screen configuration statement, determining a display area to be configured and corresponding operation information; on the basis of layout information and the operation information of the display area, generating a screen configuration instruction; and, on the basis of the screen configuration instruction, determining display content and outputting the display content to the display area.
The disclosure provides an audio and video synchronization detection method, an audio and video synchronization detection device, an electronic equipment and a terminal, and relates to a field of image processing, in particular to the technical fields of computer vision, artificial intelligence and the like. The method includes: extracting image data and audio data of a video segment of a target length; obtaining a plurality of face image lists by performing face detection and tracking based on the extracted image data; extracting mouth features corresponding to each face image list based on a traversal result of the face image list, in which the mouth features are used for representing changes in lip shape; and determining a synchronization result of the video segment based on the audio data and the mouth features.
H04N 21/439 - Processing of audio elementary streams
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
24.
IMAGE ENCODING METHOD, DEVICE AND ELECTRONIC EQUIPMENT
The disclosure provides a method for coding an image. The method includes: obtaining a contour mask image of an original image; obtaining an enhancement image by performing enhancement on the original image based on the contour mask image; and obtaining a coding stream and a reconstructing image of the original image based on the contour mask image and the enhancement image.
The present disclosure relates to the technical field of data processing, in particular to the technical fields such as map technology, indexing technology, trajectory positioning, and Internet of things technology, and provides a data processing method and apparatus, an electronic device and a computer-readable storage medium. The specific implementation solution comprises: acquiring index data, wherein the index data is obtained by clustering a plurality of area of interest spatial surfaces, the index data comprises a global index and a plurality of local indexes, the local indexes have one-to-one correspondence to clustering results and are relational indexes of one or more area of interest spatial surfaces comprised in the clustering results, and the global index is a relational index of the plurality of clustering results; and on the basis of the index data, acquiring from among full trajectory data the trajectory data corresponding to at least one area of interest spatial surface.
The present disclosure relates to the technical field of computers, and particularly relates to the technical fields of autonomous driving and artificial intelligence. Provided are an autonomous driving model based on temporal recursive autoregressive inference, and a method, an apparatus and a vehicle. In the autonomous driving model, a coding layer is configured to code sensor information of a current moment, so as to obtain a current scenario representation; a trajectory planning layer is configured to determine a driving trajectory from the current moment to a future moment on the basis of a historical scenario representation of the current moment; and an inference layer is configured to determine a predicted scenario representation of at least one future moment and a historical scenario representation of the future moment on the basis of the current scenario representation, the historical scenario representation and current prompt information, wherein the prompt information at least comprises the driving trajectory. Therefore, the autonomous driving model can use an integrated inference layer to realize the learning of historical information and the prediction of future information, so that the model can perform future prediction while learning the history, and thus the prediction effect of the model is improved.
The present disclosure relates to the technical field of computers, in particular to the technical field of autonomous driving and artificial intelligence, and provides a generative diffusion model-based autonomous driving model, a method, an apparatus, and a vehicle. An encoding layer in an autonomous driving model is configured to encode current perception information of an autonomous driving vehicle, to obtain a discrete spatial representation of a current scene, a prediction layer is configured to perform discrete diffusion on the basis of at least one scene discrete spatial representation comprising the discrete spatial representation of the current scene, to determine a prediction spatial representation at a future moment, and a decoding layer is configured to decode the prediction spatial representation, to obtain autonomous driving decision information at the future moment. Thus, according to the autonomous driving model, a generative diffusion model-based output can be used to determine the autonomous driving decision of the autonomous driving model. By improving the accuracy of future prediction, autonomous driving decision and prediction effects are further improved.
Provided are a map updating method and apparatus, a model training method and apparatus, an electronic device, and a medium, relating to the technical field of artificial intelligence, and in particular to the fields such as automatic driving, intelligent traffic, computer vision, and image processing. A specific implementation solution is: for a same piece of position information, determining associated historical map data and M pieces of collected image data based on a time sequence, wherein M is an integer greater than or equal to 1; respectively encoding the historical map data and the M pieces of collected image data to obtain a historical map feature and M collected image features; determining position information and category information of a target instance on the basis of the historical map feature and the M collected image features, wherein the target instance represents lane information; and updating the map data on the basis of the position information and the category information of the target instance.
G06F 16/56 - Information retrievalDatabase structures thereforFile system structures therefor of still image data having vectorial format
G06F 16/587 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
G06V 20/70 - Labelling scene content, e.g. deriving syntactic or semantic representations
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G01C 21/00 - NavigationNavigational instruments not provided for in groups
29.
VIDEO RENDERING METHOD FOR LIVE BROADCAST SCENE, ELECTRONIC DEVICE AND STORAGE MEDIUM
Provided is a video rendering method for a live broadcast scene, relating to the field of live broadcast and the field of large model. The method includes: recording a live broadcast of an anchor to obtain a first video stream; performing speech recognition on live speech in the first video stream to obtain first text information; determining topic popularity of the live broadcast based on audience response information in a process of recording the live broadcast and the first text information; determining corresponding reply text information based on the first text information when the topic popularity of the live broadcast meets a first set condition; rendering virtual characters based on the reply text information to obtain a second video stream; and generating a third video stream of the anchor chatting with the virtual characters based on the first video stream and the second video stream.
Provided are a resource processing method and apparatus based on a hybrid content delivery network (CDN) system and a device. The method is performed by a CDN edge node in the hybrid CDN system and includes: in response to a user request, requesting a resource from a CDN parent node, and determining, by the CDN parent node, whether the user request satisfies a hybrid pull condition for pulling the resource from a CDN hybrid node; and when the user request satisfies the hybrid pull condition, acquiring a to-be-accessed first resource from the CDN hybrid node.
A method for training a multimodal large model includes: obtaining first training data and second training data; obtaining an initial multimodal large model, wherein the multimodal large model comprises a backbone network and multiple codec networks corresponding to the multiple non-textual modalities; and the multiple codec networks perform encoding and decoding based on a same multimodal word list; performing a joint training on the multiple codec networks and the multimodal word list based on the data under the multiple non-textual modalities; and training the backbone network based on the multimodal sample reference data and the sample generation data under the target task in the second training data. The multiple codec networks perform the encoding and decoding based on the same multimodal word list, which reduces the difficulty and the cost of the model training.
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
32.
AUTONOMOUS DRIVING MODEL, METHOD, APPARATUS AND VEHICLE CAPABLE OF ACHIEVING MULTI-MODAL INTERACTION
Provided are an autonomous driving model, method, apparatus and vehicle capable of achieving multi-modal interaction. An input layer (210) in an autonomous driving model (200) is configured to receive historical decision information (201), sensing information (202), traffic information (203) and interaction information (204) at a current moment; an encoding layer (220) is configured to encode input information; an autoregressive inference layer (230) is configured to obtain a hidden state for a next moment; and a decoding layer (240) is configured to perform decoding so as to obtain interaction information for the next moment and autonomous driving decision-making information for the next moment. The autonomous driving model (200) may understand a current driving environment by inferring the sensing information (202), the traffic information (203) and the interaction information (204), and gains a better understanding of the influence of historical operations on the autonomous driving process by inferring historical decision-making data, thereby making an output result of the autonomous driving model achieve better interpretability and controllability.
The present disclosure relates to the field of computer technology and particularly relates to the technical fields of autonomous driving and artificial intelligence. Provided are an autonomous driving method, apparatus and vehicle capable of following instructions for self-recovery. The autonomous driving method comprises: acquiring input information; encoding the input information to obtain an input tensor corresponding to the input information; performing autoregressive inference on the input tensor to obtain a hidden state for a first moment after a current moment; and performing decoding on the basis of the hidden state of the first moment to obtain interaction information for the first moment and autonomous driving decision-making information for the first moment, wherein the interaction information for the first moment comprises a signal used for indicating that assistance is required during autonomous driving. In this way, prompt information for natural language interaction can be used to instruct an autonomous driving model to control a vehicle, thereby achieving the rapid generation of a self-recovery scheme. In addition, the vehicle can perform autonomous self-recovery on the basis of self-recovery scheme instructions.
The present disclosure relates to the technical field of artificial intelligence such as intelligent transportation and computer vision, and provides a method and apparatus for detecting failure to yield to pedestrians of a vehicle, an electronic device, and a readable storage medium. The method for detecting failure to yield to pedestrians of a vehicle comprises: acquiring a marked crosswalk area on the basis of surveillance images in a surveillance video stream; for each surveillance image frame in the surveillance video stream, acquiring the intersection over union between a target vehicle and the marked crosswalk area in the surveillance image frame, and selecting multiple surveillance image frames having intersection over union greater than a preset intersection over union threshold as target surveillance images corresponding to the target vehicle; and sequentially performing human detection on each target surveillance image frame in descending order of intersection over union, and when it is determined that a human detection result corresponding to a current target surveillance image satisfies a preset requirement, determining that the target vehicle is a regulation-violating vehicle which does not yield to pedestrians. The present disclosure can expand the detection scenario, reduce the computing resources required for detection, and improve the detection efficiency while detecting whether vehicles yield to pedestrians.
In the field of cloud networks and network security, which may be applied to intelligent cloud scenarios, a cloud network message processing method includes: obtaining a cloud network message; determining, from at least one type of pre-configured candidate security device, a target type of candidate security device corresponding to the cloud network message; in the case that there are multiple candidate security devices of the target type, determining a target security device from the multiple candidate security devices of the target type based on session information included in the cloud network message, where cloud network messages with same session information correspond to a same target security device; sending the cloud network message to the target security device for security processing, and sending the cloud network message having been security processed by the target security device to a destination.
The present disclosure relates to the technical fields of artificial intelligence, such as cloud services, big data and large language models, and provides an algorithm service deployment method and apparatus, an electronic device, and a readable storage medium. The algorithm service deployment method comprises: a first server acquiring service operation data and service information of a target algorithm service, wherein the service operation data comprises one of an application program interface, a mirror image file and a model file; determining a service access type on the basis of the service property of the target algorithm service, and acquiring a target access method corresponding to the service access type, wherein the service access type comprises one of application program interface access, mirror image access and model access; and deploying the target algorithm service at the first server on the basis of the target access method, the service operation data and the service information. According to the present disclosure, the first server supports deployment of algorithm services corresponding to different service access types, so that the first server can have stronger service deployment performance.
A motion estimation method, apparatus, electronic device, storage medium, and computer program product are disclosed, which relates to the field of artificial intelligence, specifically cloud storage, cloud computing, video encoding. A method for motion estimation comprises: determining candidate search spaces and candidate search starting points based on a lookahead motion vector and a predicted search starting point of a current block; determining a target search starting point from the candidate search starting points and determining a target search space from the candidate search spaces; performing a search based on the target search starting point and the target search space to obtain an initial motion estimation result for the current block; obtaining a target motion estimation result for the current block based on the initial motion estimation result.
The present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, deep learning and the like, and provides a target identification method and device. The specific implementation comprises: obtaining a video stream collected by a rotatable camera device; determining a preset position of the camera device on the basis of a target in an image frame corresponding to the video stream; determining a still image frame and a fixed still object in the still image frame on the basis of the preset position and the video stream; and determining a position change result of the target in the video stream on the basis of the still image frame and the fixed still object. The implementation improves the accuracy of position change identification of targets.
Provided are a query processing method based on a large language model, an electronic device, and a storage medium. The query processing method based on a large language model includes acquiring a to-be-processed target query; generating a prompt based on a to-be-used target data model, target format information of a specified data format, and the target query; inputting the prompt into the large language model to obtain a target parsing result of the specified data format outputted by the large language model; and modifying the target parsing result based on the target data model.
A method of training a deep learning model and a method of synthesizing a speech are provided, which relate to a field of artificial intelligence technology, in particular to fields of large model, large language model, generative model, deep learning, and speech processing technologies. The method of training a deep learning model includes: determining a reference speech feature of a sample speech, the reference speech feature being associated with a prosodic feature of the sample speech; retrieving a speech library using a sample text corresponding to the sample speech, so as to obtain a pronunciation expression feature of the sample text; inputting the pronunciation expression feature into the deep learning model to obtain an output speech feature; determining a loss of the deep learning model according to the reference speech feature and the output speech feature; and adjusting a parameter of the deep learning model according to the loss.
The present disclosure relates to the technical field of artificial intelligence, and specifically to technical fields such as computer vision, deep learning and big data. Provided are a rainfall identification method and apparatus, a model training method and apparatus, and a device and a storage medium, which may be applied in scenarios such as smart cities and emergency management. The rainfall identification method comprises: processing a target video collected by a target camera, so as to obtain an initial rainfall identification result; determining a target rainfall amount station, the distance between which and the target camera meets a preset condition, and acquiring target rainfall amount data of the target rainfall amount station; and on the basis of the initial rainfall identification result and the target rainfall amount data, determining a target rainfall identification result. The present disclosure may improve the accuracy of rainfall identification.
Provided are a query processing method based on a large language model, a prompt construction method, an electronic device, and a storage medium. The query processing method includes acquiring a to-be-processed target query; acquiring a data field in a target data model and acquiring target format information of a specified data format; constructing a prompt based on the data field in the target data model, the target format information, and the target query; and inputting the prompt into the large language model to obtain a target format result outputted by the large language model.
A method for predicting a structure of a protein complex includes: obtaining an initial coordinate of each amino acid residue in a target protein complex, and obtaining a target residue pair feature, a first multiple sequence alignment (MSA) feature and a second MSA feature of each protein monomer in the target protein complex; and inputting the initial coordinate of each amino acid residue, and the target residue pair feature, the first MSA feature and the second MSA feature of each protein monomer into an N-level fold iteration network layer, and obtaining a target coordinate of each amino acid residue by predicting a torsion angle, a position transformation at residue level and a position transformation at monomer chain level of each amino acid residue via the N level fold iteration network layer, to obtain a predicted structure of the protein complex.
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
44.
HUMAN-COMPUTER METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Provided are a human-computer interaction method and apparatus, an electronic device and a storage medium, relating to the technical field of artificial intelligence, in particular to the technical fields of deep learning, natural language processing and large models. The specific implementation solution comprises: in response to a human-computer interaction request and on the basis of a first dialogue text comprised in the human-computer interaction request, determining from among a plurality of plug-ins registered in a large language model a first target plug-in related to the first dialogue text; obtaining a second dialogue text on the basis of the first dialogue text and a description text of the first target plug-in; and inputting the second dialogue text into the large language model to obtain a reply text.
A method for reference frame selection, an apparatus for reference frame selection, an electronic device and a storage medium are provided, which relates to the field of data processing technology, in particular to the fields of video coding technology and unsupervised learning technology. The method includes: acquiring a current frame to be processed and determining attribute information of the current frame; selecting candidate reference frames from a reference frame set according to the attribute information; clustering the selected candidate reference frames to obtain at least one cluster; and selecting one candidate reference frame from each of the at least one cluster and adding the one selected candidate reference frame from each of the at least one cluster to a reference frame list associated with the current frame. The technical solution herein provided can improve the accuracy of reference frame selection.
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
46.
MODEL OPERATOR PROCESSING METHOD AND DEVICE, ELECTRONIC EQUIPMENT AND STORAGE MEDIUM
A method for processing a model operator includes: determining an operator set for model networking, wherein the operator set comprises a plurality of operators; determining a storage amount occupied by an output tensor of each operator in the operator set and a computation time period consumed in a forward computation of each operator in the operator set; and determining a first operator participating in recomputation in a model from the operator set, based on the storage amounts and the computation time periods of the plurality of operators.
A method is provided that includes: obtaining first urban data of a first sample urban region; inputting the first urban data into a multi-modal foundation model to obtain respective predicted vector representations of a plurality of first data segments; obtaining a plurality of general-purpose foundation models that are pre-trained; for each general-purpose foundation model: generating a vector representation label of a first data segment of a corresponding data modality by using the general-purpose foundation model; and determining a knowledge distillation loss of the general-purpose foundation model based on the vector representation label and a predicted vector representation of the first data segment; and adjusting parameters of the multi-modal foundation model based on at least respective knowledge distillation losses of the plurality of general-purpose foundation models.
The present disclosure relates to the technical fields of computers, and in particular to the technical fields of artificial intelligence (AI), neural network models, and smart city, and provides a target detection method based on a multi-task AI large model, and a model training method based on a multi-task AI large model. The specific implementation scheme of the target detection method comprises: recognizing a target object of an image under test to obtain a first recognition result; on the basis of the confidence level of the first recognition result and a first threshold corresponding to a first precision, determining a first alarm object from the first recognition result as a detection result; when a trigger condition is met, performing target detection on an image under supplementary test corresponding to the first recognition result to obtain a second recognition result; on the basis of the confidence level of the second recognition result and a second threshold corresponding to a second precision, determining a second alarm object from the second recognition result; and updating the detection result on the basis of the second alarm object. The present disclosure can ensure high precision of target detection and reduce the missing recall.
The present application relates to the technical field of computers, and particularly relates to the field of source codes. Provided are a method and apparatus for generating simulated data, and an electronic device and a storage medium. The specific implementation solution comprises: acquiring a database table of a project to be processed, and determining data simulation configuration information corresponding to a field in the database table, wherein the database table comprises at least one field, which represents the type of simulated data required by said project, and the data simulation configuration information represents a generation means for the simulated data and the format of the simulated data; and on the basis of the data simulation configuration information, generating the simulated data under the field in the database table. Different pieces of configuration information are set for different fields, and when it is necessary to simulate data, corresponding configuration information is searched for to automatically generate simulated data, thereby improving the simulation efficiency of the data.
Disclosed are a sentiment analysis method and apparatus, a large language model training method and apparatus, an electronic device, a storage medium, a computer program product and a computer program. The sentiment analysis method comprises: acquiring first target text; extracting from the first target text an object to be analyzed; generating second target text on the basis of the first target text and said object, wherein the second target text comprises task prompt text, and the task prompt text is used for prompting a large language model to execute a sentiment analysis task on said object on the basis of the first target text; and inputting the second target text into the large language model to obtain the sentiment polarity of said object.
The present application provides a large language model-based event processing method and apparatus, a device and a medium. The large language model-based event processing method comprises: acquiring a question input by a user; acquiring pre-generated event information; fusing the question and the event information to obtain a fusion result; and inputting the fusion result into a pre-trained large language model to obtain reply content corresponding to the question.
The present disclosure relates to the technical field of artificial intelligence, and specifically to the technical fields such as large models and natural language understanding, and provides a chart generation method and apparatus, a device, and a storage medium. The chart generation method comprises: acquiring target text content and target prompt information; on the basis of the target text content and the target prompt information, using a first pre-trained language model to generate structured information, wherein the structured information is used for generating a target chart; on the basis of the structured information, using a second pre-trained language model to generate the target chart; and displaying the target chart. The present disclosure can improve the chart generation efficiency and accuracy.
The present disclosure relates to the technical field of video processing, in particular to the technical field of monitoring video processing, and provides a monitoring video processing method, a monitoring video processing apparatus, an electronic device, a storage medium, and a program product. The specific implementation solution is: acquiring a monitoring video stream to be processed; performing semantic segmentation on video frames in said monitoring video stream to obtain semantic tags of the video frames; on the basis of the semantic tags of the video frames and scenario determination rules, determining service scenarios to which the video frames are applicable; and determining a scenario tag of said monitoring video stream on the basis of the service scenarios to which the plurality of video frames in said monitoring video stream are applicable.
The present disclosure relates to the technical field of code generation, and provides a code generation method and apparatus, a device and a storage medium. The method comprises: in response to receiving operation information of a user, requesting a node backend to create a corresponding project and a project file; in response to determining that the project and the project file are created, requesting a java backend to create a data source; calling the java backend to perform project engineering initialization and code assembly; and in response to determining that the code assembly is completed, performing project verification and code export.
Disclosed are a student model generation method and apparatus based on a large model, and an electronic device, a storage medium, a computer program product and a computer program. The method comprises: acquiring a sample data set; inputting input data and prompt information into a large model, so as to acquire first content generated by the large model; converting the first content into a first prediction result which has the same type as a labeling result; inputting the input data into an initial student model, so as to acquire a second prediction result that is output by the initial student model; determining a correction gradient on the basis of respective differences between the second prediction result and the first prediction result and between the second prediction result and the labeling result; and on the basis of the correction gradient, correcting the initial student model, so as to acquire a target student model.
The present disclosure relates to the technical field of artificial intelligence, and specifically relates to the fields of deep learning and computer vision. Provided are a multi-objective optimization method and apparatus, and a device and a storage medium. The method comprises: acquiring a set of values to be quantized and a set of objectives to be optimized; for each value to be quantized in said set of values, acquiring a quantization coefficient corresponding to said value, and determining a set of adjacent quantization coefficients of the quantization coefficient; on the basis of a set of reconstruction values corresponding to the set of adjacent quantization coefficients, determining a set of reconstruction distortion values; and on the basis of the set of reconstruction distortion values and said set of objectives, determining a target quantization coefficient. The multi-objective optimization method provided in the present disclosure improves the performance of other optimization objectives while ensuring the performance of traditional optimization objectives.
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/762 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
The present disclosure provides a data operation method, apparatus, device, and storage medium, which relates to the technical fields of distributed file system, in particular, to the technical fields of multi-version concurrency control and log-structured merge tree. The specific implementation scheme is as follows: obtaining a plurality of operation records on at least one piece of data in a file system; determining a target operation record of target data in the plurality of operation records, where the target operation record is a deletion record, and the target operation record is a latest operation record of the target data; and deleting at least one version of the target data in the file system according to the target operation record.
G06F 16/16 - File or folder operations, e.g. details of user interfaces specifically adapted to file systems
G06F 16/11 - File system administration, e.g. details of archiving or snapshots
58.
METHOD AND APPARATUS FOR GENERATING IMAGE ADVERSARIAL SAMPLE, METHOD AND APPARATUS FOR TRAINING IMAGE PROCESSING MODEL, IMAGE PROCESSING METHOD AND APPARATUS, AND DEVICE AND MEDIUM
Provided in the present application are a method and apparatus for generating an image adversarial sample, a method and apparatus for training an image processing model, an image processing method and apparatus, and a device and a medium. The method for generating an image adversarial sample comprises: acquiring an original image sample, and acquiring a feature vector map corresponding to the original image sample (S110); performing image scaling processing on the feature vector map according to an image scale of the original image sample, so as to obtain a standard-scale feature map (S120); using an attention mechanism network of a target type to process the standard-scale feature map, so as to obtain an attention influence map (S130); and on the basis of the attention influence map, adding a disturbance to the original image sample, so as to obtain an image adversarial sample corresponding to the original image sample (S140).
The present disclosure provides method and apparatus for generating 3D scene based on large language model, electronic device, and storage medium, which relates to the field of artificial intelligence technologies, particularly the fields of three-dimensional modeling technologies, large language model technologies, or the like. The three-dimensional scene generating method based on a large language model includes: processing description information of a target three-dimensional scene to obtain label information in the description information; generating query operation prompt of the LLM based on the label information, and acquiring a target asset set matched with the label information by the LLM based on the query operation prompt, the target asset set including a target asset in the target three-dimensional scene, target material information of the target asset and target scene attribute information of the target asset; and generating the target three-dimensional scene based on the target asset set.
The present disclosure provides method and apparatus for transferring facial expression of digital human, electronic device, and storage medium, which relates to the fields of augmented reality technologies, virtual reality technologies, computer vision technologies, deep learning technologies, or the like, and can be applied to scenarios, such as metaverse, a virtual digital human, or the like, An implementation includes: screening an identification of a target reference model matched with an object model from a preset reference model library; the reference model library including a plurality of reference models; acquiring an expression library of the target reference model based on the identification of the target reference model; and transferring a last frame of an expression in the expression library of the target reference model into the object model to obtain a last frame of an expression of the object model.
Provided is a method for processing video coding. The method includes: according to domain image blocks of a target image block in a video frame, determining whether the target image block belongs to a candidate caption region; in response to determining that the target image block belongs to the candidate caption region, generating a pixel histogram of the target image block; according to the pixel histogram of the target image block, determining a region type to which the target image block belongs, where the region type is a caption region or a non-caption region; and according to the region type to which the target image block belongs, determining a target coding mode for the target image block.
H04N 19/132 - Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/593 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
62.
METHOD AND APPARATUS FOR IMAGE PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM
A method for image processing, including: obtaining an image to be processed; determining a portrait area in the image to be processed, and cropping, based on the portrait area, a target background image from the image to be processed; obtaining a portrait by performing portrait matting on the image to be processed and obtaining an enlarged portrait by enlarging the portrait, wherein a height of the enlarged portrait is greater than a height of the target background image; and generating, based on the target background image and the enlarged portrait, a target image.
Provided is a data query optimization method, an electronic device and a storage medium, relating to the field of data processing technology and in particular to the technical fields of distributed database, big data, cloud computing and others. The method includes: determining a plurality of candidate execution plans for a target query request; determining execution costs of the plurality of candidate execution plans; updating the execution costs of the plurality of candidate execution plans based on monitoring data of data nodes involved in the plurality of candidate execution plans, to obtain final costs of the plurality of candidate execution plans; and screening out a final execution plan for the target query request from the plurality of candidate execution plans based on the final costs of the plurality of candidate execution plans.
Provided is a method for processing an oracle region cache, an electronic device and a storage medium, relating to the field of data processing technology, and in particular to the fields of big data, cloud computing, distributed database, intelligent search and other technologies. The method includes: obtaining a benefit parameter of a region to be processed, wherein the benefit parameter is used to represent a difference between benefit and cost of setting the region to be processed in the oracle region cache; and selecting the region to be processed to update the oracle region cache when the benefit parameter of the region to be processed meets a target condition.
A digital human generation method, an electronic device and a storage medium are disclosed. The solution relates to the fields of augmented reality technologies, virtual reality technologies, computer vision technologies, deep learning technologies, or the like, and can be applied to scenarios, such as metaverse, a virtual digital human, or the like. An implementation includes: acquiring a corresponding target object model based on a picture of a to-be-generated digital human; acquiring a corresponding point cloud of a head key feature in the picture from a pre-configured feature library based on the head key feature; and fusing the point cloud of the head key feature in the target object model to obtain a digital human figure.
The disclosure provides a code completion method based on a big model. The method includes: determining a first code element where a position to be completed is located in a first code file to be completed; determining a second code file having a dependency relationship with the first code file from a development project to which the first code file belongs; determining, according to the first code element, a second code element whose correlation with the first code element meets a preset condition, in which the second code element belongs to at least one of the first code file or the second code file; and generating a target code corresponding to the position to be completed through a big model based on a signature of the second code element.
The present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields such as intelligent office, cloud computing, generative dialogue systems, and large language models (LLMs), and provides an LLM-based data query method and apparatus, a device, and a storage medium. The LLM-based data query method comprises: determining a target data table from among candidate data tables on the basis of a query question, wherein the target data table comprises candidate attributes; determining a target attribute from among the candidate attributes on the basis of the query question; generating query instruction prompt information of an LLM on the basis of the query question, table information of the target data table, and attribute information of the target attribute, and using the LLM to generate a query instruction on the basis of the query instruction prompt information; and on the basis of the query instruction, querying from within the target data table to obtain a query answer corresponding to the query question. The present disclosure can improve the data query efficiency and accuracy.
A method and apparatus for processing an access request, and a computer readable storage medium are provided. The method includes acquiring identification information and an IP address of an access account from an authentication message; determining permission configuration information matching the identification information; generating an access control entry based on the permission configuration information and the IP address; and processing an access request of an access account based on an access control entry.
Provided is a method of deploying a multimodal large model, an electronic device and a storage medium, relating to field of artificial intelligence technology, and in particular, to fields of deep learning and model deployment. The method includes: splitting a first multimodal large model into a visual part and a linguistic part; determining a first static graph model corresponding to the visual part and a second static graph model corresponding to the linguistic part; and deploying the first multimodal large model based on the first static graph model and the second static graph model.
A method for obtaining a cover image includes: obtaining a plurality of first cropped images of an original image corresponding to a candidate resource; obtaining an aesthetic score of each of the plurality of first cropped images; and determining a target cover image of the candidate resource from the plurality of first cropped images based on the aesthetic score of each first cropped image.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 30/18 - Extraction of features or characteristics of the image
G06V 30/262 - Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
71.
METHOD AND APPARATUS FOR GENERATING COMMENT INFORMATION BASED ON LARGE MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM
The disclosure provides a method and an apparatus for generating comment information based on a large model, an electronic device and a storage medium, relates to a technical field of artificial intelligence, and in particular to the technical fields of deep learning, large model, and natural language processing, and the like. The specific technical solution includes: obtaining description information of a resource to be commented on by understanding, based on the large model, the resource to be commented on; obtaining, based on the description information, comment information of the resource to be commented on, in which the comment information includes at least a comment video of the resource to be commented on; and displaying the comment video in a comment section. The intelligent generation of comment videos and texts is realized, improving the accuracy of the comment information, simplifying the comment generation process, and improving the speed of generating comments. Further, by introducing a video comment format, more diverse comment formats are provided for users to select from, greatly enhancing the user experience.
H04N 21/4788 - Supplemental services, e.g. displaying phone caller identification or shopping application communicating with other users, e.g. chatting
G11B 27/031 - Electronic editing of digitised analogue information signals, e.g. audio or video signals
H04L 47/125 - Avoiding congestionRecovering from congestion by balancing the load, e.g. traffic engineering
72.
TRAFFIC LIGHT PREDICTION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
A traffic light prediction method, an apparatus, and an autonomous vehicle are provided. The method includes determining lane line information of a lane where the vehicle is located and information of a target traffic light corresponding to the lane based on current position information of the vehicle, and recording the lane line formation and the information of the target traffic light as element information; recognizing an obstacle in the image acquired by the vehicle to obtain obstacle information; and associating element information with obstacle information to generate topology information, where the topology information is used to represent a binding relationship among a target traffic light, a lane line, and an obstacle; and generating a prediction result of the target traffic light based on the element information, the obstacle information, and the topology information.
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
73.
METHOD FOR TRAINING IMAGE CROPPING MODEL, METHOD FOR PROCESSING IMAGE, ELECTRONIC DEVICE AND STORAGE
Provided is a method for training an image cropping model, a method for processing an image, an electronic device and a storage medium, relating to the field of deep learning and image processing technology. The training method includes: obtaining sample data, wherein the sample data at least includes: a sample image, a first cropped image obtained by cropping the sample image in a first manner, and a second cropped image obtained by cropping the sample image in a second manner; determining a target loss function; and using at least the sample data and the target loss function to perform model training on a preset image cropping model to obtain a target image cropping model.
G06V 10/32 - Normalisation of the pattern dimensions
G06V 10/42 - Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/766 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 40/10 - Human or animal bodies, e.g. vehicle occupants or pedestriansBody parts, e.g. hands
74.
METHOD FOR INFORMATION PROCESSING, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A computer-implemented method for information processing includes: obtaining text information, in which the text information includes first text information of a resource to be commented on and second text information of a candidate prompt; selecting a target prompt from the candidate prompts based on the text information; and generating comment information of the resource to be commented on, based on the resource to be commented on and the target prompt.
A method for model training based on a large model includes: determining a first large model as a teacher model of a language model, and performing distillation learning on the language model based on the first large model; inputting a first prompt text into the language model, and obtaining a plurality of first response texts for the first prompt text output by the language model; determining a reference response text for the first prompt text from the plurality of first response texts; and training the language model based on the reference response text for the first prompt text.
A large model-based recommendation method includes: determining description information of interested content corresponding to a target user; inputting a content to be recommended, the description information of interested content and current popular search sentences into a large model to generate at least one recommendation card corresponding to the content to be recommended, in which the recommendation card contains a recommendation word associated with the content to be recommended; obtaining a current behavior characteristic of the target user; and in response to the current behavior characteristic satisfying a display condition of the recommendation card, displaying the recommendation card corresponding to at least one content to be recommended.
There is provided a method for video processing, an electronic device, and a storage medium, which relates to the technical field of image processing, specifically to technical fields such as digital video and image display, which may be used in intelligent cloud and cloud computing scenarios. A specific implementation solution involves: acquiring ambient brightness data of a display device, the display device adopting a standard dynamic range (SDR) technology; obtaining screen brightness data of the display device according to video brightness data of to-be-displayed high dynamic range (HDR) video, metadata of the HDR video, and the ambient brightness data; wherein the video brightness data is obtained by tone mapping according to the metadata; and controlling, by using the screen brightness data, the display device to display the HDR video.
G06T 5/92 - Dynamic range modification of images or parts thereof based on global image properties
G06T 5/90 - Dynamic range modification of images or parts thereof
G06V 10/60 - Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
G09G 3/22 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix using controlled light sources
78.
METHOD OF DETERMINING METEOROLOGICAL INFORMATION, ELECTRONIC DEVICE AND STORAGE MEDIUM
A method of determining meteorological information, an electronic device and a storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of deep learning and large models. The method includes performing a feature extraction on meteorological raster data of a target region within a target time period to obtain a meteorological feature vector; inputting to-be-processed meteorological data of the target region within the target time period into a large language model to obtain a text summary including a meteorological information determination manner; performing an information enhancement processing on the meteorological feature vector by using the text summary to obtain an information enhancement result; and performing a self-attention processing on the information enhancement result to obtain a meteorological information determination result output for the to-be-processed meteorological data.
A method of generating a content based on a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of deep learning, natural language processing, computer vision, large models, etc. The method includes performing an intention recognition on an input information in response to receiving the input information; generating a painting knowledge text by invoking a multimodal large model based on an intention for painting knowledge acquisition in response to recognizing the intention for painting knowledge acquisition from the input information; generating a first driving voice and a first action instruction for driving a virtual character according to the painting knowledge text; and broadcasting the painting knowledge text by driving the virtual character according to the first driving voice and the first action instruction.
Provided is a performance optimization method for a model training device, an electronic device, and a storage medium, relating to the fields of deep learning, large model training, and distributed parallel strategies. The method includes: determining communication timing of a current model training device with respect to a target model block at a target sorting position, so as to be able to perform synchronously collective communication with other model training devices of a plurality of model training devices with respect to model blocks at the target sorting position; and performing the collective communication on a backward gradient of the target model block at the communication timing.
A method for processing a query-response information is provided, which relates to a field of artificial intelligence technology, and in particular to fields of deep learning, large models, intelligent query and response, etc. The method for processing a query-response information includes: generating at least one initial response information according to a query information provided by an object; acquiring at least one feedback information corresponding to the at least one initial response information, wherein the feedback information indicates a preference degree of the object for the initial response information; and generating a training sample according to the query information, the at least one initial response information and the at least one feedback information. The present disclosure further provides a method for training a conversational model, an electronic device, and a storage medium.
A method for information processing, is performed by an electronic device, and the method includes: obtaining a residue sequence AT that does not carry amino acid information and a first protein backbone structure BT generated by pure noise; and performing iterative denoising on the residue sequence AT and the first protein backbone structure BT; for a tth denoising, obtaining coevolution information of a residue sequence AT+1−t, and obtaining, based on the coevolution information and a first protein backbone structure BT+1−t, a residue sequence AT−t and a first protein backbone structure BT−t after the tth denoising, until the denoising is completed and a target amino acid sequence and a second protein backbone structure are obtained, where t is a positive integer, and 1≤t≤T, and T is a number of denoising times.
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
Data query method and apparatus based on large model, an electronic device, and a storage medium are disclosed, which relates to the field of artificial intelligence, specifically in natural language processing, deep learning, and large model technologies, applicable to scenarios such as dialogue systems and information retrieval. The method includes: performing entity recognition on a query to obtain the target entity in the query; obtaining a first related content associated with the target entity from internal information, and performing data analysis on the first related content using a large language model (LLM) to obtain a data analysis result; obtaining a second related content associated with the target entity from external information, and performing data generation on the second related content using the LLM to obtain a data generation result; obtaining a query result corresponding to the query based on the data analysis result and the data generation result.
The disclosure provides a method for optimizing content generated by a large model, an apparatus for optimizing content generated by a large model, an electronic device and a storage medium, and relates to the technical field of artificial intelligence, especially to the technical fields of text processing, large language model and the like. It can be applied to official document processing, automatic contract generation, legal document writing, enterprise internal system management and so on. The method includes: obtaining a question entered by a user, wherein the question is used to instruct a generation of a text of a target type; obtaining a set of target rules corresponding to the target type from a plurality of preset sets of rules, in which the set of target rules includes a plurality of target rules, and the target rules are rules followed by the target type of text; and according to a sequence of the target rules, inputting the plurality of target rules into a large language model sequentially to obtain a target text of the target type generated by the large language model. In this way, the accuracy of generating text following certain rules by the large language model is improved.
A method for generating a dialogue includes acquiring a current first question statement and historical dialogue information associated with the first question statement; acquiring, from a knowledge base, a first knowledge item associated with the first question statement and a second knowledge item having a question-answer relationship with the first knowledge item; obtaining a first reply statement output by a generative model by inputting the first question statement, the first knowledge item, and the historical dialogue information into the generative model; evaluating the first reply statement based on the first question statement, the first knowledge item, and the second knowledge item; and outputting the first reply statement in response to the first reply statement passing evaluation.
A training method and apparatus for a full atomic structure prediction model. The method includes: obtaining structural information of a biomolecule and a first dynamic trajectory of the biomolecules; in which, the first dynamic trajectory includes position information of atoms in the biomolecule at different time points; adding noise to the first dynamic trajectory to obtain a second dynamic trajectory; encoding the structural information to obtain encoded features; decoding the encoded features and the second dynamic trajectory to obtain a target dynamic trajectory; and training an initial full atomic structure prediction model based on a difference between the target dynamic trajectory and the first dynamic trajectory, to obtain the full atomic structure prediction model.
The present disclosure relates to the technical field of artificial intelligence, and in particular relates to the technical fields of deep learning, natural language processing, etc. Provided are a model compression method and apparatus, a training method and apparatus, and a text data processing method and apparatus. A specific implementation of the model compression method involves: according to the number of concurrently deployed calculation units, dividing initial model parameters of a model to be compressed, so as to obtain initial local model parameters corresponding to the plurality of calculation units, wherein the calculation units are used for processing the same task associated with the model to be compressed; according to an initial input activation value, rematching the correspondence between initial weight parameters and the calculation units, so as to obtain target local model parameters corresponding to the plurality of calculation units, wherein a target input activation value among the target local model parameters gradually increases in a data processing direction; and performing quantization on the target local model parameters, so as to obtain a compressed model.
A method is provided that includes: obtaining a reference image and a description text; extracting a text feature of the description text; and performing the following operations based on a pre-trained diffusion model to generate a target image: in each time step of the diffusion model: calculating a first cross-attention feature of a first image feature and the text feature; obtaining a second cross-attention feature of a second image feature of the reference image and the text feature; editing the first cross-attention feature based on the second cross-attention feature to obtain a third cross-attention feature; and generating a result image feature of the time step based on the third cross-attention feature and the text feature; and decoding a result image feature of a last time step to generate the target image.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
89.
MODEL TRAINING METHOD, MODEL REASONING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Provided is a model training method, a model reasoning method, an electronic device, and a storage medium, relating to the field of data processing, and especially to the technical fields of artificial intelligence, big data, deep learning and large models. The model training method includes: folding an initial token sequence for training a model based on a folding feature value for folding a token sequence to obtain at least a first token sequence subjected to the folding, wherein the initial token sequence represents a token sequence composed of T1 tokens, and the first token sequence has a sequence length less than that of the initial token sequence; and inputting at least the first token sequence into a preset model to train the preset model so as to obtain a target model.
Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.
Data processing method and apparatus, an electronic device, and a storage medium are disclosed, which is in the fields of artificial intelligence, such as distributed storage and cloud computing. The method includes: determining a priority of each placement group in a cache pool respectively, and dividing placement groups with the same priority into a same waiting queue; constructing a target queue which is initially empty, and in response to determining that a supplementary trigger condition is met, determining placement groups to be retrieved based on the principle that a placement group in a waiting queue with higher priority is retrieved first, retrieving the placement groups to be retrieved from the corresponding waiting queue and adding the placement groups to be retrieved to the target queue; and in response to determining that the target queue is not empty, iteratively traversing each placement group in the target queue, wherein when traversing each placement group, the placement group is used as a target placement group respectively, and the number of writable objects is determined as a first quantity, and the first quantity of objects retrieved from the target placement group is written to a backend pool.
A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.
A method for evaluating a large model, an electronic device and a computer readable storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of large models technology and deep learning technology. The method includes: evaluating a response information of each of M large language models for an input instruction based on a preset evaluation rule, so as to obtain a first evaluation information for each response information, where M is a positive integer greater than 1; evaluating, in response to the first evaluation information for the M large language models being consistent with each other, each response information in a plurality of evaluation dimensions, so as to obtain a second evaluation information for each response information; and determining an evaluation result representing a responsiveness of each large language model, according to the second evaluation information for each response information.
A method and an apparatus for optimizing an mRNA sequence, an mRNA molecule, a pharmaceutical composition, and a use thereof are provided. The disclosure relates to the technical field of artificial intelligence, specifically to technical fields such as biological computing. The method for optimizing the mRNA sequence include: obtaining a first mRNA sequence for synthesizing a protein of interest, where the first mRNA sequence includes a 5′ untranslated region sequence and a coding region sequence; and adjusting the 5′ untranslated region sequence and the coding region sequence with the goal of maximizing a first score of the first mRNA sequence, so as to obtain an optimized second mRNA sequence for synthesizing the protein of interest, where the first score reflects at least one of the following indicators of the first mRNA sequence: translation initiation efficiency, codon adaptation index, and minimum free energy.
A vehicle operating system (VOS) in an autonomous driving vehicle (ADV) can communicate with a cloud platform to automatically train AI models. The VOS collects real-time data from the ADV, and generates inference data based on the real-time data using a teacher edge model of an AI model and generates second inference data based on the real-time data using a student edge model of the AI model. The VOS then obtains one or more differences between the first inference data and the second inference data, and retrains the student edge model of the AI model based on the one or more differences. Both real-time data and the retrained student edge model are uploaded to a cloud platform for use in upgrading the student edge model and the teacher edge model on the cloud platform. The upgraded teacher edge model and the student edge model can be redeployed over-the-air (OTA) through a software define process. The above process of training AI models can be repeated in a closed-loop automatically without user intervention.
A method and apparatus for accessing a network function virtualization controller by a network element are provided. The method includes: creating at least one service unit in a region to which a network element belongs; associating the at least one service unit with a system VPC corresponding to the network element; associating at least one availability zone comprised in the service unit with at least one device pool and at least one subnet respectively, where the device pool is formed by aggregating at least one virtual network element device; associating the at least one device pool with the at least one subnet based on an IP corresponding; and accessing, by the at least one service unit, a network function virtualization controller deployed in the region to which the network element belongs.
H04L 41/342 - Signalling channels for network management communication between virtual entities, e.g. orchestrators, SDN or NFV entities
H04L 41/40 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
H04L 67/60 - Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
97.
METHOD AND APPARATUS FOR ACCESSING VIRTUAL PRIVATE CLOUD, DEVICE AND STORAGE MEDIUM
The present disclosure provides a method and apparatus for accessing a virtual private cloud (VPC), a device and a storage medium, which are applied to the field of cloud computing, intelligent search, Internet of Things and others technical fields in data processing. The method includes: receiving a first access request, where the first access request carries virtual address information, the first access request is used for indicating access to a target service node arranged in a target VPC; the target service node has actual address information; the virtual address information is used for indicating actual address information of the target VPC and the target service node, and the actual address information is a real address of a service node; accessing the target service node according to the first access request.
A method of generating a code based on a large model, an electronic device and a storage medium are provided, which relate to the field of artificial intelligence technology, in particular to the fields of deep learning technology and large model technology. The method includes: acquiring a first descriptive text input by a user, where the first descriptive text is configured to characterize a code requirement; searching for a positive code and a negative code matching the first descriptive text, where each of the positive code and the negative code is determined based on a preference operation of the user for a historical code output by the large model; generating a second descriptive text according to the first descriptive text, the positive code, and the negative code; and inputting the second descriptive text into the large model to output a target code matching the code requirement.
A method for obtaining an antibody sequence includes: obtaining first features of amino acids at different sequence positions according to an antigen multiple sequence alignment (MSA) sequence, an antibody MSA sequence, and a concatenated sequence of the antigen MSA sequence and the antibody MSA sequence; obtaining second feature of the amino acids at different 3D coordinates according to a graph constructed according to a reference antigen-antibody complex; fusing the first features of amino acids at different sequence positions with the second features of amino acids at 3D coordinates corresponding to the different sequence positions, and obtaining probability information of each of the amino acids at different positions in the antibody sequence according to fused features; and obtaining a target antibody sequence according to the amino acids and their probability information at different positions in the antibody sequence.
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
100.
TASK EXECUTION METHOD FOR LARGE MODEL, DEVICE, AND MEDIUM
A task execution method for a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, particularly to fields of deep learning technology and large model technology. The method includes: executing a modality routing task by using a target computing unit based on a target feature to be processed to obtain a modality recognition result; executing a field routing task by using the target computing unit based on the target feature to be processed and a target field gating model parameter to obtain a field recognition result; and executing a feedforward task by using the target computing unit based on the target feature to be processed and a target feedforward task model parameter to obtain a task execution result