A cloud service system manages a filter repository including filters for encoding and decoding media content (e.g. text, image, audio, video, etc.). The cloud service system may receive a request from a client device to provide a filter for installation on a node such as an endpoint device (e.g. pipeline node). The request includes information such as a type of bitstream to be processed by the requested filter. The request may further include other information such as hardware configuration and functionality attribute. The cloud service system may access the filter repository that stores the plurality of filters including encoder filters and decoder filters and may select a filter that is configured to process the type of bitstream identified in the request and provide the selected filter to the client device.
H04N 19/42 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
2.
Dynamic control for a machine learning autoencoder
An autoencoder is configured to encode content at different quality levels. The autoencoder includes an encoding system and a decoding system with neural network layers forming an encoder network and a decoder network. The encoder network and decoder network are configured to include branching paths through the networks that include different subnetworks. During deployment, content is provided to the encoding system with a quality signal indicating a quality at which the content can be reconstructed. The quality signal determines which of the paths through the encoder network are activated for encoding the content into one or more tensors, which are compressed into a bitstream and later used by the decoding system to reconstruct the content. The autoencoder is trained by randomly or systematically selecting different combinations of tensors to use to encode content and backpropagating error values from loss functions through the network paths associated with the selected tensors.
H04N 19/42 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/40 - ScenesScene-specific elements in video content
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
H04N 19/517 - Processing of motion vectors by encoding
3.
System for training and deploying filters for encoding and decoding
A cloud service system manages a filter repository including filters for encoding and decoding media content (e.g. text, image, audio, video, etc.). The cloud service system may receive a request from a client device to provide a filter for installation on a node such as an endpoint device (e.g. pipeline node). The request includes information such as a type of bitstream to be processed by the requested filter. The request may further include other information such as hardware configuration and functionality attribute. The cloud service system may access the filter repository that stores the plurality of filters including encoder filters and decoder filters and may select a filter that is configured to process the type of bitstream identified in the request and provide the selected filter to the client device.
H04N 19/42 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
A compression system trains a machine-learned compression model that includes components for an encoder and decoder. In one embodiment, the compression model is trained to receive parameter information on how a target frame should be encoded with respect to one or more encoding parameters, and encodes the target frame according to the respective values of the encoding parameters for the target frame. In particular, the encoder of the compression model includes at least an encoding system configured to encode a target frame and generate compressed code that can be transmitted by, for example, a sender system to a receiver system. The decoder of the compression model includes a decoding system trained in conjunction with the encoding system. The decoding system is configured to receive the compressed code for the target frame and reconstruct the target frame for the receiver system.
H04N 19/52 - Processing of motion vectors by encoding by predictive encoding
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/30 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
H04N 19/114 - Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/166 - Feedback from the receiver or from the transmission channel concerning the amount of transmission errors, e.g. bit error rate [BER]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/65 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience
G06N 3/04 - Architecture, e.g. interconnection topology
5.
Machine-learned in-loop predictor for video compression
A compression system trains a compression model for an encoder and decoder. In one embodiment, the compression model includes a machine-learned in-loop flow predictor that generates a flow prediction from previously reconstructed frames. The machine-learned flow predictor is coupled to receive a set of previously reconstructed frames and output a flow prediction for a target frame that is an estimation of the flow for the target frame. In particular, since the flow prediction can be generated by the decoder using the set of previously reconstructed frames, the encoder may transmit a flow delta that indicates a difference between the flow prediction and the actual flow for the target frame, instead of transmitting the flow itself. In this manner, the encoder can transmit a significantly smaller number of bits to the receiver, improving computational efficiency.
H04N 19/52 - Processing of motion vectors by encoding by predictive encoding
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/30 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
H04N 19/114 - Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/166 - Feedback from the receiver or from the transmission channel concerning the amount of transmission errors, e.g. bit error rate [BER]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/65 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience
G06N 3/04 - Architecture, e.g. interconnection topology
6.
Deep learning based adaptive arithmetic coding and codelength regularization
A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
G06N 3/04 - Architecture, e.g. interconnection topology
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 20/40 - ScenesScene-specific elements in video content
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
7.
Deep learning based adaptive arithmetic coding and codelength regularization
A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.
G06K 9/62 - Methods or arrangements for recognition using electronic means
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
G06N 3/04 - Architecture, e.g. interconnection topology
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/46 - Extraction of features or characteristics of the image
A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/46 - Extraction of features or characteristics of the image
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
An encoder system trains a compression model that includes an autoencoder model and a frame extractor model. The encoding portion of the autoencoder is coupled to receive a set of target frames and a previous state tensor for the set of target frames and generate compressed code. The decoding portion of the autoencoder is coupled to receive the compressed code and the previous state tensor for the set of frames and generate a next state tensor for the set of target frames. The frame extractor model is coupled to receive the next state tensor and generate a set of reconstructed frames that correspond to the set of target frames by performing one or more operations on the state tensor. The state tensor for the set of frames includes information from frames of the video that can be used by the frame extractor to generate reconstructed frames.
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
H04N 19/517 - Processing of motion vectors by encoding
A compression system includes an encoder and a decoder. The encoder can be deployed by a sender system to encode a tensor for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode and reconstruct the encoded tensor. The encoder receives a tensor for compression. The encoder also receives a quantization mask and probability data associated with the tensor. Each element of the tensor is quantized using an alphabet size allocated to that element by the quantization mask data. The encoder compresses the tensor by entropy coding each element using the probability data and alphabet size associated with the element. The decoder receives the quantization mask data, the probability data, and the compressed tensor data. The quantization mask and probabilities are used to entropy decode and subsequently reconstruct the tensor.
H03M 7/30 - CompressionExpansionSuppression of unnecessary data, e.g. redundancy reduction
G06F 17/18 - Complex mathematical operations for evaluating statistical data
H03M 7/02 - Conversion to or from weighted codes, i.e. the weight given to a digit depending on the position of the digit within the block or code word
An encoder system trains a compression model that includes an autoencoder model and a frame extractor model. The encoding portion of the autoencoder is coupled to receive a set of target frames and a previous state tensor for the set of target frames and generate compressed code. The decoding portion of the autoencoder is coupled to receive the compressed code and the previous state tensor for the set of frames and generate a next state tensor for the set of target frames. The frame extractor model is coupled to receive the next state tensor and generate a set of reconstructed frames that correspond to the set of target frames by performing one or more operations on the state tensor. The state tensor for the set of frames includes information from frames of the video that can be used by the frame extractor to generate reconstructed frames.
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
H04N 19/517 - Processing of motion vectors by encoding
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
12.
Dynamic control for a machine learning autoencoder
An autoencoder is configured to encode content at different quality levels. The autoencoder includes an encoding system and a decoding system with neural network layers forming an encoder network and a decoder network. The encoder network and decoder network are configured to include branching paths through the networks that include different subnetworks. During deployment, content is provided to the encoding system with a quality signal indicating a quality at which the content can be reconstructed. The quality signal determines which of the paths through the encoder network are activated for encoding the content into one or more tensors, which are compressed into a bitstream and later used by the decoding system to reconstruct the content. The autoencoder is trained by randomly or systematically selecting different combinations of tensors to use to encode content and backpropagating error values from loss functions through the network paths associated with the selected tensors.
H04N 19/42 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
H04N 19/517 - Processing of motion vectors by encoding
G06V 20/40 - ScenesScene-specific elements in video content
G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06N 3/084 - Backpropagation, e.g. using gradient descent
A deep learning based compression (DLBC) system generates a progressive representation of the encoded input image such that a client device that requires the encoded input image at a particular target bitrate can readily be transmitted the appropriately encoded data. More specifically, the DLBC system computes a representation that includes channels and bitplanes that are ordered based on importance. For a given target rate, the DLBC system truncates the representation according to a trained zero mask to generate the progressive representation. Transmitting a first portion of the progressive representation enables a client device with the lowest target bitrate to appropriately playback the content. Each subsequent portion of the progressive representation allows the client device to playback the content with improved quality.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/46 - Extraction of features or characteristics of the image
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
14.
Deep learning based on image encoding and decoding
A deep learning based compression (DLBC) system trains multiple models that, when deployed, generates a compressed binary encoding of an input image that achieves a reconstruction quality and a target compression ratio. The applied models effectively identifies structures of an input image, quantizes the input image to a target bit precision, and compresses the binary code of the input image via adaptive arithmetic coding to a target codelength. During training, the DLBC system reconstructs the input image from the compressed binary encoding and determines the loss in quality from the encoding process. Thus, the models can be continually trained to, when applied to an input image, minimize the loss in reconstruction quality that arises due to the encoding process while also achieving the target compression ratio.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 20/40 - ScenesScene-specific elements in video content
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G06V 30/194 - References adjustable by an adaptive method, e.g. learning
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
G06N 3/084 - Backpropagation, e.g. using gradient descent
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
A compression system trains a machine-learned encoder and decoder. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder receives content and generates a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder receives a tensor and generates a reconstructed version of the content. In one embodiment, the compression system trains one or more encoding components such that the encoder can adaptively encode different degrees of information for regions in the content that are associated with characteristic objects, such as human faces, texts, or buildings.
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
G06K 9/46 - Extraction of features or characteristics of the image
A machine learning (ML) task system trains a neural network model that learns a compressed representation of acquired data and performs a ML task using the compressed representation. The neural network model is trained to generate a compressed representation that balances the objectives of achieving a target codelength and achieving a high accuracy of the output of the performed ML task. During deployment, an encoder portion and a task portion of the neural network model are separately deployed. A first system acquires data, applies the encoder portion to generate a compressed representation, performs an encoding process to generate compressed codes, and transmits the compressed codes. A second system regenerates the compressed representation from the compressed codes and applies the task model to determine the output of a ML task.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/46 - Extraction of features or characteristics of the image
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
17.
Enhanced coding efficiency with progressive representation
A deep learning based compression (DLBC) system generates a progressive representation of the encoded input image such that a client device that requires the encoded input image at a particular target bitrate can readily be transmitted the appropriately encoded data. More specifically, the DLBC system computes a representation that includes channels and bitplanes that are ordered based on importance. For a given target rate, the DLBC system truncates the representation according to a trained zero mask to generate the progressive representation. Transmitting a first portion of the progressive representation enables a client device with the lowest target bitrate to appropriately playback the content. Each subsequent portion of the progressive representation allows the client device to playback the content with improved quality.
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
G06K 9/46 - Extraction of features or characteristics of the image
18.
Deep learning based adaptive arithmetic coding and codelength regularization
A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/46 - Extraction of features or characteristics of the image
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
19.
Using generative adversarial networks in compression
The compression system trains a machine-learned encoder and decoder through an autoencoder architecture. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder is coupled to receive content and output a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder is coupled to receive a tensor representing content and output a reconstructed version of the content. The compression system trains the autoencoder with a discriminator to reduce compression artifacts in the reconstructed content. The discriminator is coupled to receive one or more input content, and output a discrimination prediction that discriminates whether the input content is the original or reconstructed version of the content.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/42 - Normalisation of the pattern dimensions
G06K 9/46 - Extraction of features or characteristics of the image
H04N 19/12 - Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
H04N 19/16 - Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter for a given display mode, e.g. for interlaced or progressive display mode
H04N 19/17 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
H04N 19/19 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/14 - Coding unit complexity, e.g. amount of activity or edge presence estimation
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/15 - Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
20.
Autoencoding image residuals for improving upsampled images
An enhanced encoder system generates residual bitstreams representing additional image information that can be used by an image enhancement system to improve a low quality image. The enhanced encoder system upsamples a low quality image and compares the upsampled image to a true high quality image to determine image inaccuracies that arise due to the upsampling process. The enhanced encoder system encodes the information describing the image inaccuracies using a trained encoder model as the residual bitstream. The image enhancement system upsamples the same low quality image to obtain a prediction of a high quality image that can include image inaccuracies. Given the residual bitstream, the image enhancement system decodes the residual bitstream using a trained decoder model and uses the additional image information to improve the predicted high quality image. The image enhancement system can provide an improved, high quality image for display.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/46 - Extraction of features or characteristics of the image
H04N 19/126 - Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
H04N 19/13 - Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
H04N 19/149 - Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/48 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
H04N 19/154 - Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain