09 - Appareils et instruments scientifiques et électriques
Produits et services
Apparatus such as integrated circuit, microchip, Register Transfer Level (RTL) design, FPGA, computer hardware, chips, chiplets or chipsets for processing physical signals wherein the signal comprises audio, still images, video, radar, near infrared, far infrared, ultrasound, mm-wave wherein processing can be AI-based, computational signal processing, image processing or audio processing
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Apparatus for recording, transmission, manipulation, compression, enhancement, or reproduction of signals, including video, sound, or images; Computer software: Software for signal processing, specifically video, still imaging, and audio; Artificial intelligence software; Machine learning software; Image management software; Computer software, including downloadable software for use in sensor data analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, and radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation; Computer hardware: central processing units, graphic processing units, system on a chip, integrated circuits, multiprocessor chips, ASIC, for use in sensor data capture and analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation. Software as a service; Platform as a service; Design and development of software and hardware, including compilers, for use in sensor data capture and analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation.
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Apparatus for recording, transmission, manipulation, compression, enhancement, or reproduction of signals, including video, sound, or images; Computer software, namely, software for signal processing, specifically video, still imaging, and audio; Artificial intelligence software; Machine learning software; Image management software; Computer software, including downloadable software for use in sensor data analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, and radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation; Computer hardware, namely, central processing units, graphic processing units, system on a chip, integrated circuits, multiprocessor chips, ASIC for use in sensor data capture and analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, radio signals, namely for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation Software as a service; Platform as a service; Design and development of software and hardware, including compilers, for use in sensor data capture and analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, radio signals, namely for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Apparatus for recording, transmission, manipulation, compression, enhancement, or reproduction of signals, including video, sound, or images; Computer software: Software for signal processing, specifically video, still imaging, and audio; Artificial intelligence software; Machine learning software; Image management software; Computer software, including downloadable software for use in sensor data analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, and radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation; Computer hardware: central processing units, graphic processing units, system on a chip, integrated circuits, multiprocessor chips, ASIC, for use in sensor data capture and analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation. Software as a service; Platform as a service; Design and development of software and hardware, including compilers, for use in sensor data capture and analysis, including event sensors, digital photography, digital videography, infra-red, multispectral imaging, audio, radio signals, namely, for analysis, categorization, object recognition, compression, enhancement, editing, and manipulation.
5.
VEHICLE OCCUPANT MONITORING SYSTEM INCLUDING AN IMAGE ACQUISITION DEVICE WITH A ROLLING SHUTTER IMAGE SENSOR
A vehicle occupant monitoring system, OMS, comprises an image acquisition device with a rolling shutter image sensor configured to selectively operate in either: a full-resolution image mode where an image frame corresponding to the full image sensor is provided; or a region of interest, ROI, mode, where an image frame corresponding to a limited portion of the image sensor is provided. An object detector is configured to receive a full-resolution image from the image sensor and to identify a ROI corresponding to an object of interest within the image. A controller is configured to obtain an image corresponding to the ROI from the image sensor operating in ROI mode, the image having an exposure time long enough for all rows of the ROI to be illuminated by a common pulse of light from at least one infra-red light source and short enough to limit motion blur within the image.
G06V 10/143 - Détection ou éclairage à des longueurs d’onde différentes
G06V 10/147 - Détails de capteurs, p. ex. lentilles de capteurs
G06V 10/25 - Détermination d’une région d’intérêt [ROI] ou d’un volume d’intérêt [VOI]
G06V 10/60 - Extraction de caractéristiques d’images ou de vidéos relative aux propriétés luminescentes, p. ex. utilisant un modèle de réflectance ou d’éclairage
G06V 20/59 - Contexte ou environnement de l’image à l’intérieur d’un véhicule, p. ex. concernant l’occupation des sièges, l’état du conducteur ou les conditions de l’éclairage intérieur
G06V 10/98 - Détection ou correction d’erreurs, p. ex. en effectuant une deuxième exploration du motif ou par intervention humaineÉvaluation de la qualité des motifs acquis
A method for identifying a gesture from one of a plurality of dynamic gestures, each dynamic gesture comprising a distinct movement made by a user over a period of time within a field of view of an image acquisition device comprises iteratively: acquiring a current image from said image acquisition device at a given time; and passing at least a portion of the current image through a bidirectionally recurrent multi-layer classifier. A final layer of the multi-layer classifier comprises an output indicating a probability that a gesture from the plurality of dynamic gestures is being made by a user during the time of acquiring the image.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 40/20 - Mouvements ou comportement, p. ex. reconnaissance des gestes
A method for monitoring occupants of a vehicle comprises identifying a respective body region for one or more occupants of the vehicle within an obtained image; identifying within the body regions, skeletal points including points on an arm of a body; identifying one or more hand regions; and determining a hand region to be either a left or a right hand of an occupant in accordance with its spatial relationship with identified skeletal points of the body region of an occupant. The left or right hand region for the occupant are provided to a pair of classifiers to provide an activity classification for the occupant, a first classifier being trained with images of hands of occupants in states where objects involved are not visible and a second classifier being trained with images of occupants in the states where the objects are visible in at least one hand region.
G06V 20/59 - Contexte ou environnement de l’image à l’intérieur d’un véhicule, p. ex. concernant l’occupation des sièges, l’état du conducteur ou les conditions de l’éclairage intérieur
G06V 40/10 - Corps d’êtres humains ou d’animaux, p. ex. occupants de véhicules automobiles ou piétonsParties du corps, p. ex. mains
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
A method for calibrating a vehicle cabin camera having: a pitch, yaw and roll angle; and a field of view capturing vehicle cabin features which are symmetric about a vehicle longitudinal axis comprises: selecting points from within an image of the vehicle cabin and projecting the points onto a 3D unit circle in accordance with a camera projection model. For each of one or more rotations of a set of candidate yaw and roll rotations, the method comprises: rotating the projected points with the rotation; flipping the rotated points about a pitch axis; counter-rotating the projected points with an inverse of the rotation; and mapping the counter-rotated points back into an image plane to provide a set of transformed points. A candidate rotation which provides a best match between the set of transformed points and the locations of the selected points in the image plane is selected.
A method comprises displaying a first image acquired from a camera having an input camera projection model including a first focal length and an optical axis parameter value. A portion of the first image is selected as a second image associated with an output camera projection model in which either a focal length and/or an optical axis parameter value differ from the parameters of the input camera projection model. The method involves iteratively: adjusting either the focal length and/or an optical axis parameter value for the camera lens so that it approaches the corresponding value of the output camera projection model; acquiring a subsequent image using the adjusted focal length or optical axis parameter value; mapping pixel coordinates in the second image, through a normalized 3D coordinate system to respective locations in the subsequent image to determine respective values for the pixel coordinates; and displaying the second image.
A method for correcting an image divides an output image into a grid with vertical sections of width smaller than the image width but wide enough to allow efficient bursts when writing distortion corrected line sections into memory. A distortion correction engine includes a relatively small amount of memory for an input image buffer but without requiring unduly complex control. The input image buffer accommodates enough lines of an input image to cover the distortion of a single most vertically distorted line section of the input image. The memory required for the input image buffer can be significantly less than would be required to store all the lines of a distorted input image spanning a maximal distortion of a complete line within the input image.
Disclosed is a multi-modal convolutional neural network (CNN) for fusing image information from a frame based camera, such as, a near infra-red (NIR) camera and an event camera for analysing facial characteristics in order to produce classifications such as head pose or eye gaze. The neural network processes image frames acquired from each camera through a plurality of convolutional layers to provide a respective set of one or more intermediate images. The network fuses at least one corresponding pair of intermediate images generated from each of image frames through an array of fusing cells. Each fusing cell is connected to at least a respective element of each intermediate image and is trained to weight each element from each intermediate image to provide the fused output. The neural network further comprises at least one task network configured to generate one or more task outputs for the region of interest.
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06V 10/80 - Fusion, c.-à-d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/58 - Reconnaissance d’objets en mouvement ou d’obstacles, p. ex. véhicules ou piétonsReconnaissance des objets de la circulation, p. ex. signalisation routière, feux de signalisation ou routes
G06V 20/59 - Contexte ou environnement de l’image à l’intérieur d’un véhicule, p. ex. concernant l’occupation des sièges, l’état du conducteur ou les conditions de l’éclairage intérieur
12.
PRODUCING AN IMAGE FRAME USING DATA FROM AN EVENT CAMERA
A method of producing an image frame from event packets received from an event camera comprises: forming a tile buffer sized to accumulate event information for a subset of image tiles, the tile buffer having an associated tile table that determines a mapping between each tile of the image frame for which event information is accumulated in the tile buffer and the image frame. For each event packet: an image tile corresponding to the pixel location of the event packet is identified; responsive to the tile buffer storing information for one other event corresponding to the image tile, event information is added to the tile buffer; and responsive to the tile buffer not storing information for another event corresponding to the image tile and responsive to the tile buffer being capable of accumulating event information for at least one more tile, the image tile is added to the tile buffer.
H04N 5/335 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS]
A method to determine activity in a sequence of successively acquired images of a scene, comprises: acquiring the sequence of images; for each image in the sequence of images, forming a feature block of features extracted from the image and determining image specific information including a weighting for the image; normalizing the determined weightings to form a normalized weighting for each image in the sequence of images; for each image in the sequence of images, combining the associated normalized weighting and associated feature block to form a weighted feature block; passing a combination of the weighted feature blocks through a predictive module to determine an activity in the sequence of images; and outputting a result comprising the determined activity in the sequence of images.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
A method for producing a textural image from event information generated by an event camera comprises: accumulating event information from a plurality of events occurring during successive event cycles across a field of view of the event camera, each event indicating an x,y location within the field of view, a polarity for a change of detected light intensity incident at the x,y location and an event cycle at which the event occurred; in response to selected event cycles, analysing event information for one or more preceding event cycles to identify one or more regions of interest bounding a respective object to be tracked; and responsive to a threshold event criterion for a region of interest being met, generating a textural image for the region of interest from event information accumulated from within the region of interest.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
15.
METHOD AND SYSTEM TO DETERMINE THE LOCATION AND/OR ORIENTATION OF A HEAD
A method for determining an absolute depth map to monitor the location and pose of a head (100) being imaged by a camera comprises: acquiring (20) an image from the camera (110) including a head with a facial region; determining (23) at least one distance from the camera (110) to a facial feature of the facial region using a distance measuring sub-system (120); determining (24) a relative depth map of facial features within the facial region; and combining (25) the relative depth map with the at least one distance to form an absolute depth map for the facial region.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
16.
System and methods for calibration of an array camera
Systems and methods for calibrating an array camera are disclosed. Systems and methods for calibrating an array camera in accordance with embodiments of this invention include the capturing of an image of a test pattern with the array camera such that each imaging component in the array camera captures an image of the test pattern. The image of the test pattern captured by a reference imaging component is then used to derive calibration information for the reference component. A corrected image of the test pattern for the reference component is then generated from the calibration information and the image of the test pattern captured by the reference imaging component. The corrected image is then used with the images captured by each of the associate imaging components associated with the reference component to generate calibration information for the associate imaging components.
H04N 13/282 - Générateurs de signaux d’images pour la génération de signaux d’images correspondant à au moins trois points de vue géométriques, p. ex. systèmes multi-vues
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
G06T 7/80 - Analyse des images capturées pour déterminer les paramètres de caméra intrinsèques ou extrinsèques, c.-à-d. étalonnage de caméra
H04N 17/02 - Diagnostic, test ou mesure, ou leurs détails, pour les systèmes de télévision pour les signaux de télévision en couleurs
A camera comprises a lens assembly coupled to an event-sensor, the lens assembly being configured to focus a light field onto a surface of the event-sensor, the event-sensor surface comprising a plurality of light sensitive-pixels, each of which cause an event to be generated when there is a change in light intensity greater than a threshold amount incident on the pixel. The camera further includes an actuator which can be triggered to cause a change in the light field incident on the surface of the event-sensor and to generate a set of events from a sub-set of pixels distributed across the surface of the event-sensor.
G06K 9/60 - Combinaison de l'obtention de l'image et des fonctions de prétraitement
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
18.
Systems and methods for synthesizing high resolution images using images captured by an array of independently controllable imagers
Systems and methods in accordance with embodiments of the invention are disclosed that use super-resolution (SR) processes to use information from a plurality of low resolution (LR) images captured by an array camera to produce a synthesized higher resolution image. One embodiment includes obtaining input images using the plurality of imagers, using a microprocessor to determine an initial estimate of at least a portion of a high resolution image using a plurality of pixels from the input images, and using a microprocessor to determine a high resolution image that when mapped through the forward imaging transformation matches the input images to within at least one predetermined criterion using the initial estimate of at least a portion of the high resolution image. In addition, each forward imaging transformation corresponds to the manner in which each imager in the imaging array generate the input images, and the high resolution image synthesized by the microprocessor has a resolution that is greater than any of the input images.
Systems and methods for estimating depth from projected texture using camera arrays are described. A camera array includes a conventional camera and at least one two-dimensional array of cameras, where the conventional camera has a higher resolution than the cameras in the at least one two-dimensional array of cameras, an illumination system configured to illuminate a scene with a projected texture, where an image processing pipeline application directs the processor to: utilize the illumination system controller application to control the illumination system to illuminate a scene with a projected texture, capture a set of images of the scene illuminated with the projected texture, and determining depth estimates for pixel locations in an image from a reference viewpoint using at least a subset of the set of images.
G01B 11/22 - Dispositions pour la mesure caractérisées par l'utilisation de techniques optiques pour mesurer la profondeur
G01B 11/25 - Dispositions pour la mesure caractérisées par l'utilisation de techniques optiques pour mesurer des contours ou des courbes en projetant un motif, p. ex. des franges de moiré, sur l'objet
G06T 7/521 - Récupération de la profondeur ou de la forme à partir de la télémétrie laser, p. ex. par interférométrieRécupération de la profondeur ou de la forme à partir de la projection de lumière structurée
G06T 7/593 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir d’images stéréo
G06T 7/557 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir des champs de lumière, p. ex. de caméras plénoptiques
A method and system for detecting facial expressions in digital images and applications therefore are disclosed. Analysis of a digital image determines whether or not a smile and/or blink is present on a person's face. Face recognition, and/or a pose or illumination condition determination, permits application of a specific, relatively small classifier cascade.
Embodiments of the invention provide a camera array imaging architecture that computes depth maps for objects within a scene captured by the cameras, and use a near-field sub-array of cameras to compute depth to near-field objects and a far-field sub-array of cameras to compute depth to far-field objects. In particular, a baseline distance between cameras in the near-field subarray is less than a baseline distance between cameras in the far-field sub-array in order to increase the accuracy of the depth map. Some embodiments provide an illumination near-IR light source for use in computing depth maps.
A method for correcting an image divides an output image into a grid with vertical sections of width smaller than the image width but wide enough to allow efficient bursts when writing distortion corrected line sections into memory. A distortion correction engine includes a relatively small amount of memory for an input image buffer but without requiring unduly complex control. The input image buffer accommodates enough lines of an input image to cover the distortion of a single most vertically distorted line section of the input image. The memory required for the input image buffer can be significantly less than would be required to store all the lines of a distorted input image spanning a maximal distortion of a complete line within the input image.
Systems and methods for implementing array cameras configured to perform super-resolution processing to generate higher resolution super-resolved images using a plurality of captured images and lens stack arrays that can be utilized in array cameras are disclosed. An imaging device in accordance with one embodiment of the invention includes at least one imager array, and each imager in the array comprises a plurality of light sensing elements and a lens stack including at least one lens surface, where the lens stack is configured to form an image on the light sensing elements, control circuitry configured to capture images formed on the light sensing elements of each of the imagers, and a super-resolution processing module configured to generate at least one higher resolution super-resolved image using a plurality of the captured images.
H04N 5/365 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit appliqué au bruit à motif fixe, p.ex. non-uniformité de la réponse
H04N 13/128 - Ajustement de la profondeur ou de la disparité
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/33 - Transformation des rayonnements infrarouges
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
H04N 5/349 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner pour accroître la résolution en déplaçant le capteur par rapport à la scène
H04N 5/357 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit
G06T 7/557 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir des champs de lumière, p. ex. de caméras plénoptiques
H04N 13/239 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant deux capteurs d’images 2D dont la position relative est égale ou en correspondance à l’intervalle oculaire
G06T 7/50 - Récupération de la profondeur ou de la forme
H04N 9/097 - Dispositions optiques associées aux dispositifs analyseurs, p.ex. pour partager des faisceaux, pour corriger la couleur
G06T 19/20 - Édition d'images tridimensionnelles [3D], p. ex. modification de formes ou de couleurs, alignement d'objets ou positionnements de parties
H04N 9/09 - Générateurs de signaux d'image avec plusieurs têtes de lecture
H04N 9/73 - Circuits pour l'équilibrage des couleurs, p. ex. circuits pour équilibrer le blanc ou commande de la température de couleur
In an embodiment, a 3D facial modeling system includes a plurality of cameras configured to capture images from different viewpoints, a processor, and a memory containing a 3D facial modeling application and parameters defining a face detector, wherein the 3D facial modeling application directs the processor to obtain a plurality of images of a face captured from different viewpoints using the plurality of cameras, locate a face within each of the plurality of images using the face detector, wherein the face detector labels key feature points on the located face within each of the plurality of images, determine disparity between corresponding key feature points of located faces within the plurality of images, and generate a 3D model of the face using the depth of the key feature points.
G06T 17/20 - Description filaire, p. ex. polygonalisation ou tessellation
G06T 7/149 - DécoupageDétection de bords impliquant des modèles déformables, p. ex. des modèles de contours actifs
G06T 7/593 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir d’images stéréo
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
A neural network engine comprises a plurality of floating point multipliers, each having an input connected to an input map value and an input connected to a corresponding kernel value. Pairs of multipliers provide outputs to a tree of nodes, each node of the tree being configured to provide a floating point output corresponding to either: a larger of the inputs of the node; or a sum of the inputs, one output node of the tree providing a first input of an output module, and one of the multipliers providing an output to a second input of the output module. The engine is configured to process either a convolution layer of a neural network, an average pooling layer or a max pooling layer according to the kernel values and whether the nodes and output module are configured to output a larger or a sum of their inputs.
Systems and methods for calibrating an array camera are disclosed. Systems and methods for calibrating an array camera in accordance with embodiments of this invention include the capturing of an image of a test pattern with the array camera such that each imaging component in the array camera captures an image of the test pattern. The image of the test pattern captured by a reference imaging component is then used to derive calibration information for the reference component. A corrected image of the test pattern for the reference component is then generated from the calibration information and the image of the test pattern captured by the reference imaging component. The corrected image is then used with the images captured by each of the associate imaging components associated with the reference component to generate calibration information for the associate imaging components.
H04N 13/282 - Générateurs de signaux d’images pour la génération de signaux d’images correspondant à au moins trois points de vue géométriques, p. ex. systèmes multi-vues
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
G06T 7/80 - Analyse des images capturées pour déterminer les paramètres de caméra intrinsèques ou extrinsèques, c.-à-d. étalonnage de caméra
H04N 17/02 - Diagnostic, test ou mesure, ou leurs détails, pour les systèmes de télévision pour les signaux de télévision en couleurs
27.
Systems and methods for hybrid depth regularization
Systems and methods for hybrid depth regularization in accordance with various embodiments of the invention are disclosed. In one embodiment of the invention, a depth sensing system comprises a plurality of cameras; a processor; and a memory containing an image processing application. The image processing application may direct the processor to obtain image data for a plurality of images from multiple viewpoints, the image data comprising a reference image and at least one alternate view image; generate a raw depth map using a first depth estimation process, and a confidence map; and generate a regularized depth map. The regularized depth map may be generated by computing a secondary depth map using a second different depth estimation process; and computing a composite depth map by selecting depth estimates from the raw depth map and the secondary depth map based on the confidence map.
G06T 7/593 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir d’images stéréo
G06T 7/44 - Analyse de la texture basée sur la description statistique de texture utilisant des opérateurs de l'image, p. ex. des filtres, des mesures de densité des bords ou des histogrammes locaux
The present invention relates to an image processing apparatus which determines an order for calculating output image pixels that maximally reuses data in a local memory for computing all relevant output image pixels. Thus, the same set of data is re-used until it is no longer necessary. Output image pixel locations are browsed to determine pixel values in an order imposed by available input data, rather than in an order imposed by pixel positions in the output image. Consequently, the amount of storage required for local memory as well as the number of input image read requests and data read from memory containing the input image is minimized.
A convolutional neural network (CNN) for an image processing system comprises an image cache responsive to a request to read a block of N×M pixels extending from a specified location within an input map to provide a block of N×M pixels at an output port. A convolution engine reads blocks of pixels from the output port, combines blocks of pixels with a corresponding set of weights to provide a product, and subjects the product to an activation function to provide an output pixel value. The image cache comprises a plurality of interleaved memories capable of simultaneously providing the N×M pixels at the output port in a single clock cycle. A controller provides a set of weights to the convolution engine before processing an input map, causes the convolution engine to scan across the input map by incrementing a specified location for successive blocks of pixels and generates an output map within the image cache by writing output pixel values to successive locations within the image cache.
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
Systems and methods in accordance with embodiments of the invention are configured to render images using light field image files containing an image synthesized from light field image data and metadata describing the image that includes a depth map. One embodiment of the invention includes a processor and memory containing a rendering application and a light field image file including an encoded image, a set of low resolution images, and metadata describing the encoded image, where the metadata comprises a depth map that specifies depths from the reference viewpoint for pixels in the encoded image. In addition, the rendering application configures the processor to: locate the encoded image within the light field image file; decode the encoded image; locate the metadata within the light field image file; and post process the decoded image by modifying the pixels based on the depths indicated within the depth map and the set of low resolution images to create a rendered image.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/36 - Prétraitement de l'image, c. à d. traitement de l'information image sans se préoccuper de l'identité de l'image
H04N 13/128 - Ajustement de la profondeur ou de la disparité
H04N 13/161 - Encodage, multiplexage ou démultiplexage de différentes composantes des signaux d’images
H04N 13/243 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant au moins trois capteurs d’images 2D
H04N 13/271 - Générateurs de signaux d’images où les signaux d’images générés comprennent des cartes de profondeur ou de disparité
G06T 7/50 - Récupération de la profondeur ou de la forme
G06T 9/20 - Codage des contours, p. ex. utilisant la détection des contours
H04N 19/597 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif spécialement adapté pour l’encodage de séquences vidéo multi-vues
H04N 19/625 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant un codage par transformée utilisant une transformée en cosinus discrète
H04N 19/136 - Caractéristiques ou propriétés du signal vidéo entrant
G06T 3/40 - Changement d'échelle d’images complètes ou de parties d’image, p. ex. agrandissement ou rétrécissement
H04N 19/85 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le pré-traitement ou le post-traitement spécialement adaptés pour la compression vidéo
Systems in accordance with embodiments of the invention can perform parallax detection and correction in images captured using array cameras. Due to the different viewpoints of the cameras, parallax results in variations in the position of objects within the captured images of the scene. Methods in accordance with embodiments of the invention provide an accurate account of the pixel disparity due to parallax between the different cameras in the array, so that appropriate scene-dependent geometric shifts can be applied to the pixels of the captured images when performing super-resolution processing. In a number of embodiments, generating depth estimates considers the similarity of pixels in multiple spectral channels. In certain embodiments, generating depth estimates involves generating a confidence map indicating the reliability of depth estimates.
H04N 13/232 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant un seul capteur d’images 2D utilisant des lentilles du type œil de mouche, p. ex. dispositions de lentilles circulaires
H04N 9/097 - Dispositions optiques associées aux dispositifs analyseurs, p.ex. pour partager des faisceaux, pour corriger la couleur
Systems and methods for implementing array cameras configured to perform super-resolution processing to generate higher resolution super-resolved images using a plurality of captured images and lens stack arrays that can be utilized in array cameras are disclosed. An imaging device in accordance with one embodiment of the invention includes at least one imager array, and each imager in the array comprises a plurality of light sensing elements and a lens stack including at least one lens surface, where the lens stack is configured to form an image on the light sensing elements, control circuitry configured to capture images formed on the light sensing elements of each of the imagers, and a super-resolution processing module configured to generate at least one higher resolution super-resolved image using a plurality of the captured images.
H04N 13/239 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant deux capteurs d’images 2D dont la position relative est égale ou en correspondance à l’intervalle oculaire
H04N 5/247 - Disposition des caméras de télévision
G02B 13/00 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous
H04N 5/365 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit appliqué au bruit à motif fixe, p.ex. non-uniformité de la réponse
H04N 13/128 - Ajustement de la profondeur ou de la disparité
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/33 - Transformation des rayonnements infrarouges
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
H04N 5/349 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner pour accroître la résolution en déplaçant le capteur par rapport à la scène
Systems and methods in accordance with embodiments of the invention are disclosed that use super-resolution (SR) processes to use information from a plurality of low resolution (LR) images captured by an array camera to produce a synthesized higher resolution image. One embodiment includes obtaining input images using the plurality of imagers, using a microprocessor to determine an initial estimate of at least a portion of a high resolution image using a plurality of pixels from the input images, and using a microprocessor to determine a high resolution image that when mapped through the forward imaging transformation matches the input images to within at least one predetermined criterion using the initial estimate of at least a portion of the high resolution image. In addition, each forward imaging transformation corresponds to the manner in which each imager in the imaging array generate the input images, and the high resolution image synthesized by the microprocessor has a resolution that is greater than any of the input images.
A neural network engine comprises a plurality of floating point multipliers, each having an input connected to an input map value and an input connected to a corresponding kernel value. Pairs of multipliers provide outputs to a tree of nodes, each node of the tree being configured to provide a floating point output corresponding to either: a larger of the inputs of the node; or a sum of the inputs, one output node of the tree providing a first input of an output module, and one of the multipliers providing an output to a second input of the output module. The engine is configured to process either a convolution layer of a neural network, an average pooling layer or a max pooling layer according to the kernel values and whether the nodes and output module are configured to output a larger or a sum of their inputs.
Systems and methods for implementing array cameras configured to perform super-resolution processing to generate higher resolution super-resolved images using a plurality of captured images and lens stack arrays that can be utilized in array cameras are disclosed. Lens stack arrays in accordance with many embodiments of the invention include lens elements formed on substrates separated by spacers, where the lens elements, substrates and spacers are configured to form a plurality of optical channels, at least one aperture located within each optical channel, at least one spectral filter located within each optical channel, where each spectral filter is configured to pass a specific spectral band of light, and light blocking materials located within the lens stack array to optically isolate the optical channels.
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
Systems and methods for calibrating an array camera are disclosed. Systems and methods for calibrating an array camera in accordance with embodiments of this invention include the capturing of an image of a test pattern with the array camera such that each imaging component in the array camera captures an image of the test pattern. The image of the test pattern captured by a reference imaging component is then used to derive calibration information for the reference component. A corrected image of the test pattern for the reference component is then generated from the calibration information and the image of the test pattern captured by the reference imaging component. The corrected image is then used with the images captured by each of the associate imaging components associated with the reference component to generate calibration information for the associate imaging components.
H04N 13/282 - Générateurs de signaux d’images pour la génération de signaux d’images correspondant à au moins trois points de vue géométriques, p. ex. systèmes multi-vues
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
G06T 7/80 - Analyse des images capturées pour déterminer les paramètres de caméra intrinsèques ou extrinsèques, c.-à-d. étalonnage de caméra
H04N 17/02 - Diagnostic, test ou mesure, ou leurs détails, pour les systèmes de télévision pour les signaux de télévision en couleurs
37.
Systems and methods for depth estimation using generative models
Systems and methods for depth estimation in accordance with embodiments of the invention are illustrated. One embodiment includes a method for estimating depth from images. The method includes steps for receiving a plurality of source images captured from a plurality of different viewpoints using a processing system configured by an image processing application, generating a target image from a target viewpoint that is different to the viewpoints of the plurality of source images based upon a set of generative model parameters using the processing system configured by the image processing application, and identifying depth information of at least one output image based on the predicted target image using the processing system configured by the image processing application.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06T 7/55 - Récupération de la profondeur ou de la forme à partir de plusieurs images
G06T 7/70 - Détermination de la position ou de l'orientation des objets ou des caméras
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
38.
Optical systems for cameras incorporating lens elements formed separately and subsequently bonded to low CTE substrates
Systems and methods in accordance with embodiments of the invention implement optical systems incorporating lens elements formed separately and subsequently bonded to low coefficient of thermal expansion substrates. Optical systems in accordance with various embodiments of the invention can be utilized in single aperture cameras, and multiple-aperture array cameras. In one embodiment, a robust optical system includes at least one carrier characterized by a low coefficient of thermal expansion to which at least a primary lens element formed from precision molded glass is bonded.
G02B 7/02 - Montures, moyens de réglage ou raccords étanches à la lumière pour éléments optiques pour lentilles
B29D 11/00 - Fabrication d'éléments optiques, p. ex. lentilles ou prismes
G02B 1/04 - Éléments optiques caractérisés par la substance dont ils sont faitsRevêtements optiques pour éléments optiques faits de substances organiques, p. ex. plastiques
G03B 17/12 - Corps d'appareils avec moyens pour supporter des objectifs, des lentilles additionnelles, des filtres, des masques ou des tourelles
H04N 9/07 - Générateurs de signaux d'image avec une seule tête de lecture
Systems and methods in accordance with embodiments of the invention actively align a lens stack array with an array of focal planes to construct an array camera module. In one embodiment, a method for actively aligning a lens stack array with a sensor that has a focal plane array includes: aligning the lens stack array relative to the sensor in an initial position; varying the spatial relationship between the lens stack array and the sensor; capturing images of a known target that has a region of interest using a plurality of active focal planes at different spatial relationships; scoring the images based on the extent to which the region of interest is focused in the images; selecting a spatial relationship between the lens stack array and the sensor based on a comparison of the scores; and forming an array camera subassembly based on the selected spatial relationship.
H04N 17/00 - Diagnostic, test ou mesure, ou leurs détails, pour les systèmes de télévision
G02B 7/00 - Montures, moyens de réglage ou raccords étanches à la lumière pour éléments optiques
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
A neural network image processing apparatus arranged to acquire images from an image sensor and to: identify a ROI containing a face region in an image; determine at plurality of facial landmarks in the face region; use the facial landmarks to transform the face region within the ROI into a face region having a given pose; and use transformed landmarks within the transformed face region to identify a pair of eye regions within the transformed face region. Each identified eye region is fed to a respective first and second convolutional neural network, each network configured to produce a respective feature vector. Each feature vector is fed to respective eyelid opening level neural networks to obtain respective measures of eyelid opening for each eye region. The feature vectors are combined and to a gaze angle neural network to generate gaze yaw and pitch values substantially simultaneously with the eyelid opening values.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
41.
Method and system for correcting a distorted input image
A method for correcting an image divides an output image into a grid with vertical sections of width smaller than the image width but wide enough to allow efficient bursts when writing distortion corrected line sections into memory. A distortion correction engine includes a relatively small amount of memory for an input image buffer but without requiring unduly complex control. The input image buffer accommodates enough lines of an input image to cover the distortion of a single most vertically distorted line section of the input image. The memory required for the input image buffer can be significantly less than would be required to store all the lines of a distorted input image spanning a maximal distortion of a complete line within the input image.
Systems and methods for extended color processing on Pelican array cameras in accordance with embodiments of the invention are disclosed. In one embodiment, a method of generating a high resolution image includes obtaining input images, where a first set of images includes information in a first band of visible wavelengths and a second set of images includes information in a second band of visible wavelengths and non-visible wavelengths, determining an initial estimate by combining the first set of images into a first fused image, combining the second set of images into a second fused image, spatially registering the fused images, denoising the fused images using bilateral filters, normalizing the second fused image in the photometric reference space of the first fused image, combining the fused images, determining a high resolution image that when mapped through a forward imaging transformation matches the input images within at least one predetermined criterion.
Systems and methods for transmitting and receiving image data captured by an imager array including a plurality of focal planes are described. One embodiment of the invention includes capturing image data using a plurality of active focal planes in a camera module, where an image is formed on each active focal plane by a separate lens stack, generating lines of image data by interleaving the image data captured by the plurality of active focal planes, and transmitting the lines of image data and the additional data.
In an embodiment, a 3D facial modeling system includes a plurality of cameras configured to capture images from different viewpoints, a processor, and a memory containing a 3D facial modeling application and parameters defining a face detector, wherein the 3D facial modeling application directs the processor to obtain a plurality of images of a face captured from different viewpoints using the plurality of cameras, locate a face within each of the plurality of images using the face detector, wherein the face detector labels key feature points on the located face within each of the plurality of images, determine disparity between corresponding key feature points of located faces within the plurality of images, and generate a 3D model of the face using the depth of the key feature points.
In an embodiment, a 3D facial modeling system includes a plurality of cameras configured to capture images from different viewpoints, a processor, and a memory containing a 3D facial modeling application and parameters defining a face detector, wherein the 3D facial modeling application directs the processor to obtain a plurality of images of a face captured from different viewpoints using the plurality of cameras, locate a face within each of the plurality of images using the face detector, wherein the face detector labels key feature points on the located face within each of the plurality of images, determine disparity between corresponding key feature points of located faces within the plurality of images, and generate a 3D model of the face using the depth of the key feature points.
G06T 17/20 - Description filaire, p. ex. polygonalisation ou tessellation
G06T 7/149 - DécoupageDétection de bords impliquant des modèles déformables, p. ex. des modèles de contours actifs
G06T 7/593 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir d’images stéréo
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
46.
METHOD AND SYSTEM FOR CORRECTING A DISTORTED INPUT IMAGE
A method for correcting an image divides an output image into a grid with vertical sections of width smaller than the image width but wide enough to allow efficient bursts when writing distortion corrected line sections into memory. A distortion correction engine includes a relatively small amount of memory for an input image buffer but without requiring unduly complex control.The input image buffer accommodates enough lines of an input image to cover the distortion of a single most vertically distorted line section of the input image. The memory required for the input image buffer can be significantly less than would be required to store all the lines of a distorted input image spanning a maximal distortion of a complete line within the input image.
Systems and methods for implementing array camera configurations that include a plurality of constituent array cameras, where each constituent array camera provides a distinct field of view and/or a distinct viewing direction, are described. In several embodiments, image data captured by the constituent array cameras is used to synthesize multiple images that are subsequently blended. In a number of embodiments, the blended images include a foveated region. In certain embodiments, the blended images possess a wider field of view than the fields of view of the multiple images.
H04N 5/369 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS] circuits associés à cette dernière
H04N 5/247 - Disposition des caméras de télévision
H04N 13/243 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant au moins trois capteurs d’images 2D
G06T 7/557 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir des champs de lumière, p. ex. de caméras plénoptiques
G06T 3/40 - Changement d'échelle d’images complètes ou de parties d’image, p. ex. agrandissement ou rétrécissement
H04N 17/00 - Diagnostic, test ou mesure, ou leurs détails, pour les systèmes de télévision
G02B 13/02 - Télé-objectifs photographiques, c.-à-d. systèmes du type + — dans lesquels la distance du sommet de l'angle avant au plan de l'image est inférieure à la distance focale équivalente
H04N 13/232 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant un seul capteur d’images 2D utilisant des lentilles du type œil de mouche, p. ex. dispositions de lentilles circulaires
48.
Stereoscopic (3D) panorama creation on handheld device
A technique of generating a stereoscopic panorama image includes panning a portable camera device, and acquiring multiple image frames. Multiple at least partially overlapping image frames are acquired of portions of the scene. The method involves registering the image frames, including determining displacements of the imaging device between acquisitions of image frames. Multiple panorama images are generated including joining image frames of the scene according to spatial relationships and determining stereoscopic counterpart relationships between the multiple panorama images. The multiple panorama images are processed based on the stereoscopic counterpart relationships to form a stereoscopic panorama image.
Systems with an array camera augmented with a conventional camera in accordance with embodiments of the invention are disclosed. In some embodiments, the array camera is used to capture a first set of image data of a scene and a conventional camera is used to capture a second set of image data for the scene. An object of interest is identified in the first set of image data. A first depth measurement for the object of interest is determined and compared to a predetermined threshold. If the first depth measurement is above the threshold, a second set of image data captured using the conventional camera is obtained. The object of interest is identified in the second set of image data and a second depth measurement for the object of interest is determined using at least a portion of the first set of image data and at least a portion of the second set of image data.
H04N 13/271 - Générateurs de signaux d’images où les signaux d’images générés comprennent des cartes de profondeur ou de disparité
G01P 3/38 - Dispositifs caractérisés par l'emploi de moyens optiques, p. ex. en utilisant la lumière infrarouge, visible ou ultraviolette en utilisant des moyens photographiques
H04N 13/243 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant au moins trois capteurs d’images 2D
H04N 13/232 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant un seul capteur d’images 2D utilisant des lentilles du type œil de mouche, p. ex. dispositions de lentilles circulaires
A peripheral processing device comprises a physical interface for connecting the processing device to a host computing device through a communications protocol. A local controller connected to local memory across an internal bus provides input/output access to data stored on the processing device to the host through a file system API. A neural processor comprises at least one network processing engine for processing a layer of a neural network according to a network configuration. A memory at least temporarily stores network configuration information, input image information, intermediate image information and output information produced by each network processing engine. The local controller is arranged to receive network configuration information through a file system API write command, to receive input image information through a file system API write command; and to write output information to the local memory for retrieval by the host through a file system API read command.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06N 3/06 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone
A peripheral processing device comprises a physical interface for connecting the processing device to a host computing device through a communications protocol. A local controller connected to local memory across an internal bus provides input/output access to data stored on the processing device to the host through a file system API. A neural processor comprises at least one network processing engine for processing a layer of a neural network according to a network configuration. A memory at least temporarily stores network configuration information, input image information, intermediate image information and output information produced by each network processing engine. The local controller is arranged to receive network configuration information through a file system API write command, to receive input image information through a file system API write command; and to write output information to the local memory for retrieval by the host through a file system API read command.
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
52.
Systems and methods for estimating depth from projected texture using camera arrays
Systems and methods in accordance with embodiments of the invention estimate depth from projected texture using camera arrays. One embodiment of the invention includes: at least one two-dimensional array of cameras comprising a plurality of cameras; an illumination system configured to illuminate a scene with a projected texture; a processor; and memory containing an image processing pipeline application and an illumination system controller application. In addition, the illumination system controller application directs the processor to control the illumination system to illuminate a scene with a projected texture. Furthermore, the image processing pipeline application directs the processor to: utilize the illumination system controller application to control the illumination system to illuminate a scene with a projected texture capture a set of images of the scene illuminated with the projected texture; determining depth estimates for pixel locations in an image from a reference viewpoint using at least a subset of the set of images. Also, generating a depth estimate for a given pixel location in the image from the reference viewpoint includes: identifying pixels in the at least a subset of the set of images that correspond to the given pixel location in the image from the reference viewpoint based upon expected disparity at a plurality of depths along a plurality of epipolar lines aligned at different angles; comparing the similarity of the corresponding pixels identified at each of the plurality of depths; and selecting the depth from the plurality of depths at which the identified corresponding pixels have the highest degree of similarity as a depth estimate for the given pixel location in the image from the reference viewpoint.
G01B 11/22 - Dispositions pour la mesure caractérisées par l'utilisation de techniques optiques pour mesurer la profondeur
G01B 11/25 - Dispositions pour la mesure caractérisées par l'utilisation de techniques optiques pour mesurer des contours ou des courbes en projetant un motif, p. ex. des franges de moiré, sur l'objet
G06T 7/521 - Récupération de la profondeur ou de la forme à partir de la télémétrie laser, p. ex. par interférométrieRécupération de la profondeur ou de la forme à partir de la projection de lumière structurée
G06T 7/593 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir d’images stéréo
G06T 7/557 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir des champs de lumière, p. ex. de caméras plénoptiques
Systems and methods for hybrid depth regularization in accordance with various embodiments of the invention are disclosed. In one embodiment of the invention, a depth sensing system comprises a plurality of cameras; a processor; and a memory containing an image processing application. The image processing application may direct the processor to obtain image data for a plurality of images from multiple viewpoints, the image data comprising a reference image and at least one alternate view image; generate a raw depth map using a first depth estimation process, and a confidence map; and generate a regularized depth map. The regularized depth map may be generated by computing a secondary depth map using a second different depth estimation process; and computing a composite depth map by selecting depth estimates from the raw depth map and the secondary depth map based on the confidence map.
Systems and methods in accordance with embodiments of this invention perform depth regularization and semiautomatic interactive matting using images. In an embodiment of the invention, the image processing pipeline application directs a processor to receive (i) an image (ii) an initial depth map corresponding to the depths of pixels within the image, regularize the initial depth map into a dense depth map using depth values of known pixels to compute depth values of unknown pixels, determine an object of interest to be extracted from the image, generate an initial trimap using the dense depth map and the object of interest to be extracted from the image, and apply color image matting to unknown regions of the initial trimap to generate a matte for image matting.
Systems and methods for reducing motion blur in images or video in ultra low light with array cameras in accordance with embodiments of the invention are disclosed. In one embodiment, a method for synthesizing an image from multiple images captured using an array camera includes capturing image data using active cameras within an array camera, where the active cameras are configured to capture image data and the image data includes pixel brightness values that form alternate view images captured from different viewpoints, determining sets of corresponding pixels in the alternate view images where each pixel in a set of corresponding pixels is chosen from a different alternate view image, summing the pixel brightness values for corresponding pixels to create pixel brightness sums for pixel locations in an output image, and synthesizing an output image from the viewpoint of the output image using the pixel brightness sums.
Systems and methods in accordance with embodiments of the invention are configured to render images using light field image files containing an image synthesized from light field image data and metadata describing the image that includes a depth map. One embodiment of the invention includes a processor and memory containing a rendering application and a light field image file including an encoded image, a set of low resolution images, and metadata describing the encoded image, where the metadata comprises a depth map that specifies depths from the reference viewpoint for pixels in the encoded image. In addition, the rendering application configures the processor to: locate the encoded image within the light field image file; decode the encoded image; locate the metadata within the light field image file; and post process the decoded image by modifying the pixels based on the depths indicated within the depth map and the set of low resolution images to create a rendered image.
G06T 9/20 - Codage des contours, p. ex. utilisant la détection des contours
H04N 19/597 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif spécialement adapté pour l’encodage de séquences vidéo multi-vues
H04N 19/625 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant un codage par transformée utilisant une transformée en cosinus discrète
H04N 19/136 - Caractéristiques ou propriétés du signal vidéo entrant
G06T 3/40 - Changement d'échelle d’images complètes ou de parties d’image, p. ex. agrandissement ou rétrécissement
H04N 19/85 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le pré-traitement ou le post-traitement spécialement adaptés pour la compression vidéo
Systems and methods for synthesizing high resolution images using image deconvolution and depth information in accordance embodiments of the invention are disclosed. In one embodiment, an array camera includes a processor and a memory, wherein an image deconvolution application configures the processor to obtain light field image data, determine motion data based on metadata contained in the light field image data, generate a depth-dependent point spread function based on the synthesized high resolution image, the depth map, and the motion data, measure the quality of the synthesized high resolution image based on the generated depth-dependent point spread function, and when the measured quality of the synthesized high resolution image is within a quality threshold, incorporate the synthesized high resolution image into the light field image data.
A method and system for detecting facial expressions in digital images and applications therefore are disclosed. Analysis of a digital image determines whether or not a smile and/or blink is present on a person's face. Face recognition, and/or a pose or illumination condition determination, permits application of a specific, relatively small classifier cascade.
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
59.
System and methods for calibration of an array camera
Systems and methods for calibrating an array camera are disclosed. Systems and methods for calibrating an array camera in accordance with embodiments of this invention include the capturing of an image of a test pattern with the array camera such that each imaging component in the array camera captures an image of the test pattern. The image of the test pattern captured by a reference imaging component is then used to derive calibration information for the reference component. A corrected image of the test pattern for the reference component is then generated from the calibration information and the image of the test pattern captured by the reference imaging component. The corrected image is then used with the images captured by each of the associate imaging components associated with the reference component to generate calibration information for the associate imaging components.
H04N 13/282 - Générateurs de signaux d’images pour la génération de signaux d’images correspondant à au moins trois points de vue géométriques, p. ex. systèmes multi-vues
Systems and methods for automatically correcting apparent distortions in close range photographs that are captured using an imaging system capable of capturing images and depth maps are disclosed. In many embodiments, faces are automatically detected and segmented from images using a depth-assisted alpha matting. The detected faces can then be re-rendered from a more distant viewpoint and composited with the background to create a new image in which apparent perspective distortion is reduced.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
09 - Appareils et instruments scientifiques et électriques
Produits et services
Facial recognition computer software that uses computer vision techniques and depth information to securely unlock, personalize and enable payment features for use with computers, tablet computers, mobile devices, mobile phones and cameras.
Architectures for imager arrays configured for use in array cameras in accordance with embodiments of the invention are described. One embodiment of the invention includes a plurality of focal planes, where each focal plane comprises a two dimensional arrangement of pixels having at least two pixels in each dimension and each focal plane is contained within a region of the imager array that does not contain pixels from another focal plane, control circuitry configured to control the capture of image information by the pixels within the focal planes, where the control circuitry is configured so that the capture of image information by the pixels in at least two of the focal planes is separately controllable, sampling circuitry configured to convert pixel outputs into digital pixel data, and output interface circuitry configured to transmit pixel data via an output interface.
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
H04N 5/345 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner en lisant partiellement une matrice de capteurs SSIS
H04N 5/365 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit appliqué au bruit à motif fixe, p.ex. non-uniformité de la réponse
H04N 5/374 - Capteurs adressés, p.ex. capteurs MOS ou CMOS
H04N 5/378 - Circuits de lecture, p.ex. circuits d’échantillonnage double corrélé [CDS], amplificateurs de sortie ou convertisseurs A/N
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 9/09 - Générateurs de signaux d'image avec plusieurs têtes de lecture
H04N 5/3745 - Capteurs adressés, p.ex. capteurs MOS ou CMOS ayant des composants supplémentaires incorporés au sein d'un pixel ou connectés à un groupe de pixels au sein d'une matrice de capteurs, p.ex. mémoires, convertisseurs A/N, amplificateurs de pixels, circuits communs ou composants communs
Systems and methods for storing images synthesized from light field image data and metadata describing the images in electronic files in accordance with embodiments of the invention are disclosed. One embodiment includes a processor and memory containing an encoding application and light field image data, where the light field image data comprises a plurality of low resolution images of a scene captured from different viewpoints. In addition, the encoding application configures the processor to synthesize a higher resolution image of the scene from a reference viewpoint using the low resolution images, where synthesizing the higher resolution image involves creating a depth map that specifies depths from the reference viewpoint for pixels in the higher resolution image; encode the higher resolution image; and create a light field image file including the encoded image, the low resolution images, and metadata including the depth map.
G06T 9/20 - Codage des contours, p. ex. utilisant la détection des contours
H04N 19/597 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif spécialement adapté pour l’encodage de séquences vidéo multi-vues
H04N 19/625 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant un codage par transformée utilisant une transformée en cosinus discrète
H04N 19/136 - Caractéristiques ou propriétés du signal vidéo entrant
G06T 3/40 - Changement d'échelle d’images complètes ou de parties d’image, p. ex. agrandissement ou rétrécissement
H04N 19/85 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le pré-traitement ou le post-traitement spécialement adaptés pour la compression vidéo
Systems and methods for detecting defective camera arrays, optic arrays and/or sensors are described. One embodiment includes capturing image data using a camera array; dividing the captured images into a plurality of corresponding image regions; identifying the presence of localized defects in any of the cameras by evaluating the image regions in the captured images; and detecting a defective camera array using the image processing system when the number of localized defects in a specific set of image regions exceeds a predetermined threshold, where the specific set of image regions is formed by: a common corresponding image region from at least a subset of the captured images; and any additional image region in a given image that contains at least one pixel located within a predetermined maximum parallax shift distance along an epipolar line from a pixel within said common corresponding image region within the given image.
H04N 5/367 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit appliqué au bruit à motif fixe, p.ex. non-uniformité de la réponse appliqué aux défauts, p.ex. pixels non réactifs
H04N 5/217 - Circuits pour la suppression ou la diminution de perturbations, p.ex. moiré ou halo lors de la production des signaux d'image
66.
Systems and methods for controlling aliasing in images captured by an array camera for use in super resolution processing using pixel apertures
Imager arrays, array camera modules, and array cameras in accordance with embodiments of the invention utilize pixel apertures to control the amount of aliasing present in captured images of a scene. One embodiment includes a plurality of focal planes, control circuitry configured to control the capture of image information by the pixels within the focal planes, and sampling circuitry configured to convert pixel outputs into digital pixel data. In addition, the pixels in the plurality of focal planes include a pixel stack including a microlens and an active area, where light incident on the surface of the microlens is focused onto the active area by the microlens and the active area samples the incident light to capture image information, and the pixel stack defines a pixel area and includes a pixel aperture, where the size of the pixel apertures is smaller than the pixel area.
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
H04N 9/097 - Dispositions optiques associées aux dispositifs analyseurs, p.ex. pour partager des faisceaux, pour corriger la couleur
H04N 13/271 - Générateurs de signaux d’images où les signaux d’images générés comprennent des cartes de profondeur ou de disparité
H04N 13/232 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant un seul capteur d’images 2D utilisant des lentilles du type œil de mouche, p. ex. dispositions de lentilles circulaires
67.
Systems and methods for manufacturing camera modules using active alignment of lens stack arrays and sensors
Systems and methods in accordance with embodiments of the invention actively align a lens stack array with an array of focal planes to construct an array camera module. In one embodiment, a method for actively aligning a lens stack array with a sensor that has a focal plane array includes: aligning the lens stack array relative to the sensor in an initial position; varying the spatial relationship between the lens stack array and the sensor; capturing images of a known target that has a region of interest using a plurality of active focal planes at different spatial relationships; scoring the images based on the extent to which the region of interest is focused in the images; selecting a spatial relationship between the lens stack array and the sensor based on a comparison of the scores; and forming an array camera subassembly based on the selected spatial relationship.
H04N 17/00 - Diagnostic, test ou mesure, ou leurs détails, pour les systèmes de télévision
G02B 7/00 - Montures, moyens de réglage ou raccords étanches à la lumière pour éléments optiques
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
68.
Capturing and processing of images including occlusions focused on an image sensor by a lens stack array
Systems and methods for implementing array cameras configured to perform super-resolution processing to generate higher resolution super-resolved images using a plurality of captured images and lens stack arrays that can be utilized in array cameras are disclosed. An imaging device in accordance with one embodiment of the invention includes at least one imager array, and each imager in the array comprises a plurality of light sensing elements and a lens stack including at least one lens surface, where the lens stack is configured to form an image on the light sensing elements, control circuitry configured to capture images formed on the light sensing elements of each of the imagers, and a super-resolution processing module configured to generate at least one higher resolution super-resolved image using a plurality of the captured images.
H04N 5/365 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit appliqué au bruit à motif fixe, p.ex. non-uniformité de la réponse
H04N 13/128 - Ajustement de la profondeur ou de la disparité
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/33 - Transformation des rayonnements infrarouges
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
G06T 7/557 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir des champs de lumière, p. ex. de caméras plénoptiques
H04N 5/349 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner pour accroître la résolution en déplaçant le capteur par rapport à la scène
Systems and methods for calibrating an array camera are disclosed. Systems and methods for calibrating an array camera in accordance with embodiments of this invention include the detecting of defects in the imaging components of the array camera and determining whether the detected defects may be tolerated by image processing algorithms. The calibration process also determines translation information between imaging components in the array camera for use in merging the image data from the various imaging components during image processing. Furthermore, the calibration process may include a process to improve photometric uniformity in the imaging components.
G06T 7/80 - Analyse des images capturées pour déterminer les paramètres de caméra intrinsèques ou extrinsèques, c.-à-d. étalonnage de caméra
H04N 5/247 - Disposition des caméras de télévision
H04N 5/367 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit appliqué au bruit à motif fixe, p.ex. non-uniformité de la réponse appliqué aux défauts, p.ex. pixels non réactifs
H04N 5/357 - Traitement du bruit, p.ex. détection, correction, réduction ou élimination du bruit
A camera array, an imaging device and/or a method for capturing image that employ a plurality of imagers fabricated on a substrate is provided. Each imager includes a plurality of pixels. The plurality of imagers include a first imager having a first imaging characteristics and a second imager having a second imaging characteristics. The images generated by the plurality of imagers are processed to obtain an enhanced image compared to images captured by the imagers. Each imager may be associated with an optical element fabricated using a wafer level optics (WLO) technology.
H04N 5/369 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS] circuits associés à cette dernière
H04N 3/14 - Détails des dispositifs de balayage des systèmes de télévisionLeur combinaison avec la production des tensions d'alimentation par des moyens non exclusivement optiques-mécaniques au moyen de dispositifs à l'état solide à balayage électronique
H04N 1/195 - Dispositions de balayage utilisant des ensembles composés de plusieurs éléments l'ensemble comprenant un ensemble à deux dimensions
H04N 5/335 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS]
G06T 1/20 - Architectures de processeursConfiguration de processeurs p. ex. configuration en pipeline
G02B 27/12 - Systèmes divisant ou combinant des faisceaux fonctionnant uniquement par réfraction
G02B 27/10 - Systèmes divisant ou combinant des faisceaux
An iris image acquisition system comprises an image sensor comprising an array of pixels including pixels sensitive to NIR wavelengths; at least one NIR light source capable of selectively emitting light with different discrete NIR wavelengths; and a processor, operably connected to the image sensor and the at least one NIR light source, to acquire image information from the sensor under illumination at one of the different discrete NIR wavelengths. A lens assembly comprises a plurality of lens elements with a total track length no more than 4.7 mm, each lens element comprising a material with a refractive index inversely proportional to wavelength. The different discrete NIR wavelengths are matched with the refractive index of the material for the lens elements to balance axial image shift induced by a change in object distance with axial image shift due to change in illumination wavelength.
H04N 9/47 - Synchronisation de couleurs pour des signaux séquentiels
H04N 7/18 - Systèmes de télévision en circuit fermé [CCTV], c.-à-d. systèmes dans lesquels le signal vidéo n'est pas diffusé
H04N 5/33 - Transformation des rayonnements infrarouges
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G02B 27/00 - Systèmes ou appareils optiques non prévus dans aucun des groupes ,
G02B 13/00 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous
G02B 13/14 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous à utiliser avec des radiations infrarouges ou ultraviolettes
G02B 13/18 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous avec des lentilles ayant une ou plusieurs surfaces non sphériques, p. ex. pour réduire l'aberration géométrique
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
An iris image acquisition system (10) comprises an image sensor (14) comprising an array of pixels including pixels sensitive to NIR wavelengths; at least one NIR light source (16, 18) capable of selectively emitting light with different discrete NIR wavelengths; and a processor (20), operably connected to the image sensor (14) and the at least one NIR light source (16, 18), to acquire image information from the sensor (14) under illumination at one of the different discrete NIR wavelengths. A lens assembly (12) comprises a plurality of lens elements with a total track length of no more than 4.7mm, each lens element comprising a material with a refractive index inversely proportional to wavelength. The different discrete NIR wavelengths are matched with the refractive index of the material for the lens elements to balance axial image shift induced by a change in object distance with axial image shift due to change in illumination wavelength.
G02B 27/00 - Systèmes ou appareils optiques non prévus dans aucun des groupes ,
G02B 13/00 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous
G02B 13/14 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous à utiliser avec des radiations infrarouges ou ultraviolettes
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
G02B 13/18 - Objectifs optiques spécialement conçus pour les emplois spécifiés ci-dessous avec des lentilles ayant une ou plusieurs surfaces non sphériques, p. ex. pour réduire l'aberration géométrique
Sub-regions within one or more face images are identified within a digital image, and enhanced by applying an artificial glint symmetrically and/or synchronously to image data corresponding to sub-regions of eyes within the face image. An enhanced face image is generated including an enhanced version of the face that includes certain original pixels in combination with pixels corresponding to the one or more eye regions of the face with the artificial glint.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
H04N 1/62 - Retouches, c.-à-d. modification de couleurs isolées uniquement ou dans des zones d'image isolées uniquement
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/262 - Circuits de studio, p. ex. pour mélanger, commuter, changer le caractère de l'image, pour d'autres effets spéciaux
G06T 7/181 - DécoupageDétection de bords impliquant des croissances de bordsDécoupageDétection de bords impliquant des liaisons de bords
A method for producing a histogram of oriented gradients (HOG) for at least a portion of an image comprises dividing the image portion into cells, each cell comprising a plurality of image pixels. Then, for each image pixel of a cell, obtaining a horizontal gradient component, gx, and a vertical gradient component, gy, based on differences in pixel values along at least a row of the image and a column of the image respectively including the pixel; and allocating a gradient to one of a plurality of sectors, where n is a sector index, each sector extending through a range of orientation angles and at least some of the sectors being divided from adjacent sectors according to the inequalities: b*16
A biometric recognition system for a hand held computing device incorporating an inertial measurement unit (IMU) comprising a plurality of accelerometers and at least one gyroscope is disclosed. A tremor analysis component is arranged to: obtain from the IMU, accelerometer signals indicating device translational acceleration along each of X, Y and Z axes as well as a gyroscope signal indicating rotational velocity about the Y axis during a measurement window. Each of the IMU signals is filtered to provide filtered frequency components for the signals during the measurement window. The accelerometer signals are combined to provide a combined filtered accelerometer magnitude signal for the measurement window. A spectral density estimationis provided for each of the combined filtered accelerometer magnitude signal and the filtered gyroscope signal. An irregularity is determined for each spectral density estimation; and based on the determined irregularities, the tremor analysis component attempts to authenticate a user of the device.
G06F 21/32 - Authentification de l’utilisateur par données biométriques, p. ex. empreintes digitales, balayages de l’iris ou empreintes vocales
G06F 21/40 - Authentification de l’utilisateur sous réserve d’un quorum, c.-à-d. avec l’intervention nécessaire d’au moins deux responsables de la sécurité
76.
Systems and methods for generating compressed light field representation data using captured light fields, array geometry, and parallax information
Systems and methods for the generating compressed light field representation data using captured light fields in accordance embodiments of the invention are disclosed. In one embodiment, an array camera includes a processor and a memory connected configured to store an image processing application, wherein the image processing application configures the processor to obtain image data, wherein the image data includes a set of images including a reference image and at least one alternate view image, generate a depth map based on the image data, determine at least one prediction image based on the reference image and the depth map, compute prediction error data based on the at least one prediction image and the at least one alternate view image, and generate compressed light field representation data based on the reference image, the prediction error data, and the depth map.
G06K 9/66 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques utilisant des comparaisons ou corrélations simultanées de signaux images avec une pluralité de références, p.ex. matrice de résistances avec des références réglables par une méthode adaptative, p.ex. en s'instruisant
An image processing method for iris recognition of a predetermined subject, comprises acquiring through an image sensor, a probe image illuminated by an infra-red (IR) illumination source, wherein the probe image comprises one or more eye regions and is overexposed until skin portions of the image are saturated. One or more iris regions are identified within the one or more eye regions of said probe image; and the identified iris regions are analysed to detect whether they belong to the predetermined subject.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
Systems and methods for dynamically calibrating an array camera to accommodate variations in geometry that can occur throughout its operational life are disclosed. The dynamic calibration processes can include acquiring a set of images of a scene and identifying corresponding features within the images. Geometric calibration data can be used to rectify the images and determine residual vectors for the geometric calibration data at locations where corresponding features are observed. The residual vectors can then be used to determine updated geometric calibration data for the camera array. In several embodiments, the residual vectors are used to generate a residual vector calibration data field that updates the geometric calibration data. In many embodiments, the residual vectors are used to select a set of geometric calibration from amongst a number of different sets of geometric calibration data that is the best fit for the current geometry of the camera array.
A method of correcting an image obtained by an image acquisition device includes obtaining successive measurements (Gn), of device movement during exposure of each row of an image. An integration range (idx), is selected in proportion to an exposure time (te), for each row of the image. Accumulated measurements (Cn), of device movement for each row of an image are averaged across the integration range to provide successive filtered measurements (Ḡ), of device movement during exposure of each row of an image. The image is corrected for device movement using the filtered measurements (Ḡ).
A method operable within an image capture device for stabilizing a sequence of images captured by the image capture device is disclosed. The method comprises,using lens based sensors indicating image capture device movement during image acquisition, performing optical image stabilization (OIS) during acquisition of each image of the sequence of images to provide a sequence of OIS corrected images. Movement of the device for each frame during which each OIS corrected image is captured is determined using inertial measurement sensors. At least an estimate of OIS control performed during acquisition of an image is obtained. The estimate is removed from the intra-frame movement determined for the frame during which the OIS corrected image was captured to provide a residual measurement of movement for the frame. Electronic image stabilization (EIS) of each OIS corrected image based on the residual measurement is performed to provide a stabilized sequence of images.
A method operable within an image capture device for stabilizing a sequence of images captured by the image capture device is disclosed. The method comprises, using lens based sensors indicating image capture device movement during image acquisition, performing optical image stabilization (OIS) during acquisition of each image of the sequence of images to provide a sequence of OIS corrected images. Frame-to-frame movement of the device for each frame during which each OIS corrected image is captured is determined using inertial measurement sensors. At least an estimate of OIS control performed during acquisition of an image is obtained. The estimate is removed from the frame-to-frame movement determined for the frame during which the OIS corrected image was captured to provide a residual measurement of movement for the frame. Electronic image stabilization (EIS) of each OIS corrected image based on the residual measurement is performed to provide a stabilized sequence of images.
A convolutional neural network (CNN) for an image processing system comprises an image cache responsive to a request to read a block of N×M pixels extending from a specified location within an input map to provide a block of N×M pixels at an output port. A convolution engine reads blocks of pixels from the output port, combines blocks of pixels with a corresponding set of weights to provide a product, and subjects the product to an activation function to provide an output pixel value. The image cache comprises a plurality of interleaved memories capable of simultaneously providing the N×M pixels at the output port in a single clock cycle. A controller provides a set of weights to the convolution engine before processing an input map, causes the convolution engine to scan across the input map by incrementing a specified location for successive blocks of pixels and generates an output map within the image cache by writing output pixel values to successive locations within the image cache.
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
A convolutional neural network (CNN) for an image processing system comprises an image cache responsive to a request to read a block of NxM pixels extending from a specified location within an input map to provide a block of NxM pixels at an output port. A convolution engine reads blocks of pixels from the output port, combines blocks of pixels with a corresponding set of weights to provide a product, and subjects the product to an activation function to provide an output pixel value. The image cache comprises a plurality of interleaved memories capable of simultaneously providing the NxM pixels at the output port in a single clock cycle. A controller provides a set of weights to the convolution engine before processing an input map, causes the convolution engine to scan across the input map by incrementing a specified location for successive blocks of pixels and generates an output map within the image cache by writing output pixel values to successive locations within the image cache.
Systems and methods for implementing array cameras configured to perform super-resolution processing to generate higher resolution super-resolved images using a plurality of captured images and lens stack arrays that can be utilized in array cameras are disclosed. Lens stack arrays in accordance with many embodiments of the invention include lens elements formed on substrates separated by spacers, where the lens elements, substrates and spacers are configured to form a plurality of optical channels, at least one aperture located within each optical channel, at least one spectral filter located within each optical channel, where each spectral filter is configured to pass a specific spectral band of light, and light blocking materials located within the lens stack array to optically isolate the optical channels.
H04N 5/341 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner
An image processing system comprises a template matching engine (TME). The TME reads an image from the memory; and as each pixel of the image is being read, calculates a respective feature value of a plurality of feature maps as a function of the pixel value. A pre-filter is responsive to a current pixel location comprising a node within a limited detector cascade to be applied to a window within the image to: compare a feature value from a selected one of the plurality of feature maps corresponding to the pixel location to a threshold value; and responsive to pixels for all nodes within a limited detector cascade to be applied to the window having been read, determine a score for the window. A classifier, responsive to the pre-filter indicating that a score for a window is below a window threshold, does not apply a longer detector cascade to the window before indicating that the window does not comprise an object to be detected.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
86.
Systems and methods for decoding image files containing depth maps stored as metadata
Systems and methods in accordance with embodiments of the invention are configured to render images using light field image files containing an image synthesized from light field image data and metadata describing the image that includes a depth map. One embodiment of the invention includes a processor and memory containing a rendering application and a light field image file including an encoded image, a set of low resolution images, and metadata describing the encoded image, where the metadata comprises a depth map that specifies depths from the reference viewpoint for pixels in the encoded image. In addition, the rendering application configures the processor to: locate the encoded image within the light field image file; decode the encoded image; locate the metadata within the light field image file; and post process the decoded image by modifying the pixels based on the depths indicated within the depth map and the set of low resolution images to create a rendered image.
A convolutional neural network (CNN) for an image processing system comprises an image cache responsive to a request to read a block of N×M pixels extending from a specified location within an input map to provide a block of N×M pixels at an output port. A convolution engine reads blocks of pixels from the output port, combines blocks of pixels with a corresponding set of weights to provide a product, and subjects the product to an activation function to provide an output pixel value. The image cache comprises a plurality of interleaved memories capable of simultaneously providing the N×M pixels at the output port in a single clock cycle. A controller provides a set of weights to the convolution engine before processing an input map, causes the convolution engine to scan across the input map by incrementing a specified location for successive blocks of pixels and generates an output map within the image cache by writing output pixel values to successive locations within the image cache.
Passive alignment of array camera modules constructed from lens stack arrays and sensors based upon alignment information obtained during manufacture of array camera modules using an active alignment process
Systems and methods in accordance with embodiments of the invention actively align a representative optic array with an imager array, and subsequently passively align constituent optic arrays with constituent imager arrays based on data from the active alignment. In one embodiment, a method of aligning a plurality of lens stack arrays with a corresponding plurality of sensors includes: aligning a first lens stack array relative to a first sensor, varying the spatial relationship between the first lens stack array and the first sensor; capturing images of a known target using the arrangement at different spatial relationships between the first lens stack array and the first sensor; scoring the quality of the captured images; and aligning at least a second lens stack array relative to at least a second sensor, based on the scored images and the corresponding spatial relationships by which the scored images were obtained.
A hand-held digital image capture device (digital camera) has a user-selectable mode in which upon engaging the mode the device detects a face in the field of view of the device and generates a face delimiter on a camera display screen, the delimiter surrounding the initial position of the image of a the face on the screen. The device is arranged to indicate thereafter to the user if the device departs from movement along a predetermined concave path P with the optical axis of the device pointing towards the face, such indication being made by movement of the image of the face relative to the delimiter. The camera captures and stores a plurality of images at successive positions along the concave path.
H04N 13/221 - Générateurs de signaux d’images utilisant des caméras à images stéréoscopiques utilisant un seul capteur d’images 2D utilisant le mouvement relatif de caméras et de sujets
90.
Extended color processing on pelican array cameras
Systems and methods for extended color processing on Pelican array cameras in accordance with embodiments of the invention are disclosed. In one embodiment, a method of generating a high resolution image includes obtaining input images, where a first set of images includes information in a first band of visible wavelengths and a second set of images includes information in a second band of visible wavelengths and non-visible wavelengths, determining an initial estimate by combining the first set of images into a first fused image, combining the second set of images into a second fused image, spatially registering the fused images, denoising the fused images using bilateral filters, normalizing the second fused image in the photometric reference space of the first fused image, combining the fused images, determining a high resolution image that when mapped through a forward imaging transformation matches the input images within at least one predetermined criterion.
A method of tracking an object across a stream of images comprises determining a region of interest (ROI) bounding the object in an initial frame of an image stream. A HOG map is provided for the ROI by: dividing the ROI into an array of MxN cells, each cell comprising a plurality of image pixels;and determining a HOG for each of the cells. The HOG map is stored as indicative of the features of the object. Subsequent frames are acquired from the stream of images. The frames are scanned ROI by ROI to identify a candidate ROI having a HOG map best matching the stored HOG map features. If the match meets a threshold, the stored HOG map indicative of the features of the object is updated according to the HOG map for the best matching candidate ROI.
Systems and methods for synthesizing images from image data captured by an array camera using restricted depth of field depth maps in which depth estimation precision varies
Systems and methods are described for generating restricted depth of field depth maps. In one embodiment, an image processing pipeline application configures a processor to: determine a desired focal plane distance and a range of distances corresponding to a restricted depth of field for an image rendered from a reference viewpoint; generate a restricted depth of field depth map from the reference viewpoint using the set of images captured from different viewpoints, where depth estimation precision is higher for pixels with depth estimates within the range of distances corresponding to the restricted depth of field and lower for pixels with depth estimates outside of the range of distances corresponding to the restricted depth of field; and render a restricted depth of field image from the reference viewpoint using the set of images captured from different viewpoints and the restricted depth of field depth map.
Systems and methods in accordance with embodiments of the invention implement one-dimensional array cameras, as well as modular array cameras using sub-array modules. In one embodiment, a 1×N array camera module includes: a 1×N arrangement of focal planes, where N is greater than or equal to 2, each focal plane includes a plurality of rows of pixels that also form a plurality of columns of pixels, and each focal plane not including pixels from another focal plane; and a 1×N arrangement of lens stacks, the arrangement of lens stacks being disposed relative to the arrangement of focal planes so as to form a 1×N arrangement of cameras, each configured to independently capture an image of a scene, where each lens stack has a field of view that is shifted with respect to that of each other lens stack so that each shift includes a sub-pixel shifted view of the scene.
H04N 9/09 - Générateurs de signaux d'image avec plusieurs têtes de lecture
H04N 5/33 - Transformation des rayonnements infrarouges
H04N 5/349 - Extraction de données de pixels provenant d'un capteur d'images en agissant sur les circuits de balayage, p.ex. en modifiant le nombre de pixels ayant été échantillonnés ou à échantillonner pour accroître la résolution en déplaçant le capteur par rapport à la scène
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/247 - Disposition des caméras de télévision
H04N 5/369 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS] circuits associés à cette dernière
H04N 5/378 - Circuits de lecture, p.ex. circuits d’échantillonnage double corrélé [CDS], amplificateurs de sortie ou convertisseurs A/N
H04N 5/262 - Circuits de studio, p. ex. pour mélanger, commuter, changer le caractère de l'image, pour d'autres effets spéciaux
94.
Array camera configurations incorporating constituent array cameras and constituent cameras
Systems and methods for implementing array camera configurations that include a plurality of constituent array cameras, where each constituent array camera provides a distinct field of view and/or a distinct viewing direction, are described. In several embodiments, image data captured by the constituent array cameras is used to synthesize multiple images that are subsequently blended. In a number of embodiments, the blended images include a foveated region. In certain embodiments, the blended images possess a wider field of view than the fields of view of the multiple images.
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
H04N 5/247 - Disposition des caméras de télévision
H04N 5/369 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS] circuits associés à cette dernière
G02B 13/02 - Télé-objectifs photographiques, c.-à-d. systèmes du type + — dans lesquels la distance du sommet de l'angle avant au plan de l'image est inférieure à la distance focale équivalente
G06T 3/40 - Changement d'échelle d’images complètes ou de parties d’image, p. ex. agrandissement ou rétrécissement
G06T 7/557 - Récupération de la profondeur ou de la forme à partir de plusieurs images à partir des champs de lumière, p. ex. de caméras plénoptiques
Array cameras, and array camera modules incorporating independently aligned lens stacks are disclosed. Processes for manufacturing array camera modules including independently aligned lens stacks can include: forming at least one hole in at least one carrier; mounting the at least one carrier relative to at least one sensor so that light passing through the at least one hole in the at least one carrier is incident on a plurality of focal planes formed by arrays of pixels on the at least one sensor; and independently mounting a plurality of lens barrels to the at least one carrier, so that a lens stack in each lens barrel directs light through the at least one hole in the at least one carrier and focuses the light onto one of the plurality of focal planes.
G02B 27/62 - Appareils optiques spécialement adaptés pour régler des éléments optiques pendant l'assemblage de systèmes optiques
G02B 7/02 - Montures, moyens de réglage ou raccords étanches à la lumière pour éléments optiques pour lentilles
H01L 25/04 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés
H04N 5/235 - Circuits pour la compensation des variations de la luminance de l'objet
H04N 5/369 - Transformation d'informations lumineuses ou analogues en informations électriques utilisant des capteurs d'images à l'état solide [capteurs SSIS] circuits associés à cette dernière
Systems and methods for synthesizing high resolution images using image deconvolution and depth information in accordance embodiments of the invention are disclosed. In one embodiment, an array camera includes a processor and a memory, wherein an image deconvolution application configures the processor to obtain light field image data, determine motion data based on metadata contained in the light field image data, generate a depth-dependent point spread function based on the synthesized high resolution image, the depth map, and the motion data, measure the quality of the synthesized high resolution image based on the generated depth-dependent point spread function, and when the measured quality of the synthesized high resolution image is within a quality threshold, incorporate the synthesized high resolution image into the light field image data.
An image processing apparatus comprises a normalisation module operatively connected across a bus to a memory storing an image in which a region of interest (ROI) has been identified within the image. The ROI is bound by a rectangle having a non-orthogonal orientation within the image. In one embodiment, the normalisation module is arranged to divide the ROI into one or more slices, each slice comprising a plurality of adjacent rectangular tiles. For each slice, the apparatus successively reads ROI information for each tile from the memory including: reading a portion of the image extending across at least a width of the slice line-by-line along an extent of a slice. For each tile, the apparatus downsamples the ROI information to a buffer to within a scale SD<2 of a required scale for a normalised version of the ROI. The apparatus then fractionally downsamples and rotates downsampled information for a tile within the buffer to produce a respective normalised portion of the ROI at the required scale for the normalised ROI. Downsampled and rotated information is accumulated for each tile within a normalised ROI buffer for subsequent processing by the image processing apparatus.
Systems and methods in accordance with embodiments of the invention are disclosed that use super-resolution (SR) processes to use information from a plurality of low resolution (LR) images captured by an array camera to produce a synthesized higher resolution image. One embodiment includes obtaining input images using the plurality of imagers, using a microprocessor to determine an initial estimate of at least a portion of a high resolution image using a plurality of pixels from the input images, and using a microprocessor to determine a high resolution image that when mapped through the forward imaging transformation matches the input images to within at least one predetermined criterion using the initial estimate of at least a portion of the high resolution image. In addition, each forward imaging transformation corresponds to the manner in which each imager in the imaging array generate the input images, and the high resolution image synthesized by the microprocessor has a resolution that is greater than any of the input images.
A method and system for detecting facial expressions in digital images and applications therefore are disclosed. Analysis of a digital image determines whether or not a smile and/or blink is present on a person's face. Face recognition, and/or a pose or illumination condition determination, permits application of a specific, relatively small classifier cascade.
H04N 5/232 - Dispositifs pour la commande des caméras de télévision, p.ex. commande à distance
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
100.
Systems and methods for generating depth maps using a camera arrays incorporating monochrome and color cameras
A camera array, an imaging device and/or a method for capturing image that employ a plurality of imagers fabricated on a substrate is provided. Each imager includes a plurality of pixels. The plurality of imagers include a first imager having a first imaging characteristics and a second imager having a second imaging characteristics. The images generated by the plurality of imagers are processed to obtain an enhanced image compared to images captured by the imagers. Each imager may be associated with an optical element fabricated using a wafer level optics (WLO) technology.