Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for robot navigation. In some implementations, a method includes obtaining sensor data captured by one or more sensors located at a property over a time period; detecting an object represented in the sensor data; detecting, using the detected object and multiple subsets of the sensor data, a movement pattern of the object over the time period; determining an area navigable for a robot at the property using the detected movement pattern of the object over the time period; and providing, to the robot, an indication of the area navigable for the robot.
G05D 1/249 - Arrangements for determining position or orientation using signals provided by artificial sources external to the vehicle, e.g. navigation beacons from positioning sensors located off-board the vehicle, e.g. from cameras
G05D 1/246 - Arrangements for determining position or orientation using environment maps, e.g. simultaneous localisation and mapping [SLAM]
G05D 101/20 - Details of software or hardware architectures used for the control of position using external object recognition
H04W 4/38 - Services specially adapted for particular environments, situations or purposes for collecting sensor information
2.
DUAL DESCRIPTOR DATA FOR OBJECT RECOGNITION IN LOW LIGHT CONDITIONS
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using dual descriptor data. One of the methods includes: detecting, using a first set of descriptor features included in dual descriptor data, a first representation within first image data collected by a camera; determining a change to an imaging modality of the camera; detecting, using a second set of features included in the dual descriptor data, a second representation within second image data collected by the camera; classifying the first representation and the second representation as associated with a same object using the dual descriptor data; and in response to classifying the first representation and the second representation as associated with the same object using the dual descriptor data, transmitting operational instructions to one or more appliances connected to the system.
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 10/60 - Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
H04N 23/11 - Cameras or camera modules comprising electronic image sensorsControl thereof for generating image signals from different wavelengths for generating image signals from visible and infrared light wavelengths
H04N 23/667 - Camera operation mode switching, e.g. between still and video, sport and normal or high and low resolution modes
3.
REDUCING FALSE DETECTIONS FOR NIGHT VISION CAMERAS
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reducing camera false detections. One of the methods includes providing, to a neural network of an image classifier that is trained to detect objects of two or more classification types, a feature vector for a respective training image; receiving, from the neural network, an output vector that indicates, for each of the two or more classification types, a likelihood that the respective training image depicts an object of the corresponding classification type; accessing, from two or more ground truth vectors each for one of the two or more classification types, a ground truth vector for the classification type of an object depicted in the training image; and adjusting one or more weights in the neural network using the output vector and the ground truth vector; and storing, in a memory, the image classifier.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for retroactive event detection. The methods, systems, and apparatus include actions of obtaining an image captured by a camera at a current time, determining that the image depicts a change in a region from a previous image captured by the camera at a previous time, determining, based on determining that the image depicts the change in the region, whether the change depicted in the image is of a known object type, determining, based on the determination that the change depicted in the image is of a known object type, whether the change does not correspond to a previously detected event, and determining, based on the determination that the change does not correspond to a previously detected event, whether the images captured by the camera between the current time and the previous time depict an event.
Disclosed are methods, systems, and apparatus for object localization in video. A method includes obtaining a reference image of an object; generating, from the reference image, homographic adapted images showing the object at various locations with various orientations; determining interest points from the homographic adapted images; determining locations of an object center in the homographic adapted images relative to the interest points; obtaining a sample image of the object; identifying matched pairs of interest points, each matched pair including an interest point from the homographic adapted images and a matching interest point in the sample image; and determining a location of the object in the sample image based on the locations of the object center in the homographic adapted images relative to the matched pairs. The method includes generating a homography matrix; and projecting the reference image of the object to the sample image using the homography matrix.
G06V 10/24 - Aligning, centring, orientation detection or correction of the image
G06T 7/55 - Depth or shape recovery from multiple images
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods
G06V 10/22 - Image preprocessing by selection of a specific region containing or referencing a patternLocating or processing of specific regions to guide the detection or recognition
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
Systems and methods for detecting and monitoring areas of water damage or water management problems in a property are described. Monitoring devices can be deployed at different locations of a property to obtain sensor data and image data regarding the environmental conditions in potentially problematic areas of the property. An image obtained by the monitoring devices can be processed and compared with reference images or other image data that is filtered in a different manner that the image to identify portions of the image in which water damage or water management problems exist.
Methods and systems, including computer programs encoded on a storage medium, are described for implementing item monitoring using a doorbell camera. A system generates an input video stream that has image frames corresponding to detection of activity at a property. Timing information is generated for the video stream and includes a timestamp for each image frame of the stream. Using the timing information, the system processes a pre-event image frame that precedes detection of the activity and a post-event image frame that coincides with detection of the activity. An image score is computed with respect to placement of a candidate item at the property in response to processing the pre-event and post-event image frames. The image score is used to determine that a first item was delivered to the property or that a second item was removed after being delivered to the property.
G06V 20/40 - ScenesScene-specific elements in video content
G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
G06F 18/22 - Matching criteria, e.g. proximity measures
G06V 10/22 - Image preprocessing by selection of a specific region containing or referencing a patternLocating or processing of specific regions to guide the detection or recognition
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
8.
ADJUSTING AREAS OF INTEREST FOR MOTION DETECTION IN CAMERA SCENES
Disclosed are methods, systems, and apparatus for adjusting areas of interest for motion detection in camera scenes. A method includes obtaining a map of false motion event detections using a first area of interest; identifying an overlap area between the map of false detections and the first area of interest; determining a second area of interest that includes portions of the first area of interest and excludes at least a part of the overlap area; obtaining a map of true motion event detections using the first area of interest; determining whether true detections using the second area of interest compared to true detections using the first area of interest satisfies performance criteria; and in response to determining that true detections using the second area of interest compared to true detections using the first area of interest satisfies performance criteria, providing the second area of interest for use in detecting events.
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
9.
ENHANCING MONITORING SYSTEM WITH AUGMENTED REALITY
A computer-implemented method includes obtaining an image of an area of a property from an augmented reality device, identifying the area of the property based on the image obtained from the augmented reality device, determining that the area of the property corresponds to an event at the property or a configuration of a monitoring system of the property, and providing, in response to determining that the area of the property corresponds to the event or the configuration, information that represents the event or the configuration and that is configured to be displayed on the augmented reality device.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving multiple images from a camera, each image of the multiple images representative of a detection of an object within the image. For each image of the multiple images the methods include: determining a set of detected objects within the image, each object defined by a respective bounding box, and determining, from the set of detected objects within the image and ground truth labels, a false detection of a first object. The methods further include determining that a target object threshold is met based on a number of false detections of the first object in the multiple images, generating, based on the number of false detections for the first object meeting the target object threshold, an adversarial mask for the first object, and providing, to the camera, the adversarial mask.
G06V 10/98 - Detection or correction of errors, e.g. by rescanning the pattern or by human interventionEvaluation of the quality of the acquired patterns
G06V 10/70 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Methods and systems for image-based abnormal event detection are disclosed. An example method includes obtaining a sequential set of images captured by a camera; generating a set of observed features for each of the images; generating a set of predicted features based on a portion of the sets of observed features that excludes the set of observed features for a last image in the sequential set of images; determining that a difference between the set of predicted features and the set of observed features for the last image in the sequential set of images satisfies abnormal event criteria; and in response to determining that the difference between the set of predicted features and the set of observed features for the last image in the sequential set of images satisfies abnormal event criteria, classifying the set of sequential images as showing an abnormal event.
G06V 20/40 - ScenesScene-specific elements in video content
G06F 18/22 - Matching criteria, e.g. proximity measures
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for fetching and guarding delivered packages using robots. One of the methods includes: determining a characteristic of a package at a first location; determining a risk score for the package at the first location; determining, using the risk score and the characteristic of the package, to perform an action for the package; in response to determining to perform the action for the package, assigning a robot to perform the action; and deploying the robot to perform the action.
G05D 1/10 - Simultaneous control of position or course in three dimensions
G06Q 10/04 - Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for object detection. One of the methods includes determining, using first sensor data, a detection result on whether to trigger an event alerting a presence of an object in a target area by executing one or more models; determining, using second sensor data, a ground truth for the event that indicates whether an object is present in the target area; determining a difference value by comparing the detection result and the ground truth; adjusting at least one parameter of the one or more models in response to determining that the difference value does not satisfy the one or more threshold criteria; and determining a new detection result on whether to trigger a second event by executing the one or more models with adjusted parameters using new first sensor data.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for tracking objects of interest using distance-based thresholding. One of the methods includes detecting an object depicted in an image captured by a camera, determining a predicted physical distance between the object and the camera, selecting, from a plurality of predetermined confidence thresholds, a confidence threshold for the predicted physical distance, each confidence threshold in the plurality of predetermined confidence thresholds for a different physical distance range, the confidence threshold having a physical distance range that includes the predicted physical distance, and determining, using the confidence threshold and a confidence score that indicates a likelihood that the object is an object of interest, that the object is likely an object of interest.
G06T 7/70 - Determining position or orientation of objects or cameras
G06V 40/10 - Human or animal bodies, e.g. vehicle occupants or pedestriansBody parts, e.g. hands
G06F 18/2113 - Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for real-motion prediction. One of the methods includes: providing, as an input to a machine learning model, image frames of a scene for which the image frames were captured over a period of time; obtaining, as an output from the machine learning model, a temporally aggregated optical flow signature that includes a two-dimensional (2D) motion vector for a plurality of locations in the image frames of the scene; detecting, using the temporally aggregated optical flow signature, a real-motion event by comparing a magnitude of each 2D motion vector with a threshold; and performing an action for the real-motion event in response to detecting the real-motion event.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for real-motion prediction. One of the methods includes: providing, as an input to a machine learning model, image frames of a scene for which the image frames were captured over a period of time; obtaining, as an output from the machine learning model, a temporally aggregated optical flow signature that includes a two-dimensional (2D) motion vector for a plurality of locations in the image frames of the scene; detecting, using the temporally aggregated optical flow signature, a real-motion event by comparing a magnitude of each 2D motion vector with a threshold; and performing an action for the real-motion event in response to detecting the real-motion event.
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for robot navigation. One of the methods includes obtaining one or more images of an area from a robot; detecting two or more lines within the one or more images; identifying at least two of the two or more lines as vanishing lines; determining, using the vanishing lines, a correction maneuver; and controlling the robot to implement the correction maneuver.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for consolidation of alerts using a similarity criterion. One of the methods includes: receiving, from a sensor at a property for which a first action in a sequence of actions was detected by a second sensor at the property, data that identifies a second action at the property; determining whether the second action satisfies a similarity criterion for an action in the sequence of actions that define an event at the property; and in response to determining that the second action satisfies the similarity criterion for the action in the sequence of actions that define the event at the property, performing an event operation for the event at the property.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for Object Embedded Learning. One of the methods includes maintaining data that represents an image; providing, to a machine learning model, the data that represents the image; receiving, from the machine learning model, output data that includes i) an object detection result that indicates whether a target object is detected in the image and ii) an object embedding for the target object; and determining whether to perform an automated action using the output data.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for updating a knowledge distillation training system. One of the methods includes: providing, to a teacher model in a knowledge distillation training system, first data representing an image to cause the teacher model to generate teacher output data that indicates whether the image depicts an object of interest; providing, to a student model in the knowledge distillation training system, second data representing the image to cause the student model to generate student output data that indicates whether the image depicts an object of interest; determining whether an accuracy of the teacher output data satisfies an accuracy threshold; and in response to determining that the accuracy of the teacher output data does not satisfy the accuracy threshold: determining to skip updating the student model; and updating the student model using the student output data and ground truth data.
Methods, systems, and apparatus for ground plane filtering of video events are disclosed. A method includes obtaining a first set of images of a scene from a camera; determining a ground plane from the first set of images of the scene; obtaining a second set of images of the scene after the first set of images of the scene is obtained; determining that movement shown by a group of pixels in the second set of images of the scene satisfies motion criteria; determining that the ground plane corresponds with at least a portion of the group of pixels; and in response to determining that movement shown by the group of pixels in the second set of images of the scene satisfies motion criteria, and that the ground plane corresponds with at least a portion of the group of pixels, classifying the group of pixels as showing ground plane based motion.
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 20/40 - ScenesScene-specific elements in video content
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G06V 10/50 - Extraction of image or video features by performing operations within image blocksExtraction of image or video features by using histograms, e.g. histogram of oriented gradients [HoG]Extraction of image or video features by summing image-intensity valuesProjection analysis
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for generating rules for a monitoring system. In some implementations, a method includes presenting a virtual model of a monitoring system of a property; obtaining action data from a computer generated avatar in the virtual model; generating a security rule for the monitoring system using the action data from the computer generated avatar in the virtual model; and providing, to the monitoring system, the security rule to cause the monitoring system to generate, using the security rule and sensor data from the property received by the monitoring system, a security alert.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for estimating a robot pose. One of the methods includes the actions of obtaining two or more images captured at two or more locations on a property; detecting feature points at positions within two or more images including first feature points in the first image and second feature points in the second image; comparing the positions of the first feature points in the first image to positions of the second feature points in the second image; obtaining data indicating the two or more locations on the property; comparing the two or more locations; and generating depth data for the feature points for use by a robot navigating the property.
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for generating a placement of a camera. In some implementations, maintaining monitoring information of a property; generating a virtual model of a property; determining an initial placement location for a virtual camera in the virtual model; obtaining image data generated in the virtual model from the virtual camera placed at the initial placement location; analyzing the image data; determining, using the analysis of the image data, whether to identify an updated placement location; and providing the placement location for a physical camera at the property to a device using a result of the determination whether to identify the updated placement location.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for estimating a robot pose. One of the methods includes the actions of obtaining two or more images captured at two or more locations on a property; detecting feature points at positions within two or more images including first feature points in the first image and second feature points in the second image; comparing the positions of the first feature points in the first image to positions of the second feature points in the second image; obtaining data indicating the two or more locations on the property; comparing the two or more locations; and generating depth data for the feature points for use by a robot navigating the property.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for object detection using positional prior filtering. One of the methods includes: obtaining, from a plurality of first images, person bounding boxes and face bounding boxes that each correspond to one of the person bounding boxes, each person bounding box identifying at least one portion of a respective image of the plurality of first images that likely represents a person; training a face location predictor to predict a location of a face in an image using the person bounding boxes and the face bounding boxes; training, using the face location predictor, an error model that determines a likelihood that an image depicts a face using output from the face location predictor; and storing, in memory, the trained error model, and the face location predictor for use by a device detecting faces depicted in an image.
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G07C 9/00 - Individual registration on entry or exit
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for managing alerts based on predicted user activity. In some implementations, a request is received to send an alert to a target device of a target user. A status of the target user is determined. One or more properties of the alert are determined. Using at least the status of the target user and the one or more properties of the alert, the alert is presented on the target device according to a determined delay from a first time period during which the alert would normally be presented to a second, later time period.
H04W 4/16 - Communication-related supplementary services, e.g. call-transfer or call-hold
H04W 68/04 - User notification, e.g. alerting or paging, for incoming communication, change of service or the like multi-step notification using statistical or historical mobility data
G06Q 10/107 - Computer-aided management of electronic mailing [e-mailing]
G08B 3/10 - Audible signalling systemsAudible personal calling systems using electric transmissionAudible signalling systemsAudible personal calling systems using electromagnetic transmission
G08B 23/00 - Alarms responsive to unspecified undesired or abnormal conditions
H04M 1/72421 - User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for supporting emergency services with automatic activation of emergency service functions, e.g. upon sensing an alarm
H04M 1/7243 - User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
H04M 1/72451 - User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to schedules, e.g. using calendar applications
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for managing alerts based on predicted user activity. In some implementations, a request is received to send an alert to a target device of a target user. A status of the target user is determined. One or more properties of the alert are determined. Using at least the status of the target user and the one or more properties of the alert, the alert is presented on the target device according to a determined delay from a first time period during which the alert would normally be presented to a second, later time period.
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for obtaining a sample Light Detection and Ranging (LIDAR) profile generated by a drone; selecting a reference position based on the sample LIDAR profile; determining a LIDAR profile-based translation and rotation relative to a reference LIDAR profile of the reference position; determining an image-based translation and rotation relative to a reference image of the reference position; determining whether the LIDAR profile-based translation and rotation and the image-based translation and rotation satisfy a similarity threshold; and verifying, using a result of the determination, a predicted position of the drone.
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for obtaining a sample Light Detection and Ranging (LIDAR) profile generated by a drone; selecting a reference position based on the sample LIDAR profile; determining a LIDAR profile-based translation and rotation relative to a reference LIDAR profile of the reference position; determining an image-based translation and rotation relative to a reference image of the reference position; determining whether the LIDAR profile-based translation and rotation and the image-based translation and rotation satisfy a similarity threshold; and verifying, using a result of the determination, a predicted position of the drone.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using dual descriptor data. One of the methods includes: detecting, using a first set of descriptor features included in dual descriptor data, a first representation within first image data collected by a camera; determining a change to an imaging modality of the camera; detecting, using a second set of features included in the dual descriptor data, a second representation within second image data collected by the camera; classifying the first representation and the second representation as associated with a same object using the dual descriptor data; and in response to classifying the first representation and the second representation as associated with the same object using the dual descriptor data, transmitting operational instructions to one or more appliances connected to the system.
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 10/143 - Sensing or illuminating at different wavelengths
G06V 10/60 - Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
H04N 23/11 - Cameras or camera modules comprising electronic image sensorsControl thereof for generating image signals from different wavelengths for generating image signals from visible and infrared light wavelengths
H04N 23/90 - Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
H04N 23/667 - Camera operation mode switching, e.g. between still and video, sport and normal or high and low resolution modes
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for automated package management. One of the methods includes for an image that depicts a package at a property and a person retrieving the package, determining, using data from the image or from a device for the person, whether the person is authorized to retrieve the package; and in response to determining that the person is authorized to retrieve the package, determining to skip performing an automated action that would have been performed if the person was not authorized to retrieve the package.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selectively obfuscating audio. One of the methods includes detecting, by a security system for a premises, that a person who was within a threshold area for the premises was likely uttering a sound; determining, using data from one or more sensors, a likelihood that the person was communicating with the security system at the premises; determining whether the likelihood satisfies a threshold likelihood and that the person's voice should not be obfuscated; and in response to determining whether the likelihood satisfies the threshold likelihood and that the person's voice should not be obfuscated, selectively obfuscating the person's voice or maintaining, in memory, an audio signal i) that encodes the person's voice and ii) was captured by a microphone that was physically located at the premises.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing assistive actions using predictive human behavior. One of the methods includes obtaining a the first image of a person within a first threshold distance of a property; predicting, using the first image of the person, a task flow including a sequence of activities to be performed by the person; obtaining a second image of the person within a second threshold distance of the property; determining, using the second image, that activities performed by the person do not match the task flow; and in response to determining that the activities performed by the person do not match the predicted task flow, performing one or more actions at the property.
Methods, systems, and apparatus for remote camera-assisted robot guidance are disclosed. A method includes obtaining images of objects approaching a door of a property; identifying candidate paths to the door based on the images of the objects approaching the door of the property; determining movement capabilities of the objects; storing the candidate paths to the door labeled by the movement capabilities of the objects that took the paths; determining capability information for a robot at the property that indicates movement capabilities of the robot; selecting, from the candidate paths, a path for the robot to take to the door based on the movement capabilities of the robot and the labels of the candidate paths; and providing guidance information to the robot that guides the robot to the door along the selected path.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for monitoring extended reality spaces. One of the methods includes selecting, from a physical space at a property, a first available portion of the physical space for representing an extended reality environment; causing presentation of a first portion of the extended reality environment at the first available portion of the physical space; predicting, using sensor data generated from one or more sensors at the property, that the first available portion of the physical space will likely be interfered with; in response to predicting that the first available portion of the physical space will likely be interfered with, selecting, from the plurality of available portions of the physical space, a second available portion for representing the environment; and causing presentation of a second portion of the extended reality environment at the second available portion of the physical space.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for monitoring extended reality spaces. One of the methods includes selecting, from a physical space at a property, a first available portion of the physical space for representing an extended reality environment; causing presentation of a first portion of the extended reality environment at the first available portion of the physical space; predicting, using sensor data generated from one or more sensors at the property, that the first available portion of the physical space will likely be interfered with; in response to predicting that the first available portion of the physical space will likely be interfered with, selecting, from the plurality of available portions of the physical space, a second available portion for representing the environment; and causing presentation of a second portion of the extended reality environment at the second available portion of the physical space.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for monitoring extended reality spaces. One of the methods includes maintaining, for an extended reality environment generated by a first device operated by a person, data defining a three-dimensional space at a property; accessing sensor data generated by one or more sensors physically located at the property; predicting, using the sensor data, that an object will likely interfere with the three-dimensional space at the property; and in response to predicting that the object will likely interfere with the three-dimensional space at the property, providing a notification to a second device.
G06T 19/00 - Manipulating 3D models or images for computer graphics
G06F 3/04815 - Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for monitoring extended reality spaces. One of the methods includes determining, for a first extended reality environment generated by a first device, first data defining a first three-dimensional space at a property for the first environment; determining, for a second extended reality environment generated by a second device, second data defining a second three-dimensional space at the property for the second environment; determining whether the first space at least partially overlaps with the second space; in response to determining that the first space at least partially overlaps with the second space, determining that the first space has a higher priority than the second space; in response to determining that the first space has the higher priority than the second space, providing, to the second device, a command to adjust an experience for the second extended reality environment.
Methods, systems, an apparatus, including computer programs encoded on a storage device, for training an image classifier. A method includes receiving an image that includes a depiction of an object; generating a set of poorly localized bounding boxes; and generating a set of accurately localized bounding boxes. The method includes training, at a first learning rate and using the poorly localized bounding boxes, an object classifier to classify the object; and training, at a second learning rate that is lower than the first learning rate, and using the accurately localized bounding boxes, the object classifier to classify the object. The method includes receiving a second image that includes a depiction of an object; and providing, to the trained object classifier, the second image. The method includes receiving an indication that the object classifier classified the object in the second image; and performing one or more actions.
Methods, systems, and apparatus for camera detection of human activity with co-occurrence are disclosed. A method includes detecting a person in an image captured by a camera; in response to detecting the person in the image, determining optical flow in portions of a first set of images; determining that particular portions of the first set of images satisfy optical flow criteria; in response to determining that the particular portions of the first set of images satisfy optical flow criteria, classifying the particular portions of the first set of images as indicative of human activity; receiving a second set of images captured by the camera after the first set of images; and determining that the second set of images likely shows human activity based on analyzing portions of the second set of images that correspond to the particular portions of the first set of images classified as indicative of human activity.
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
H04N 23/68 - Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
42.
FEATURE SELECTION FOR OBJECT TRACKING USING MOTION MASK, MOTION PREDICTION, OR BOTH
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for feature selection for object tracking. One of the methods includes: obtaining first feature points of an object in a first image of a scene captured by a camera; obtaining a second image of the scene captured by the camera after the first image was captured; determining whether a motion prediction of the object is available that indicates an area of the second image where the object is likely located; in response to determining that the motion prediction of the object is available, identifying, in the area of the second image where the object is likely located, second feature points that satisfy a similarity threshold for the first feature points in the first image; and detecting the object in the second image using the identified second feature points.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predictive video conference actions. One of the methods includes accessing, for a video conference in progress in an area of a property, data indicating activity at the property; predicting, using the data indicating activity at the property, that a video conference interruption is likely to occur; and in response to determining that a video conference interruption is likely to occur, performing one or more actions to reduce a likelihood that the video conference interruption will be presented during the video conference.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for extrinsic camera calibration using a calibration object. One of the methods includes: determining physical locations of interest points of a calibration object in a calibration object centered coordinate system; determining pixel locations of the interest points in an image of the calibration object captured by a camera; determining, using the pixel locations and the physical locations, a transformation from the calibration object centered coordinate system to a camera centered coordinate system; and determining, using the transformation, a camera tilt angle and a camera mount height of the camera for use in analyzing images captured by the camera.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for extrinsic camera calibration using a calibration object. One of the methods includes: determining physical locations of interest points of a calibration object in a calibration object centered coordinate system; determining pixel locations of the interest points in an image of the calibration object captured by a camera; determining, using the pixel locations and the physical locations, a transformation from the calibration object centered coordinate system to a camera centered coordinate system; and determining, using the transformation, a camera tilt angle and a camera mount height of the camera for use in analyzing images captured by the camera.
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods
G06T 3/60 - Rotation of whole images or parts thereof
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
46.
SPATIAL MOTION ATTENTION FOR INTELLIGENT VIDEO ANALYTICS
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for spatial motion attention for intelligent video analytics. One of the methods includes: obtaining an input image of a region; generating a motion image that characterizes a difference between a value of a pixel at the pixel location in the input image and a value of a pixel at the pixel location in the reference image; generating a feature map using the input image; generating, using the motion image and the feature map, a motion enhanced feature map that has, for one or more pixels that likely indicate movement, a first value that a) indicates that the corresponding pixel in the motion enhanced feature map likely indicates movement and b) is different from a second value for a corresponding pixel in the feature map; and analyzing the motion enhanced feature map.
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06T 3/40 - Scaling of whole images or parts thereof, e.g. expanding or contracting
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
47.
SPATIAL MOTION ATTENTION FOR INTELLIGENT VIDEO ANALYTICS
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for spatial motion attention for intelligent video analytics. One of the methods includes: obtaining an input image of a region; generating a motion image that characterizes a difference between a value of a pixel at the pixel location in the input image and a value of a pixel at the pixel location in the reference image; generating a feature map using the input image; generating, using the motion image and the feature map, a motion enhanced feature map that has, for one or more pixels that likely indicate movement, a first value that a) indicates that the corresponding pixel in the motion enhanced feature map likely indicates movement and b) is different from a second value for a corresponding pixel in the feature map; and analyzing the motion enhanced feature map.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for motorized structure integration with augmented and virtual reality. One of the methods includes determining one or more physical objects for use with an extended reality environment; generating, for a physical object from the one or more physical objects, data for a representation of the physical object for use in the extended reality environment; providing, to a user device, at least some of the data for the representation of the physical object to cause the user device to present at least a portion of the representation in the extended reality environment; determining to change a presentation of the representation in the extended reality environment; and in response to determining to change the presentation of the representation, controlling a physical position of the physical object using the change to the presentation of the representation in the extended reality environment.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for objection classification. One of the methods includes: obtaining a first set of images of objects that have a likelihood of being at a property that satisfies a likelihood threshold; generating, for each object, a binary classifier from a set of images of the respective object; determining, using at least one of the binary classifiers, that an image of an unknown object was classified as an object from the objects; in response to determining, using the binary classifiers, that the image of the unknown object was classified as an object from the objects, selecting a second set of images of unknown objects that does not include the image; and generating a multiclass classifier for use classifying objects using i) the first set as respective classes and ii) the second set that does not include the image.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for objection classification. One of the methods includes: obtaining a first set of images of objects that have a likelihood of being at a property that satisfies a likelihood threshold; generating, for each object, a binary classifier from a set of images of the respective object; determining, using at least one of the binary classifiers, that an image of an unknown object was classified as an object from the objects; in response to determining, using the binary classifiers, that the image of the unknown object was classified as an object from the objects, selecting a second set of images of unknown objects that does not include the image; and generating a multiclass classifier for use classifying objects using i) the first set as respective classes and ii) the second set that does not include the image.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/771 - Feature selection, e.g. selecting representative features from a multi-dimensional feature space
Methods, systems, and apparatus for motion-based human video detection are disclosed. A method includes generating a representation of a difference between two frames of a video; providing, to an object detector, a particular frame of the two frames and the representation of the difference between two frames of the video; receiving an indication that the object detector detected an object in the particular frame; determining that detection of the object in the particular frame was a false positive detection; determining an amount of motion energy where the object was detected in the particular frame; and training the object detector based on penalization of the false positive detection in accordance with the amount of motion energy where the object was detected in the particular frame.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/778 - Active pattern-learning, e.g. online learning of image or video features
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/98 - Detection or correction of errors, e.g. by rescanning the pattern or by human interventionEvaluation of the quality of the acquired patterns
G06V 40/20 - Movements or behaviour, e.g. gesture recognition
G06N 5/046 - Forward inferencingProduction systems
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for finding lost objects. In some implementations, a video frame is displayed. An input drawing a shape around an area of the video frame is received. A second video frame is displayed. An indication of the shape in the second video frame is displayed. An input to adjust the shape such that the shape is drawn around a second area is received.
G06V 20/40 - ScenesScene-specific elements in video content
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06V 10/22 - Image preprocessing by selection of a specific region containing or referencing a patternLocating or processing of specific regions to guide the detection or recognition
53.
Vehicular access control based on virtual inductive loop
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for monitoring events using a Virtual Inductive Loop system. In some implementations, image data is obtained from cameras. A region depicted in the obtained image data is identified, the region comprising lines spaced by a distance that satisfies a distance threshold. For each line included in the region: an object depicted crossing the line is determined whether to satisfy a height criteria indicating that the line is activated. In response to determining that an object depicted crossing the line satisfies the height criteria, an event is determined to have likely occurred using data indicating (i) which lines of the lines were activated and (ii) an order in which each of the lines were activated. In response to determining that an event likely occurred, actions are performed using at least some of the data.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for fast user enrollment for facial recognition using face clustering. One of the methods includes identifying, from a set of face images of faces, clusters of face images, where the clusters of face images include a particular cluster; receiving, from a device, an indication that the particular cluster includes a first subcluster of face images that depict a first person and a second subcluster of face images that depict a second person; in response to receiving the indication, determining that a number of face images in the first subcluster of face images that depict the first person does not satisfy an enrollment criteria; identifying another cluster of face images that depict the first person; and enrolling, in a facial recognition database, the first person using the other cluster of face images.
G06V 40/50 - Maintenance of biometric data or enrolment thereof
G06V 10/762 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
Methods, systems, an apparatus, including computer programs encoded on a storage device, for tracking human movement in video images. A method includes obtaining a first image of a scene captured by a camera; identifying a bounding box around a human detected in the first image; determining a scale amount that corresponds to a size of the bounding box; obtaining a second image of the scene captured by the camera after the first image was captured; and detecting the human in the second image based on both the first image scaled by the scale amount and the second image scaled by the scale amount. Detecting the human in the second image can include identifying a second scaled bounding box around the human detected in the second image scaled by the scale amount.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for managing virtual surveillance windows for video surveillance. The methods, systems, and apparatus include actions of obtaining an original video, generating a downscaled video from the original video, detecting a first event at a location from the downscaled video using a first classifier, generating a windowed video from the original video based on the location, detecting a second event from the windowed video, and performing an action in response to detecting the second event.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for progressive deep metric learning. One of the methods includes maintaining training data for training a machine learning model that will include a plurality of blocks after training. A number of training stages is determined using the plurality of blocks in the machine learning model. The machine learning model is trained using the training data in a plurality of stages, including, for each stage: adding, from the plurality of blocks, a new block for a current stage to the machine learning model; and training the machine learning model using the training data. The trained machine learning model that includes the plurality of trained blocks is outputted.
A method for classifying activity based on multi-sensor input includes receiving, from two or more sensors, sensor data indicating activity within a building, determining, for each of the two or more sensors and based on the received sensor data, (i) an extracted feature vector for activity within the building and (ii) location data, labelling each of the extracted feature vectors with the location data, generating, using the extracted feature vectors, an integrated feature vector, detecting a particular activity based on the integrated feature vector, and in response to detecting the particular activity, performing a monitoring action.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing targeted notifications. One of the methods includes determining a likely identity of a visitor who arrived at a property that is associated with two or more people, selecting, from the two or more people, a target person using the likely identity of the visitor, and sending, to a device of the target person, a notification that indicates that the visitor arrived at the property.
H04M 11/02 - Telephonic communication systems specially adapted for combination with other electrical systems with bell or annunciator systems
G08B 3/10 - Audible signalling systemsAudible personal calling systems using electric transmissionAudible signalling systemsAudible personal calling systems using electromagnetic transmission
G08B 25/00 - Alarm systems in which the location of the alarm condition is signalled to a central station, e.g. fire or police telegraphic systems
H04M 11/04 - Telephonic communication systems specially adapted for combination with other electrical systems with alarm systems, e.g. fire, police or burglar alarm systems
60.
Adjusting areas of interest for motion detection in camera scenes
Disclosed are methods, systems, and apparatus for adjusting areas of interest for motion detection in camera scenes. A method includes obtaining a map of false motion event detections using a first area of interest; identifying an overlap area between the map of false detections and the first area of interest; determining a second area of interest that includes portions of the first area of interest and excludes at least a part of the overlap area; obtaining a map of true motion event detections using the first area of interest; determining whether true detections using the second area of interest compared to true detections using the first area of interest satisfies performance criteria; and in response to determining that true detections using the second area of interest compared to true detections using the first area of interest satisfies performance criteria, providing the second area of interest for use in detecting events.
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
61.
ADJUSTING AREAS OF INTEREST FOR MOTION DETECTION IN CAMERA SCENES
Disclosed are methods, systems, and apparatus for adjusting areas of interest for motion detection in camera scenes. A method includes obtaining a map of false motion event detections using a first area of interest; identifying an overlap area between the map of false detections and the first area of interest; determining a second area of interest that includes portions of the first area of interest and excludes at least a part of the overlap area; obtaining a map of true motion event detections using the first area of interest; determining whether true detections using the second area of interest compared to true detections using the first area of interest satisfies performance criteria; and in response to determining that true detections using the second area of interest compared to true detections using the first area of interest satisfies performance criteria, providing the second area of interest for use in detecting events.
Methods and systems, including computer-readable media, are described for monitoring presence or absence of an object at a property using local region matching. A system generates images while monitoring an area of the property and, based on the images, detects an object in a region of interest in the area. For each of the images: the system iteratively computes interest points for the object using photometric augmentation applied to the image before each iteration of computing the interest points. A digital representation of the region and the object is generated based on interest points that repeat across the images after each application of the photometric augmentation. Based on the digital representation, a set of anchor points are determined from the interest points that repeat across images. Using the set of anchor points, the system detects an absence or a continued presence of the object in the area of the property.
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 10/46 - Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]Salient regional features
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for retroactive event detection. The methods, systems, and apparatus include actions of obtaining an image captured by a camera at a current time, determining that the image depicts a change in a region from a previous image captured by the camera at a previous time, determining, based on determining that the image depicts the change in the region, whether the change depicted in the image is of a known object type, determining, based on the determination that the change depicted in the image is of a known object type, whether the change does not correspond to a previously detected event, and determining, based on the determination that the change does not correspond to a previously detected event, whether the images captured by the camera between the current time and the previous time depict an event.
Systems and techniques are described for using thermal imaging to configure and/or augment temperature regulation operations within a property. In some implementations, a computing device obtains a thermal image of a region of a property. The thermal image identifies at least a surface within the region. A surface temperature of the surface is determined. An ambient temperature for the region is determined based at least on the surface temperature. The one or more temperature controls for the region are adjusted based at least on the ambient temperature.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reducing a likelihood that a doorbell might be used. One of the methods includes determining that a visitor arrived at a premises; determining that the visitor used a computing device at a first time; determining that a user device of a person of the premises received a message at a second time; determining that the first time and the second time both satisfy a timing criteria; and in response to determining that the first time and the second time both satisfy the timing criteria, sending, to the user device, a notification that indicates that the visitor arrived at the premises.
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
G08B 3/10 - Audible signalling systemsAudible personal calling systems using electric transmissionAudible signalling systemsAudible personal calling systems using electromagnetic transmission
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G10L 15/18 - Speech classification or search using natural language modelling
G08B 5/14 - Visible signalling systems, e.g. personal calling systems, remote indication of seats occupied using hydraulic transmissionVisible signalling systems, e.g. personal calling systems, remote indication of seats occupied using pneumatic transmission with indicator element moving about a pivot, e.g. hinged flap or rotating vane
Systems and methods for detecting and monitoring areas of water damage or water management problems in a property are described. Monitoring devices can be deployed at different locations of a property to obtain sensor data and image data regarding the environmental conditions in potentially problematic areas of the property. An image obtained by the monitoring devices can be processed and compared with reference images or other image data that is filtered in a different manner that the image to identify portions of the image in which water damage or water management problems exist.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an event detector. The methods, systems, and apparatus include actions of identifying a portion of a first interframe difference image that represents motion of an OI, determining that a second interframe difference image represents motion by a non-OI, combining the portion of the first interframe difference image and the second interframe difference image as a third interframe difference image labeled as motion of both an OI and a non-OI, and training an event detector with the third interframe difference image.
Disclosed are methods, systems, and apparatus for object localization in video. A method includes obtaining a reference image of an object; generating, from the reference image, homographic adapted images showing the object at various locations with various orientations; determining interest points from the homographic adapted images; determining locations of an object center in the homographic adapted images relative to the interest points; obtaining a sample image of the object; identifying matched pairs of interest points, each matched pair including an interest point from the homographic adapted images and a matching interest point in the sample image; and determining a location of the object in the sample image based on the locations of the object center in the homographic adapted images relative to the matched pairs. The method includes generating a homography matrix; and projecting the reference image of the object to the sample image using the homography matrix.
G06V 10/24 - Aligning, centring, orientation detection or correction of the image
G06T 7/55 - Depth or shape recovery from multiple images
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods
G06V 10/22 - Image preprocessing by selection of a specific region containing or referencing a patternLocating or processing of specific regions to guide the detection or recognition
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for implementing intelligent seating for wellness monitoring. A system obtains data from a first sensor integrated in an intelligent seating apparatus at a property. The first data indicates a potential abnormal condition of a person at the property. The system determines that the person has an abnormal condition based on the first data corresponding to the person having used the seating apparatus. Based on the abnormal condition, the system provides an indication to a client device of the person to prompt the person to adjust their use of the seating apparatus. The system also obtains visual indications of the abnormal condition, determines the type of abnormal condition afflicting the person, and determines a wellness command with instructions for alleviating the abnormal condition. The wellness command is provided for display on the client device.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing video controlled adjustment of light conditions at a property. A monitoring system obtains video data from a video recording device at the property and uses the video data to determine a location of a person at the property as well as a location of an electronic window treatment at the property. Using the video data, the system identifies an attribute of light entering a window at the property. The system determines an adjustment setting for the electronic window treatment at the property based on the attribute of the light and the location of the person at the property. Based on the adjustment setting, a command is provided to perform a first adjustment action that adjusts the position of the electronic window treatment located at the window.
G05B 19/4155 - Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form characterised by programme execution, i.e. part programme or machine function execution, e.g. selection of a programme
G06T 7/70 - Determining position or orientation of objects or cameras
71.
MULTI-HEAD DEEP METRIC MACHINE-LEARNING ARCHITECTURE
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing a multi-head deep metric machine-learning architecture. The architecture is used to perform techniques that include obtaining multiple features that are derived from data values of an input dataset and identifying, for an input image of the input dataset, global features and local features among the features. The techniques also include determining a first set of vectors from the global features and a second set of vectors from the local features; computing, from the first and second sets of vectors, a concatenated feature set based on a proxy-based loss function and pairwise-based loss function. A feature representation that integrates the global features and the local features are generated based on the concatenated feature set. A machine-learning model is generated and configured to output a prediction about an image based on inferences derived using the feature representation.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving multiple images from a camera, each image of the multiple images representative of a detection of an object within the image. For each image of the multiple images the methods include: determining a set of detected objects within the image, each object defined by a respective bounding box, and determining, from the set of detected objects within the image and ground truth labels, a false detection of a first object. The methods further include determining that a target object threshold is met based on a number of false detections of the first object in the multiple images, generating, based on the number of false detections for the first object meeting the target object threshold, an adversarial mask for the first object, and providing, to the camera, the adversarial mask.
G06V 10/98 - Detection or correction of errors, e.g. by rescanning the pattern or by human interventionEvaluation of the quality of the acquired patterns
G06V 10/70 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
73.
Reducing false detections for night vision cameras
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reducing camera false detections. One of the methods includes providing, to a neural network of an image classifier that is trained to detect objects of two or more classification types, a feature vector for a respective training image; receiving, from the neural network, an output vector that indicates, for each of the two or more classification types, a likelihood that the respective training image depicts an object of the corresponding classification type; accessing, from two or more ground truth vectors each for one of the two or more classification types, a ground truth vector for the classification type of an object depicted in the training image; and adjusting one or more weights in the neural network using the output vector and the ground truth vector; and storing, in a memory, the image classifier.
Methods and systems, including computer programs encoded on a storage medium, are described for implementing item monitoring using a doorbell camera. A system generates an input video stream that has image frames corresponding to detection of activity at a property. Timing information is generated for the video stream and includes a timestamp for each image frame of the stream. Using the timing information, the system processes a pre-event image frame that precedes detection of the activity and a post-event image frame that coincides with detection of the activity. An image score is computed with respect to placement of a candidate item at the property in response to processing the pre-event and post-event image frames. The image score is used to determine that a first item was delivered to the property or that a second item was removed after being delivered to the property.
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
Methods and systems, including computer programs encoded on a storage medium, are described for implementing item monitoring using a doorbell camera. A system generates an input video stream that has image frames corresponding to detection of activity at a property. Timing information is generated for the video stream and includes a timestamp for each image frame of the stream. Using the timing information, the system processes a pre-event image frame that precedes detection of the activity and a post-event image frame that coincides with detection of the activity. An image score is computed with respect to placement of a candidate item at the property in response to processing the pre-event and post-event image frames. The image score is used to determine that a first item was delivered to the property or that a second item was removed after being delivered to the property.
G06V 20/40 - ScenesScene-specific elements in video content
G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
G06F 18/22 - Matching criteria, e.g. proximity measures
G06V 10/22 - Image preprocessing by selection of a specific region containing or referencing a patternLocating or processing of specific regions to guide the detection or recognition
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for implementing intelligent seating for wellness monitoring. A system obtains data from a first sensor integrated in an intelligent seating apparatus at a property. The first data indicates a potential abnormal condition of a person at the property. The system determines that the person has an abnormal condition based on the first data corresponding to the person having used the seating apparatus. Based on the abnormal condition, the system provides an indication to a client device of the person to prompt the person to adjust their use of the seating apparatus. The system also obtains visual indications of the abnormal condition, determines the type of abnormal condition afflicting the person, and determines a wellness command with instructions for alleviating the abnormal condition. The wellness command is provided for display on the client device.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for finding lost objects. In some implementations, an unassociated object that appeared in a video from a property is identified. Known entities for the property are identified. A number of times that the known entities appeared in videos from the property is obtained. An order to display the known entities for the property based on a number of times that the known entities appeared in videos from the property are determined. An indication that the unassociated object is associated with a particular entity of the known entities is received. An image of the unassociated object from the video in association with the particular entity of the known entities is stored.
A computer-implemented method includes obtaining an image of an area of a property from an augmented reality device, identifying the area of the property based on the image obtained from the augmented reality device, determining that the area of the property corresponds to an event at the property or a configuration of a monitoring system of the property, and providing, in response to determining that the area of the property corresponds to the event or the configuration, information that represents the event or the configuration and that is configured to be displayed on the augmented reality device.
Methods and systems including computer programs encoded on a computer storage medium, for obtaining imaging data of a parking lot that includes a set of parking spots, detecting a vehicle enter the parking lot, generating a vehicle recognition model for the vehicle, determining that the vehicle is parked in a parking spot, detecting a customer exit the vehicle, generating a customer recognition model for the customer, determining that the customer has entered a business of one or more businesses affiliated with the parking lot, and providing information related to parking space usage for the parking spot and the business to a user device.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for monitoring of pet feeding. The methods, systems, and apparatus include actions of: obtaining a first reference image of a container corresponding to a full state of the container; obtaining a second reference image of the container corresponding to an empty state of the container; obtaining a sample image of the container; based on the first and second reference images, determining an amount of content in the container from the sample image; and based on the amount of the content being less than a reference amount, notifying a user that the amount of the content is getting low.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods
G08B 25/10 - Alarm systems in which the location of the alarm condition is signalled to a central station, e.g. fire or police telegraphic systems characterised by the transmission medium using wireless transmission systems
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for tracking moving objects depicted in multiple images. One of the methods includes determining, for an image captured by a camera, a first bounding box that represents a first moving object depicted in the image, determining that the first bounding box and a second bounding box overlap in an overlap area, determining that the first moving object represented by the first bounding box was farther from the camera that captured the image than a second moving object represented by the second bounding box, generating a mask for the first bounding box based on the overlap area, and determining, using data from the image that is associated with the mask, that the first moving object matches an appearance of another moving object depicted in another image captured by the camera.
G06V 10/56 - Extraction of image or video features relating to colour
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/762 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using feature descriptors to track objects depicted in images. One of the methods includes receiving hue, saturation, value data for an image and data that indicates an object detected in the image, generating a feature descriptor that includes hue data and saturation data, determining, for each of two or more tracked objects that each have a historical feature descriptor that includes historical hue data and historical saturation data, a distance between (i) the respective historical feature descriptor and (ii) the feature descriptor, associating the feature descriptor for the object with a tracked object from the two or more tracked objects, and tracking the tracked object in one or more images from a video sequence using the feature descriptor and the historical feature descriptor.
Methods and apparatus are disclosed for enhancing urban surface model with image data obtained from a satellite image. Three dimensional models of an urban cityscape obtained from digital surface models may comprise surface location information but lack image information associated with the cityscape, such as the color and texture of building facades. The location of the satellite at the time of recording the satellite image interest may be obtained from metadata associated with the satellite image. A 3D model of a cityscape corresponding to the satellite image may be subjected to a transformation operation to determine portions of the 3D model that are viewable from a location corresponding to the location of the satellite when taking the picture. Visible facades buildings of the 3D model ma be identified and mapped to portions of the satellite image which may then be used in rendering 2D images from the 3D model. In some examples a satellite image projection model may be adjusted to more accurately determine geolocations of portions of the satellite image by analysis of a plurality of satellite images.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cooperative video surveillance. The methods, systems, and apparatus include actions of determining that a vehicle has arrived at a particular parking spot, determining a view of a property for an onboard camera for the vehicle, providing a detection rule to the vehicle based on the view of the property for the onboard camera for the vehicle, and receiving an image captured by the onboard camera for the vehicle based on satisfaction of the detection rule.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting people for a training set. The methods, systems, and apparatus include actions of: obtaining an image of a user, obtaining an initial training set that includes the image of the user as a positive example and images of a subset of a set of other people as negative examples, training an initial classifier with the images of the initial training set, determining false positive classifications by the initial classifier, selecting people in the set of other people based on the false positive classifications, obtaining an updated training set that includes an image of the user, images of the subset of the set of other people, and images of the people that are selected, and generating an updated classifier with the updated training set.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for canine assisted home monitoring. The methods, systems, and apparatus include actions of: obtaining a reference signal from an animal at a property; determining whether an event occurred at the property corresponding to when the reference signal from the animal was received; in response to a determination that the event occurred at the property corresponding to when the reference signal from the animal was received, determining that the reference signal indicates that the event is likely occurring at the property; obtaining a sample signal from the animal at the property; determining whether the sample signal corresponds to the reference signal; and notifying a user that the event, which was determined to be indicated by the reference signal, is likely occurring again at the property.
Methods, systems, and apparatus for removing precipitation from video are disclosed. A method includes generating, from a first set of images of a scene from a camera, a segmented background image model of the scene; obtaining a second set of images from the camera; identifying, in an image of the second set of images, a plurality of edges, determining that a first edge of the plurality of edges satisfies criteria for representing precipitation based at least in part on determining that the first edge (i) does not correspond to the background image model of the scene and (ii) extends into two or more contiguous segments of the scene; in response, classifying each of the contiguous segments as a precipitation segment; generating pixel data for each of the precipitation segments; and applying the pixel data to each precipitation segment in the image.
Methods, systems, and apparatus for removing precipitation from video are disclosed. A method includes generating, from a first set of images of a scene from a camera, a segmented background image model of the scene; obtaining a second set of images from the camera; identifying, in an image of the second set of images, a plurality of edges, determining that a first edge of the plurality of edges satisfies criteria for representing precipitation based at least in part on determining that the first edge (i) does not correspond to the background image model of the scene and (ii) extends into two or more contiguous segments of the scene; in response, classifying each of the contiguous segments as a precipitation segment; generating pixel data for each of the precipitation segments; and applying the pixel data to each precipitation segment in the image.
G06F 18/2413 - Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
G06V 10/50 - Extraction of image or video features by performing operations within image blocksExtraction of image or video features by using histograms, e.g. histogram of oriented gradients [HoG]Extraction of image or video features by summing image-intensity valuesProjection analysis
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an event detector. The methods, systems, and apparatus include actions of obtaining frames of a video, determining whether an object of interest is detected within the frames, determining whether motion is detected within the frames, determining whether the frames correspond to motion by an object of interest, generating a training set that includes labeled inter-frame differences based on whether the frames correspond to motion by an object of interest, and training an event detector using the training set.
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for emphasizing a portion of audio data. In some implementations, a method may include determining that a first person is wearing a hearing aid, determining, from images captured by a camera, that a second person is speaking to the first person, determining an audio stream for an environment in which the first person is located, determining whether more than one sound stream is encoded in the audio stream, based on determining that more than one sound stream is encoded in audio data, identifying a portion of captured sounds that corresponds to the second person speaking to the first person, and providing, to the hearing aid, audio data that increases a volume of the portion of captured sounds relative to other portions of the captured sounds.
G06T 7/70 - Determining position or orientation of objects or cameras
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 17/06 - Decision making techniquesPattern matching strategies
G10L 25/78 - Detection of presence or absence of voice signals
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for false detection removal using adversarial masks. The method includes performing object detection on a first image that includes a first region using a detection model determining the detection model incorrectly classified the first region of the first image; generating an adversarial mask based on the first region of the first image and the detection model; obtaining a second image that includes the first region; generating a masked image based on the second image and the adversarial mask; and performing object detection on the masked image including the first region using the detection model.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating distributed jobs for cameras in a monitored property. The methods, systems, and apparatus include actions of obtaining a request to process a video based on an event detected by a first camera at a monitored property, determining resources likely to be available corresponding to the other cameras at the monitored property, allocating one or more tasks corresponding to processing the video to the other cameras based on the resources likely to be available corresponding to the other cameras, and providing the one or more allocated tasks to the first camera and to the other cameras.
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for implementing a smart security fastener. The smart security fastener includes a body configured to be installed at a property; and a head that is supported by the body. The head has circuitry that includes: a micro-processor that generates control signals; and a radio device that is coupled to the micro-processor. The radio device is operable to: i) transmit data to a property monitoring system based on the control signals, where the data indicates an installation status of the smart security fastener; and ii) receive a command from the property monitoring system that indicates authorization to uninstall the smart security fastener. The circuitry also includes a power source that powers each of the micro-processor and the radio device.
Methods, and systems including computer programs encoded on a computer storage medium, for training a detection model for surveillance devices using semi-supervised learning. In one aspect, the methods include receiving imaging data collected by a camera of a scene within a field of view of the camera. Annotated training data is generated from the imaging data and one or more detection models are trained using the annotated training data. Based on a set of performance parameters, an optimized detection model is selected of the one or more detection models, and the optimized detection model is provided to the camera.
Methods, systems, and apparatus for camera detection of human activity with co-occurrence are disclosed. A method includes detecting a person in an image captured by a camera; in response to detecting the person in the image, determining optical flow in portions of a first set of images; determining that particular portions of the first set of images satisfy optical flow criteria; in response to determining that the particular portions of the first set of images satisfy optical flow criteria, classifying the particular portions of the first set of images as indicative of human activity; receiving a second set of images captured by the camera after the first set of images; and determining that the second set of images likely shows human activity based on analyzing portions of the second set of images that correspond to the particular portions of the first set of images classified as indicative of human activity.
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
H04N 5/232 - Devices for controlling television cameras, e.g. remote control
Methods, systems, and apparatus for ground plane filtering of video events are disclosed. A method includes obtaining a first set of images of a scene from a camera; determining a ground plane from the first set of images of the scene; obtaining a second set of images of the scene after the first set of images of the scene is obtained; determining that movement shown by a group of pixels in the second set of images of the scene satisfies motion criteria; determining that the ground plane corresponds with at least a portion of the group of pixels; and in response to determining that movement shown by the group of pixels in the second set of images of the scene satisfies motion criteria, and that the ground plane corresponds with at least a portion of the group of pixels, classifying the group of pixels as showing ground plane based motion.
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06V 20/40 - ScenesScene-specific elements in video content
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G06V 10/62 - Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extractionPattern tracking
G06V 10/50 - Extraction of image or video features by performing operations within image blocksExtraction of image or video features by using histograms, e.g. histogram of oriented gradients [HoG]Extraction of image or video features by summing image-intensity valuesProjection analysis
Methods, systems, and apparatus for ground plane filtering of video events are disclosed. A method includes obtaining a first set of images of a scene from a camera; determining a ground plane from the first set of images of the scene; obtaining a second set of images of the scene after the first set of images of the scene is obtained; determining that movement shown by a group of pixels in the second set of images of the scene satisfies motion criteria; determining that the ground plane corresponds with at least a portion of the group of pixels; and in response to determining that movement shown by the group of pixels in the second set of images of the scene satisfies motion criteria, and that the ground plane corresponds with at least a portion of the group of pixels, classifying the group of pixels as showing ground plane based motion.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an event detector. The methods, systems, and apparatus include actions of obtaining frames of a video, determining whether an object of interest is detected within the frames, determining whether motion is detected within the frames, determining whether the frames correspond to motion by an object of interest, generating a training set that includes labeled inter-frame differences based on whether the frames correspond to motion by an object of interest, and training an event detector using the training set.
A device for detecting and alleviating flooding and blocked storm sewers includes a manhole cover coupled to a float body. The device also includes a canister having a drain hole and a valve. The device also includes multiple guides that can catch onto part of a sewer. The device is configured such that when water flows into the canister, the manhole cover, float body, and guides rise and the valve is opened.
E02D 29/14 - Covers for manholes or the likeFrames for covers
F21V 23/04 - Arrangement of electric circuit elements in or on lighting devices the elements being switches
G01F 23/00 - Indicating or measuring liquid level or level of fluent solid material, e.g. indicating in terms of volume or indicating by means of an alarm
G01F 23/62 - Indicating or measuring liquid level or level of fluent solid material, e.g. indicating in terms of volume or indicating by means of an alarm by floats using elements rigidly fixed to, and rectilinearly moving with, the floats as transmission elements using magnetically actuated indicating means
G01S 19/01 - Satellite radio beacon positioning systems transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
H04L 67/12 - Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
Methods, systems, and apparatus for remote camera-assisted robot guidance are disclosed. A method includes obtaining images of objects approaching a door of a property; identifying candidate paths to the door based on the images of the objects approaching the door of the property; determining movement capabilities of the objects; storing the candidate paths to the door labeled by the movement capabilities of the objects that took the paths; determining capability information for a robot at the property that indicates movement capabilities of the robot; selecting, from the candidate paths, a path for the robot to take to the door based on the movement capabilities of the robot and the labels of the candidate paths; and providing guidance information to the robot that guides the robot to the door along the selected path.