A method comprising causing an automotive assistant in a vehicle to disregard an utterance made by an occupant of that vehicle based on a sightline of the occupant.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable computer software and downloadable computer software platforms for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; downloadable computer software and downloadable computer software platforms for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; downloadable artificial intelligence software for enabling user interaction with vehicles; downloadable computer software for understanding user preferences; downloadable computer software for speech recognition and natural language understanding; downloadable computer software for gaze and gesture detection in or associated with vehicles; downloadable computer software for authentication and identification of individuals; downloadable computer software for reading and translating handwriting and converting text into speech; downloadable computer software for speech signal enhancement; downloadable computer software and downloadable computer software platforms for connecting vehicles with one or more computing devices; downloadable computer software for connecting, operating, and managing networked vehicles software for vehicle navigation; downloadable computer software for vehicle operation, control and user interaction with vehicles; downloadable computer software for use in the operation and control of autonomous-driving vehicles Providing temporary use of non-downloadable computer software for operating voice recognition and voice-activated personal assistance programs; providing temporary use of non-downloadable computer software for enabling hands-free operation of computing devices using voice activation and voice recognition; software as a service (SaaS) services featuring software using artificial intelligence technology that enables users to use a voice activated virtual assistant; software as a service (SaaS) services featuring software using artificial intelligence technology, namely, a digital assistant featuring speech recognition software; software as a service (SaaS) services featuring software applications for computer understanding, recognition, and processing of natural language; software as a service (SaaS) services featuring software applications for programming and controlling communication with voice assistants, drive-assistants, and smart assistants; software as a service (SaaS) services featuring software applications for recognizing, authenticating, and verifying the identity of a speaker; software as a service (SaaS) services featuring software applications for the deployment of conversational Artificial Intelligence (AI) technology; Software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; software as a service (SaaS) services featuring software applications including artificial intelligence software for enabling user interaction with vehicles; software as a service (SaaS) services featuring software applications for understanding user preferences; software as a service (SaaS) services featuring software applications for speech recognition and natural language understanding; software as a service (SaaS) services featuring software applications for gaze and gesture detection in or associated with vehicles; software as a service (SaaS) services featuring software applications for authentication and identification of individuals; software as a service (SaaS) services featuring software applications for reading and translating handwriting and converting text into speech; software as a service (SaaS) services featuring software applications for speech signal enhancement; software as a service (SaaS) services featuring software applications for connecting vehicles with one or more computing devices; software as a service (SaaS) services featuring software applications for connecting, operating, and managing networked vehicles; software as a service (SaaS) services featuring software applications for vehicle navigation; software as a service (SaaS) services featuring software applications for vehicle operation, control and user interaction with vehicles; software as a service (SaaS) services featuring software applications for use in the operation and control of autonomous-driving vehicles; Software as a service (SaaS) services for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software including machine learning applications consisting of large language model applications; providing temporary use of online non-downloadable software for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software in the nature of machine learning applications consisting of large language model applications; Platform as a service (PaaS) featuring computer software for enabling secure transmission of digital information and data to and from artificial intelligence software including machine learning models consisting of large language models, for accelerating training and development of large language models, for administering large language models, for integrating with other software systems to share data, and for improving the quality of machine learning model in the nature of large language model responses; Platform as a service (PaaS) featuring computer software for developing, deploying, maintaining, managing, training, validating, configuring, monitoring, querying, and auditing large language model applications; Software as a service (SAAS) services featuring software for enterprise-grade, large-language model hosting and fine-tuning on the cloud
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable computer software and downloadable computer software platforms for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; downloadable computer software and downloadable computer software platforms for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; downloadable artificial intelligence software for enabling user interaction with vehicles; downloadable computer software for understanding user preferences; downloadable computer software for speech recognition and natural language understanding; downloadable computer software for gaze and gesture detection in or associated with vehicles; downloadable computer software for authentication and identification of individuals; downloadable computer software for reading and translating handwriting and converting text into speech; downloadable computer software for speech signal enhancement; downloadable computer software and downloadable computer software platforms for connecting vehicles with one or more computing devices; downloadable computer software for connecting, operating, and managing networked vehicles software for vehicle navigation; downloadable computer software for vehicle operation, control and user interaction with vehicles; downloadable computer software for use in the operation and control of autonomous-driving vehicles Providing temporary use of non-downloadable computer software for operating voice recognition and voice-activated personal assistance programs; providing temporary use of non-downloadable computer software for enabling hands-free operation of computing devices using voice activation and voice recognition; software as a service (SaaS) services featuring software using artificial intelligence technology that enables users to use a voice activated virtual assistant; software as a service (SaaS) services featuring software using artificial intelligence technology, namely, a digital assistant featuring speech recognition software; software as a service (SaaS) services featuring software applications for computer understanding, recognition, and processing of natural language; software as a service (SaaS) services featuring software applications for programming and controlling communication with voice assistants, drive-assistants, and smart assistants; software as a service (SaaS) services featuring software applications for recognizing, authenticating, and verifying the identity of a speaker; software as a service (SaaS) services featuring software applications for the deployment of conversational Artificial Intelligence (AI) technology; Software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; software as a service (SaaS) services featuring software applications including artificial intelligence software for enabling user interaction with vehicles; software as a service (SaaS) services featuring software applications for understanding user preferences; software as a service (SaaS) services featuring software applications for speech recognition and natural language understanding; software as a service (SaaS) services featuring software applications for gaze and gesture detection in or associated with vehicles; software as a service (SaaS) services featuring software applications for authentication and identification of individuals; software as a service (SaaS) services featuring software applications for reading and translating handwriting and converting text into speech; software as a service (SaaS) services featuring software applications for speech signal enhancement; software as a service (SaaS) services featuring software applications for connecting vehicles with one or more computing devices; software as a service (SaaS) services featuring software applications for connecting, operating, and managing networked vehicles; software as a service (SaaS) services featuring software applications for vehicle navigation; software as a service (SaaS) services featuring software applications for vehicle operation, control and user interaction with vehicles; software as a service (SaaS) services featuring software applications for use in the operation and control of autonomous-driving vehicles; Software as a service (SaaS) services for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software including machine learning applications consisting of large language model applications; providing temporary use of online non-downloadable software for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software in the nature of machine learning applications consisting of large language model applications; Platform as a service (PaaS) featuring computer software for enabling secure transmission of digital information and data to and from artificial intelligence software including machine learning models consisting of large language models, for accelerating training and development of large language models, for administering large language models, for integrating with other software systems to share data, and for improving the quality of machine learning model in the nature of large language model responses; Platform as a service (PaaS) featuring computer software for developing, deploying, maintaining, managing, training, validating, configuring, monitoring, querying, and auditing large language model applications; Software as a service (SAAS) services featuring software for enterprise-grade, large-language model hosting and fine-tuning on the cloud
A method for providing a driver with a warning includes determining that a vehicle is being operated in a manner that fails to comply with a constraint imposed on motion of vehicles on a section of a road, determining that a driver of the vehicle is gazing in a non-neutral direction, based on having done so, selecting a level of obtrusiveness for the warning message, and outputting the warning message at that level.
A method for applying a watermark signal to a speech signal to prevent unauthorized use of speech signals, the method may include receiving an original speech signal; determining a corresponding spectrogram of the original speech signal; selecting a phase sequence of fixed frame length and uniform distribution; and generating an encoded watermark signal based on the corresponding spectrogram and phase sequence.
A method for providing a driver with a warning includes determining that a vehicle is being operated in a manner that fails to comply with a constraint imposed on motion of vehicles on a section of a road, determining that a driver of the vehicle exhibits a non-neutral sentiment, based on having determined that the driver exhibits a non-neutral sentiment, selecting a level of obtrusiveness for the warning message, and outputting the warning message at that level.
B60W 50/14 - Means for informing the driver, warning the driver or prompting a driver intervention
A61B 5/18 - Devices for psychotechnics; Testing reaction times for vehicle drivers
B60W 40/08 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to drivers or passengers
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
A method includes customizing interaction of a person with a vehicle from a fleet of vehicles by receiving, from an access controller in the vehicle, biometric data, the biometric data having been acquired from the person; using the biometric data, retrieving a profile for the person; and providing the profile to the vehicle. This enables the vehicle to transition into, or be reconfigured into, a state that enables the person to interact with the vehicle in a manner based on the profile.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing temporary use of non-downloadable computer
software for operating voice recognition and voice-activated
personal assistance programs; providing temporary use of
non-downloadable computer software for enabling hands-free
operation of computing devices using voice activation and
voice recognition; software as a service (SaaS) services
featuring artificial intelligence technology that enables
users to use a voice activated virtual assistant; software
as a service (SaaS) services featuring artificial
intelligence technology, namely, a digital assistant
featuring speech recognition software; software as a service
(SaaS) services featuring software applications for computer
understanding, recognition, and processing of natural
language; software as a service (SaaS) services featuring
software applications for programming and controlling
communication with voice assistants, drive-assistants, and
smart assistants; software as a service (SaaS) services
featuring software applications for recognizing,
authenticating, and verifying the identity of a speaker;
software as a service (SaaS) services featuring software
applications for the deployment of conversational Artificial
Intelligence (AI) technology; Software as a service (SaaS)
services featuring software applications for use in and with
mobile devices and vehicles for enabling operation, control
and performance of vehicle systems; software as a service
(SaaS) services featuring software applications for use in
and with mobile devices and vehicles for enabling operation
and control of mobile device and vehicle functions based on
user commands; software as a service (SaaS) services
featuring software applications including artificial
intelligence software for enabling user interaction with
vehicles; software as a service (SaaS) services featuring
software applications for understanding user preferences;
software as a service (SaaS) services featuring software
applications for speech recognition and natural language
understanding; software as a service (SaaS) services
featuring software applications for gaze and gesture
detection in or associated with vehicles; software as a
service (SaaS) services featuring software applications for
authentication and identification of individuals; software
as a service (SaaS) services featuring software applications
for reading and translating handwriting and converting text
into speech; software as a service (SaaS) services featuring
software applications for speech signal enhancement;
software as a service (SaaS) services featuring software
applications for connecting vehicles with one or more
computing devices; software as a service (SaaS) services
featuring software applications for connecting, operating,
and managing networked vehicles; software as a service
(SaaS) services featuring software applications for vehicle
navigation; software as a service (SaaS) services featuring
software applications for vehicle operation, control and
user interaction with vehicles; software as a service (SaaS)
services featuring software applications for use in the
operation and control of autonomous-driving vehicles;
software as a service (SaaS) services for developing,
deploying, maintaining, administering, managing, training,
validating, configuring, monitoring, using, querying, and
auditing artificial intelligence software including machine
learning applications consisting of large language model
applications; providing temporary use of online
non-downloadable software for developing, deploying,
maintaining, administering, managing, training, validating,
configuring, monitoring, using, querying, and auditing
artificial intelligence software in the nature of machine
learning applications consisting of large language model
applications; platform as a service (PaaS) featuring
computer software for enabling secure transmission of
digital information and data to and from artificial
intelligence software including machine learning models
consisting of large language models, for accelerating
training and development of large language models, for
administering large language models, for integrating with
other software systems to share data, and for improving the
quality of machine learning model in the nature of large
language model responses; platform as a service (PaaS)
featuring computer software for developing, deploying,
maintaining, managing, training, validating, configuring,
monitoring, querying, and auditing large language model
applications; software as a service (SAAS) services
featuring software for enterprise-grade, large-language
model hosting and fine-tuning on the cloud.
A method for providing communication between an intravehicular conferee who is in a vehicle and first and second extravehicular conferees who are outside the vehicle includes causing speech by the first extravehicular conferee to originate from a first zone in the vehicle and causing speech by the second extravehicular conferee to originate from a second zone in the vehicle. The first and second zones are volumes of space in a cabin of the vehicle.
A method includes providing interaction packages for consumption by an application that engages in speech interaction with a human client in an environment. The interaction packages include a speaker event and a scene event, both of which have been tagged with timing information. The method includes continuously listening to the environment to obtain a stream of audio data, partitioning it into audio segments and using those audio segments to obtain the scene events and the speaker events for the interaction packages.
A tunable zone detection approach makes use of multiple microphones in a fixed configuration in an environment. There are multiple zones in the environment and one or more predetermined positions in each zone. Predetermined transfer functions between the positions and the microphones are used to determine beamformed energies for each of the positions based on received microphone signals. These beamformed energies may be computed using normalization of correlations between microphones. The beamformed energies are processed using a tunable transformation to determine whether an acoustic source is in a particular zone, thereby enabling adjustment of the detection approach to situations including variation in acoustics of the environment.
A method includes receiving a representation of a spoken utterance, processing the representation of the spoken utterance to identify, from a number of candidate domains, a request and a serving domain, and routing the request to a personal assistant based on the request and the serving domain. Identification of the serving domain is based on one or more of a contextual state, a behavior profile of a speaker of the utterance, and a semantic content of the utterance.
Each vehicle in a fleet has an automotive assistant and an external speech interface. A person who is authorized to interact with an automotive assistant of any vehicle from that fleet is detected as being outside a vehicle from that fleet. After having determined that the person is indeed authorized, an automotive assistant in that vehicle communicates with that person. It does so either by receiving an utterance by that person or by transmitting an utterance to that person.
A biometric authenticator for use by an application executing on a vehicle's infotainment system carries out a biometric authentication procedure that is tailored to dynamically varying context information that is obtained by vehicle sensors.
A method that includes receiving information indicative of a location of a vehicle. The vehicle has an occupied-vehicle state that includes an occupant state. This occupant state represents the state of one or more occupants within the vehicle. The method further includes receiving information indicative of this occupant state. The information indicative of the occupant state results from an observation by a detector that is in communication with an infotainment system within the vehicle. The method continues with using both the information indicative of the occupant state and the information indicative of the location to select an advertisement from a database of advertisements. This selected advertisement is one that is ultimately for presentation to the occupant.
A contextual answering system for processing a user spoken utterance and providing a response to the user spoken utterance may include a vehicle head unit configured to receive microphone signals indicative of a user utterance; and a processor programed to receive data indicative of a vehicle state, receive the user spoken utterance, perform semantic analysis on the user spoken utterance based at least in part on a context of the user spoken utterance and vehicle state, select a knowledge base as a source for information regarding the user spoken utterance based on the semantic analysis; and provide a response to the user spoken utterance from the selected knowledge base to the vehicle head unit.
42 - Scientific, technological and industrial services, research and design
Goods & Services
(1) Providing temporary use of non-downloadable computer software for operating voice recognition and voice-activated personal assistance programs; providing temporary use of non-downloadable computer software for enabling hands-free operation of computing devices using voice activation and voice recognition; software as a service (SaaS) services featuring artificial intelligence technology that enables users to use a voice activated virtual assistant; software as a service (SaaS) services featuring artificial intelligence technology, namely, a digital assistant featuring speech recognition software; software as a service (SaaS) services featuring software applications for computer understanding, recognition, and processing of natural language; software as a service (SaaS) services featuring software applications for programming and controlling communication with voice assistants, drive-assistants, and smart assistants; software as a service (SaaS) services featuring software applications for recognizing, authenticating, and verifying the identity of a speaker; software as a service (SaaS) services featuring software applications for the deployment of conversational Artificial Intelligence (AI) technology; Software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; software as a service (SaaS) services featuring software applications including artificial intelligence software for enabling user interaction with vehicles; software as a service (SaaS) services featuring software applications for understanding user preferences; software as a service (SaaS) services featuring software applications for speech recognition and natural language understanding; software as a service (SaaS) services featuring software applications for gaze and gesture detection in or associated with vehicles; software as a service (SaaS) services featuring software applications for authentication and identification of individuals; software as a service (SaaS) services featuring software applications for reading and translating handwriting and converting text into speech; software as a service (SaaS) services featuring software applications for speech signal enhancement; software as a service (SaaS) services featuring software applications for connecting vehicles with one or more computing devices; software as a service (SaaS) services featuring software applications for connecting, operating, and managing networked vehicles; software as a service (SaaS) services featuring software applications for vehicle navigation; software as a service (SaaS) services featuring software applications for vehicle operation, control and user interaction with vehicles; software as a service (SaaS) services featuring software applications for use in the operation and control of autonomous-driving vehicles; software as a service (SaaS) services for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software including machine learning applications consisting of large language model applications; providing temporary use of online non-downloadable software for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software in the nature of machine learning applications consisting of large language model applications; platform as a service (PaaS) featuring computer software for enabling secure transmission of digital information and data to and from artificial intelligence software including machine learning models consisting of large language models, for accelerating training and development of large language models, for administering large language models, for integrating with other software systems to share data, and for improving the quality of machine learning model in the nature of large language model responses; platform as a service (PaaS) featuring computer software for developing, deploying, maintaining, managing, training, validating, configuring, monitoring, querying, and auditing large language model applications; software as a service (SAAS) services featuring software for enterprise-grade, large-language model hosting and fine-tuning on the cloud.
18.
Visual Platforms for Configuring Audio Processing Operations
Disclosed are systems, methods, and other implementations, including a method for controlling a configurable audio processor, coupled via a plurality of transducers (such as the microphones 220A-C and/or the loudspeakers 224A-B of FIG. 2) to an acoustic environment, that includes determining a three-dimensional spatial variation in the acoustic environment of a processing characteristic of the audio processor based on configuration values for the audio processor, forming a three-dimensional image of the three-dimensional spatial variation of the processing characteristic, and providing the three-dimensional image for presentation to a user for controlling the configuration values.
A hybrid noise-reducer provides an output audio signal by carrying out noise reduction on an input audio signal over a desired range of frequencies. The desired range of frequencies consists of the union of a base range of frequencies and a remainder range of frequencies. The noise reducer includes first and second noise-reduction paths of different types. The first noise-reduction path relies on a dynamic neural network that has been trained using the base range of frequencies. The second noise-reduction path relies on a noise estimation module that uses an estimate of signal-to-noise ratio estimate to identify noise within the remainder range.
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/21 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being power information
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
20.
ACOUSTIC INTERFERENCE SUPPRESSION THROUGH SPEAKER-AWARE PROCESSING
Disclosed are systems, methods, and other implementations for acoustic interference suppression, including a method that includes obtaining a multi-source sound signal sample combining multiple sound components from a plurality of sound sources in a sound environment, with the plurality of sounds sources including one or more interfering sound sources produced by one or more loudspeakers in the sound environment, determining interfering sound characteristics for one or more sound signals that correspond to the one or more interfering sound sources, and suppressing at least one of the multiple sound components associated with the determined interfering sound characteristics for at least one of the one or more sound signals.
A computer-implemented Karaoke system, which may be deployed in a vehicle for use by a driver and/or one or more passengers of the vehicle, adjusts relevant settings depending on the properties of the song, for instance as automatically determined by analysis of the audio signal of a song. In some examples, the system may dynamically remix original vocals or user-provided vocals depending on whether the user is singing.
A system for interacting with an audio stream to obtain lyric information, control playback of the audio stream, and control aspects of the audio stream. In some instances, end users can request that the audio stream play with a lead vocal track or without a lead vocal track. Obtaining lyric information includes receiving via a text to speech module an audio playback of the lyric information.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/54 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for retrieval
Disclosed are systems, methods, and other implementations for noise suppression, including a method that includes obtaining a sound signal sample, determining a noise reduction profile, from a plurality of noise reduction profiles, for processing the obtained sound signal sample, and processing the sound signal sample with a machine learning system to produce a noise suppressed signal. The machine learning system implements (executes) a single machine learning model trained to controllably suppress noise in input sound signals according to the plurality of noise reduction profiles. The processing of the sound signal sample is performed according to the determined noise reduction profile.
G10L 25/03 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
Disclosed are systems, methods, and other implementations for noise suppression, including a method that includes obtaining a sound signal sample, determining a noise reduction profile, from a plurality of noise reduction profiles, for processing the obtained sound signal sample, and processing the sound signal sample with a machine learning system to produce a noise suppressed signal. The machine learning system implements (executes) a single machine learning model trained to controllably suppress noise in input sound signals according to the plurality of noise reduction profiles. The processing of the sound signal sample is performed according to the determined noise reduction profile.
Methods and systems for deessing of speech signals are described. A deesser of a speech processing system includes an analyzer configured to receive a full spectral envelope for each time frame of a speech signal presented to the speech processing system, and to analyze the full spectral envelope to identify frequency content for deessing. The deesser also includes a compressor configured to receive results from the analyzer and to spectrally weight the speech signal as a function of results of the analyzer. The analyzer can be configured to calculate a psychoacoustic measure from the full spectral envelope, and may be further configured to detect sibilant sounds of the speech signal using the psychoacoustic measure. The psychoacoustic measure can include, for example, a measure of sharpness, and the analyzer may be further configured to calculate deesser weights based on the measure of sharpness. An example application includes in-car communications.
G10L 21/0364 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
H03G 9/02 - Combinations of two or more types of control, e.g. gain control and tone control in untuned amplifiers
A voice-based system is configured to process commands in a flexible format, for example, in which a wake word does not necessarily have to occur at the beginning of an utterance. As in natural speech, the system being addressed may be named within or at the end of a spoken utterance rather than at the beginning, or depending on the context, may not be named at all.
A method for applying a watermark signal to a speech signal to prevent unauthorized use of speech signals, the method may include receiving an original speech signal; determining a corresponding spectrogram of the original speech signal; selecting a phase sequence of fixed frame length and uniform distribution; and generating an encoded watermark signal based on the corresponding spectrogram and phase sequence.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing temporary use of non-downloadable computer software for operating voice recognition and voice-activated personal assistance programs; providing temporary use of non-downloadable computer software for enabling hands-free operation of computing devices using voice activation and voice recognition; software as a service (SaaS) services featuring software using artificial intelligence technology that enables users to use a voice activated virtual assistant; software as a service (SaaS) services featuring software using artificial intelligence technology, namely, a digital assistant featuring speech recognition software; software as a service (SaaS) services featuring software applications for computer understanding, recognition, and processing of natural language; software as a service (SaaS) services featuring software applications for programming and controlling communication with voice assistants, drive-assistants, and smart assistants; software as a service (SaaS) services featuring software applications for recognizing, authenticating, and verifying the identity of a speaker; software as a service (SaaS) services featuring software applications for the deployment of conversational Artificial Intelligence (AI) technology; Software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; software as a service (SaaS) services featuring software applications including artificial intelligence software for enabling user interaction with vehicles; software as a service (SaaS) services featuring software applications for understanding user preferences; software as a service (SaaS) services featuring software applications for speech recognition and natural language understanding; software as a service (SaaS) services featuring software applications for gaze and gesture detection in or associated with vehicles; software as a service (SaaS) services featuring software applications for authentication and identification of individuals; software as a service (SaaS) services featuring software applications for reading and translating handwriting and converting text into speech; software as a service (SaaS) services featuring software applications for speech signal enhancement; software as a service (SaaS) services featuring software applications for connecting vehicles with one or more computing devices; software as a service (SaaS) services featuring software applications for connecting, operating, and managing networked vehicles; software as a service (SaaS) services featuring software applications for vehicle navigation; software as a service (SaaS) services featuring software applications for vehicle operation, control and user interaction with vehicles; software as a service (SaaS) services featuring software applications for use in the operation and control of autonomous-driving vehicles; Software as a service (SaaS) services for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software including machine learning applications consisting of large language model applications; providing temporary use of online non-downloadable software for developing, deploying, maintaining, administering, managing, training, validating, configuring, monitoring, using, querying, and auditing artificial intelligence software in the nature of machine learning applications consisting of large language model applications; Platform as a service (PaaS) featuring computer software for enabling secure transmission of digital information and data to and from artificial intelligence software including machine learning models consisting of large language models, for accelerating training and development of large language models, for administering large language models, for integrating with other software systems to share data, and for improving the quality of machine learning model in the nature of large language model responses; Platform as a service (PaaS) featuring computer software for developing, deploying, maintaining, managing, training, validating, configuring, monitoring, querying, and auditing large language model applications; Software as a service (SAAS) services featuring software for enterprise-grade, large-language model hosting and fine-tuning on the cloud
29.
COLLABORATION BETWEEN A RECOMMENDATION ENGINE AND A VOICE ASSISTANT
A method comprising causing a voice assistant and a recommendation engine that are executing in an infotainment system of a vehicle to cooperate in processing a vehicle occupant's acceptance of a recommendation proposed by the recommendation engine by having an interface to enable the recommendation engine to provide recommendation context to the voice assistant to enable the voice assistant to resolve an ambiguity in the occupant's acceptance of the recommendation.
B60R 16/037 - Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric for occupant comfort
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing temporary use of non-downloadable computer
software for operating voice recognition and voice-activated
personal assistance programs; providing temporary use of
non-downloadable computer software for enabling hands-free
operation of computing devices using voice activation and
voice recognition; software as a service (SaaS) services
featuring artificial intelligence technology that enables
users to use a voice activated virtual assistant; software
as a service (SaaS) services featuring artificial
intelligence technology, namely, a digital assistant
featuring speech recognition software; software as a service
(SaaS) services featuring software applications for computer
understanding, recognition, and processing of natural
language; software as a service (SaaS) services featuring
software applications for programming and controlling
communication with voice assistants, drive-assistants, and
smart assistants; software as a service (SaaS) services
featuring software applications for recognizing,
authenticating, and verifying the identity of a speaker;
software as a service (SaaS) services featuring software
applications for the deployment of conversational Artificial
Intelligence (AI) technology; Software as a service (SaaS)
services featuring software applications for use in and with
mobile devices and vehicles for enabling operation, control
and performance of vehicle systems; software as a service
(SaaS) services featuring software applications for use in
and with mobile devices and vehicles for enabling operation
and control of mobile device and vehicle functions based on
user commands; software as a service (SaaS) services
featuring software applications including artificial
intelligence software for enabling user interaction with
vehicles; software as a service (SaaS) services featuring
software applications for understanding user preferences;
software as a service (SaaS) services featuring software
applications for speech recognition and natural language
understanding; software as a service (SaaS) services
featuring software applications for gaze and gesture
detection in or associated with vehicles; software as a
service (SaaS) services featuring software applications for
authentication and identification of individuals; software
as a service (SaaS) services featuring software applications
for reading and translating handwriting and converting text
into speech; software as a service (SaaS) services featuring
software applications for speech signal enhancement;
software as a service (SaaS) services featuring software
applications for connecting vehicles with one or more
computing devices; software as a service (SaaS) services
featuring software applications for connecting, operating,
and managing networked vehicles; software as a service
(SaaS) services featuring software applications for vehicle
navigation; software as a service (SaaS) services featuring
software applications for vehicle operation, control and
user interaction with vehicles; software as a service (SaaS)
services featuring software applications for use in the
operation and control of autonomous-driving vehicles.
31.
IN-CAR ASSISTIVE AUDIO TECHNOLOGIES FOR USERS WITH HEARING LOSS
A hearing application (162) for a vehicle audio system may include at least one speaker (148) configured to play playback content, and at least one hearing application programmed to receive optimization parameters from a hearing device (124) within the vehicle (104), the optimization parameters including signal processing parameters specific to the hearing device (124), apply the optimization parameters to the playback content, and transmit the playback content for playback by one of the hearing device (124) and/or at least one speaker (148).
A hearing application for a vehicle audio system may include at least one speaker configured to play playback content, and at least one hearing application programmed to receive optimization parameters from a hearing device within the vehicle, the optimization parameters including signal processing parameters specific to the hearing device, apply the optimization parameters to the playback content, and transmit the playback content for playback by one of the hearing device and/or at least one speaker.
A vehicle includes a cabin, an internal-loudspeaker set an external-microphone set, and a signal processor that filters a raw audio signal that has been received by the external-microphone set broadcasts the resulting filtered audio signal into the cabin using the internal-loudspeaker set.
G10K 11/178 - Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
H04N 23/62 - Control of parameters via user interfaces
H04N 23/695 - Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
A method for managing an interaction between a user and a driver interaction system in a vehicle, the method comprising presenting a first audio output to a user from an output device of the driver interaction system, and, while presenting the first audio output to the user, receiving sensed input at the driver interaction system, processing the sensed input including determining an emotional content of the driver, and controlling the interaction based at least in part on the emotional content of the sensed input.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
A voice assistant system for a vehicle includes a microphone configured to detect an audio signal from a user of the vehicle; a speaker configured to output a dialogue in response to the audio signal; and a processor programmed to responsive to detecting a conversation in which the user is involved, decrease a lengthiness setting of the voice assistant system to reduce the length of the dialogue, and increase an independency setting of the voice assistant system to prevent a confirmation question from the voice assistant system.
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
G10L 17/02 - Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
G10L 17/06 - Decision making techniques; Pattern matching strategies
B60Q 9/00 - Arrangement or adaptation of signal devices not provided for in one of main groups
A voice assistant system for a vehicle includes a microphone configured to detect an audio signal from a user of the vehicle; a speaker configured to output a dialogue in response to the audio signal; and a processor programmed to responsive to detecting a conversation in which the user is involved, decrease a lengthiness setting of the voice assistant system to reduce the length of the dialogue, and increase an independency setting of the voice assistant system to prevent a confirmation question from the voice assistant system.
A method for managing an interaction between a user and a driver interaction system in a vehicle, the method comprising presenting a first audio output to a user from an output device of the driver interaction system, and, while presenting the first audio output to the user, receiving sensed input at the driver interaction system, processing the sensed input including determining an emotional content of the driver, and controlling the interaction based at least in part on the emotional content of the sensed input.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
A method for synthesizing speech from a textual input includes receiving the textual input, the textual input including native words in a native language and foreign words in a foreign language, and processing the textual input to determine a phonetic representation of the textual input. The processing includes determining a native phonetic representation of the of the native words, and determining a nativized phonetic representation of the foreign words. Determining the nativized phonetic representation includes forming a foreign phonetic representation of the foreign words using a foreign phoneme set, and mapping the foreign phonetic representation to the nativized phonetic representation according to a model of a native speaker's pronunciation of foreign words.
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
A method for synthesizing speech from a textual input includes receiving the textual input, the textual input including native words in a native language and foreign words in a foreign language, and processing the textual input to determine a phonetic representation of the textual input. The processing includes determining a native phonetic representation of the of the native words, and determining a nativized phonetic representation of the foreign words. Determining the nativized phonetic representation includes forming a foreign phonetic representation of the foreign words using a foreign phoneme set, and mapping the foreign phonetic representation to the nativized phonetic representation according to a model of a native speaker's pronunciation of foreign words.
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 13/06 - Elementary speech units used in speech synthesisers; Concatenation rules
An interface customized to a detected emotional state of a user is provided. Audio signals are received from at least one microphone, the audio signals being indicative of spoken words, phrases, or commands. A wake-up word (WuW) is detected in the audio signals. An emotion is also detected in the audio signals containing the WuW. An emotion-aware processing system is configured according to the detected emotion. A voice control session is performed using the emotion-aware processing system configured according to the detected emotion.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
An interface customized to a detected emotional state of a user is provided. Audio signals are received from at least one microphone, the audio signals being indicative of spoken words, phrases, or commands. A wake-up word (WuW) is detected in the audio signals. An emotion is also detected in the audio signals containing the WuW. An emotion-aware processing system is configured according to the detected emotion. A voice control session is performed using the emotion-aware processing system configured according to the detected emotion.
G10L 17/26 - Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
42.
INTERACTIVE MODIFICATION OF SPEAKING STYLE OF SYNTHESIZED SPEECH
Control over speaking style of a text-to- speech (TTS) system is provided without necessarily requiring that the training of the TTS conversion process (e.g., the ANN used for the conversion) take into account the speaking styles of the training data. For example, the TTS system may allow adjustment of characteristics of speaking styles, such as, speed, perceivable degree of "kindness", average pitch, pitch variation, and duration of pauses. In some examples, a voice designer may have a number of independent controls that vary corresponding characteristics without necessarily varying others. Once the designer has configured a desired overall speaking style based on those controllable characteristics, the TTS system can be configured to use that speaking style for deployments of the TTS system. For example, the TTS system may be used for audio output in a voice assistant, for instance, for an in-vehicle voice assistant.
42 - Scientific, technological and industrial services, research and design
Goods & Services
(1) Providing temporary use of non-downloadable computer software for operating voice recognition and voice-activated personal assistance programs; providing temporary use of non-downloadable computer software for enabling hands-free operation of computing devices using voice activation and voice recognition; software as a service (SaaS) services featuring artificial intelligence technology that enables users to use a voice activated virtual assistant; software as a service (SaaS) services featuring artificial intelligence technology, namely, a digital assistant featuring speech recognition software; software as a service (SaaS) services featuring software applications for computer understanding, recognition, and processing of natural language; software as a service (SaaS) services featuring software applications for programming and controlling communication with voice assistants, drive-assistants, and smart assistants; software as a service (SaaS) services featuring software applications for recognizing, authenticating, and verifying the identity of a speaker; software as a service (SaaS) services featuring software applications for the deployment of conversational Artificial Intelligence (AI) technology; Software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; software as a service (SaaS) services featuring software applications including artificial intelligence software for enabling user interaction with vehicles; software as a service (SaaS) services featuring software applications for understanding user preferences; software as a service (SaaS) services featuring software applications for speech recognition and natural language understanding; software as a service (SaaS) services featuring software applications for gaze and gesture detection in or associated with vehicles; software as a service (SaaS) services featuring software applications for authentication and identification of individuals; software as a service (SaaS) services featuring software applications for reading and translating handwriting and converting text into speech; software as a service (SaaS) services featuring software applications for speech signal enhancement; software as a service (SaaS) services featuring software applications for connecting vehicles with one or more computing devices; software as a service (SaaS) services featuring software applications for connecting, operating, and managing networked vehicles; software as a service (SaaS) services featuring software applications for vehicle navigation; software as a service (SaaS) services featuring software applications for vehicle operation, control and user interaction with vehicles; software as a service (SaaS) services featuring software applications for use in the operation and control of autonomous-driving vehicles.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing temporary use of non-downloadable computer software for operating voice recognition and voice-activated personal assistance programs; providing temporary use of non-downloadable computer software for enabling hands-free operation of computing devices using voice activation and voice recognition; software as a service (SaaS) services featuring artificial intelligence technology that enables users to use a voice activated virtual assistant; software as a service (SaaS) services featuring artificial intelligence technology, namely, a digital assistant featuring speech recognition software; software as a service (SaaS) services featuring software applications for computer understanding, recognition, and processing of natural language; software as a service (SaaS) services featuring software applications for programming and controlling communication with voice assistants, drive-assistants, and smart assistants; software as a service (SaaS) services featuring software applications for recognizing, authenticating, and verifying the identity of a speaker; software as a service (SaaS) services featuring software applications for the deployment of conversational Artificial Intelligence (AI) technology; Software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation, control and performance of vehicle systems; software as a service (SaaS) services featuring software applications for use in and with mobile devices and vehicles for enabling operation and control of mobile device and vehicle functions based on user commands; software as a service (SaaS) services featuring software applications including artificial intelligence software for enabling user interaction with vehicles; software as a service (SaaS) services featuring software applications for understanding user preferences; software as a service (SaaS) services featuring software applications for speech recognition and natural language understanding; software as a service (SaaS) services featuring software applications for gaze and gesture detection in or associated with vehicles; software as a service (SaaS) services featuring software applications for authentication and identification of individuals; software as a service (SaaS) services featuring software applications for reading and translating handwriting and converting text into speech; software as a service (SaaS) services featuring software applications for speech signal enhancement; software as a service (SaaS) services featuring software applications for connecting vehicles with one or more computing devices; software as a service (SaaS) services featuring software applications for connecting, operating, and managing networked vehicles; software as a service (SaaS) services featuring software applications for vehicle navigation; software as a service (SaaS) services featuring software applications for vehicle operation, control and user interaction with vehicles; software as a service (SaaS) services featuring software applications for use in the operation and control of autonomous-driving vehicles
45.
METHODS AND SYSTEMS FOR INCREASING AUTONOMOUS VEHICLE SAFETY AND FLEXIBILITY USING VOICE INTERACTION
A vehicle control system executing a voice control system for facilitating voice-based dialog with a driver to enable the driver or autonomous vehicle to control certain operational aspects of an autonomous vehicle is provided. Using environmental and sensor input, the vehicle control system can select optimal routes for operating the vehicle in an autonomous mode or choose a preferred operational mode. Occupants of the autonomous vehicle can change a destination, route or driving mode by engaging with the vehicle control system in a dialog enabled by the voice control system.
Microphone signal is received from at least one microphone. AEC produces an echo cancelled microphone signal using first adaptive filters to estimate and cancel feedback that is a result of the environment. AFC produces a processed microphone signal using second adaptive filters to estimate and cancel feedback resulting from application of the reinforced voice signal within the environment. The uttered speech is reinforced in the processed microphone signal to produce the reinforced voice signal. The reinforced voice signal and the audio signal is applied to the loudspeakers. A step size of adjustment of the second adaptive filters may be increased responsive to detection of reverberation in the microphone signal. The reverberation that is used to control the step size of the second adaptive filters may be added artificially. This may provide multiple benefits including improving adjustment of the second adaptive filters and also improving the sound impression of the voice.
G10L 21/02 - Speech enhancement, e.g. noise reduction or echo cancellation
H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
H04R 3/02 - Circuits for transducers for preventing acoustic reaction
Microphone signal is received from at least one microphone. AEC produces an echo cancelled microphone signal using first adaptive filters to estimate and cancel feedback that is a result of the environment. AFC produces a processed microphone signal using second adaptive filters to estimate and cancel feedback resulting from application of the reinforced voice signal within the environment. The uttered speech is reinforced in the processed microphone signal to produce the reinforced voice signal. The reinforced voice signal and the audio signal is applied to the loudspeakers. A step size of adjustment of the second adaptive filters may be increased responsive to detection of reverberation in the microphone signal. The reverberation that is used to control the step size of the second adaptive filters may be added artificially. This may provide multiple benefits including improving adjustment of the second adaptive filters and also improving the sound impression of the voice.
A system for interactive and iterative media generation may include loudspeakers configured to play back audio signals into an environment, the audio signals including karaoke content; at least one microphone configured to receive microphone signals indicative of sound in the environment; and a processor programmed to receive a first microphone signal from the at least one microphone, the first microphone signal including a first user sound and karaoke content, instruct the loudspeakers to play back the first microphone signal, receiving a second microphone signal from the at least one microphone, the second microphone signal including the first user sound of the first microphone signal and a second user sound, transmitting the second microphone signal, including the first and second microphone signals and the karaoke content, as an instance of iteratively-generated media content.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
H04W 4/46 - Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for vehicle-to-vehicle communication [V2V]
An In-Car Communication (ICC) system supports the communication paths within a car by receiving the speech signals of a speaking passenger and playing it back for one or more listening passengers. Signal processing tasks are split into a microphone related part and into a loudspeaker related part. A sound processing system suitable for use in a vehicle having multiple acoustic zones includes a plurality of microphone In-Car Communication (Mic-ICC) instances coupled and a plurality of loudspeaker In-Car Communication (Ls-ICC) instances. The system further includes a dynamic audio routing matrix with a controller and coupled to the Mic-ICC instances, a mixer coupled to the plurality of Mic-ICC instances and a distributor coupled to the Ls-ICC instances.
G10L 25/84 - Detection of presence or absence of voice signals for discriminating voice from noise
H04M 9/08 - Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
A vehicle system for classifying spoken utterance within a vehicle cabin as one of system-directed and non-system directed, the system may include at least one microphone configured to detect at least one audio signal from at least one occupant of a vehicle, and a processor programmed to receive the at least one audio signal including at least one acoustic utterance, determine a number of vehicle occupants based at least in part on the at least one signal, determine a probability that the utterance is system directed based at least in part one the utterance and the number of vehicle occupants, determine a classification threshold based at least in part on the number of vehicle occupants, compare the classification threshold to the probability to determine whether the at least one acoustic utterance is one of a system directed utterance and a non-system directed utterance.
Disclosed are systems, methods and other implementations for speech generation, including a method that includes obtaining a speech sample for a target speaker, processing, using a trained encoder, the speech sample to produce a parametric representation of the speech sample for the target speaker, receiving configuration data for a speech synthesis system that accepts as an input the parametric representation, and adapting the configuration data for the speech synthesis system according to an input comprising the parametric representation, and a time-domain representation for the speech sample, to generate adapted configuration data for the speech synthesis system. The method further includes causing configuration of the speech synthesis system according to the adapted configuration data, with the speech synthesis system being implemented to generate synthesized speech output data with estimated voice and time- domain speech characteristics approximating actual voice and time-domain speech characteristics for the target speaker.
A system for detection of at least one designated wake-up word for at least one speech-enabled application. The system comprises at least one microphone; and at least one computer hardware processor configured to perform: receiving an acoustic signal generated by the at least one microphone at least in part as a result of receiving an utterance spoken by a speaker; obtaining information indicative of the speaker's identity; interpreting the acoustic signal at least in part by determining, using the information indicative of the speaker's identity and automated speech recognition, whether the utterance spoken by the speaker includes the at least one designated wake-up word; and interacting with the speaker based, at least in part, on results of the interpreting.
A method for selecting a speech recognition result on a computing device includes receiving a first speech recognition result determined by the computing device, receiving first features, at least some of the features being determined using the first speech recognition result, determining whether to select the first speech recognition result or to wait for a second speech recognition result determined by a cloud computing service based at least in part on the first speech recognition result and the first features.
G08G 1/04 - Detecting movement of traffic to be counted or controlled using optical or ultrasonic detectors
H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
G01S 3/801 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves - Details
55.
CONTEXTUAL UTTERANCE RESOLUTION IN MULTIMODAL SYSTEMS
A system and method of responding to a vocal utterance may include capturing and converting the utterance to word(s) using a language processing method, such as natural language processing. The context of the utterance and of the system, which may include multimodal inputs, may be used to determine the meaning and intent of the words.
A vehicle system for classifying spoken utterance within a vehicle cabin as one of system-directed and non-system directed, the system may include at least one microphone configured to detect at least one acoustic utterance from at least one occupant of a vehicle, at least one sensor to detect user behavior data indicative of user behavior, and a processor programmed to: receive the acoustic utterance, classify the acoustic utterance as one of a system-directed utterance and a non-system directed utterance, determine whether the acoustic utterance was properly classified based on user behavior observed via data received from the sensor after the classification, and apply a mitigating adjustment to classifications of subsequent acoustic utterances based on an improper classification.
G10L 15/18 - Speech classification or search using natural language modelling
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
57.
Techniques for language independent wake-up word detection
A method for a user device, including receiving a first acoustic input of a user speaking a wake-up word in the target language; providing a first acoustic feature derived from the first acoustic input to an acoustic model stored on the user device to obtain a first sequence of speech units corresponding to the wake-up word spoken by the user in the target language, the acoustic model trained on a corpus of training data in a source language different than the target language; receiving a second acoustic input including the wake-up word in the target language; providing a second acoustic feature derived from the second acoustic input to the acoustic model to obtain a second sequence of speech units corresponding to the wake-up word in the target language; and comparing the first and second sequences of speech units to recognize the wake-up word in the target language.
A method for managing location-aware reminders in an automobile includes monitoring a geographic location of the automobile using a computer system installed in the vehicle. The computer system detects that the automobile has entered a geographic region associated with a location-aware reminder and issues a reminder message associated with the location-aware reminder to a driver of the automobile based on the detecting.
A system and method for providing avatar device status indicators for voice assistants in multi-zone vehicles. The method comprises: receiving at least one signal from a plurality of microphones, wherein each microphone is associated with one of a plurality of spatial zones, and one of a plurality of avatar devices; wherein the at least one signal further comprises a speech signal component from a speaker; wherein the speech signal component is a voice command or question; sending zone information associated with the speaker and with one of the plurality of spatial zones to an avatar; activating one the plurality of avatar devices in a respective one of the plurality of spatial zones associated with the speaker.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
H04R 1/40 - Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
H05B 47/12 - Controlling the light source in response to determined parameters by determining the presence or movement of objects or living beings by detecting audible sound
60.
SYSTEM AND METHOD FOR ACOUSTIC DETECTION OF EMERGENCY SIRENS
A method detects presence of a multi-tone siren type in an acoustic signal. The multi-tone siren type is associated with one or more siren patterns, where each siren pattern includes a number of time patterns at corresponding frequencies. The method includes processing a number of frequency components of a frequency domain representation of the acoustic signal over time to determine a corresponding plurality of values. That processing includes determining, for each frequency component, a value characterizing a presence of a time pattern associated with at least one siren pattern. The method also includes processing the values according to the siren patterns to determine a detection result indicating whether the multi-tone siren type is present in the acoustic signal.
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G08G 1/0965 - Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages responding to signals from another vehicle, e.g. emergency vehicle
B60W 50/14 - Means for informing the driver, warning the driver or prompting a driver intervention
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
B60W 30/16 - Control of distance between vehicles, e.g. keeping a distance to preceding vehicle
While current voice assistants can respond to voice requests, creating smarter assistants that leverage location, past requests, and user data to enhance responses to future requests and to provide robust data about locations is desirable. A method for enhancing a geolocation database (“database”) associates a user-initiated triggering event with a location in a database by sensing user position and orientation within the vehicle and a position and orientation of the vehicle. The triggering event is detected by sensors arranged within a vehicle with respect to the user. The method determines a point of interest (“POI”) near the location based on the user-initiated triggering event. The method, responsive to the user-initiated triggering event, updates the database based on information related to the user-initiated triggering event at an entry of the database associated with the POI. The database and voice assistants can leverage the enhanced data about the POI for future requests.
A hybrid noise-reducer provides an output audio signal by carrying out noise reduction on an input audio signal over a desired range of frequencies. The desired range of frequencies consists of the union of a base range of frequencies and a remainder range of frequencies. The noise reducer includes first and second noise-reduction paths of different types. The first noise-reduction path relies on a dynamic neural network that has been trained using the base range of frequencies. The second noise-reduction path relies on a noise estimation module that uses an estimate of signal-to-noise ratio estimate to identify noise within the remainder range.
G10L 21/0264 - Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 25/78 - Detection of presence or absence of voice signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
63.
METHODS AND APPARATUS FOR DETECTING A VOICE COMMAND
According to some aspects, a method of monitoring an acoustic environment of a mobile device, at least one computer readable medium encoded with instructions that, when executed, perform such a method and/or a mobile device configured to perform such a method is provided. The method comprises receiving acoustic input from the environment of the mobile device while the mobile device is operating in the low power mode, detecting whether the acoustic input includes a voice command based on performing a plurality of processing stages on the acoustic input, wherein at least one of the plurality of processing stages is performed while the mobile device is operating in the low power mode, and using at least one contextual cue to assist in detecting whether the acoustic input includes a voice command.
A voice-based system is configured to process commands in a flexible format, for example, in which a wake word does not necessarily have to occur at the beginning of an utterance. As in natural speech, the system being addressed may be named within or at the end of a spoken utterance rather than at the beginning, or depending on the context, may not be named at all.
A vehicle and many dynamic features move relative to the same reference frame. An infotainment system responds to a request from an occupant of a vehicle to provide information concerning a particular dynamic feature. The occupant provides the infotainment system with information concerning a bearing to the dynamic feature and the infotainment system identifies the dynamic feature in response.
A vehicle defines an interior space and an exterior space. Within the vehicle are internal microphones that are disposed to capture an acoustic event that originated in an origination space, which is either the interior space or the exterior space. An infotainment system includes circuitry that forms a head unit having an acoustic-signal processor that is configured to receive, from the microphones, a sound vector indicative of the acoustic event and to identify the origination space based at least in part on the sound vector.
An apparatus comprising an infotainment system including a proactive automotive assistant that executes a first action and a second action, wherein the first action is that of permitting spontaneous communication to an occupant in a vehicle and the second action is that of providing information indicating that spontaneous communication with the occupant is impermissible. The automotive assistant is configured to receive information selected from the group consisting of vehicle-status information concerning operation of the vehicle and occupant-status information concerning the occupant and to base the first and second actions at least in part on the information.
H04M 1/72454 - User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
H04W 4/40 - Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
H04W 4/16 - Communication-related supplementary services, e.g. call-transfer or call-hold
H04M 3/42 - Systems providing special services or facilities to subscribers
H04M 1/72463 - User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions to restrict the functionality of the device
G08G 1/0962 - Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
68.
INTERACTIVE AUDIO ENTERTAINMENT SYSTEM FOR VEHICLES
A system for interacting with an audio stream to obtain lyric information, control playback of the audio stream, and control aspects of the audio stream. In some instances, end users can request that the audio stream play with a lead vocal track or without a lead vocal track. Obtaining lyric information includes receiving via a text to speech module an audio playback of the lyric information.
G10L 13/00 - Speech synthesis; Text to speech systems
G10L 25/54 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for retrieval
69.
PLATFORM FOR INTEGRATING DISPARATE ECOSYSTEMS WITHIN A VEHICLE
A system for integrating disparate ecosystems including smart home and internet-of-things (IoT) ecosystems. The system including a vehicle assistant that executes within the context of a cloud-based application and that retrieves sensor data and utterances from a vehicle and forwards the sensor data and passenger-spoken utterances to a cloud-based application. Using the sensor data and utterances, the cloud-based application selects and executes a predetermined routine that includes at least one action to be completed in vehicle, on mobile phone or Smart Home/IoT ecosystem. The action is then complete by issuing the command to the vehicle head-unit, specified mobile phone or target ecosystem selected from a group of disparate ecosystems.
H04W 4/44 - Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for communication between vehicles and infrastructures, e.g. vehicle-to-cloud [V2C] or vehicle-to-home [V2H]
H04L 12/28 - Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
An automotive assistant that is connected to microphones and loudspeakers that are associated with different seats in a passenger vehicle includes a dialog manager that is configured to initiate a dialog based on an utterance received at a first one of the microphones and to advance that dialog based on an utterance received from another of the microphones.
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
B60K 35/10 - Input arrangements, i.e. from user to vehicle, associated with vehicle functions or specially adapted therefor
B60K 35/60 - Instruments characterised by their location or relative disposition in or on vehicles (arrangements of lighting devices on dashboards B60Q 3/10)
G10L 15/20 - Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
G10L 17/14 - Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
A system for routing commands issued by a passenger of a vehicle to a Smart Home and/or an Internet of Things (IoT) ecosystem via a connection manager. Issued commands are obtained from utterances using speech recognition and analyzed using natural language understanding and natural language processing. Using the output of the natural understanding analysis, the connection manager determines where to send the command by identifying a target Smart Home and/or IoT ecosystem.
A system for routing commands issued by a passenger of a vehicle to a Smart Home and/or an Internet of Things (IoT) ecosystem via a connection manager. Issued commands are obtained from utterances using speech recognition and analyzed using natural language understanding and natural language processing. Using the output of the natural understanding analysis, the connection manager determines where to send the command by identifying a target Smart Home and/or IoT ecosystem.
G10L 15/18 - Speech classification or search using natural language modelling
H04L 67/125 - Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks involving control of end-device applications over a network
73.
PLATFORM FOR INTEGRATING DISPARATE ECOSYSTEMS WITHIN A VEHICLE
A system for integrating disparate ecosystems including smart home and internet-of-things (IoT) ecosystems. The system including a vehicle assistant that executes within the context of a cloud-based application and that retrieves sensor data and utterances from a vehicle and forwards the sensor data and passenger-spoken utterances to a cloud-based application. Using the sensor data and utterances, the cloud-based application selects and executes a predetermined routine that includes at least one action to be completed in vehicle, on mobile phone or Smart Home/IoT ecosystem. The action is then complete by issuing the command to the vehicle head-unit, specified mobile phone or target ecosystem selected from a group of disparate ecosystems.
H04L 67/125 - Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks involving control of end-device applications over a network
An automotive processing unit (30) includes an infotainment system (38) having a speech interface (48), an application suite (50) comprising one or more spatially-cognizant applications, and an automotive assistant (46) that is configured to execute one or more of the spatially-cognizant applications. The speech interface (48) is configured to receive a navigation announcement (49) from a navigator (32) and a touring announcement (51) from one of the spatially-cognizant applications and, in response, to cause a spoken announcement to be made audible in a vehicle's cabin (12) through a loudspeaker (22). The spoken announcement comprising content from at least one of the touring announcement (51) and the navigation announcement (49).
An automotive processing unit includes an infotainment system having a speech interface, an application suite comprising one or more spatially-cognizant applications, and an automotive assistant that is configured to execute one or more of the spatially-cognizant applications. The speech interface is configured to receive a navigation announcement from a navigator and a touring announcement from one of the spatially-cognizant applications and, in response, to cause a spoken announcement to be made audible in a vehicle's cabin through a loudspeaker. The spoken announcement comprising content from at least one of the touring announcement and the navigation announcement.
An apparatus for entertaining an entertainee in a passenger vehicle includes an infotainment system having an automotive assistant that executes a spatially-cognizant entertainment application that interacts with the entertainee. In doing so, the automotive assistant receives information about the vehicle's environment from peripheral devices connected to the infotainment system. This provides the entertainment application with spatial intelligence that it then uses while interacting with the entertainee.
A63F 13/215 - Input arrangements for video game devices characterised by their sensors, purposes or types comprising means for detecting acoustic signals, e.g. using a microphone
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
A63F 13/213 - Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
A63F 13/216 - Input arrangements for video game devices characterised by their sensors, purposes or types using geographical information, e.g. location of the game device or player using GPS
A63F 13/285 - Generating tactile feedback signals via the game input device, e.g. force feedback
A63F 13/217 - Input arrangements for video game devices characterised by their sensors, purposes or types using environment-related information, i.e. information generated otherwise than by the player, e.g. ambient temperature or humidity
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
A vehicle includes a cabin, an internal-loudspeaker set an external-microphone set, and a signal processor that filters a raw audio signal that has been received by the external-microphone set broadcasts the resulting filtered audio signal into the cabin using the internal-loudspeaker set.
G08G 1/0962 - Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
G10K 11/178 - Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
In a parked vehicle, a head unit operating in low-power mode draws attention to a living being in a cabin of a parked vehicle. It does so by using a microphone to monitor said cabin for out-of-domain sounds, detecting a signal originating within said cabin, said signal being an acoustic signal that is representative of an out-of-domain sound, classifying said acoustic signal as being indicative of the existence of the living being within said cabin of said vehicle, and sending an alert to a first person.
B60W 50/14 - Means for informing the driver, warning the driver or prompting a driver intervention
B60R 25/10 - Fittings or systems for preventing or indicating unauthorised use or theft of vehicles actuating a signalling device
B60W 40/08 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to drivers or passengers
A haptic-communication system includes a fabric, a frame that supports the fabric, and a haptic element that is integral with a textile that forms the fabric. A transition of the haptic element between first and second states thereof provides communication with a person in contact with the haptic element.
A haptic-communication system includes a fabric, a frame that supports the fabric, and a haptic element that is integral with a textile that forms the fabric. A transition of the haptic element between first and second states thereof provides communication with a person in contact with the haptic element.
A biometric corroborator receives an enhanced sound file, a vehicular data provided by a vehicle, and telemetry data provided by the vehicle to a telematics server. It then uses these to corroborate each other before deciding whether to authenticate a proposed transaction.
G06Q 20/40 - Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check of credit lines or negative lists
B60R 25/25 - Means to switch the anti-theft system on or off using biometry
B60R 25/33 - Detection related to theft or to other events relevant to anti-theft systems of global position, e.g. by providing GPS coordinates
G01S 19/14 - Receivers specially adapted for specific applications
G06F 21/32 - User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
G07C 5/08 - Registering or indicating performance data other than driving, working, idle, or waiting time, with or without registering driving, working, idle, or waiting time
G10L 17/26 - Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
A method includes receiving a representation of a spoken utterance, processing the representation of the spoken utterance to identify, from a number of candidate domains, a request and a serving domain, and routing the request to a personal assistant based on the request and the serving domain. Identification of the serving domain is based on one or more of a contextual state, a behavior profile of a speaker of the utterance, and a semantic content of the utterance.
A biometric corroborator receives an enhanced sound file, a vehicular data provided by a vehicle, and telemetry data provided by the vehicle to a telematics server. It then uses these to corroborate each other before deciding whether to authenticate a proposed transaction.
According to some aspects, a method of monitoring an acoustic environment of a mobile device, at least one computer readable medium encoded with instructions that, when executed, perform such a method and/or a mobile device configured to perform such a method is provided. The method comprises receiving acoustic input from the environment of the mobile device while the mobile device is operating in the low power mode, detecting whether the acoustic input includes a voice command based on performing a plurality of processing stages on the acoustic input, wherein at least one of the plurality of processing stages is performed while the mobile device is operating in the low power mode, and using at least one contextual cue to assist in detecting whether the acoustic input includes a voice command.
G08G 1/0965 - Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages responding to signals from another vehicle, e.g. emergency vehicle
G08B 1/08 - Systems for signalling characterised solely by the form of transmission of the signal using electric transmission
G01S 3/80 - Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves
G08G 1/00 - Traffic control systems for road vehicles
G08G 7/00 - Traffic control systems for simultaneous control of two or more different kinds of craft
A method for managing interactions between users of an interface and a number of voice assistants associated with the interface includes receiving a voice command from a user of the interface, determining a voice assistant of the number of voice assistants for servicing the command, and providing a representation of the voice command to the voice assistant for servicing.
A method includes receiving a representation of a spoken utterance, processing the representation of the spoken utterance to identify, from a number of candidate domains, a request and a serving domain, and routing the request to a personal assistant based on the request and the serving domain. Identification of the serving domain is based on one or more of a contextual state, a behavior profile of a speaker of the utterance, and a semantic content of the utterance.
A shared mobility vehicle hosts a moving "info kiosk" that provides information assistance to potential passengers (or other individuals) and to on-board passenger. The approach is applicable to human-operated vehicles, and a particularly applicable to autonomous vehicles where no human operator is available to provide assistance.
A method for managing location-aware reminders in an automobile includes monitoring a geographic location of the automobile using a computer system installed in the vehicle. The computer system detects that the automobile has entered a geographic region associated with a location-aware reminder and issues a reminder message associated with the location-aware reminder to a driver of the automobile based on the detecting.
A method detects presence of a multi-tone siren type in an acoustic signal. The multi-tone siren type is associated with one or more siren patterns, where each siren pattern includes a number of time patterns at corresponding frequencies. The method includes processing a number of frequency components of a frequency domain representation of the acoustic signal over time to determine a corresponding plurality of values. That processing includes determining, for each frequency component, a value characterizing a presence of a time pattern associated with at least one siren pattern. The method also includes processing the values according to the siren patterns to determine a detection result indicating whether the multi-tone siren type is present in the acoustic signal.
G10L 25/00 - Speech or voice analysis techniques not restricted to a single one of groups
G08G 1/0965 - Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages responding to signals from another vehicle, e.g. emergency vehicle
G05D 1/00 - Control of position, course, altitude, or attitude of land, water, air, or space vehicles, e.g. automatic pilot
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
91.
Low complexity detection of voiced speech and pitch estimation
A low-complexity method and apparatus for detection of voiced speech and pitch estimation is disclosed that is capable of dealing with special constraints given by applications where low latency is required, such as in-car communication (ICC) systems. An example embodiment employs very short frames that may capture only a single excitation impulse of voiced speech in an audio signal. A distance between multiple such impulses, corresponding to a pitch period, may be determined by evaluating phase differences between low-resolution spectra of the very short frames. An example embodiment may perform pitch estimation directly in a frequency domain based on the phase differences and reduce computational complexity by obviating transformation to a time domain to perform the pitch estimation. In an event the phase differences are determined to be substantially linear, an example embodiment enhances voice quality of the voiced speech by applying speech enhancement to the audio signal.
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/84 - Detection of presence or absence of voice signals for discriminating voice from noise
G10L 25/90 - Pitch determination of speech signals
A system for detecting the presence of a person in an automobile includes a computing device and a microelectromechanical system (MEMS) sensor integrated into the automobile and configured to generate sensor data representing movement of the person in the automobile, the MEMS sensor being in operative communication with the computing device. The computing device is configured to process the sensor data from the MEMS sensor to detect the presence of a person in the automobile when the automobile is parked.
G08B 21/22 - Status alarms responsive to presence or absence of persons
G08B 21/24 - Reminder alarms, e.g. anti-loss alarms
G08B 25/01 - Alarm systems in which the location of the alarm condition is signalled to a central station, e.g. fire or police telegraphic systems characterised by the transmission medium
In-car personalization methods/systems for applying personalized settings to car functionality and for allowing multiple passengers in a car to apply personalized settings to their individual location in the care and according to stored preferences. The methods/systems provide multi-user profile selections using one or more of: (1) key-less multi-user profile selection; (2) biometric multi-user profile selection; and/or a combination of multi-modal technologies for {key-less, biometric} multi-user profile selection. The disclosed methods/systems combine multiple available sensors to solve the complementary tasks of: (1) detecting the presence of a person, (2) performing a coarse classification of the occupants (e.g., driver vs passenger; child vs adolescent/adult), (3) seat-based localization of detected occupants, and (4) identification of a specific user.
B60R 16/037 - Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric for occupant comfort
B60N 2/00 - Seats specially adapted for vehicles; Arrangement or mounting of seats in vehicles
G06F 21/32 - User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
H04S 7/00 - Indicating arrangements; Control arrangements, e.g. balance control
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
A system and/or method receives speech input including an accent. The accent is classified with an accent classifier to yield an accent classification. Automatic speech recognition is performed based on the speech input and the accent classification to yield an automatic speech recognition output. Natural language understanding is performed on the speech recognition output and the accent classification determining an intent of the speech recognition output. Natural language generation generates an output based on the speech recognition output and the intent and the accent classification. An output is rendered using text to speech based on the natural language generation and the accent classification.
A system and method for providing avatar device (115, 125, 135, 145) status indicators for voice assistants in multi-zone vehicles. The method comprises: receiving at least one signal from a plurality of microphones (114, 124, 134, 144), wherein each microphone (114, 124, 134, 144) is associated with one of a plurality of spatial zones (110, 120, 130, 140), and one of a plurality of avatar devices (115, 125, 135, 145); wherein the at least one signal further comprises a speech signal component from a speaker; wherein the speech signal component is a voice command or question; sending zone information associated with the speaker and with one of the plurality of spatial zones (110, 120, 130, 140) to an avatar (115, 125, 135, 145); activating one the plurality of avatar devices (115, 125, 135, 145) in a respective one of the plurality of spatial zones (110, 120, 130, 140) associated with the speaker.
A system and method for an errand plotter considers user criteria to generate an optimized errand plot for an errand involving one or more tasks at one or more locations. A computing device obtains a plurality of tasks input by a user. Each task has an associated location and associated user criteria. User criteria associated with each task is analyzed to determine which tasks of the plurality of tasks have a temperature sensitive attribute or a user defined urgency attribute. A task sequence is determined by arranging the tasks of the plurality of tasks that have user defined urgency attribute first, tasks of the plurality of tasks that have a temperature sensitive attribute last, and the remaining tasks of the plurality of tasks therebetween. The computing device generates an errand plot based on the determined task sequence.
There is provided an automated speech recognition system that applies weights to grapheme-to-phoneme models, and interpolates pronunciations from combinations of the models, to recognize utterances of foreign named entities for naive, informed, and in-between pronunciations.
While current voice assistants can respond to voice requests, creating smarter assistants that leverage location, past requests, and user data to enhance responses to future requests and to provide robust data about locations is desirable. A method for enhancing a geolocation database (“database”) associates a user-initiated triggering event with a location in a database by sensing user position and orientation within the vehicle and a position and orientation of the vehicle. The triggering event is detected by sensors arranged within a vehicle with respect to the user. The method determines a point of interest (“POI”) near the location based on the user-initiated triggering event. The method, responsive to the user-initiated triggering event, updates the database based on information related to the user-initiated triggering event at an entry of the database associated with the POI. The database and voice assistants can leverage the enhanced data about the POI for future requests.
While current voice assistants can respond to voice requests, creating smarter assistants that leverage location, past requests, and user data to enhance responses to future requests and to provide robust data about locations is desirable. A method for enhancing a geolocation database ("database") associates a user-initiated triggering event with a location in a database by sensing user position and orientation within the vehicle and a position and orientation of the vehicle. The triggering event is detected by sensors arranged within a vehicle with respect to the user. The method determines a point of interest ("POI") near the location based on the user-initiated triggering event. The method, responsive to the user-initiated triggering event, updates the database based on information related to the user-initiated triggering event at an entry of the database associated with the POI. The database and voice assistants can leverage the enhanced data about the POI for future requests.
A method for selecting a speech recognition result on a computing device includes receiving a first speech recognition result determined by the computing device, receiving first features, at least some of the features being determined using the first speech recognition result, determining whether to select the first speech recognition result or to wait for a second speech recognition result determined by a cloud computing service based at least in part on the first speech recognition result and the first features.