Soundhound AI IP Holding, LLC

États‑Unis d’Amérique

1-100 de 134 pour Soundhound AI IP Holding, LLC

Trier par

Recheche Texte


Affiner par
Date
2024	7
2023	11
2022	14
2021	32
2020	10
Avant 2020	60
Voir plus Voir moins
Classe IPC
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine	52
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel	42
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux	24
G06F 17/30 - Recherche documentaire; Structures de bases de données à cet effet	19
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance	19
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur	19
G10L 15/08 - Classement ou recherche de la parole	16
G06F 40/30 - Analyse sémantique	15
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole	15
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds	14
G10L 15/00 - Reconnaissance de la parole	14
G06F 3/16 - Entrée acoustiqueSortie acoustique	12
G06F 17/27 - Analyse automatique, p.ex. analyse grammaticale, correction orthographique	11
G06F 40/205 - Analyse syntaxique	9
G06F 17/00 - Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques	7
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots	7
G06F 40/253 - Analyse grammaticaleCorrigé du style	6
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels	6
G06F 40/40 - Traitement ou traduction du langage naturel	5
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification	4
G10L 15/04 - SegmentationDétection des limites de mots	4
G10L 15/05 - Détection des limites de mots	4
G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots	4
G06F 16/2452 - Traduction des requêtes	3
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur	3
G06N 20/00 - Apprentissage automatique	3
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion	3
G06N 3/08 - Méthodes d'apprentissage	3
G10H 5/00 - Instruments dans lesquels les sons sont produits au moyen de générateurs électroniques	3
G10L 13/00 - Synthèse de la paroleSystèmes de synthèse de la parole à partir de texte	3
Voir plus Voir moins
Statut
En Instance	23
Enregistré / En vigueur	111

Résultats pour

brevets

1 2 Prochaine page

1. SEMANTICALLY CONDITIONED VOICE ACTIVITY DETECTION

Numéro d'application	18047650
Statut	En instance
Date de dépôt	2022-10-19
Date de la première publication	2024-07-11
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Leitman, Victor

Abrégé

A method includes recognizing words comprised by a first utterance; interpreting the recognized words according to a grammar comprised by a domain; from the interpreting of the recognized words, determining a timeout period for the first utterance based on the domain of the first utterance; detecting end of voice activity in the first utterance; executing an instruction following an amount of time after detecting end of voice activity of the first utterance in response to the amount of time exceeding the timeout period, the executed instruction based at least in part on interpreting the recognized words.

Classes IPC ?

G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 25/78 - Détection de la présence ou de l’absence de signaux de voix

2. REAL-TIME NATURAL LANGUAGE PROCESSING AND FULFILLMENT

Numéro d'application	18055821
Statut	En instance
Date de dépôt	2022-11-15
Date de la première publication	2024-05-16
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Grossmann, Jon Macrae, Robert Halstvedt, Scott Mohajer, Keyvan

Abrégé

A system and method of real-time feedback confirmation to solicit a virtual assistant response from an evolving semantic state of at least a portion of an utterance. A user accesses a virtual assistant on an electronic device having the system and/or method configured to capture a command, a question, and/or a fulfillment request from audio such as, the speech emitted from the speaking user. The speech may be intercepted by a speech engine configured to transcribe the speech into text that is matched with the fragment pattern's regular expression to generate a fragment and/or the speech may be processed with a machine learning model to identify fragments. The fragments are identified by a domain handler configured to update a data structure of the current semantic state of the utterance in real-time on an interface of an electronic device.

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 3/16 - Entrée acoustiqueSortie acoustique
G06F 40/30 - Analyse sémantique
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine

3. DOMAIN SPECIFIC NEURAL SENTENCE GENERATOR FOR MULTI-DOMAIN VIRTUAL ASSISTANTS

Numéro d'application	18050182
Statut	En instance
Date de dépôt	2022-10-27
Date de la première publication	2024-05-02
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Singh, Pranav Zhang, Yilun Na, Eunjee Bettaglio, Olivia

Abrégé

Automatically generating sentences that a user can say to invoke a set of defined actions performed by a virtual assistant are disclosed. A sentence is received and keywords are extracted from the sentence. Based on the keywords, additional sentences are generated. A classifier model is applied to the generated sentences to determine a sentence that satisfies a threshold. In the situation a sentence satisfies the threshold, an intent associated with the classifier model can be invoked. In the situation the sentences fail to satisfy the classifier model, the virtual assistant can attempt to interpret the received sentence according to the most likely intent by invoking a sentence generation model fine-tuned for a particular domain, generate additional sentences with a high probability of having the same intent and fulfill the specific action defined by the intent.

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine

4. TEXT-TO-SPEECH SYSTEM WITH VARIABLE FRAME RATE

Numéro d'application	18051507
Statut	En instance
Date de dépôt	2022-10-31
Date de la première publication	2024-05-02
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Pearson, Steve Grossman, Jon

Abrégé

A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.

Classes IPC ?

G10L 13/047 - Architecture des synthétiseurs de parole
G10L 13/06 - Unités élémentaires de parole utilisées dans les synthétiseurs de paroleRègles de concaténation

5. ADAPTING AN UTTERANCE CUT-OFF PERIOD WITH USER SPECIFIC PROFILE DATA

Numéro d'application	18401770
Statut	En instance
Date de dépôt	2024-01-02
Date de la première publication	2024-04-25
Propriétaire	SOUNDHOUND AI IP HOLDING, LLC (USA) SOUNDHOUND AI IP, LLC (USA)
Inventeur(s)	Aguayo, Patricia Pozon Zhang, Jennifer Hee Young Probell, Jonah

Abrégé

A system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech of a user that is stored on a user's device or the system, which detects the voice activity, to determine according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech, which is based on the user profile, has a high probability of being a prefix of a longer utterance.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/05 - Détection des limites de mots
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 25/78 - Détection de la présence ou de l’absence de signaux de voix

6. Automatic Speech Recognition with Voice Personalization and Generalization

Numéro d'application	18046137
Statut	En instance
Date de dépôt	2022-10-12
Date de la première publication	2024-04-18
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan

Abrégé

A voice morphing model can transform diverse voices to one or a small number of target voices. An acoustic model can be trained for high accuracy on the target voices. Speech recognition on diverse voices can be performed by morphing it to a target voice and then performing recognition on audio with the target voice. The morphing model and an acoustic model for speech recognition can be trained separately or jointly. A voice morphing model can transform diverse voices to one or a small number of target voices. An acoustic model can be trained for high accuracy on the target voices. Speech recognition on diverse voices can be performed by morphing it to a target voice and then performing recognition on audio with the target voice. The morphing model and an acoustic model for speech recognition can be trained separately or jointly. A source of requests for speech recognition can pass audio and a voiceprint with requests. Speech recognition can run with improved accuracy by biasing an acoustic model for the voice in the audio using the voiceprint. The audio can be used to calculate a new voiceprint, which can be used to update the voiceprint included with the audio. The updated voiceprint can be sent back to the source and then used with future speech recognition requests.

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur

7. Authorization of Action by Voice Identification

Numéro d'application	17818628
Statut	En instance
Date de dépôt	2022-08-09
Date de la première publication	2024-02-15
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Hassan, Ahmadul Hom, James

Abrégé

Actions are authorized by computing a confidence score that exceeds a threshold. The confidence score is based on a match between metadata about requests and fields in corresponding database records. The confidences score weights matches by the dependability of the metadata for authentication. The confidence score is further based on the closeness of a sample of speech audio to a stored voiceprint. Additional identification may be required for authorization. The confidence score requirement may be relaxed based on identification in a buffer of recent action requests.

Classes IPC ?

G06F 21/32 - Authentification de l’utilisateur par données biométriques, p. ex. empreintes digitales, balayages de l’iris ou empreintes vocales
G10L 17/12 - Normalisation du score
G06F 3/16 - Entrée acoustiqueSortie acoustique

8. PRE-WAKEWORD SPEECH PROCESSING

Numéro d'application	17804544
Statut	En instance
Date de dépôt	2022-05-27
Date de la première publication	2023-11-30
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Stahl, Karl Mont-Reynaud, Bernard

Abrégé

Methods and systems for pre-wakeword speech processing are disclosed. Speech audio, comprising command speech spoken before a wakeword, may be stored in a buffer in oldest to newest order. Upon detection of the wakeword, reverse acoustic models and language models, such as reverse automatic speech recognition (R-ASR) can be applied to the buffered audio, in newest to oldest order, starting from before the wakeword. The speech is converted into a sequence of words. Natural language grammar models, such as natural language understanding (NLU), can be applied to match the sequence of words to a complete command, the complete command being associated with invoking a computer operation.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/08 - Classement ou recherche de la parole
G10L 25/93 - Différenciation entre parties voisées et non voisées des signaux de la parole

9. APPARATUS, PLATFORM, METHOD AND MEDIUM FOR INTENTION IMPORTANCE INFERENCE

Numéro d'application	17820660
Statut	En instance
Date de dépôt	2022-08-18
Date de la première publication	2023-11-30
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Wang, Chong

Abrégé

The application provides an apparatus, platform, method and medium for intention importance interference. The apparatus includes an interface configured to receive user-related information; and a processor coupled to the interface and configured to: extract data related to different aspects of a user from the user-related information; generate a plurality of intention probes based on the data related to different aspects of the user, each intention probe comprising an intention and associated data items; infer an importance of each intention probe by calculating a score of each associated data items of the intention probe based on the data related to different aspects of the user; and provide information associated with an intention probe with a highest importance.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G06F 16/9535 - Adaptation de la recherche basée sur les profils des utilisateurs et la personnalisation

10. SYSTEMS AND METHODS FOR GENERATING AND USING SHARED NATURAL LANGUAGE LIBRARIES

Numéro d'application	18206567
Statut	En instance
Date de dépôt	2023-06-06
Date de la première publication	2023-10-12
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan

Abrégé

Systems and methods for searching databases by sound data input are provided herein. A service provider may have a need to make their database(s) searchable through search technology. However, the service provider may not have the resources to implement such search technology. The search technology may allow for search queries using sound data input. The technology described herein provides a solution addressing the service provider’s need, by giving a search technology that furnishes search results in a fast, accurate manner. In further embodiments, systems and methods to monetize those search results are also described herein.

Classes IPC ?

G06F 16/33 - Requêtes
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/183 - Classement ou recherche de la parole utilisant une modélisation du langage naturel selon les contextes, p. ex. modèles de langage
G06F 16/174 - Élimination de redondances par le système de fichiers

11. Ordering from a menu using natural language

Numéro d'application	17716482
Numéro de brevet	12124804
Statut	Délivré - en vigueur
Date de dépôt	2022-04-08
Date de la première publication	2023-09-14
Date d'octroi	2024-10-22
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Aung, Joe Kyaw Soe Garcia, Vincent Ren, Junru

Abrégé

A computer system ingests a catalog of a plurality of items. The catalog is specific to a particular domain and including names for individual items of the plurality of items. One or more attributes are respectively associated to the individual items of the plurality of items. A specialist grammar specific to the particular domain of the catalog is obtained and a programming language code to interpret natural language input related to the catalog is generated using the specialist grammar, and the names for the individual items of the plurality of items and their associated one or more attributes.

Classes IPC ?

G06F 17/00 - Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
G06F 40/295 - Reconnaissance de noms propres
G06F 40/40 - Traitement ou traduction du langage naturel
G06N 5/022 - Ingénierie de la connaissanceAcquisition de la connaissance
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole

12. Token confidence scores for automatic speech recognition

Numéro d'application	17649810
Numéro de brevet	12223948
Statut	Délivré - en vigueur
Date de dépôt	2022-02-03
Date de la première publication	2023-08-03
Date d'octroi	2025-02-11
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Singh, Pranav Mishra, Saraswati Na, Eunjee

Abrégé

Methods and systems for correction of a likely erroneous word in a speech transcription are disclosed. By evaluating token confidence scores of individual words or phrases, the automatic speech recognition system can replace a low-confidence score word with a substitute word or phrase. Among various approaches, neural network models can be used to generate individual confidence scores. Such word substitution can enable the speech recognition system to automatically detect and correct likely errors in transcription. Furthermore, the system can indicate the token confidence scores on a graphic user interface for labeling and dictionary enhancement.

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole

13. VIDEO CONFERENCE CAPTIONING

Numéro d'application	18298282
Statut	En instance
Date de dépôt	2023-04-10
Date de la première publication	2023-08-03
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Coeytaux, Ethan

Abrégé

A video conferencing system, such as one implemented with a cloud server, receives audio streams from a plurality of endpoints. The system uses automatic speech recognition to transcribe speech in the audio streams. The system multiplexes the transcriptions into individual caption streams and sends them to the endpoints, but the caption stream to each endpoint omits the transcription of audio from the endpoint. Some systems allow muting of audio through an indication to the system. The system then omits sending the muted audio to other endpoints and also omits sending a transcription of the muted audio to other endpoints.

Classes IPC ?

G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 19/005 - Correction d’erreurs induites par le canal de transmission, lorsqu’elles sont liées à l’algorithme de codage
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots
G10L 15/14 - Classement ou recherche de la parole utilisant des modèles statistiques, p. ex. des modèles de Markov cachés [HMM]

14. METHOD AND APPARATUS FOR INTELLIGENT VOICE QUERY

Numéro d'application	17654635
Statut	En instance
Date de dépôt	2022-03-14
Date de la première publication	2023-07-27
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Wang, Chong

Abrégé

A method and an apparatus for processing an intelligent voice query. A voice query input is received from a user. Automatic speech recognition and natural language understanding generate structured query data. It is modified based on an input adaptation rule to obtain modified structured query data appropriate for a content providing server, which provides a query result output corresponding to the modified structured query data. Input adaptation rules may comprise rule sets based on behavior patterns of the user and/or business recommendations. The query result output can be used for natural language generation, which may have similar adaptation rules for output.

Classes IPC ?

G06F 16/2452 - Traduction des requêtes
G06F 16/242 - Formulation des requêtes
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine

15. METHOD AND SYSTEM FOR ASSISTING A USER

Numéro d'application	17561548
Statut	En instance
Date de dépôt	2021-12-23
Date de la première publication	2023-06-29
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Kam, Kaishin Pierret, Christophe

Abrégé

A method of assisting a user. The method including obtaining a plurality of rules having condition components and action components, the action components specifying conversation schemas, detecting, by a sensor, a fact related to an environment of the user, identifying a rule, of the plurality of rules, having a condition component that is satisfied by the detected fact, initiating a conversation with the user according to a conversation schema of the action component of the rule of the plurality of rules, and performing an action in response to a positive statement by the user.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G01C 21/36 - Dispositions d'entrée/sortie pour des calculateurs embarqués

16. CONTROLLING A GRAPHICAL USER INTERFACE BY TELEPHONE

Numéro d'application	17408476
Statut	En instance
Date de dépôt	2021-08-22
Date de la première publication	2023-02-23
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Kamyar Mohajer, Keyvan Hom, James Jiang, Evelyn

Abrégé

A method and system for controlling a GUI on a user's network-connected device, the control being provided by a telephone call between the user and a speech recognition and speech synthesis system. An example of a restaurant ordering system is provided. The user calls a phone number and is guided through a verbal ordering process that includes one or more of: adding an item, deleting an item, changing quantities, changing sizes, and changing details of an item. The user's choices are added to a display so that a current status of the order is visible to the user. The GUI is updated as changes are made to the order. The GUI can also request additional information, upsell items, and show menus. The GUI aids the user in confirming that the order is correct. The system provides the final order to a restaurant for fulfillment.

Classes IPC ?

G06Q 30/06 - Transactions d’achat, de vente ou de crédit-bail
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 13/02 - Procédés d'élaboration de parole synthétiqueSynthétiseurs de parole
G06F 3/16 - Entrée acoustiqueSortie acoustique
G06F 16/955 - Recherche dans le Web utilisant des identifiants d’information, p. ex. des localisateurs uniformisés de ressources [uniform resource locators - URL]
G06F 9/451 - Dispositions d’exécution pour interfaces utilisateur

17. Differential spatial rendering of audio sources

Numéro d'application	17655650
Numéro de brevet	11589184
Statut	Délivré - en vigueur
Date de dépôt	2022-03-21
Date de la première publication	2023-02-21
Date d'octroi	2023-02-21
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard

Abrégé

Methods and systems for intuitive spatial audio rendering with improved intelligibility are disclosed. By establishing a virtual association between an audio source and a location in the listener's virtual audio space, a spatial audio rendering system can generate spatial audio signals that create a natural and immersive audio field for a listener. The system can receive the virtual location of the source as a parameter and map the source audio signal to a source-specific multi-channel audio signal. In addition, the spatial audio rendering system can be interactive and dynamically modify the rendering of the spatial audio in response to a user's active control or tracked movement.

Classes IPC ?

H04S 7/00 - Dispositions pour l'indicationDispositions pour la commande, p. ex. pour la commande de l'équilibrage

18. Using a smartphone to control another device by voice

Numéro d'application	17372123
Numéro de brevet	11950300
Statut	Délivré - en vigueur
Date de dépôt	2021-07-09
Date de la première publication	2023-01-12
Date d'octroi	2024-04-02
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Tsuchida, Keisuke

Abrégé

A method and system for implementing a speech-enabled interface of a host device via an electronic mobile device in a network are provided. The method includes establishing a communication session between the host device and the mobile device via a session service provider. According to some embodiments, a barcode can be adopted to enable the pairing of the host device and mobile device. Furthermore, the present method and system employ the voice interface in conjunction with speech recognition systems and natural language processing to interpret voice input for the hosting device, which can be used to perform one or more actions related to the hosting device.

Classes IPC ?

H04W 76/11 - Attribution ou utilisation d'identifiants de connexion
G10L 15/08 - Classement ou recherche de la parole
H04W 4/50 - Fourniture de services ou reconfiguration de services

19. Sidebar conversations

Numéro d'application	17353639
Numéro de brevet	11539920
Statut	Délivré - en vigueur
Date de dépôt	2021-06-21
Date de la première publication	2022-12-22
Date d'octroi	2022-12-27
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Stonehocker, Timothy P

Abrégé

A system and a method are disclosed that enable sidebar conversations between two or more attendees that are participating in a primary or main meeting. The sidebar conversation occurs in conjunction or concurrently with the primary meeting. A first attendee provides commands to indicate a desire to initiate a sidebar conversation and information about a targeted attendee. The commands are analyzed to determine if a trigger phrase is included. The commands are analyzed to determine if there is an identification of a second (targeted) attendee, who is currently participating in the main meeting. If the second attendee is available, then the sidebar conversation is initiated. Additional attendees can be added to the sidebar conversation. Additional independent and simultaneous sidebar conversations can be initiated (by attendees currently participating in the active sidebar conversation), thereby allowing one attendee to conduct multiple simultaneous sidebar conversations while being able to switch between them.

Classes IPC ?

H04N 7/15 - Systèmes pour conférences
H04L 65/403 - Dispositions pour la communication multipartite, p. ex. pour les conférences
H04L 65/1069 - Établissement ou terminaison d'une session
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 25/57 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour le traitement des signaux vidéo
G06F 3/16 - Entrée acoustiqueSortie acoustique
G10L 15/08 - Classement ou recherche de la parole

20. Method for providing information, method for generating database, and program

Numéro d'application	17649052
Numéro de brevet	11995143
Statut	Délivré - en vigueur
Date de dépôt	2022-01-26
Date de la première publication	2022-12-01
Date d'octroi	2024-05-28
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Naito, Masaki Tsuchida, Keisuke Yoneyama, Jun Sawada, Kaku

Abrégé

As audio (1) is input to an extension of a browser, the extension transmits the audio (1) to a language processing server. A speech recognition unit obtains a text (1) corresponding to the audio (1), and transmits the text (1) to a natural language understanding unit. In the natural language understanding unit, an information processing unit identifies a URL (1) corresponding to the text (1), and transmits the URL (1) to the browser. The extension passes the URL (1) to a browsing function. The browsing function uses the URL (1) to access a web server. The web server transmits a web page (1) corresponding to the URL (1) to the browser. The browsing function shows a screen corresponding to the web page (1) on a display.

Classes IPC ?

G06F 16/95 - Recherche dans le Web
G06F 16/33 - Requêtes
G06F 16/955 - Recherche dans le Web utilisant des identifiants d’information, p. ex. des localisateurs uniformisés de ressources [uniform resource locators - URL]
G06F 40/40 - Traitement ou traduction du langage naturel
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole

21. API FOR SERVICE PROVIDER FULFILLMENT OF DATA PRIVACY REQUESTS

Numéro d'application	17237705
Statut	En instance
Date de dépôt	2021-04-22
Date de la première publication	2022-10-27
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Qiu, Kevin Jiang, Evelyn Eichstaedt, Matthias Heit, Warren S.

Abrégé

A system and method are disclosed for fulfilling GDPR and other privacy requests in a client device system as well as a downstream service provider with which the client device system partners. In examples, the downstream service provider may be a voice assistant service provider providing voice recognition and language understanding capabilities to an upstream client device system.

Classes IPC ?

G06F 21/62 - Protection de l’accès à des données via une plate-forme, p. ex. par clés ou règles de contrôle de l’accès

22. Adapting an utterance cut-off period based on parse prefix detection

Numéro d'application	17698623
Numéro de brevet	11862162
Statut	Délivré - en vigueur
Date de dépôt	2022-03-18
Date de la première publication	2022-06-30
Date d'octroi	2024-01-02
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Aguayo, Patricia Pozon Zhang, Jennifer Hee Young Probell, Jonah

Abrégé

A processing system detects a period of non-voice activity and compares its duration to a cutoff period. The system adapts the cutoff period based on parsing previously-recognized speech to determine, according to a model, such as a machine-learned model, the probability that the speech recognized so far is a prefix to a longer complete utterance. The cutoff period is longer when a parse of previously recognized speech has a high probability of being a prefix of a longer utterance.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/05 - Détection des limites de mots
G10L 25/78 - Détection de la présence ou de l’absence de signaux de voix
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/08 - Classement ou recherche de la parole
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots

23. SYSTEM AND METHOD FOR COMPUTING REGION CENTERS BY POINT CLUSTERING

Numéro d'application	17549796
Statut	En instance
Date de dépôt	2021-12-13
Date de la première publication	2022-06-16
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Pierret, Christophe

Abrégé

A system and a method are disclosed that calculate the center of a geographic region. A set of topological/geographical points is received. A set of clusters is determined. A weight for each cluster is computed. The highest weighted cluster is selected. The geographic region center is calculated using the selected cluster. The geographical points can include a key for each point and be filtered by an indicated key before calculating the center of a geographic location.

Classes IPC ?

G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques

24. System and Method For Achieving Interoperability Through The Use of Interconnected Voice Verification System

Numéro d'application	17108724
Statut	En instance
Date de dépôt	2020-12-01
Date de la première publication	2022-06-02
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Heit, Warren S.

Abrégé

A system and method are disclosed for achieving interoperability and access to a personal extension knowledge/preference database (PEKD) through interconnected voice verification systems. Devices from various different companies and systems can link to a voice verification system (VVS). Users can also enroll with the VSS so that the VSS can provide authentication of users by personal wake phrases. Thereafter users can access their PEKD from un-owned devices by speaking their wake phrase.

Classes IPC ?

G10L 17/24 - Procédures interactivesInterfaces homme-machine l’utilisateur étant incité à prononcer un mot de passe ou une phrase prédéfinie
G10L 17/04 - Entraînement, enrôlement ou construction de modèle
G06F 21/32 - Authentification de l’utilisateur par données biométriques, p. ex. empreintes digitales, balayages de l’iris ou empreintes vocales
H04L 29/06 - Commande de la communication; Traitement de la communication caractérisés par un protocole
G06N 20/00 - Apprentissage automatique
G06F 16/25 - Systèmes d’intégration ou d’interfaçage impliquant les systèmes de gestion de bases de données

25. NEURAL SENTENCE GENERATOR FOR VIRTUAL ASSISTANTS

Numéro d'application	17455727
Statut	En instance
Date de dépôt	2021-11-19
Date de la première publication	2022-05-26
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Singh, Pranav Mohajer, Keyvan Zhang, Yilun

Abrégé

Methods and systems for automatically generating sample phrases or sentences that a user can say to invoke a set of defined actions performed by a virtual assistant are disclosed. By enabling finetuned general-purpose natural language models, the system can generate potential and accurate utterance sentences based on extracted keywords or the input utterance sentence. Furthermore, domain-specific datasets can be used to train the pre-trained, general-purpose natural language models via unsupervised learning. These generated sentences can improve the efficiency of configuring a virtual assistant. The system can further optimize the effectiveness of a virtual assistant in understanding the user, which can enhance the user experience of communicating with it.

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine

26. RECOMMENDATION ENGINE FOR UPSELLING IN RESTAURANT ORDERS

Numéro d'application	17667535
Statut	En instance
Date de dépôt	2022-02-08
Date de la première publication	2022-05-26
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Kamyar Macrae, Robert

Abrégé

A computer-implemented method is provided to support a food ordering system for food items from a menu of a restaurant using natural language. Expressions made for ordering are used to recommend a food item that a user has a high probability of wanting to include in an order. The recommendation engine is trained using machine learning. Expressions are collected and parsed to identify words that might indicate food items offered by the restaurant. The words are provided to a restaurant owner to identify food items on a menu, with which the words are associated.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur
G10L 17/00 - Techniques d'identification ou de vérification du locuteur
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G06F 16/242 - Formulation des requêtes
G06F 16/22 - IndexationStructures de données à cet effetStructures de stockage

27. Text-to-Speech Adapted by Machine Learning

Numéro d'application	17580289
Statut	En instance
Date de dépôt	2022-01-20
Date de la première publication	2022-05-12
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Almudafar-Depeyrot, Monika

Abrégé

Machine learned models take in vectors representing desired behaviors and generate voice vectors that provide the parameters for text-to-speech (TTS) synthesis. Models may be trained on behavior vectors that include user profile attributes, situational attributes, or semantic attributes. Situational attributes may include age of people present, music that is playing, location, noise, and mood. Semantic attributes may include presence of proper nouns, number of modifiers, emotional charge, and domain of discourse. TTS voice parameters may apply per utterance and per word as to enable contrastive emphasis.

Classes IPC ?

G10L 13/10 - Règles de prosodie dérivées du texteIntonation ou accent tonique
G10L 13/04 - Détails des systèmes de synthèse de la parole, p. ex. structure du synthétiseur ou gestion de la mémoire
G10L 13/033 - Édition de voix, p. ex. transformation de la voix du synthétiseur

28. DRIVER INTERFACE WITH VOICE AND GESTURE CONTROL

Numéro d'application	17547917
Statut	En instance
Date de dépôt	2021-12-10
Date de la première publication	2022-05-05
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Li, Zili Vasconcelos, Cristina

Abrégé

A driver interface for use within an automobile provides responses to voice commands issued for example by a driver of the automobile. The interface includes a camera and microphone for capturing image data such as gestures and audio data from the automobile driver. The image data and audio data are processed to extract image and linguistic features from the image and audio data, which image and linguistic features are processed to interpret and infer a meaning of the voice command.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G10L 15/24 - Reconnaissance de la parole utilisant des caractéristiques non acoustiques
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G06V 10/40 - Extraction de caractéristiques d’images ou de vidéos
G06V 10/70 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique
G06V 20/40 - ScènesÉléments spécifiques à la scène dans le contenu vidéo

29. Using phonetic variants in a local context to improve natural language understanding

Numéro d'application	16529689
Numéro de brevet	11295730
Statut	Délivré - en vigueur
Date de dépôt	2019-08-01
Date de la première publication	2022-04-05
Date d'octroi	2022-04-05
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Wilson, Christopher Mont-Reynaud, Bernard

Abrégé

A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots

30. System and method for providing natural language recommendations

Numéro d'application	16447958
Numéro de brevet	11276398
Statut	Délivré - en vigueur
Date de dépôt	2019-06-20
Date de la première publication	2022-03-15
Date d'octroi	2022-03-15
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Macrae, Robert Mohajer, Kamyar

Abrégé

A system that includes a stand-alone device or a server connected client device are in communication with a server and provide recommendations. The device includes an input component, a storage component, a processor and an output component. The server-connected client device includes an input component that receives the user's request, a communication component that communicates the request to the server and receives the recommendation from the server, and an output component that provides the recommendation to user.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 16/242 - Formulation des requêtes
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur
G06F 16/22 - IndexationStructures de données à cet effetStructures de stockage
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 17/00 - Techniques d'identification ou de vérification du locuteur

31. Conditional responses to application commands in a client-server system

Numéro d'application	16791421
Numéro de brevet	11250217
Statut	Délivré - en vigueur
Date de dépôt	2020-02-14
Date de la première publication	2022-02-15
Date d'octroi	2022-02-15
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Wilson, Christopher S. Khov, Kheng Graves, Ian

Abrégé

A client device receives a user request (e.g., in natural language form) to execute a command of an application. The client device delegates interpretation of the request to a response-processing server. Using domain knowledge previously provided by a developer of the application, the response-processing server determines the various possible responses that client devices could make in response to the request based on circumstances such as the capabilities of the client devices and the state of the application data. The response-processing server accordingly generates a response package that describes a number of different conditional responses that client devices could have to the request and provides the response package to the client device. The client device selects the appropriate response from the response package based on the circumstances as determined by the client device, executes the command (if possible), and provides the user with some representation of the response.

Classes IPC ?

G06F 40/30 - Analyse sémantique
H04L 29/08 - Procédure de commande de la transmission, p.ex. procédure de commande du niveau de la liaison

32. System and method for interpreting natural language commands with compound criteria

Numéro d'application	17081996
Numéro de brevet	11238101
Statut	Délivré - en vigueur
Date de dépôt	2020-10-27
Date de la première publication	2022-02-01
Date d'octroi	2022-02-01
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan

Abrégé

A command-processing server receives a natural language command from a user. The command-processing server has a set of domain command interpreters corresponding to different domains in which commands can be expressed, such as the domain of entertainment, or the domain of travel. Some or all of the domain command interpreters recognize user commands having a verbal prefix, an optional pre-filter, an object, and an optional post-filter; the pre- and post-filters may be compounded expressions involving multiple atomic filters. Different developers may independently specify the domain command interpreters and the sub-structure interpreters on which they are based.

Classes IPC ?

G06F 16/9032 - Formulation de requêtes
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur
G06F 3/16 - Entrée acoustiqueSortie acoustique
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
H04N 21/482 - Interface pour utilisateurs finaux pour la sélection de programmes
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole

33. Support for grammar inflections within a software development framework

Numéro d'application	17474680
Numéro de brevet	11797777
Statut	Délivré - en vigueur
Date de dépôt	2021-09-14
Date de la première publication	2021-12-30
Date d'octroi	2023-10-24
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Taron, Seth

Abrégé

A natural language understanding server includes grammars specified in a modified extended Backus-Naur form (MEBNF) that includes an agglutination metasymbol not supported by conventional EBNF grammar parsers, as well as an agglutination preprocessor. The agglutination preprocessor applies one or more sets of agglutination rewrite rules to the MEBNF grammars, transforming them to EBNF grammars that can be processed by conventional EBNF grammar parsers. Permitting grammars to be specified in MEBNF form greatly simplifies the authoring and maintenance of grammars supporting inflected forms of words in the languages described by the grammars.

Classes IPC ?

G06F 40/30 - Analyse sémantique
G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots
G06F 8/41 - Compilation
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 40/205 - Analyse syntaxique
G06F 40/253 - Analyse grammaticaleCorrigé du style

34. Configurable neural speech synthesis

Numéro d'application	17341082
Numéro de brevet	11741941
Statut	Délivré - en vigueur
Date de dépôt	2021-06-07
Date de la première publication	2021-12-16
Date d'octroi	2023-08-29
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Richards, Andrew

Abrégé

A discriminator trained on labeled samples of speech can compute probabilities of voice properties. A speech synthesis generative neural network that takes in text and continuous scale values of voice properties is trained to synthesize speech audio that the discriminator will infer as matching the values of the input voice properties. Voice parameters can include speaker voice parameters, accents, and attitudes, among others. Training can be done by transfer learning from an existing neural speech synthesis model or such a model can be trained with a loss function that considers speech and parameter values. A graphical user interface can allow voice designers for products to synthesize speech with a desired voice or generate a speech synthesis engine with frozen voice parameters. A vector of parameters can be used for comparison to previously registered voices in databases such as ones for trademark registration.

Classes IPC ?

G10L 13/047 - Architecture des synthétiseurs de parole
G10L 13/08 - Analyse de texte ou génération de paramètres pour la synthèse de la parole à partir de texte, p. ex. conversion graphème-phonème, génération de prosodie ou détermination de l'intonation ou de l'accent tonique
G10L 13/033 - Édition de voix, p. ex. transformation de la voix du synthétiseur
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G06N 3/084 - Rétropropagation, p. ex. suivant l’algorithme du gradient
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
G06F 3/16 - Entrée acoustiqueSortie acoustique
G06F 3/04847 - Techniques d’interaction pour la commande des valeurs des paramètres, p. ex. interaction avec des règles ou des cadrans

35. Interpreting Queries According To Preferences

Numéro d'application	17389847
Statut	En instance
Date de dépôt	2021-07-30
Date de la première publication	2021-11-18
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Mont-Reynaud, Bernard Wilson, Christopher S.

Abrégé

The present invention extends to methods, systems, and computer program products for interpreting queries according to preferences. Multi-domain natural language understanding systems can support a variety of different types of clients. Queries can be received and interpreted across one or more domains. Preferred query interpretations can be identified and query responses provided based on any of: domain preferences, preferences indicated by an identifier, or (e.g., weighted) scores exceeding a threshold.

Classes IPC ?

G06F 40/30 - Analyse sémantique
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux

36. Virtual assistant domain functionality

Numéro d'application	17383097
Numéro de brevet	11836453
Statut	Délivré - en vigueur
Date de dépôt	2021-07-22
Date de la première publication	2021-11-11
Date d'octroi	2023-12-05
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Kamyar Mohajer, Keyvan Mont-Reynaud, Bernard Singh, Pranav

Abrégé

Aspects include methods, systems, and computer-program products providing virtual assistant domain functionality. A natural language query including one or more words is received. A collection of natural language modules is accessed. The collection natural language modules are configured to process sets of natural language queries. A natural language module, from the collection of natural language modules, is identified to interpret the natural language query. An interpretation of the natural language query is computed using the identified natural language module. A response to the natural language query is returned using the computed interpretation.

Classes IPC ?

G06F 40/40 - Traitement ou traduction du langage naturel
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G06Q 30/0283 - Estimation ou détermination de prix
G06Q 20/10 - Architectures de paiement spécialement adaptées aux systèmes de transfert électronique de fondsArchitectures de paiement spécialement adaptées aux systèmes de banque à domicile
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification

37. Method and system for acoustic model conditioning on non-phoneme information features

Numéro d'application	17224967
Numéro de brevet	11741943
Statut	Délivré - en vigueur
Date de dépôt	2021-04-07
Date de la première publication	2021-10-28
Date d'octroi	2023-08-29
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Gowayyed, Zizu Mohajer, Keyvan

Abrégé

A method and system for acoustic model conditioning on non-phoneme information features for optimized automatic speech recognition is provided. The method includes using an encoder model to encode sound embedding from a known key phrase of speech and conditioning an acoustic model with the sound embedding to optimize its performance in inferring the probabilities of phonemes in the speech. The sound embedding can comprise non-phoneme information related to the key phrase and the following utterance. Further, the encoder model and the acoustic model can be neural networks that are jointly trained with audio data.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/04 - SegmentationDétection des limites de mots

38. Loudspeaker with transmitter

Numéro d'application	17301308
Numéro de brevet	11627405
Statut	Délivré - en vigueur
Date de dépôt	2021-03-31
Date de la première publication	2021-10-07
Date d'octroi	2023-04-11
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Stahl, Karl

Abrégé

A speaker device includes an electroacoustic transducer configured to convert an audio signal into a set of sound waves and a transmitter configured to transmit an electromagnetic signal that carries the audio signal for receipt at distances limited to an audibility range of the set of sound waves. The audibility range of the set of sound waves corresponds to a distance at which the set of sound waves is estimated to be below a predetermined sound level.

Classes IPC ?

H04R 25/00 - Appareils pour sourds
H04R 1/10 - ÉcouteursLeurs fixations
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 21/0316 - Amélioration de l'intelligibilité de la parole, p. ex. réduction de bruit ou annulation d'écho en changeant l’amplitude
G10L 25/06 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant des coefficients de corrélation
G10L 25/51 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
H04R 1/08 - EmbouchuresLeurs fixations
H04R 5/033 - Casques pour communication stéréophonique

39. Framework for identifying distinct questions in a composite natural language query

Numéro d'application	16292190
Numéro de brevet	11138205
Statut	Délivré - en vigueur
Date de dépôt	2019-03-04
Date de la première publication	2021-10-05
Date d'octroi	2021-10-05
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Mont-Reynaud, Bernard Hubert, Philipp

Abrégé

A query-processing server provides natural language services to applications. More specifically, the query-processing server receives and stores domain knowledge information from application developers, the domain knowledge information comprising a linguistic description of the natural language user queries that application developers wish their applications to support. A first portion of the domain knowledge information is applied to transform a natural language query received from an application to an ordered sequence of question elements. A second portion of the domain knowledge information is applied to group the ordered sequence of question elements into a plurality of distinct structured questions posed by the natural language query. The distinct structured questions may then be provided to the application, which may then execute them and obtain the corresponding data referenced by the questions.

Classes IPC ?

G06F 16/00 - Recherche d’informationsStructures de bases de données à cet effetStructures de systèmes de fichiers à cet effet
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur
G06F 16/2455 - Exécution des requêtes
G06F 40/40 - Traitement ou traduction du langage naturel

40. Framework for understanding complex natural language queries in a dialog context

Numéro d'application	16363929
Numéro de brevet	11132504
Statut	Délivré - en vigueur
Date de dépôt	2019-03-25
Date de la première publication	2021-09-28
Date d'octroi	2021-09-28
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Wilson, Christopher S Mohajer, Keyvan

Abrégé

A domain-independent framework parses and interprets compound natural language queries in the context of a conversation between a human and an agent. Generic grammar rules and corresponding semantics support the understanding of compound queries in the conversation context. The sub-queries themselves are from one or more domains, and they are parsed and interpreted by a pre-existing grammar, covering one or more pre-existing domains. The pre-existing grammar, extended by the generic rules, recognizes all compound queries based on any queries recognized by the pre-existing grammar. Use of the disclosed framework requires little or no change in the domain-specific NLU handling code. The framework defines a generic approach to propagating context data between sub-queries of a compound query. The framework can be further extended to propagate intra-query context data in, out and across query components. Complex query results, and other data such as accounting data, can also be propagated simultaneously with dialog context data in a consolidated intra-query context data structure.

Classes IPC ?

G06F 40/205 - Analyse syntaxique
G06F 40/30 - Analyse sémantique

41. Deriving acoustic features and linguistic features from received speech audio

Numéro d'application	17325114
Numéro de brevet	12175964
Statut	Délivré - en vigueur
Date de dépôt	2021-05-19
Date de la première publication	2021-09-02
Date d'octroi	2024-12-24
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Lokeswarappa, Kiran Garaga Gedalius, Joel Mont-Reynaud, Bernard Huang, Jun

Abrégé

A computer-implemented method is provided. The method including receiving speech audio of dictation associated with a user ID, deriving acoustic features from the speech audio, storing the derived acoustic features in a user profile associated with the user ID, receiving a request for acoustic features through an application programming interface (API), the request including the user ID, and sending the derived acoustic features through the API.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G06F 40/205 - Analyse syntaxique
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification
G06F 40/253 - Analyse grammaticaleCorrigé du style
G06N 20/00 - Apprentissage automatique
G06Q 30/0241 - Publicités
G06Q 30/0251 - Publicités ciblées
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 25/90 - Détermination de la hauteur tonale des signaux de parole
H04L 67/306 - Profils des utilisateurs
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 25/51 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation

42. Semantic grammar extensibility within a software development framework

Numéro d'application	16505185
Numéro de brevet	11100291
Statut	Délivré - en vigueur
Date de dépôt	2019-07-08
Date de la première publication	2021-08-24
Date d'octroi	2021-08-24
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Wilson, Christopher S. Mont-Reynaud, Bernard

Abrégé

A query-processing server that interprets natural language expressions supports the extension of a first semantic grammar (for a particular type of expression), which is declared extensible, by a second semantic grammar (for another type of expression). When an extension is requested, the query-processing server checks that the two semantic grammars have compatible semantic types. The developers need not have any knowledge of each other, or about their respective grammars. Performing an extension may be done by yet another party, such as the query-processing server, or another server, independently of all previous parties. The use of semantic grammar extensions provides a way to expand the coverage and functionality of natural language interpretation in a simple and flexible manner, so that new forms of expression may be supported, and seamlessly combined with pre-existing interpretations. Finally, in some implementations, this is done without loss of efficiency.

Classes IPC ?

G06F 40/30 - Analyse sémantique
G06F 8/20 - Conception de logiciels

43. Factored neural networks for language modeling

Numéro d'application	16228278
Numéro de brevet	11100288
Statut	Délivré - en vigueur
Date de dépôt	2018-12-20
Date de la première publication	2021-08-24
Date d'octroi	2021-08-24
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Gowayyed, Zizu Mont-Reynaud, Bernard

Abrégé

A factored neural network estimates a conditional distribution of token probabilities using two smaller models, a class model and an index model. Every token has a unique class, and a unique index in the class. The two smaller models are trained independently but cooperate at inference time. Factoring with more than two models is possible. Networks can be recurrent. Factored neural networks for statistical language modelling treat words as tokens. In that context, classes capture linguistic regularities. Partitioning of words into classes keeps the number of classes and the maximum size of a class both low. Optimization of partitioning is by iteratively splitting and assembling classes.

Classes IPC ?

G06F 40/30 - Analyse sémantique
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
G06N 3/08 - Méthodes d'apprentissage
G06F 40/284 - Analyse lexicale, p. ex. segmentation en unités ou cooccurrence

44. Neural acoustic model

Numéro d'application	16790643
Numéro de brevet	11392833
Statut	Délivré - en vigueur
Date de dépôt	2020-02-13
Date de la première publication	2021-08-19
Date d'octroi	2022-07-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Wieman, Maisy Spencer, Andrew Carl L{hacek Over (i)}, Zìlì Vasconcelos, Cristina

Abrégé

An audio processing system is described. The audio processing system uses a convolutional neural network architecture to process audio data, a recurrent neural network architecture to process at least data derived from an output of the convolutional neural network architecture, and a feed-forward neural network architecture to process at least data derived from an output of the recurrent neural network architecture. The feed-forward neural network architecture is configured to output classification scores for a plurality of sound units associated with speech. The classification scores indicate a presence of one or more sound units in the audio data. The convolutional neural network architecture has a plurality of convolutional groups arranged in series, where a convolutional group includes a combination of two data mappings arranged in parallel.

Classes IPC ?

G06N 3/08 - Méthodes d'apprentissage
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion

45. Wake suppression for audio playing and listening devices

Numéro d'application	16781214
Numéro de brevet	11328721
Statut	Délivré - en vigueur
Date de dépôt	2020-02-04
Date de la première publication	2021-08-05
Date d'octroi	2022-05-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Yang, Hsuan Zhang, Qìndí Heit, Warren S.

Abrégé

A system and method are disclosed for ignoring a wakeword received at a speech-enabled listening device when it is determined the wakeword is reproduced audio from an audio-playing device. Determination can be by detecting audio distortions, by an ignore flag sent locally between an audio-playing device and speech-enabled device, by and ignore flag sent from a server, by comparison of received audio played audio to a wakeword within an audio-playing device or a speech-enabled device, and other means.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/08 - Classement ou recherche de la parole

46. Providing a platform for configuring device-specific speech recognition and using a platform for configuring device-specific speech recognition

Numéro d'application	17237003
Numéro de brevet	11367448
Statut	Délivré - en vigueur
Date de dépôt	2021-04-21
Date de la première publication	2021-08-05
Date d'octroi	2022-06-21
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Patel, Mehul

Abrégé

A method of providing a platform for configuring device-specific speech recognition is provided. The method includes providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device, receiving, from a developer, a selection of the set of the at least two acoustic models, and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 3/16 - Entrée acoustiqueSortie acoustique
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel

47. Building a natural language understanding application using a received electronic record containing programming code including an interpret-block, an interpret-statement, a pattern expression and an action statement

Numéro d'application	17225997
Numéro de brevet	11776533
Statut	Délivré - en vigueur
Date de dépôt	2021-04-08
Date de la première publication	2021-07-22
Date d'octroi	2023-10-03
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Emami, Seyed M. Wilson, Chris Mohajer, Keyvan

Abrégé

A method of building a natural language understanding application is provided. The method includes receiving at least one electronic record containing programming code and creating executable code from the programming code. Further, the executable code, when executed by a processor, causes the processor to create a parse and an interpretation of a sequence of input tokens, the programming code includes an interpret-block and the interpret-block includes an interpret-statement. Additionally, the interpret-statement includes a pattern expression and the interpret-statement includes an action statement.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 40/205 - Analyse syntaxique
G06F 8/30 - Création ou génération de code source
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
H04M 3/493 - Services d'information interactifs, p. ex. renseignements sur l'annuaire téléphonique

48. Voice morphing apparatus having adjustable parameters

Numéro d'application	16740440
Numéro de brevet	11600284
Statut	Délivré - en vigueur
Date de dépôt	2020-01-11
Date de la première publication	2021-07-15
Date d'octroi	2023-03-07
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Pearson, Steve

Abrégé

A voice morphing apparatus having adjustable parameters is described. The disclosed system and method include a voice morphing apparatus that morphs input audio to mask a speaker's identity. Parameter adjustment uses evaluation of an objective function that is based on the input audio and output of the voice morphing apparatus. The voice morphing apparatus includes objectives that are based adversarially on speaker identification and positively on audio fidelity. Thus, the voice morphing apparatus is adjusted to reduce identifiability of speakers while maintaining fidelity of the morphed audio. The voice morphing apparatus may be used as part of an automatic speech recognition system.

Classes IPC ?

G10L 21/013 - Adaptation à la hauteur tonale ciblée
G10L 21/0208 - Filtration du bruit
G06N 20/00 - Apprentissage automatique
G06N 3/08 - Méthodes d'apprentissage

49. Training a voice morphing apparatus

Numéro d'application	16740378
Numéro de brevet	11100940
Statut	Délivré - en vigueur
Date de dépôt	2020-01-10
Date de la première publication	2021-06-24
Date d'octroi	2021-08-24
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Pearson, Steve

Abrégé

Systems and methods for training a voice morphing apparatus are described. The voice morphing apparatus is trained to morph input audio data to mask an identity of a speaker. Training is performed by evaluating an objective function that is a function of the input audio data and an output of the voice morphing apparatus. The objective function may have a first term that is based on speaker identification and a second term that is based on audio fidelity. By optimizing the objective function, parameters of the voice morphing apparatus may be adjusted so as to reduce a confidence of speaker identification and maintain an audio fidelity of the morphed audio data. The voice morphing apparatus, once trained, may be used as part of an automatic speech recognition system.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 17/04 - Entraînement, enrôlement ou construction de modèle
G10L 17/00 - Techniques d'identification ou de vérification du locuteur
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 21/013 - Adaptation à la hauteur tonale ciblée
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
G10L 25/51 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
G10L 21/0364 - Amélioration de l'intelligibilité de la parole, p. ex. réduction de bruit ou annulation d'écho en changeant l’amplitude pour améliorer l'intelligibilité

50. Neural network training from private data

Numéro d'application	16716497
Numéro de brevet	11551083
Statut	Délivré - en vigueur
Date de dépôt	2019-12-17
Date de la première publication	2021-06-17
Date d'octroi	2023-01-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Li, Zili Amirguliyev, Asif Probell, Jonah

Abrégé

Training and enhancement of neural network models, such as from private data, are described. A slave device receives a version of a neural network model from a master. The slave accesses a local and/or private data source and uses the data to perform optimization of the neural network model. This can be done such as by computing gradients or performing knowledge distillation to locally train an enhanced second version of the model. The slave sends the gradients or enhanced neural network model to a master. The master may use the gradient or second version of the model to improve a master model.

Classes IPC ?

G06N 3/08 - Méthodes d'apprentissage
H04L 67/10 - Protocoles dans lesquels une application est distribuée parmi les nœuds du réseau
H04L 41/082 - Réglages de configuration caractérisés par les conditions déclenchant un changement de paramètres la condition étant des mises à jour ou des mises à niveau des fonctionnalités réseau
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion

51. Synthesizing speech recognition training data

Numéro d'application	16704216
Numéro de brevet	11308938
Statut	Délivré - en vigueur
Date de dépôt	2019-12-05
Date de la première publication	2021-06-10
Date d'octroi	2022-04-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Wieman, Maisy Probell, Jonah Krishnaswamy, Sudharsan

Abrégé

To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 13/02 - Procédés d'élaboration de parole synthétiqueSynthétiseurs de parole
G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes

52. Dynamic wakewords for speech-enabled devices

Numéro d'application	16704944
Numéro de brevet	11295741
Statut	Délivré - en vigueur
Date de dépôt	2019-12-05
Date de la première publication	2021-06-10
Date d'octroi	2022-04-05
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard

Abrégé

A system and method are disclosed capable of parsing a spoken utterance into a natural language request and a speech audio segment, where the natural language request directs the system to use the speech audio segment as a new wakeword. In response to this wakeword assignment directive, the system and method are further capable of immediately building a new wakeword spotter to activate the device upon matching the new wakeword in the input audio. Different approaches to promptly building a new wakeword spotter are described. Variations of wakeword assignment directives can make the new wakeword public or private. They can also add the new wakeword to earlier wakewords, or replace earlier wakewords.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 3/16 - Entrée acoustiqueSortie acoustique
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/08 - Classement ou recherche de la parole
G10L 17/04 - Entraînement, enrôlement ou construction de modèle

53. Systems and methods for granularizing compound natural language queries

Numéro d'application	16226372
Numéro de brevet	11023509
Statut	Délivré - en vigueur
Date de dépôt	2018-12-19
Date de la première publication	2021-06-01
Date d'octroi	2021-06-01
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Weinstein, Jason Mohajer, Keyvan

Abrégé

A method for processing a natural language query. The method includes receiving a text query, the query referring to a plurality of objects, attributes, qualifiers and other arguments and parsing the query to produce an argument tree representing the substance and structure of the query. The method also includes the capability to define qualifiers as being possibly projectable onto other arguments and indicate their direction of projectability and the capability to denote nodes of the argument tree as foldable, as splittable, or as containing sequences of qualifier arguments. The method additionally includes defining validity rules for a domain of knowledge, used to determine whether a list of arguments form a valid granular query component and processing of the argument tree, in view of the above in order to derive a corresponding plurality of granular query components that collectively request the plurality of pieces of information representing the intent of the query.

Classes IPC ?

G06F 16/33 - Requêtes
G06F 16/338 - Présentation des résultats des requêtes
G06N 5/04 - Modèles d’inférence ou de raisonnement

54. Identification of code for parsing given expressions

Numéro d'application	16786991
Numéro de brevet	11003426
Statut	Délivré - en vigueur
Date de dépôt	2020-02-10
Date de la première publication	2021-05-11
Date d'octroi	2021-05-11
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Wilson, Christopher S. Mohajer, Keyvan

Abrégé

A command-processing server provides natural language processing services to applications. The command-processing server stores a set of code blocks, each code block being able to interpret a set of corresponding natural language expressions. The command-processing server accepts natural language expressions and identifies the code blocks that are capable of interpreting those expressions by attempting to parse the natural language expressions using the code blocks. The command-processing server then provides a list of the identified code blocks to the developers, who can then incorporate the code blocks into their applications.

Classes IPC ?

G06F 8/41 - Compilation
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification

55. Integrated programming framework for speech and text understanding with block and statement structure

Numéro d'application	16209854
Numéro de brevet	10996931
Statut	Délivré - en vigueur
Date de dépôt	2018-12-04
Date de la première publication	2021-05-04
Date d'octroi	2021-05-04
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Emami, Seyed M. Wilson, Chris Mont-Reynaud, Bernard

Abrégé

The technology disclosed relates to authoring of vertical applications of natural language understanding (NLU), which analyze text or utterances and construct their meaning. In particular, it relates to new programming constructs and tools and data structures implementing those new applications.

Classes IPC ?

G06F 17/00 - Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
G06F 8/30 - Création ou génération de code source
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
H04M 3/493 - Services d'information interactifs, p. ex. renseignements sur l'annuaire téléphonique
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine

56. System and method for voice morphing

Numéro d'application	16578386
Numéro de brevet	11205056
Statut	Délivré - en vigueur
Date de dépôt	2019-09-22
Date de la première publication	2021-03-25
Date d'octroi	2021-12-21
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Ross, Dylan H

Abrégé

A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift.

Classes IPC ?

G06F 40/56 - Génération de langage naturel
G10L 19/26 - Pré-filtrage ou post-filtrage
G10L 19/125 - Excitation de la hauteur tonale, p. ex. prédiction linéaire à excitation de code avec innovation synchrone de la hauteur tonale [PSI-CELP]
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 40/58 - Utilisation de traduction automatisée, p. ex. pour recherches multilingues, pour fournir aux dispositifs clients une traduction effectuée par le serveur ou pour la traduction en temps réel

57. Integrated programming framework for speech and text understanding with meaning parsing

Numéro d'application	13842735
Numéro de brevet	10957310
Statut	Délivré - en vigueur
Date de dépôt	2013-03-15
Date de la première publication	2021-03-23
Date d'octroi	2021-03-23
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Emami, Seyed Majid Wilson, Chris Mont-Reynaud, Bernard

Abrégé

Classes IPC ?

G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 40/205 - Analyse syntaxique

58. System and method for detection and correction of a query

Numéro d'application	16561020
Numéro de brevet	11263198
Statut	Délivré - en vigueur
Date de dépôt	2019-09-05
Date de la première publication	2021-03-11
Date d'octroi	2022-03-01
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Bettaglio, Olivia Singh, Pranav

Abrégé

Systems and methods are provided for systematically finding and fixing automatic speech recognition (ASR) mistranscriptions and natural language understanding (NLU) misinterpretations and labeling data for machine learning. High similarity of non-identical consecutive queries indicates ASR mistranscriptions. Consecutive queries with close vectors in a semantic embedding space indicates NLU misinterpretations. Key phrases and barge-in also indicate errors. Only queries within a short amount of time are considered.

Classes IPC ?

G06F 16/23 - Mise à jour
G06F 16/2452 - Traduction des requêtes
G06N 7/00 - Agencements informatiques fondés sur des modèles mathématiques spécifiques

59. Support for grammar inflections within a software development framework

Numéro d'application	16563783
Numéro de brevet	11151329
Statut	Délivré - en vigueur
Date de dépôt	2019-09-06
Date de la première publication	2021-03-11
Date d'octroi	2021-10-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Taron, Seth

Abrégé

Classes IPC ?

G06F 40/30 - Analyse sémantique
G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots
G06F 8/41 - Compilation
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 40/205 - Analyse syntaxique
G06F 40/253 - Analyse grammaticaleCorrigé du style

60. Natural language grammar improvement

Numéro d'application	16546177
Numéro de brevet	11636853
Statut	Délivré - en vigueur
Date de dépôt	2019-08-20
Date de la première publication	2021-02-25
Date d'octroi	2023-04-25
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Howard, Angela Rose

Abrégé

A method for configuring natural language grammars is provided to include identifying a first transcription having a first automatic speech recognition (ASR) score and a first natural language understanding (NLU) score and identifying a second transcription having a second ASR score and a second NLU score. The method includes detecting that a difference between the first and second ASR scores has a signed value with an opposite sign than a sign of a signed value of a difference between the first and second NLU scores, and responsive to detecting the opposite sign providing, to an evaluator, the audio query and the first and second transcriptions, receiving, from the evaluator, an indication of which of the first and second transcriptions is a correct transcription, and adjusting a value implemented to calculate the first NLU score or a value implemented to calculate the second NLU score.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 40/253 - Analyse grammaticaleCorrigé du style
G06F 40/279 - Reconnaissance d’entités textuelles
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel

61. Method and system using phoneme embedding

Numéro d'application	16543483
Numéro de brevet	11410642
Statut	Délivré - en vigueur
Date de dépôt	2019-08-16
Date de la première publication	2021-02-18
Date d'octroi	2022-08-09
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Scuderi, Serena Caterina Zoli, Gioia Hotung, Sarah Beth

Abrégé

A system and method for creating an embedded phoneme map from a corpus of speech in accordance with a multiplicity of acoustic features of the speech. The embedded phoneme map is used to determine how to pronounce borrowed words from a lending language in the borrowing language, using the phonemes of the borrowing language that are closest to the phonemes of the lending language. The embedded phoneme map is also used to help linguists visualize the phonemes being pronounced by a speaker in real-time and to help non-native speakers practice pronunciation by displaying the differences between proper pronunciation and actual pronunciation for open-ended speech by the speaker.

Classes IPC ?

G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/00 - Reconnaissance de la parole
G10L 21/10 - Transformation en information visible

62. Dynamic interpolation for hybrid language models

Numéro d'application	16529730
Numéro de brevet	11295732
Statut	Délivré - en vigueur
Date de dépôt	2019-08-01
Date de la première publication	2021-02-04
Date d'octroi	2022-04-05
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Holm, Steffen Kong, Terry Lokeswarappa, Kiran Garaga

Abrégé

In order to improve the accuracy of ASR, an utterance is transcribed using a plurality of language models, such as for example, an N-gram language model and a neural language model. The language models are trained separately. They each output a probability score or other figure of merit for a partial transcription hypothesis. Model scores are interpolated to determine a hybrid score. While recognizing an utterance, interpolation weights are chosen or updated dynamically, in the specific context of processing. The weights are based on dynamic variables associated with the utterance, the partial transcription hypothesis, or other aspects of context.

Classes IPC ?

G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel

63. User-defined extensions of the command input recognized by a virtual assistant

Numéro d'application	16206963
Numéro de brevet	10896671
Statut	Délivré - en vigueur
Date de dépôt	2018-11-30
Date de la première publication	2021-01-19
Date d'octroi	2021-01-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Wilson, Christopher S. Mont-Reynaud, Bernard Macrae, Robert

Abrégé

A command-processing server provides natural language services to applications. More specifically, the command-processing server receives natural language inputs from users for use in applications such as virtual assistants. Some user inputs create user-defined rules that consist of trigger conditions and of corresponding actions that are executed when the triggers fire. The command-processing server stores the rules received from a user in association with the specific user. The command-processing server also identifies rules that can be generalized across users and promoted into generic rules applicable to many or all users. The generic rules may or may not have an associated context constraining their application.

Classes IPC ?

G06F 17/27 - Analyse automatique, p.ex. analyse grammaticale, correction orthographique
G10L 15/07 - Adaptation au locuteur
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
H04L 12/28 - Réseaux de données à commutation caractérisés par la configuration des liaisons, p. ex. réseaux locaux [LAN Local Area Networks] ou réseaux étendus [WAN Wide Area Networks]
H04L 29/08 - Procédure de commande de la transmission, p.ex. procédure de commande du niveau de la liaison
G10L 17/22 - Procédures interactivesInterfaces homme-machine

64. Vision-assisted speech processing

Numéro d'application	16509029
Numéro de brevet	11257493
Statut	Délivré - en vigueur
Date de dépôt	2019-07-11
Date de la première publication	2021-01-14
Date d'octroi	2022-02-22
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Vasconcelos, Cristina Li, Zili

Abrégé

Systems and methods for processing speech are described. In certain examples, image data is used to generate visual feature tensors and audio data is used to generate audio feature tensors. The visual feature tensors and the audio feature tensors are used by a linguistic model to determine linguistic features that are usable to parse an utterance of a user. The generation of the feature tensors may be jointly configured with the linguistic model. Systems may be provided in a client-server architecture.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G10L 15/24 - Reconnaissance de la parole utilisant des caractéristiques non acoustiques
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06K 9/72 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques utilisant une analyse de contexte basée sur l'identité provisoire attribuée à une série de formes successives, p.ex. d'un mot
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux

65. Parsing to determine interruptible state in an utterance by detecting pause duration and complete sentences

Numéro d'application	16243920
Numéro de brevet	10832005
Statut	Délivré - en vigueur
Date de dépôt	2019-01-09
Date de la première publication	2020-11-10
Date d'octroi	2020-11-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Mont-Reynaud, Bernard

Abrégé

The technology disclosed relates to computer-implemented conversational agents and particularly to detecting a point in the dialog (end of turn, or end of utterance) at which the agent can start responding to the user. The technology disclosed provides a method of incrementally parsing an input utterance with multiple parses operating in parallel. The technology disclosed includes detecting an interjection point in the input utterance when a pause exceeds a high threshold, or detecting an interjection point in the input utterance when a pause exceeds a low threshold and at least one of the parallel parses is determined to be interruptible by matching a complete sentence according to the grammar. The conversational agents start responding to the user at a detected interjection point.

Classes IPC ?

G06F 40/205 - Analyse syntaxique
G06F 40/30 - Analyse sémantique

66. System and method for controlling an application using natural language communication

Numéro d'application	16388867
Numéro de brevet	11393463
Statut	Délivré - en vigueur
Date de dépôt	2019-04-19
Date de la première publication	2020-10-22
Date d'octroi	2022-07-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Stonehocker, Timothy P. Mcmahon, Kathleen Worthington

Abrégé

A system and method are disclosed for setting up a communication link between a device or application and a system with a controller. The controller can collect and send information to the application. A user interfaces with the controller to access the functionality of the application through providing commands to the controller. The system allows the user to interface with multiple applications.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/193 - Grammaires formelles, p. ex. automates à états finis, grammaires hors contexte ou réseaux de mots
G06F 3/16 - Entrée acoustiqueSortie acoustique

67. Microphone mask

Numéro d'application	16838219
Numéro de brevet	11266184
Statut	Délivré - en vigueur
Date de dépôt	2020-04-02
Date de la première publication	2020-10-08
Date d'octroi	2022-03-08
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mcmahon, Kathleen Worthington

Abrégé

A mask is worn to cover a mouth of a wearer, and includes a mask main body made of a cloth-like material, a microphone arranged on the mask main body, and configured to collect voice of the wearer, a cord connected to the microphone, and a support portion that supports the microphone. The support portion is joined to a peripheral portion of the mask main body and higher in rigidity than the mask main body.

Classes IPC ?

A41D 13/11 - Masques de protection du visage, p. ex. pour utilisation chirurgicale ou pour utilisation en atmosphère polluée
A41D 1/00 - Vêtements
A42B 3/30 - Montage de postes radio ou de systèmes de communication

68. Using a virtual assistant to store a personal voice memo and to obtain a response based on a stored personal voice memo that is retrieved according to a received query

Numéro d'application	16255674
Numéro de brevet	11211064
Statut	Délivré - en vigueur
Date de dépôt	2019-01-23
Date de la première publication	2020-07-23
Date d'octroi	2021-12-28
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Selvaggi, Mara Spiridonova, Irina A Stahl, Karl

Abrégé

The technology disclosed relates to retrieving a personal memo from a database. The method includes receiving, by a virtual assistant, a natural language utterance that expresses a request, interpreting the natural language utterance according to a natural language grammar rule for retrieving memo data from the natural language utterance, the natural language grammar rule recognizing query information, responsive to interpreting the natural language utterance, using the query information to query the database for a memo related to the query information, and providing, to a user, a response generated in dependence upon the memo related to the query information.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots
G10L 15/08 - Classement ou recherche de la parole

69. Adapting an utterance cut-off period based on parse prefix detection

Numéro d'application	16824308
Numéro de brevet	11308960
Statut	Délivré - en vigueur
Date de dépôt	2020-03-19
Date de la première publication	2020-07-09
Date d'octroi	2022-04-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Aguayo, Patricia Pozon Zhang, Jennifer Hee Young Probell, Jonah

Abrégé

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/05 - Détection des limites de mots
G10L 25/78 - Détection de la présence ou de l’absence de signaux de voix
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/08 - Classement ou recherche de la parole
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots

70. Unified embeddings for translation

Numéro d'application	16232984
Numéro de brevet	10796107
Statut	Délivré - en vigueur
Date de dépôt	2018-12-26
Date de la première publication	2020-07-02
Date d'octroi	2020-10-06
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Kong, Terry

Abrégé

A method of training word embeddings is provided. The method includes determining anchors, each comprising a first word in a first domain and a second word in a second domain, training word embeddings for the first and second domains, and training a transform for transforming word embedding vectors in the first domain to word embedding vectors in the second domain, wherein the training minimizes a loss function that includes an anchor loss for each anchor, such that for each anchor, the anchor loss is based on a distance between the anchor's second word's embedding vector and the transform of the anchor's first word's embedding vector, and for each anchor, the anchor loss for the respective anchor is zero when the distance between the respective anchor's second word's embedding vector and the transform of the respective anchor's first word's embedding vector is less than a specific tolerance.

Classes IPC ?

G06F 40/216 - Analyse syntaxique utilisant des méthodes statistiques
G06F 40/58 - Utilisation de traduction automatisée, p. ex. pour recherches multilingues, pour fournir aux dispositifs clients une traduction effectuée par le serveur ou pour la traduction en temps réel
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06F 40/295 - Reconnaissance de noms propres

71. System and method for detection and correction of incorrectly pronounced words

Numéro d'application	16212695
Numéro de brevet	11043213
Statut	Délivré - en vigueur
Date de dépôt	2018-12-07
Date de la première publication	2020-06-11
Date d'octroi	2021-06-22
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Norouzi, Katayoun Stahl, Karl

Abrégé

A system and method are disclosed for capturing a segment of speech audio, performing phoneme recognition on the segment of speech audio to produce a segmented phoneme sequence, comparing the segmented phoneme sequence to stored phoneme sequences that represent incorrect pronunciations of words to determine if there is a match, and identifying an incorrect pronunciation for a word in the segment of speech audio. The system builds a library based on the data collected for the incorrect pronunciations.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G10L 15/04 - SegmentationDétection des limites de mots
G09B 19/04 - Élocution
G10L 13/00 - Synthèse de la paroleSystèmes de synthèse de la parole à partir de texte
G06F 3/16 - Entrée acoustiqueSortie acoustique
G10L 15/08 - Classement ou recherche de la parole

72. Virtual Assistant Domain Selection Analysis

Numéro d'application	16213020
Statut	En instance
Date de dépôt	2018-12-07
Date de la première publication	2020-06-11
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Probell, Jonah

Abrégé

A virtual assistant platform provides a user interface for app developers to configure the enablement of domains for virtual assistants. Sets of test queries can be uploaded and statistical analyses displayed for the numbers of test queries served by each selected domain and costs for usage of each domain. Costs can vary according to complex pricing models. The user interface provides display views of tables, cost stack charts, and histograms to inform decisions that trade-off costs with benefits to the virtual assistant user experience. The platform interface shows, for individual queries, responses possible from different domains. Platform providers promote certain chosen domains.

Classes IPC ?

G06F 11/36 - Prévention d'erreurs par analyse, par débogage ou par test de logiciel
G06F 9/451 - Dispositions d’exécution pour interfaces utilisateur
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds

73. Text-to-speech adapted by machine learning

Numéro d'application	16742006
Numéro de brevet	11531819
Statut	Délivré - en vigueur
Date de dépôt	2020-01-14
Date de la première publication	2020-05-14
Date d'octroi	2022-12-20
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Almudafar-Depeyrot, Monika

Abrégé

Classes IPC ?

G06F 40/30 - Analyse sémantique
G10L 13/00 - Synthèse de la paroleSystèmes de synthèse de la parole à partir de texte
G10L 13/033 - Édition de voix, p. ex. transformation de la voix du synthétiseur
G10L 13/04 - Détails des systèmes de synthèse de la parole, p. ex. structure du synthétiseur ou gestion de la mémoire
G10L 13/10 - Règles de prosodie dérivées du texteIntonation ou accent tonique

74. Concept-based augmentation of queries for applying a buyer-defined function

Numéro d'application	16572179
Numéro de brevet	11461812
Statut	Délivré - en vigueur
Date de dépôt	2019-09-16
Date de la première publication	2020-01-09
Date d'octroi	2022-10-04
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Halstvedt, Scott

Abrégé

Original concepts obtained from a query may be augmented with additional concepts connected to the original concepts in a concept graph in response to determining that the original concepts did not match a sufficient number of bid functions. The augmented set of concepts may then be evaluated with respect to the bid functions to identify matching ad functions. This process may be repeated until a sufficient number of matching ad functions are found. A bid amount of the matching bid functions may be calculated, such as based on semantic information obtained as a result of the query. The bid amounts may further be based on environmental information. A bid function is selected based on the bid amounts and the content associated with the bid function is provided to the source of the query. The content may be selected based on the semantic information.

Classes IPC ?

G06Q 30/00 - Commerce
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06F 40/289 - Analyse syntagmatique, p. ex. techniques d’états finis ou regroupement
G05B 19/418 - Commande totale d'usine, c.-à-d. commande centralisée de plusieurs machines, p. ex. commande numérique directe ou distribuée [DNC], systèmes d'ateliers flexibles [FMS], systèmes de fabrication intégrés [IMS], productique [CIM]
G06F 40/30 - Analyse sémantique

75. Custom acoustic models

Numéro d'application	15996393
Numéro de brevet	11011162
Statut	Délivré - en vigueur
Date de dépôt	2018-06-01
Date de la première publication	2019-12-05
Date d'octroi	2021-05-18
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Patel, Mehul Mohajer, Keyvan

Abrégé

The technology disclosed relates to performing speech recognition for a plurality of different devices or devices in a plurality of conditions. This includes storing a plurality of acoustic models associated with different devices or device conditions, receiving speech audio including natural language utterances, receiving metadata indicative of a device type or device condition, selecting an acoustic model from the plurality in dependence upon the received metadata, and employing the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio. Each of speech recognition and the storage of acoustic models can be performed locally by devices or on a network-connected server. Also provided is a platform and interface, used by device developers to select, configure, and/or train acoustic models for particular devices and/or conditions.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 3/16 - Entrée acoustiqueSortie acoustique
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel

76. Interpreting expressions having potentially ambiguous meanings in different domains

Numéro d'application	15942875
Numéro de brevet	11113473
Statut	Délivré - en vigueur
Date de dépôt	2018-04-02
Date de la première publication	2019-10-03
Date d'octroi	2021-09-07
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Wilson, Christopher S. Mohajer, Keyvan Mont-Reynaud, Bernard

Abrégé

The present invention extends to methods, systems, and computer program products for interpreting expressions having potentially ambiguous meanings in different domains. Multi-domain natural language understanding systems can support a variety of different types of clients. Expressions can be interpreted across multiple domains. Weights can be assigned to domains. Weights can be client specific or expression specific so that a chosen interpretation is more likely correct for the type of client or for its context. Stored weight sets can be chosen according to identifying information carried as metadata with expressions or weight sets carried directly as metadata. Domains can additionally or alternatively be ranked in ordered lists or comparative domain pairs of to favor some domains over others as appropriate for client type or client context.

Classes IPC ?

G06F 40/00 - Maniement de données en langage naturel
G06F 40/30 - Analyse sémantique
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux

77. System and methods for a virtual assistant to manage and use context in a natural language dialog

Numéro d'application	15163485
Numéro de brevet	10418032
Statut	Délivré - en vigueur
Date de dépôt	2016-05-24
Date de la première publication	2019-09-17
Date d'octroi	2019-09-17
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Wilson, Christopher Mont-Reynaud, Bernard Collecchia, Regina

Abrégé

A dialog with a conversational virtual assistant includes a sequence of user queries and systems responses. Queries are received and interpreted by a natural language understanding system. Dialog context information gathered from user queries and system responses is stored in a layered context data structure. Incomplete queries, which do not have sufficient information to result in an actionable interpretation, become actionable with use of context data. The system recognizes the need to access context data, and retrieves from context layers information required to transform the query into an executable one. The system may then act on the query and provide an appropriate response to the user. Context data buffers forget information, perhaps selectively, with the passage of time, and after a sufficient number and type of intervening queries.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 17/30 - Recherche documentaire; Structures de bases de données à cet effet
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots
G06F 16/25 - Systèmes d’intégration ou d’interfaçage impliquant les systèmes de gestion de bases de données
G06F 16/2452 - Traduction des requêtes

78. Techniques for concurrent processing of user speech

Numéro d'application	16388526
Numéro de brevet	10699713
Statut	Délivré - en vigueur
Date de dépôt	2019-04-18
Date de la première publication	2019-08-08
Date d'octroi	2020-06-30
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Halstvedt, Scott Mont-Reynaud, Bernard Wadud, Kazi Asif

Abrégé

A server receives a user audio stream, the stream comprising multiple utterances. A query-processing module of the server continuously listens to and processes the utterances. The processing includes parsing successive utterances and recognizing corresponding queries, taking appropriate actions while the utterances are being received. In some embodiments, a query may be parsed and executed before the previous query's execution is complete.

Classes IPC ?

G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 40/35 - Représentation du discours ou du dialogue

79. Advertisement selection by linguistic classification

Numéro d'application	16388753
Numéro de brevet	11030993
Statut	Délivré - en vigueur
Date de dépôt	2019-04-18
Date de la première publication	2019-08-08
Date d'octroi	2021-06-08
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Huang, Jun Lokeswarappa, Kiran Garaga Gedalius, Joel Mont-Reynaud, Bernard

Abrégé

A method is provided for advertisement selection. The method includes recognizing words from user speech over a large number of interactions, computing a number of unique words uttered during the interactions, classifying the user by the number of unique words uttered during the interactions, and selecting an advertisement targeted to the classified users.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
H04L 29/08 - Procédure de commande de la transmission, p.ex. procédure de commande du niveau de la liaison
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06F 40/205 - Analyse syntaxique
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification
G06F 40/253 - Analyse grammaticaleCorrigé du style
G06N 20/00 - Apprentissage automatique
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 25/90 - Détermination de la hauteur tonale des signaux de parole
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 25/51 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole

80. Parse prefix-detection in a human-machine interface

Numéro d'application	15855908
Numéro de brevet	10636421
Statut	Délivré - en vigueur
Date de dépôt	2017-12-27
Date de la première publication	2019-06-27
Date d'octroi	2020-04-28
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Zhang, Jennifer Hee Young Aguayo, Patricia Pozon Probell, Jonah

Abrégé

A speech-based human-machine interface that parses words spoken to detect a complete parse and, responsive to so detecting, computes a hypothesis as to whether the words are a prefix to another complete parse. The duration of no voice activity period to determine an end of a sentence depends on the prefix hypothesis. The user's typical speech speed profile and a short-term measure of speech speed also scale the period. Speech speed is measured by the time between words, and the period scaling uses a continuously adaptive algorithm. The system uses a longer cut-off period after a system wake-up event but before it detects any voice activity.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/05 - Détection des limites de mots
G10L 25/78 - Détection de la présence ou de l’absence de signaux de voix
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/08 - Classement ou recherche de la parole
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots

81. Method and system for building an integrated user profile

Numéro d'application	15385493
Numéro de brevet	10311858
Statut	Délivré - en vigueur
Date de dépôt	2016-12-20
Date de la première publication	2019-06-04
Date d'octroi	2019-06-04
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Huang, Jun Lokeswarappa, Kiran Garaga Gedalius, Joel

Abrégé

A system and method are provided for adding user characterization information to a user profile by analyzing user's speech. User properties such as age, gender, accent, and English proficiency may be inferred by extracting and deriving features from user speech, without the user having to configure such information manually. A feature extraction module that receives audio signals as input extracts acoustic, phonetic, textual, linguistic, and semantic features. The module may be a system component independent of any particular vertical application or may be embedded in an application that accepts voice input and performs natural language understanding. A profile generation module receives the features extracted by the feature extraction module and uses classifiers to determine user property values based on the extracted and derived features and store these values in a user profile. The resulting profile variables may be globally available to other applications.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G06F 17/27 - Analyse automatique, p.ex. analyse grammaticale, correction orthographique
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 25/90 - Détermination de la hauteur tonale des signaux de parole
H04L 29/08 - Procédure de commande de la transmission, p.ex. procédure de commande du niveau de la liaison
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06N 20/00 - Apprentissage automatique

82. Integration of third party virtual assistants

Numéro d'application	16246543
Numéro de brevet	10783872
Statut	Délivré - en vigueur
Date de dépôt	2019-01-13
Date de la première publication	2019-05-16
Date d'octroi	2020-09-22
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Almudafar-Depeyrot, Monika Mohajer, Keyvan Stevans, Mark

Abrégé

A speech-enabled dialog system responds to a plurality of wake-up phrases. Based on which wake-up phrase is detected, the system's configuration is modified accordingly. Various configurable aspects of the system include selection and morphine of a text-to-speech voice; configuration of acoustic model, language model, vocabulary, and grammar; configuration of a graphic animation; configuration of virtual assistant personality parameters; invocation of a particular user profile; invocation of an authentication function; and configuration of an open sound. Configuration depends on a target market segment. Configuration also depends on the state of the dialog system, such as whether a previous utterance was an information query.

Classes IPC ?

G10L 13/08 - Analyse de texte ou génération de paramètres pour la synthèse de la parole à partir de texte, p. ex. conversion graphème-phonème, génération de prosodie ou détermination de l'intonation ou de l'accent tonique
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 13/04 - Détails des systèmes de synthèse de la parole, p. ex. structure du synthétiseur ou gestion de la mémoire
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/08 - Classement ou recherche de la parole
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance

83. Geographical mapping of interpretations of natural language expressions

Numéro d'application	16238445
Numéro de brevet	11205051
Statut	Délivré - en vigueur
Date de dépôt	2019-01-02
Date de la première publication	2019-05-09
Date d'octroi	2021-12-21
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Khov, Kheng Singh, Pranav Mont-Reynaud, Bernard Probell, Jonah

Abrégé

A method of predicting a person's interests is provided. The method includes receiving geolocation information about a user location, reading, from a database of interpretations, at least one interpretation of an expression made in close proximity to the location, reading, from a database of ad bids, a plurality of ad bids comprising interpretations, comparing the interpretation from the database to the interpretations of the ad bids to select a most valuable ad bid having an interpretation that matches the interpretation of an expression made in close proximity to the location, and presenting an ad associated with the most valuable ad bid, wherein the interpretation is from a natural language expression.

Classes IPC ?

G06F 40/30 - Analyse sémantique
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06F 16/00 - Recherche d’informationsStructures de bases de données à cet effetStructures de systèmes de fichiers à cet effet
G06F 16/29 - Bases de données d’informations géographiques
G06F 40/289 - Analyse syntagmatique, p. ex. techniques d’états finis ou regroupement
G06F 16/9537 - Recherche à dépendance spatiale ou temporelle, p. ex. requêtes spatio-temporelles

84. Bidirectional probabilistic natural language rewriting and selection

Numéro d'application	15726394
Numéro de brevet	10599645
Statut	Délivré - en vigueur
Date de dépôt	2017-10-06
Date de la première publication	2019-04-11
Date d'octroi	2020-03-24
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Lefebure, Luke Singh, Pranav

Abrégé

A speech recognition and natural language understanding system performs insertion, deletion, and replacement edits of tokens at positions with low probabilities according to both a forward and a backward statistical language model (SLM) to produce rewritten token sequences. Multiple rewrites can be produced with scores depending on the probabilities of tokens according to the SLMs. The rewritten token sequences can be parsed according to natural language grammars to produce further weighted scores. Token sequences can be rewritten iteratively using a graph-based search algorithm to find the best rewrite. Mappings of input token sequences to rewritten token sequences can be stored in a cache, and searching for a best rewrite can be bypassed by using cached rewrites when present. Analysis of various initial token sequences that produce the same new rewritten token sequence can be useful to improve natural language grammars.

Classes IPC ?

G06F 17/27 - Analyse automatique, p.ex. analyse grammaticale, correction orthographique
G06F 16/2453 - Optimisation des requêtes
G06N 7/00 - Agencements informatiques fondés sur des modèles mathématiques spécifiques
G10L 15/183 - Classement ou recherche de la parole utilisant une modélisation du langage naturel selon les contextes, p. ex. modèles de langage

85. Classification by natural language grammar slots across domains

Numéro d'application	16121967
Numéro de brevet	11935029
Statut	Délivré - en vigueur
Date de dépôt	2018-09-05
Date de la première publication	2019-03-07
Date d'octroi	2024-03-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Aung, Joe Probell, Jonah

Abrégé

A virtual assistant processes natural language expressions according to grammar rules created by domain providers. The virtual assistant uniquely identifies each of a multiplicity of users and stores values of grammar slots filled by natural language expressions from each user. The virtual assistant stores histories of slot values and computes statistics from the history. The virtual assistant provider, or a classification client, provides values of attributes of users as labels for a machine learning classification algorithm. The algorithm processes the grammar slot values and labels to compute probability distributions for unknown attribute values of users. A network effect of users and domain grammars make the virtual assistant useful and provides increasing amounts of data that improve classification accuracy and usefulness.

Classes IPC ?

G06Q 20/24 - Schémas de crédit, c.-à-d. de "paiement différé"
G06F 9/54 - Communication interprogramme
G06F 16/28 - Bases de données caractérisées par leurs modèles, p. ex. des modèles relationnels ou objet
G06F 40/205 - Analyse syntaxique
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification
G06F 40/253 - Analyse grammaticaleCorrigé du style
G06F 40/30 - Analyse sémantique

86. Natural language recommendation feedback

Numéro d'application	15670975
Numéro de brevet	10373618
Statut	Délivré - en vigueur
Date de dépôt	2017-08-07
Date de la première publication	2019-02-07
Date d'octroi	2019-08-06
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Kamyar Macrae, Robert

Abrégé

Systems parse natural language expressions to extract items and values of their attributes and store them in a database. Systems also parse natural language expressions to extract values of attributes of user preferences and store them in a database. Recommendation engines use the databases to make recommendations. Parsing is of speech or text and uses conversation state, discussion context, synonym recognition, and speaker profile. Database pointers represent relative attribute values. Recommendations use machine learning to crowdsource from databases of many user preferences and to overcome the cold start problem. Parsing and recommendations use current or stored values of environmental parameters. Databases store different values of the same user preference attributes for different activities. Systems add unrecognized attributes and legal values when encountered in natural language expressions.

Classes IPC ?

G06F 16/22 - IndexationStructures de données à cet effetStructures de stockage
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 17/00 - Techniques d'identification ou de vérification du locuteur
G06F 16/242 - Formulation des requêtes
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur

87. Promotional content targeting based on recognized audio

Numéro d'application	16134890
Numéro de brevet	10832287
Statut	Délivré - en vigueur
Date de dépôt	2018-09-18
Date de la première publication	2019-01-17
Date d'octroi	2020-11-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Master, Aaron Mohajer, Keyvan

Abrégé

An audio recognition system provides for delivery of promotional content to its user. A user interface device, locally or with the assistance of a network-connected server, performs recognition of audio in response to queries. Recognition can be through a method such as processing features extracted from the audio. Audio can comprise recorded music, singing or humming, instrumental music, vocal music, spoken voice, or other recognizable types of audio. Campaign managers provide promotional content for delivery in response to audio recognized in queries.

Classes IPC ?

G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06F 16/60 - Recherche d’informationsStructures de bases de données à cet effetStructures de systèmes de fichiers à cet effet de données audio

88. Modular virtual assistant platform

Numéro d'application	16128227
Numéro de brevet	11144731
Statut	Délivré - en vigueur
Date de dépôt	2018-09-11
Date de la première publication	2019-01-10
Date d'octroi	2021-10-12
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Singh, Pranav Mohajer, Keyvan Mohajer, Kamyar Mont-Reynaud, Bernard

Abrégé

A platform provides for developers of applications, such as devices, with natural language interfaces to configure the availability of vertical domain modules in applications. Modules can include grammars for parsing natural language expressions and interfaces to data sources. Third party developers can create modules with pricing models for their usage or access to their data. Device developers can browse or search available modules and test their performance for specific queries. The platform provides for devices users to access the chosen modules as configured by device developers and for charging and payment between users, application developers, and module developers.

Classes IPC ?

G06F 40/40 - Traitement ou traduction du langage naturel
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06Q 20/10 - Architectures de paiement spécialement adaptées aux systèmes de transfert électronique de fondsArchitectures de paiement spécialement adaptées aux systèmes de banque à domicile
G06F 40/211 - Parsage syntaxique, p. ex. basé sur une grammaire hors contexte ou sur des grammaires d’unification

89. Dual mode speech recognition

Numéro d'application	15619304
Numéro de brevet	10410635
Statut	Délivré - en vigueur
Date de dépôt	2017-06-09
Date de la première publication	2018-12-13
Date d'octroi	2019-09-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard

Abrégé

A dual mode speech recognition system sends speech to two or more speech recognizers. If a first recognition result is received, whose recognition score exceeds a high threshold, the first result is selected without waiting for another result. If the score is below a low threshold, the first result is ignored. At intermediate values of recognition scores, a timeout duration is dynamically determined as a function of the recognition score. The timeout duration determines how long the system will wait for another result. Many functions of the recognition score are possible, but timeout durations generally decrease as scores increase. When receiving a second recognition score before the timeout occurs, a comparison based on recognition scores determines whether the first result or the second result is the basis for creating a response.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/00 - Reconnaissance de la parole
G10L 15/32 - Reconnaisseurs multiples utilisés en séquence ou en parallèleSystèmes de combinaison de score à cet effet, p. ex. systèmes de vote
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel

90. Systems and methods for providing identification information in response to an audio segment

Numéro d'application	16044331
Numéro de brevet	10657174
Statut	Délivré - en vigueur
Date de dépôt	2018-07-24
Date de la première publication	2018-11-15
Date d'octroi	2020-05-19
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Master, Aaron Mont-Reynaud, Bernard Mohajer, Keyvan Stonehocker, Timothy

Abrégé

The present invention relates to providing identification information in response to an audio segment using a first mode of operation including receiving an audio segment and sending the audio segment to a remote server and receiving, from the remote server, identification information relating to the audio segment, and a second mode of operation of receiving an audio segment and using stored information to obtain identification information relating to the received audio segment received, without sending the audio segment to the remote server. The present invention further includes using identification information from the remote server and using local identification information and selecting either identification information from the remote server or local identification information based on selection criteria, and generating an output based on the selected identification information.

Classes IPC ?

G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/432 - Formulation de requêtes
G06F 16/638 - Présentation des résultats des requêtes

91. System and method for targeting content based on identified audio and multimedia

Numéro d'application	15455083
Numéro de brevet	10121165
Statut	Délivré - en vigueur
Date de dépôt	2017-03-09
Date de la première publication	2018-11-06
Date d'octroi	2018-11-06
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mohajer, Keyvan Master, Aaron

Abrégé

The present disclosure relates to systems and methods that recognize audio queries and select related information to return in response to recognition of the audio queries. The technology disclosed facilitates easy designation of aggregate user experience categories and custom audio references to be recognized. It facilitates linking and returning of selected information in response to recognition of audio queries that match the designated aggregate user experience categories or custom audio references to be recognized.

Classes IPC ?

G10H 5/00 - Instruments dans lesquels les sons sont produits au moyen de générateurs électroniques
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06F 17/30 - Recherche documentaire; Structures de bases de données à cet effet

92. Managing agent engagement in a man-machine dialog

Numéro d'application	15881553
Numéro de brevet	11250844
Statut	Délivré - en vigueur
Date de dépôt	2018-01-26
Date de la première publication	2018-10-18
Date d'octroi	2022-02-15
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Mont-Reynaud, Bernard Halstvedt, Scott Mohajer, Keyvan

Abrégé

Agents engage and disengage with users intelligently. Users can tell agents to remain engaged without requiring a wakeword. Engaged states can support modal dialogs and barge-in. Users can cause disengagement explicitly. Disengagement can be conditional based on timeout, change of user, or environmental conditions. Engagement can be one-time or recurrent. Recurrent states can be attentive or locked. Locked states can be unconditional or conditional, including being reserved to support user continuity. User continuity can be tested by matching parameters or tracking user by many modalities including microphone arrays, cameras, and other sensors.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 17/22 - Procédures interactivesInterfaces homme-machine
G10L 17/04 - Entraînement, enrôlement ou construction de modèle
G10L 15/08 - Classement ou recherche de la parole
G10L 17/06 - Techniques de prise de décisionStratégies d’alignement de motifs
G06F 3/16 - Entrée acoustiqueSortie acoustique
G06F 21/32 - Authentification de l’utilisateur par données biométriques, p. ex. empreintes digitales, balayages de l’iris ou empreintes vocales
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G10L 17/00 - Techniques d'identification ou de vérification du locuteur

93. Speech-enabled system with domain disambiguation

Numéro d'application	15456354
Numéro de brevet	10229683
Statut	Délivré - en vigueur
Date de dépôt	2017-03-10
Date de la première publication	2018-09-13
Date d'octroi	2019-03-12
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Leeb, Rainer

Abrégé

Systems perform methods of interpreting spoken utterances from a user and responding to the utterances by providing requested information or performing a requested action. The utterances are interpreted in the context of multiple domains. Each interpretation is assigned a relevancy score based on how well the interpretation represents what the speaker intended. Interpretations having a relevancy score below a threshold for its associated domain are discarded. A remaining interpretation is chosen based on choosing the most relevant domain for the utterance. The user may be prompted to provide disambiguation information that can be used to choose the best domain. Storing past associations of utterance representation and domain choice allows for measuring the strength of correlation between uttered words and phrases with relevant domains. This correlation strength information may allow the system to automatically disambiguate alternate interpretations without requiring user input.

Classes IPC ?

G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel

94. Pronunciation guided by automatic speech recognition

Numéro d'application	15439883
Numéro de brevet	10319250
Statut	Délivré - en vigueur
Date de dépôt	2017-02-22
Date de la première publication	2018-07-05
Date d'octroi	2019-06-11
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Lokeswarappa, Kiran Garaga Probell, Jonah

Abrégé

Speech synthesis chooses pronunciations of words with multiple acceptable pronunciations based on an indication of a personal, class-based, or global preference or an intended non-preferred pronunciation. A speaker's words can be parroted back on personal devices using preferred pronunciations for accent training. Degrees of pronunciation error are computed and indicated to the user in a visual transcription or audibly as word emphasis in parroted speech. Systems can use sets of phonemes extended beyond those generally recognized for a language. Speakers are classified in order to choose specific phonetic dictionaries or adapt global ones. User profiles maintain lists of which pronunciations are preferred among ones acceptable for words with multiple recognized pronunciations. Systems use multiple correlations of word preferences across users to predict use preferences of unlisted words. Speaker-preferred pronunciations are used to weight the scores of transcription hypotheses based on phoneme sequence hypotheses in speech engines.

Classes IPC ?

G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G09B 5/04 - Matériel à but éducatif à commande électrique avec présentation sonore du sujet à étudier
G09B 19/06 - Langues étrangères
G10L 13/00 - Synthèse de la paroleSystèmes de synthèse de la parole à partir de texte
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine

95. Natural language grammar enablement by speech characterization

Numéro d'application	15411567
Numéro de brevet	10347245
Statut	Délivré - en vigueur
Date de dépôt	2017-01-20
Date de la première publication	2018-06-28
Date d'octroi	2019-07-09
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Stahl, Karl

Abrégé

Either or both of voice speaker identification or utterance classification such as by age, gender, accent, mood, and prosody characterize speech utterances in a system that performs automatic speech recognition (ASR) and natural language processing (NLP). The characterization conditions NLP, either through application to interpretation hypotheses or to specific grammar rules. The characterization also conditions language models of ASR. Conditioning may comprise enablement and may comprise reweighting of hypotheses.

Classes IPC ?

G10L 15/19 - Contexte grammatical, p. ex. désambiguïsation des hypothèses de reconnaissance par application des règles de séquence de mots
G10L 15/197 - Grammaires probabilistes, p. ex. n-grammes de mots
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 25/63 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour estimer un état émotionnel
G10L 17/02 - Opérations de prétraitement, p. ex. sélection de segmentReprésentation ou modélisation de motifs, p. ex. fondée sur l’analyse linéaire discriminante [LDA] ou les composantes principalesSélection ou extraction des caractéristiques
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance

96. Predicting human behavior by machine learning of natural language interpretations

Numéro d'application	15425099
Numéro de brevet	10296586
Statut	Délivré - en vigueur
Date de dépôt	2017-02-06
Date de la première publication	2018-06-28
Date d'octroi	2019-05-21
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Singh, Pranav Mont-Reynaud, Bernard Khov, Kheng Probell, Jonah

Abrégé

An accurate thought map is created by recording people's many utterances of natural language expressions together with the location at which each expression was made. The expressions are input into a Natural Language Understanding system including a semantic parser, and the resulting interpretations stored in a database with the geolocation of the speaker. Emotions, concepts, time, user identification, and other interesting information may also be detected and stored. Interpretations of related expressions may be linked in the database. The database may be indexed and filtered according to multiple aspects of interpretations such as geolocation ranges, time ranges or other criteria, and analyzed according to multiple algorithms. The analyzed results may be used to render map displays, determine effective locations for advertisements, preemptively fetch information for users of mobile devices, and predict the behavior of individuals and groups of people.

Classes IPC ?

G10L 15/00 - Reconnaissance de la parole
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 25/63 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour estimer un état émotionnel
G06F 17/27 - Analyse automatique, p.ex. analyse grammaticale, correction orthographique
G06Q 30/02 - MarketingEstimation ou détermination des prixCollecte de fonds
G06F 16/00 - Recherche d’informationsStructures de bases de données à cet effetStructures de systèmes de fichiers à cet effet
G06F 16/29 - Bases de données d’informations géographiques
G06F 16/9537 - Recherche à dépendance spatiale ou temporelle, p. ex. requêtes spatio-temporelles

97. Full-duplex utterance processing in a natural language virtual assistant

Numéro d'application	15389122
Numéro de brevet	10311875
Statut	Délivré - en vigueur
Date de dépôt	2016-12-22
Date de la première publication	2018-06-28
Date d'octroi	2019-06-04
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Halstvedt, Scott Mont-Reynaud, Bernard Wadud, Kazi Asif

Abrégé

A query-processing system processes an input audio stream that represents a succession of queries spoken by a user. The query-processing system listens continuously to the input audio stream, parses queries and takes appropriate actions in mid-stream. In some embodiments, the system processes queries in parallel, limited by serial constraints. In some embodiments, the system parses and executes queries while a previous query's execution is still in progress. To accommodate users who tend to speak slowly and express a thought in separate parts, the query-processing system halts the outputting of results corresponding to a previous query if it detects that a new speech utterance modifies the meaning of the previous query.

Classes IPC ?

G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G06F 17/27 - Analyse automatique, p.ex. analyse grammaticale, correction orthographique

98. Parametric adaptation of voice synthesis

Numéro d'application	15406213
Numéro de brevet	10586079
Statut	Délivré - en vigueur
Date de dépôt	2017-01-13
Date de la première publication	2018-06-28
Date d'octroi	2020-03-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Almudafar-Depeyrot, Monika Mont-Reynaud, Bernard

Abrégé

Software-based systems perform parametric speech synthesis. TTS voice parameters determine the generated speech audio. Voice parameters include gender, age, dialect, donor, arousal, authoritativeness, pitch, range, speech rate, volume, flutter, roughness, breath, frequencies, bandwidths, and relative amplitudes of formants and nasal sounds. The system chooses TTS parameters based on one or more of: user profile attributes including gender, age, and dialect; situational attributes such as location, noise level, and mood; natural language semantic attributes such as domain of conversation, expression type, dimensions of affect, word emphasis and sentence structure; and analysis of target speaker voices. The system chooses TTS parameters to improve listener satisfaction or other desired listener behavior. Choices may be made by specified algorithms defined by code developers, or by machine learning algorithms trained on labeled samples of system performance.

Classes IPC ?

G10L 13/033 - Édition de voix, p. ex. transformation de la voix du synthétiseur
G10L 13/10 - Règles de prosodie dérivées du texteIntonation ou accent tonique
G06F 40/30 - Analyse sémantique
G10L 13/00 - Synthèse de la paroleSystèmes de synthèse de la parole à partir de texte

99. Dynamic choice of data sources in natural language query processing

Numéro d'application	15342970
Numéro de brevet	10585891
Statut	Délivré - en vigueur
Date de dépôt	2016-11-03
Date de la première publication	2018-05-03
Date d'octroi	2020-03-10
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Halstvedt, Scott

Abrégé

A virtual assistant receives natural language interpretation hypotheses for user queries, determines entities and attributes from the interpretations, and requests data from appropriate data sources. A cost function estimates the cost of each data source request. Cost functions include factors such as contract pricing, access latency, and data quality. Based on the estimated cost, the virtual assistant sends requests to a plurality of data sources, each of which might be able to provide data necessary to answer the user query. By including user credits in the cost function, the virtual assistant provides better quality of results and answer latency for paying users. The virtual assistant minimizes latency by answering using data from the first responding data source or provides a latency guarantee by answering with the most accurate data received by a deadline. The virtual assistant measures data source response latency and caches responses for expensive requests.

Classes IPC ?

G06F 16/2453 - Optimisation des requêtes
G06F 16/33 - Requêtes

100. Virtual assistant configured by selection of wake-up phrase

Numéro d'application	15294234
Numéro de brevet	10217453
Statut	Délivré - en vigueur
Date de dépôt	2016-10-14
Date de la première publication	2018-04-19
Date d'octroi	2019-02-26
Propriétaire	SOUNDHOUND AI IP, LLC (USA) SOUNDHOUND AI IP HOLDING, LLC (USA)
Inventeur(s)	Stevans, Mark Almudafar-Depeyrot, Monika Mohajer, Keyvan

Abrégé

A speech-enabled dialog system responds to a plurality of wake-up phrases. Based on which wake-up phrase is detected, the system's configuration is modified accordingly. Various configurable aspects of the system include selection and morphing of a text-to-speech voice; configuration of acoustic model, language model, vocabulary, and grammar; configuration of a graphic animation; configuration of virtual assistant personality parameters; invocation of a particular user profile; invocation of an authentication function; and configuration of an open sound. Configuration depends on a target market segment. Configuration also depends on the state of the dialog system, such as whether a previous utterance was an information query.

Classes IPC ?

G10L 13/04 - Détails des systèmes de synthèse de la parole, p. ex. structure du synthétiseur ou gestion de la mémoire
G10L 13/08 - Analyse de texte ou génération de paramètres pour la synthèse de la parole à partir de texte, p. ex. conversion graphème-phonème, génération de prosodie ou détermination de l'intonation ou de l'accent tonique
G10L 15/02 - Extraction de caractéristiques pour la reconnaissance de la paroleSélection d'unités de reconnaissance
G10L 15/08 - Classement ou recherche de la parole
G10L 15/18 - Classement ou recherche de la parole utilisant une modélisation du langage naturel
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p. ex. dialogue homme-machine
G10L 15/30 - Reconnaissance distribuée, p. ex. dans les systèmes client-serveur, pour les applications en téléphonie mobile ou réseaux

1 2 Prochaine page