Modulate, Inc.

United States of America

Back to Profile

1-23 of 23 for Modulate, Inc. Sort by
Query
Aggregations
IP Type
        Patent 18
        Trademark 5
Jurisdiction
        United States 17
        World 5
        Canada 1
Date
2025 2
2024 2
2023 5
2022 2
2021 7
See more
IPC Class
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit 11
G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice 10
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog 9
G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal 9
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks 9
See more
NICE Class
09 - Scientific and electric apparatus and instruments 4
42 - Scientific, technological and industrial services, research and design 1
Status
Pending 5
Registered / In Force 18

1.

SYSTEM AND METHOD FOR CREATING TIMBRES

      
Application Number 19301342
Status Pending
Filing Date 2025-08-15
First Publication Date 2025-12-11
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

IPC Classes  ?

  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

2.

USER INTERFACE FOR CONTENT MODERATION FOR VOICE CHAT

      
Application Number 19231045
Status Pending
Filing Date 2025-06-06
First Publication Date 2025-09-25
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael
  • Morino, Ken
  • Pickart, David

Abstract

A content moderation system analyzes speech, or characteristics thereof, and determines a toxicity score representing the likelihood that a given clip of speech is toxic. A user interface displays a timeline with various instances of toxicity by one or more users for a give session. The user interface is optimized for moderation interaction, and shows how the conversation containing toxicity evolves over the time domain of a conversation.

IPC Classes  ?

  • H04L 12/18 - Arrangements for providing special services to substations for broadcast or conference
  • G06F 3/0482 - Interaction with lists of selectable items, e.g. menus
  • G06F 3/0484 - Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
  • G10L 15/08 - Speech classification or search
  • G10L 25/27 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique
  • G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
  • H04L 65/403 - Arrangements for multi-party communication, e.g. for conferences

3.

MULTI-STAGE ADAPTIVE SYSTEM FOR CONTENT MODERATION

      
Application Number 18660835
Status Pending
Filing Date 2024-05-10
First Publication Date 2024-09-05
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael
  • Howie, Henry

Abstract

A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.

IPC Classes  ?

  • G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
  • G06N 5/022 - Knowledge engineeringKnowledge acquisition
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

4.

System and method for creating timbres

      
Application Number 18528244
Grant Number 12412588
Status In Force
Filing Date 2023-12-04
First Publication Date 2024-04-11
Grant Date 2025-09-09
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

IPC Classes  ?

  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

5.

SCORING SYSTEM FOR CONTENT MODERATION

      
Application Number 18204869
Status Pending
Filing Date 2023-06-01
First Publication Date 2023-12-07
Owner MODULATE, INC. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael
  • Morino, Ken
  • Pickart, David

Abstract

A method for online voice content moderation provides a multi-stage voice content analysis system. The system includes a pre-moderator stage having a toxicity scorer configured to provide a toxicity score for a given toxic speech content from a user. The toxicity score is a function of a platform content policy. The method generates a toxicity score for the given toxic speech content. The toxic speech content is provided to a moderator as a function of the toxicity score.

IPC Classes  ?

  • G10L 15/08 - Speech classification or search
  • G10L 25/27 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique

6.

User interface for content moderation of voice chat

      
Application Number 18204873
Grant Number 12341619
Status In Force
Filing Date 2023-06-01
First Publication Date 2023-12-07
Grant Date 2025-06-24
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael
  • Morino, Ken
  • Pickart, David

Abstract

A content moderation system analyzes speech, or characteristics thereof, and determines a toxicity score representing the likelihood that a given clip of speech is toxic. A user interface displays a timeline with various instances of toxicity by one or more users for a give session. The user interface is optimized for moderation interaction, and shows how the conversation containing toxicity evolves over the time domain of a conversation.

IPC Classes  ?

  • H04L 12/18 - Arrangements for providing special services to substations for broadcast or conference
  • G06F 3/0482 - Interaction with lists of selectable items, e.g. menus
  • G06F 3/0484 - Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
  • G10L 15/08 - Speech classification or search
  • G10L 25/27 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique
  • G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
  • H04L 65/403 - Arrangements for multi-party communication, e.g. for conferences

7.

SCORING SYSTEM FOR CONTENT MODERATION

      
Application Number US2023024193
Publication Number 2023/235517
Status In Force
Filing Date 2023-06-01
Publication Date 2023-12-07
Owner MODULATE, INC. (USA)
Inventor
  • Huffman, William, Carter
  • Pappas, Michael
  • Morino, Ken
  • Pickart, David

Abstract

A method for online voice content moderation provides a multi-stage voice content analysis system. The system includes a pre-moderator stage having a toxicity scorer configured to provide a toxicity score for a given toxic speech content from a user. The toxicity score is a function of a platform content policy. The method generates a toxicity score for the given toxic speech content. The toxic speech content is provided to a moderator as a function of the toxicity score.

IPC Classes  ?

  • G06F 40/40 - Processing or translation of natural language
  • G06N 5/04 - Inference or reasoning models
  • H04L 51/212 - Monitoring or handling of messages using filtering or selective blocking
  • G06N 20/00 - Machine learning

8.

PREDICTIVE AUDIO REDACTION FOR REALTIME COMMUNICATION

      
Application Number 18132251
Status Pending
Filing Date 2023-04-07
First Publication Date 2023-10-12
Owner MODULATE, INC. (USA)
Inventor
  • Huffman, William Carter
  • Fishman, Joshua D.
  • Nevue, Zachary

Abstract

Illustrative embodiments employ trained artificial intelligence to provide real-time (e.g., zero introduced latency), or near-real-time (e.g., less than 500 ms of introduced latency), moderation of a verbal communication, without the need for human moderators. Illustrative embodiments employ trained artificial intelligence to provide real-time (e.g., zero introduced latency), or near-real-time (e.g., less than 500 ms of introduced latency), moderation of a verbal communication, without the need for human moderators. By using predictive technology with pre-defined knowledge of undesirable content (e.g., speech to be redacted from a verbal communication), undesirable content of a verbal communication (e.g., human speech or text-to-speech communication) may be censored, as the verbal communication is created. Prediction of undesirable content may be based on context of the initial audio communication (e.g., words preceding the offensive language) and/or the phonetic content of the verbal communication preceding the undesirable content, and/or the phonetic content of the undesirable content itself (e.g., the first sounds of offensive language).

IPC Classes  ?

  • A63F 13/67 - Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
  • G10L 15/187 - Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
  • G10L 15/197 - Probabilistic grammars, e.g. word n-grams
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G06F 21/60 - Protecting data

9.

PREDICTIVE AUDIO REDACTION FOR REALTIME COMMUNICATION

      
Application Number US2023017930
Publication Number 2023/196624
Status In Force
Filing Date 2023-04-07
Publication Date 2023-10-12
Owner MODULATE, INC. (USA)
Inventor
  • Huffman, William Carter
  • Fishman, Joshua D.
  • Nevue, Zachary

Abstract

Illustrative embodiments employ trained artificial intelligence to provide realtime (e.g., zero introduced latency), or near-real-time (e.g., less than 500 ms of introduced latency), moderation of a verbal communication, without the need for human moderators. By using predictive technology with pre-defined knowledge of undesirable content (e.g., speech to be redacted from a verbal communication), undesirable content of a verbal communication (e.g., human speech or text-to-speech communication) may be censored, as the verbal communication is created. Prediction of undesirable content may be based on context of the initial audio communication (e.g., words preceding the offensive language) and / or the phonetic content of the verbal communication preceding the undesirable content, and/ or the phonetic content of the undesirable content itself (e.g., the first sounds of offensive language).

IPC Classes  ?

  • G10L 15/16 - Speech classification or search using artificial neural networks
  • G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
  • G06N 20/00 - Machine learning

10.

MULTI-STAGE ADAPTIVE SYSTEM FOR CONTENT MODERATION

      
Application Number US2021054319
Publication Number 2022/076923
Status In Force
Filing Date 2021-10-08
Publication Date 2022-04-14
Owner MODULATE, INC. (USA)
Inventor
  • Huffman, William, Carter
  • Pappas, Michael
  • Howie, Henry

Abstract

A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.

IPC Classes  ?

  • G10L 15/00 - Speech recognition
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/08 - Speech classification or search
  • G10L 17/00 - Speaker identification or verification techniques
  • G10L 15/04 - SegmentationWord boundary detection

11.

Multi-stage adaptive system for content moderation

      
Application Number 17497862
Grant Number 11996117
Status In Force
Filing Date 2021-10-08
First Publication Date 2022-04-14
Grant Date 2024-05-28
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael
  • Howie, Henry

Abstract

A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.

IPC Classes  ?

  • G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
  • G06N 5/022 - Knowledge engineeringKnowledge acquisition
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

12.

M MODULATE

      
Application Number 1615454
Status Registered
Filing Date 2021-07-26
Registration Date 2021-07-26
Owner Modulate, Inc. (USA)
NICE Classes  ? 09 - Scientific and electric apparatus and instruments

Goods & Services

Downloadable computer programs for editing and altering sound; downloadable application software that alters and modifies the properties of a sound recording; downloadable application software for adding sound effects to sound recordings; downloadable software applications for enhancing audio recordings; downloadable computer software for use in sound database management, system administration, for generating and processing sound signals, and for converting analog and digital sound signals.

13.

System and method for creating timbres

      
Application Number 17307397
Grant Number 11854563
Status In Force
Filing Date 2021-05-04
First Publication Date 2021-08-19
Grant Date 2023-12-26
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

IPC Classes  ?

  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

14.

M MODULATE

      
Application Number 213689000
Status Registered
Filing Date 2021-07-26
Registration Date 2023-05-10
Owner Modulate, Inc. (USA)
NICE Classes  ? 09 - Scientific and electric apparatus and instruments

Goods & Services

(1) Downloadable computer programs for editing and altering sound; downloadable application software that alters and modifies the properties of a sound recording; downloadable application software for adding sound effects to sound recordings; downloadable software applications for enhancing audio recordings; downloadable computer software for use in sound database management, system administration, for generating and processing sound signals, and for converting analog and digital sound signals.

15.

Generation and detection of watermark for real-time voice conversion

      
Application Number 16994432
Grant Number 11538485
Status In Force
Filing Date 2020-08-14
First Publication Date 2021-02-18
Grant Date 2022-12-27
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Kelly, Brendan

Abstract

A method watermarks speech data by using a generator to generate speech data including a watermark. The generator is trained to generate the speech data including the watermark. The training process generates first speech from the generator. The first speech data is configured to represent speech. The first speech data includes a candidate watermark. The training also produces an inconsistency message as a function of at least one difference between the first speech data and at least authentic speech data. The training further includes transforming the first speech data, including the candidate watermark, using a watermark robustness module to produce transformed speech data including a transformed candidate watermark. The transformed speech data includes a transformed candidate watermark. The training further produces a watermark-detectability message, using a watermark detection machine learning system, relating to one or more desirable watermark features of the transformed candidate watermark.

IPC Classes  ?

  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G06N 3/04 - Architecture, e.g. interconnection topology
  • G06N 3/08 - Learning methods
  • G10L 21/007 - Changing voice quality, e.g. pitch or formants characterised by the process used
  • G10L 21/013 - Adapting to target pitch
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

16.

GENERATION AND DETECTION OF WATERMARK FOR REAL-TIME VOICE CONVERSION

      
Application Number US2020046534
Publication Number 2021/030759
Status In Force
Filing Date 2020-08-14
Publication Date 2021-02-18
Owner MODULATE, INC. (USA)
Inventor
  • Huffman, William Carter
  • Kelly, Brendan

Abstract

A method watermarks speech data by using a generator to generate speech data including a watermark. The generator is trained to generate the speech data including the watermark. The training process generates first speech from the generator. The first speech data is configured to represent speech. The first speech data includes a candidate watermark. The training also produces an inconsistency message as a function of at least one difference between the first speech data and at least authentic speech data. The training further includes transforming the first speech data, including the candidate watermark, using a watermark robustness module to produce transformed speech data including a transformed candidate watermark. The transformed speech data includes a transformed candidate watermark. The training further produces a watermark-detectability message, using a watermark detection machine learning system, relating to one or more desirable watermark features of the transformed candidate watermark.

IPC Classes  ?

  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog

17.

TOXMOD

      
Serial Number 90485675
Status Registered
Filing Date 2021-01-25
Registration Date 2022-12-13
Owner Modulate, Inc. ()
NICE Classes  ? 42 - Scientific, technological and industrial services, research and design

Goods & Services

Providing temporary use of on-line non-downloadable software for monitoring, analyzing and managing online platform user communications and interactions and policing online platform behavior in view of user community behavior standards

18.

VOICEWEAR

      
Serial Number 90485671
Status Registered
Filing Date 2021-01-25
Registration Date 2022-03-22
Owner Modulate, Inc. ()
NICE Classes  ? 09 - Scientific and electric apparatus and instruments

Goods & Services

Downloadable software featuring computer programs for editing and altering sound; downloadable software for creating, enhancing and supplementing audio effects in only games and entertainment platforms

19.

System and method for creating timbres

      
Application Number 16846460
Grant Number 11017788
Status In Force
Filing Date 2020-04-13
First Publication Date 2020-07-30
Grant Date 2021-05-25
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

IPC Classes  ?

  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

20.

M MODULATE

      
Serial Number 88462668
Status Registered
Filing Date 2019-06-06
Registration Date 2022-07-05
Owner Modulate, Inc. ()
NICE Classes  ? 09 - Scientific and electric apparatus and instruments

Goods & Services

Downloadable computer programs for editing and altering sound; downloadable application software that alters and modifies the properties of a sound recording; downloadable application software for adding sound effects to sound recordings; downloadable software applications for enhancing audio recordings; downloadable computer software for use in sound database management, system administration, for generating and processing sound signals, and for converting analog and digital sound signals

21.

System and method for creating timbres

      
Application Number 15989072
Grant Number 10622002
Status In Force
Filing Date 2018-05-24
First Publication Date 2018-11-29
Grant Date 2020-04-14
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

IPC Classes  ?

  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

22.

System and method for voice-to-voice conversion

      
Application Number 15989062
Grant Number 10614826
Status In Force
Filing Date 2018-05-24
First Publication Date 2018-11-29
Grant Date 2020-04-07
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A method of building a speech conversion system uses target information from a target voice and source speech data. The method receives the source speech data and the target timbre data, which is within a timbre space. A generator produces first candidate data as a function of the source speech data and the target timbre data. A discriminator compares the first candidate data to the target timbre data with reference to timbre data of a plurality of different voices. The discriminator determines inconsistencies between the first candidate data and the target timbre data. The discriminator produces an inconsistency message containing information relating to the inconsistencies. The inconsistency message is fed back to the generator, and the generator produces a second candidate data. The target timbre data in the timbre space is refined using information produced by the generator and/or discriminator as a result of the feeding back.

IPC Classes  ?

  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

23.

System and method for building a voice database

      
Application Number 15989065
Grant Number 10861476
Status In Force
Filing Date 2018-05-24
First Publication Date 2018-11-29
Grant Date 2020-12-08
Owner Modulate, Inc. (USA)
Inventor
  • Huffman, William Carter
  • Pappas, Michael

Abstract

A timbre vector space construction system for building a timbre vector space has an input. The input is configured to receive a first speech segment in a first voice and a second speech segment in a second voice. The system also includes a temporal receptive field to transform the first speech segment into a first plurality of analytical segments, and the second speech segment into a second plurality of analytical segments. Each of the first plurality of smaller analytical segments, and each of the second plurality of analytical segments have a frequency distribution that represents a different portion of the timbre data of the respective voices. The system also includes a machine learning system configured to map the first voice relative to the second voice in the timbre vector space as a function of the frequency distribution of the first plurality of analytical segments the second plurality of analytical segments.

IPC Classes  ?

  • G10L 21/00 - Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
  • G10L 21/013 - Adapting to target pitch
  • G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
  • G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
  • G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
  • G10L 19/018 - Audio watermarking, i.e. embedding inaudible data in the audio signal
  • G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks