Deepbrain AI Inc.

Republic of Korea

1-64 of 64 for Deepbrain AI Inc.

Sort by

Query


Aggregations
Jurisdiction
United States	33
World	31

Date
New (last 4 weeks)	1
2025 September	1
2025 August	1
2025 July	1
2025 (YTD)	8
2024	12
2023	14
2022	28
2020	2
See more See less
IPC Class
G10L 21/10 - Transforming into visible information	26
G06N 3/08 - Learning methods	12
G06T 13/20 - 3D [Three Dimensional] animation	12
G10L 15/04 - SegmentationWord boundary detection	10
G06N 3/04 - Architecture, e.g. interconnection topology	9
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination	9
G06T 5/00 - Image enhancement or restoration	8
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals	8
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks	8
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware	8
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings	7
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit	7
G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction	6
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals	6
G06N 20/00 - Machine learning	5
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks	5
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions	5
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers	5
G06T 13/80 - 2D animation, e.g. using sprites	4
G10L 15/16 - Speech classification or search using artificial neural networks	4
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog	4
H04N 5/265 - Mixing	4
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments	3
G06T 7/269 - Analysis of motion using gradient-based methods	3
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting	3
G10L 15/25 - Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis	3
H04N 21/854 - Content authoring	3
G06F 40/12 - Use of codes for handling textual entities	2
G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates	2
G06T 11/00 - 2D [Two Dimensional] image generation	2
See more See less
Status
Pending	11
Registered / In Force	53

Found results for

patents

1. Kiosk

Application Number	29885584
Grant Number	D1091528
Status	In Force
Filing Date	2023-02-27
First Publication Date	2025-09-02
Grant Date	2025-09-02
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Park, Boo Won

2. Kiosk

Application Number	29885583
Grant Number	D1090705
Status	In Force
Filing Date	2023-02-27
First Publication Date	2025-08-26
Grant Date	2025-08-26
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Park, Boo Won

3. ARTIFICIAL INTELLIGENCE-BASED VOICE DETECTION SERVER AND METHOD

Application Number	KR2024006986
Publication Number	2025/150627
Status	In Force
Filing Date	2024-05-23
Publication Date	2025-07-17
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Lee, Jeoung Soo Lee, Yoo Hyun

Abstract

Disclosed are an artificial intelligence-based voice detection server and method. The voice detection server according to an embodiment of the present invention comprises: a detection model training unit for generating an artificial intelligence-based voice detection model by performing pre-training using a training data set configured by one or more different formats including original voice data and modulated data of the original voice data, wherein a voice detection model for extracting speaker-unique information from an input value and determining authenticity is generated by performing pre-training through first model training and second model training by using a pre-processed training data set; and a detection processing unit for determining whether input voice data is modulated, on the basis of the modulation probability of the voice data by using the voice detection model.

IPC Classes ?

G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers
G10L 17/18 - Artificial neural networksConnectionist approaches
G10L 17/04 - Training, enrolment or model building
G06N 3/08 - Learning methods

4. VIDEO RECORDING SYSTEM AND METHOD FOR COMPOSITING

Application Number	18840889
Status	Pending
Filing Date	2022-05-26
First Publication Date	2025-05-22
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Park, Jun Sang Park, Jong Hyeon Jo, Hye Jin Chae, Gyeong Su

Abstract

A video recording system for compositing includes a first monitor which is positioned in a gaze area of a user and is for outputting a live video of the user and a basic posture still image displayed to be superimposed on the live video of the user, a recording apparatus for recording the user, and an image controller for transmitting the basic posture still image and the live video of the user to the first monitor on the basis of a user video transmitted from the recording apparatus and changing the basic posture still image transmitted to the first monitor when an image conversion condition is met while recording the live video of the user.

IPC Classes ?

G06T 11/60 - Editing figures and textCombining figures or text

5. APPARATUS AND METHOD FOR PROVIDING SPEECH VIDEO

Application Number	18839543
Status	Pending
Filing Date	2022-08-23
First Publication Date	2025-05-22
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

In a method for providing a speech video performed by a computing device, a standby state video in a video file format in which a person in the video is in a standby state is reproduced, a plurality of speech state images in which the person in the video is in a speech state and a speech voice based on a source of speech contents during the reproduction of the standby state video are generated, the reproduction of the standby state video is stopped and a back motion video in a video file format for returning to a reference frame of the standby state video is reproduced, and a synthesized speech video is generated by synthesizing the plurality of speech state images and the speech voice with the standby state video from the reference frame.

IPC Classes ?

G06T 13/00 - Animation

6. APPARATUS AND METHOD FOR PROVIDING SPEECH VIDEO

Application Number	18839550
Status	Pending
Filing Date	2022-08-23
First Publication Date	2025-05-15
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

In a method for providing a speech video performed by a computing device according to one embodiment, first sections of a plurality of standby state videos are sequentially played back, wherein each standby state video includes the first section in which a person in the video is in a standby state and a second section for image interpolation between a last frame of the first section and a reference frame, a plurality of speech state images in which the person in the video is in a speech state and a speech voice based on a source of speech contents are generated and played back, when the generating of the plurality of speech state images and the speech voice is completed, the second section of the standby state video being played back at the time of completion, and a synthesized speech video is generated by synthesizing the plurality of speech state images and the speech voice with at least some of the plurality of standby state videos.

IPC Classes ?

G06T 13/20 - 3D [Three Dimensional] animation
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers

7. APPARATUS AND METHOD FOR TEXT ANALYSIS AND SPEECH SYNTHESIS

Application Number	KR2023013581
Publication Number	2025/041899
Status	In Force
Filing Date	2023-09-11
Publication Date	2025-02-27
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Nam, Gyu Hyeon Hwang, Guem Buel Chae, Gyeong Su

Abstract

Disclosed are an apparatus and a method for text analysis and speech synthesis. The apparatus for text analysis and speech synthesis according to an embodiment includes: a text analysis module for generating a plurality of text chunks by separating input text into utterance units; a text encoding module for generating a plurality of text feature chunks by encoding the plurality of generated text chunks; and a speech synthesis module for generating speech signals on the basis of the plurality of text feature chunks.

IPC Classes ?

G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G06F 40/10 - Text processing
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers
G06F 40/30 - Semantic analysis
G06N 3/0464 - Convolutional networks [CNN, ConvNet]
G06N 3/044 - Recurrent networks, e.g. Hopfield networks

8. SYSTEM AND METHOD FOR PROVIDING REAL-TIME INTERPRETATION SERVICE USING AI HUMANS

Application Number	KR2024006946
Publication Number	2025/033658
Status	In Force
Filing Date	2024-05-23
Publication Date	2025-02-13
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Lee, Sung Joo

Abstract

Disclosed are a system and a method for providing a real-time interpretation service using AI humans. The system for providing a real-time interpretation service using AI humans, according to one aspect, may comprise: a first terminal which is arranged on a first user side and receives first voice data in a first language uttered by the first user; a service server which recognizes the first voice data and thus generates 1-1 text data in the first language, translates the 1-1 text data into a second language so as to generate 1-2 text data in the second language, and generates a first utterance video in which the 1-2 text data is uttered in the second language by an AI human; and a second terminal which is arranged on a second user side and plays the first utterance video.

IPC Classes ?

G06F 40/58 - Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
G06F 40/51 - Translation evaluation
H04L 67/50 - Network services
G10L 13/04 - Details of speech synthesis systems, e.g. synthesiser structure or memory management
G10L 15/26 - Speech to text systems
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06N 3/0475 - Generative networks
G06F 3/16 - Sound inputSound output

9. METHOD AND APPARATUS FOR LEARNING KEY POINT OF BASED NEURAL NETWORK

Application Number	18822615
Status	Pending
Filing Date	2024-09-03
First Publication Date	2024-12-26
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A neural network-based key point training apparatus according to an embodiment includes a key point model trained to extract key points from an input image, and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input. The optimized parameters of the key point model and the image reconstruction model can be calculated.

IPC Classes ?

G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
G06T 11/00 - 2D [Two Dimensional] image generation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

10. Apparatus and method for generating speech synthesis image

Application Number	17779651
Grant Number	12322016
Status	In Force
Filing Date	2022-03-15
First Publication Date	2024-12-12
Grant Date	2025-06-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

An apparatus for generating a speech synthesis image according to a disclosed embodiment is an apparatus for generating a speech synthesis image based on machine learning, the apparatus including a first global geometric transformation predictor configured to be trained to receive each of a source image and a target image including the same person, and predict a global geometric transformation for a global motion of the person between the source image and the target image based on the source image and the target image, a local feature tensor predictor configured to be trained to predict a feature tensor for a local motion of the person based on preset input data, and an image generator configured to be trained to reconstruct the target image based on the global geometric transformation, the source image, and the feature tensor for the local motion.

IPC Classes ?

G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06T 7/262 - Analysis of motion using transform domain methods, e.g. Fourier domain methods
G06T 11/20 - Drawing from basic elements, e.g. lines or circles
G06T 13/20 - 3D [Three Dimensional] animation
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit

11. METHOD AND APPARATUS FOR SYNTHESIZING VOICE OF BASED TEXT

Application Number	18788311
Status	Pending
Filing Date	2024-07-30
First Publication Date	2024-11-21
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Kim, Dalhyun

Abstract

An apparatus for synthesizing speech according to an embodiment is a computing apparatus that includes one or more processors and a memory storing one or more programs executed by the one or more processors. The apparatus for synthesizing speech includes a pre-processing module that marks a preset classification symbol on each of unit texts input; and a speech synthesis module that receives each unit text marked with the classification symbol and synthesizes speech uttering the unit text based on the input unit text.

IPC Classes ?

G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G06F 40/12 - Use of codes for handling textual entities

12. APPARATUS AND METHOD FOR GENERATING SPEECH SYNTHESIS IMAGE

Application Number	17779693
Status	Pending
Filing Date	2022-03-15
First Publication Date	2024-09-12
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

An apparatus for generating a speech synthesis image according to a disclosed embodiment is an apparatus for generating a speech synthesis image based on machine learning, the apparatus including a first global geometric transformation predictor configured to be trained to receive each of a source image and a target image including the same person, and predict a global geometric transformation for a global motion of the person between the source image and the target image based on the source image and the target image, a local feature tensor predictor configured to be trained to predict a feature tensor for a local motion of the person based on input target image-related information, and an image generator configured to be trained to reconstruct the target image based on the global geometric transformation, the source image, and the feature tensor for the local motion.

IPC Classes ?

G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G10L 15/25 - Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis

13. GOLF ROUND ASSISTANCE DEVICE AND METHOD USING AI CADDIE

Application Number	KR2024001116
Publication Number	2024/177295
Status	In Force
Filing Date	2024-01-24
Publication Date	2024-08-29
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Han, Jong Ho

Abstract

Disclosed are a golf round assistance device and method using an AI caddie. The golf round assistance device, according to an embodiment, comprises: an AI caddie selection unit that selects one AI caddie model from among a plurality of AI caddie models; a golf course information collection unit that collects, on the basis of location information about a user, information about a golf course where the user is located; a play record collection unit that collects play record information related to the golf course where the user is located; a strategy information generation unit that generates, on the basis of location information, the information about the golf course, and the play record information, strategy information for a course where the user plays a round; and an information provision unit that provides at least one of the information about the golf course and the strategy information to the user through the selected AI caddie model.

IPC Classes ?

A63B 71/06 - Indicating or scoring devices for games or players
A63B 71/04 - Games or sports accessories not covered in groups for small-room or indoor sporting games
G10L 25/48 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use
G06Q 50/10 - Services
G06N 3/08 - Learning methods
G06T 19/00 - Manipulating 3D models or images for computer graphics
A63B 102/32 - Golf

14. Method for providing speech video and computing device for executing the method

Application Number	18441193
Grant Number	12367892
Status	In Force
Filing Date	2024-02-14
First Publication Date	2024-06-06
Grant Date	2025-07-22
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doohyun

Abstract

In a method of providing a speech video according to an embodiment, a standby state video in which a person in a video is in a standby state is reproduced, a speech state video in which a person in a video is in a speech state based on a source of speech content is generated, the standby state video being reproduced to a reference frame of the standby state video being reproduced based on a back motion image is returned, and a synthesized speech video by synthesizing the returned reference frame and the speech state video is generated.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 13/027 - Concept to speech synthesisersGeneration of natural phrases from machine-based concepts

15. APPARATUS AND METHOD FOR PROVIDING SPEECH VIDEO

Application Number	KR2022095118
Publication Number	2024/038976
Status	In Force
Filing Date	2022-08-23
Publication Date	2024-02-22
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

Provided are an apparatus and method for providing a speech video. A method, performed by a computing apparatus, for providing a speech video according to an embodiment comprises the steps of: sequentially playing back a first section of a plurality of standby videos including the first section and a second section, wherein the first section is a section in which a person in the videos is standing by, and the second section is for interpolating images of the final frame of the first section and a reference frame; generating a plurality of speaking images, in which the person in the videos is speaking, and a speaking voice on the basis of the source of speech content; when the generation of the plurality of speaking images and the speaking voice is completed, playing back the second section of the standby video that was being played back at the time of completion; and generating a synthesized speech video in at least a portion of the plurality of standby videos by synthesizing the plurality of speaking images and the speaking voice.

IPC Classes ?

H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
H04N 5/265 - Mixing
G10L 21/10 - Transforming into visible information
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals

16. APPARATUS AND METHOD FOR PROVIDING SPEECH VIDEO

Application Number	KR2022095117
Publication Number	2024/038975
Status	In Force
Filing Date	2022-08-23
Publication Date	2024-02-22
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

An apparatus and a method for providing a speech video are disclosed. A speech video providing method performed by a computing device according to an embodiment comprises the steps of: reproducing a standby state video having a video file format, in which a person in a video is in a standby state; during the reproduction of the standby state video, generating, on the basis of a source of speech contents, a spoken voice and multiple speaking state images in which the person in the video is in a speaking state; stopping the reproduction of the standby state video, and reproducing a back motion video having a video file format, which is for a return to a reference frame of the standby state video; and synthesizing the multiple speaking state images and the spoken voice with the standby state video from the reference frame, so as to generate a synthesized speech video.

IPC Classes ?

H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
H04N 5/265 - Mixing
G10L 21/10 - Transforming into visible information
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals

17. Learning method for generating lip sync image based on machine learning and lip sync image generation device for performing same

Application Number	17764314
Grant Number	12198713
Status	In Force
Filing Date	2021-06-17
First Publication Date	2024-02-15
Grant Date	2025-01-14
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su

Abstract

A lip sync image generation device based on machine learning according to a disclosed embodiment includes an image synthesis model, which is an artificial neural network model, and which uses a person background image and an utterance audio signal as an input to generate a lip sync image, and a lip sync discrimination model, which is an artificial neural network model, and which discriminates the degree of match between the lip sync image generated by the image synthesis model and the utterance audio signal input to the image synthesis model.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals

18. METHOD FOR GENERATING DATA USING MACHINE LEARNING AND COMPUTING DEVICE FOR EXECUTING THE SAME

Application Number	17764265
Status	Pending
Filing Date	2021-06-17
First Publication Date	2024-02-08
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel Park, Sung Woo

Abstract

A computing device according to an embodiment disclosed is provided with one or more processors and a memory storing one or more programs executed by the one or more processors. The computing device includes a machine learning model, in which the machine learning model is trained to perform a task of receiving data in which a part of original data is damaged or removed, and restoring and outputting the damaged or removed data part as a main task, and is trained to perform a task of receiving original data and reconstructing and outputting the received original data as an auxiliary task.

IPC Classes ?

G06N 20/00 - Machine learning

19. SPEECH IMAGE PROVIDING METHOD AND COMPUTING DEVICE FOR PERFORMING THE SAME

Application Number	17764718
Status	Pending
Filing Date	2021-07-09
First Publication Date	2024-02-08
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

A computing device according to an embodiment disclosed includes one or more processors and a memory storing one or more programs executed by the one or more processors, and a standby state image generating module configured to generate a standby state image in which a person is in a standby state, an interpolation image generating module configured to generate an interpolation image set for interpolation between the standby state image and a pre-stored speech preparation image, and an image playback module configured to generate a connection image for connecting the standby state image and a speech state image based on the interpolation image set when the speech state image is generated.

IPC Classes ?

G06T 13/20 - 3D [Three Dimensional] animation
G06T 13/80 - 2D animation, e.g. using sprites
G10L 21/10 - Transforming into visible information

20. Speech image providing method and computing device for performing the same

Application Number	17764704
Grant Number	12243550
Status	In Force
Filing Date	2021-07-09
First Publication Date	2024-02-08
Grant Date	2025-03-04
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

A speech image providing method according to an embodiment includes generating a standby state image in which a person is in a standby state, generating a plurality of back-motion images at a preset frame interval from the standby state image for image interpolation between a preset reference frame of the standby state image, generating a speech state image in which a person is in a speech state based on a source of speech content, returning the standby state image being played to the reference frame based on the plurality of back-motion images of the standby state image, based on a point of time when the generating of the speech state image is completed, and generating a synthetic speech image in combination with frames of the speech state image from the reference frame.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G06T 7/246 - Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
G06T 13/20 - 3D [Three Dimensional] animation
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06T 13/80 - 2D animation, e.g. using sprites

21. APPARATUS AND METHOD FOR CONVERTING GRAPHEME TO PHONEME

Application Number	KR2022008366
Publication Number	2023/238975
Status	In Force
Filing Date	2022-06-14
Publication Date	2023-12-14
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Jung Jun Han, Chang Jin Chae, Gyeong Su

Abstract

Disclosed are an apparatus and method for converting a grapheme to a phoneme. An apparatus for converting a grapheme to a phoneme according to one embodiment comprises: a tokenization unit for dividing an input string into tokens; and a phoneme determination unit for determining the phoneme of each token on the basis of the token and tokens directly adjacent to the left and right thereof.

IPC Classes ?

G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 15/18 - Speech classification or search using natural language modelling
G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates

22. APPARATUS AND METHOD FOR GENERATING 3D LIP SYNC VIDEO

Application Number	KR2022008364
Publication Number	2023/229091
Status	In Force
Filing Date	2022-06-14
Publication Date	2023-11-30
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Kim, Doo Hyun Kwak, Hee Tae Jo, Hye Jin Lee, Ki Hyeok

Abstract

An apparatus and a method for generating a 3D lip sync video are disclosed. The apparatus for generating a 3D lip sync video, according to one embodiment, comprises: a voice conversion unit for generating speech audio on the basis of input text; and a 3D lip sync video generation model for generating a 3D lip sync video in which a 3D model of a person speaks, on the basis of the generated speech audio, a 2D video obtained by capturing an image of a speaking person, and 3D data acquired from the image of the speaking person.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G06T 13/20 - 3D [Three Dimensional] animation
G10L 15/04 - SegmentationWord boundary detection
G06T 17/00 - 3D modelling for computer graphics

23. VIDEO RECORDING SYSTEM AND METHOD FOR COMPOSITING

Application Number	KR2022007475
Publication Number	2023/224154
Status	In Force
Filing Date	2022-05-26
Publication Date	2023-11-23
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Park, Jun Sang Park, Jong Hyeon Jo, Hye Jin Chae, Gyeong Su

Abstract

Disclosed are a video recording system and method for compositing. The video recording system for compositing according to one embodiment of the present invention comprises: a first monitor, positioned in a user's gaze area, for outputting the user's live video and a basic posture still image superimposed on the user's live video; a recording apparatus for recording the user; and an image control unit for transmitting the basic posture still image and the user's live video to the first monitor on the basis of a user video transmitted from the recording apparatus, and changing the basic posture still image transmitted to the first monitor when an image conversion condition is met while recording the user's live video.

IPC Classes ?

H04N 23/60 - Control of cameras or camera modules
H04N 5/265 - Mixing
H04N 23/90 - Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
H04N 5/77 - Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
H04N 21/2187 - Live feed

24. Method for providing speech video and computing device for executing the method

Application Number	17598470
Grant Number	11967336
Status	In Force
Filing Date	2020-12-22
First Publication Date	2023-10-19
Grant Date	2024-04-23
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doohyun

Abstract

A computing device according to an embodiment is a computing device that is provided with one or more processors and a memory storing one or more programs executed by the one or more processors, the computing device includes a standby state video generating module that generates a standby state video in which a person in a video is in a standby state, a speech state video generating module that generates a speech state video in which a person in a video is in a speech state based on a source of speech content, and a video reproducing module that reproduces the standby state video, and generates a synthesized speech video by synthesizing the standby state video being reproduced and the speech state video.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 13/027 - Concept to speech synthesisersGeneration of natural phrases from machine-based concepts

25. APPARATUS AND METHOD FOR GENERATING SYNTHESIZED SPPECH IMAGE

Application Number	KR2022003607
Publication Number	2023/153553
Status	In Force
Filing Date	2022-03-15
Publication Date	2023-08-17
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

Disclosed are an apparatus and method for generating a synthesized speech image. The apparatus for generating a synthesized speech image, according to an embodiment, is a machine learning-based apparatus for generating a synthesized speech image, comprising: a first global geometric transformation prediction unit that receives an input of each of a source image and a target image, which include the same person, and is trained to predict global geometric transformation for global movement of the person between the source image and the target image on the basis of the source image and the target image; a local feature tensor prediction unit that is trained to predict a feature tensor for local movement of the person on the basis of preconfigured input image; and an image generation unit that is trained to reconstruct the target image on the basis of the global geometric transformation, the source image, and the feature tensor for the local movement.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G06T 13/20 - 3D [Three Dimensional] animation
G06T 7/269 - Analysis of motion using gradient-based methods
G06T 5/00 - Image enhancement or restoration
G10L 15/04 - SegmentationWord boundary detection
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
G06N 20/00 - Machine learning

26. APPARATUS AND METHOD FOR GENERATING SPEECH SYNTHESIS IMAGE

Application Number	KR2022003610
Publication Number	2023/153555
Status	In Force
Filing Date	2022-03-15
Publication Date	2023-08-17
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

An apparatus and a method for generating a speech synthesis image are disclosed. An apparatus for generating a speech synthesis image according to an embodiment relates to an apparatus for generating a speech synthesis image on the basis of machine learning, and comprises: a first global geometric transformation prediction unit for receiving an input of each of a source image and a target image, in which the same person is included, and trained to predict, on the basis of the source image and the target image, a global geometric transformation for global movement of the person between the source image and the target image; a local feature tensor prediction unit trained to predict a feature tensor for a local movement of the person, on the basis of information relating to the input target image; and an image generation unit trained to reconstruct the target image on the basis of the global geometric transformation, the source image, and the feature tensor for the local movement.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 15/04 - SegmentationWord boundary detection
G06T 7/20 - Analysis of motion
G06T 5/00 - Image enhancement or restoration
G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals

27. APPARATUS AND METHOD FOR GENERATING SYNTHESIZED SPEECH IMAGE

Application Number	KR2022003608
Publication Number	2023/153554
Status	In Force
Filing Date	2022-03-15
Publication Date	2023-08-17
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

Disclosed are an apparatus and method for generating a synthesized speech image. The apparatus for generating a synthesized speech image, according to an embodiment, is a machine learning-based apparatus for generating a synthesized speech image, comprising: a first global geometric transformation prediction unit that receives an input of each of a source image and a target image, which include the same person, and is trained to predict global geometric transformation for global movement of the person between the source image and the target image on the basis of the source image and the target image; a local geometric transformation prediction unit that is trained to predict local geometric transformation for local movement of the person between the source image and the target image on the basis of preconfigured input data; a geometric transformation combination unit that combines the global geometric transformation and the local geometric transformation so as to calculate overall movement geometric transformation for overall movement of the person; and an image generation unit that is trained to reconstruct the target image on the basis of the source image and the overall movement geometric transformation.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G06T 13/20 - 3D [Three Dimensional] animation
G06T 7/269 - Analysis of motion using gradient-based methods
G06T 5/00 - Image enhancement or restoration
G10L 15/04 - SegmentationWord boundary detection
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
G06N 20/00 - Machine learning

28. DEVICE AND METHOD FOR GENERATING SYNTHESIZED SPEECH IMAGE

Application Number	KR2022003604
Publication Number	2023/146019
Status	In Force
Filing Date	2022-03-15
Publication Date	2023-08-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

A device and a method for generating a synthesized speech image are disclosed. The device for generating a synthesized speech image according to an embodiment is a device for generating a synthesized speech image on the basis of machine learning, the device comprising: a first global geometric transformation prediction unit which receives each of a source image and a target image including the same person, and is trained to predict a global geometric transformation for global movement of the person between the source image and the target image on the basis of the source image and the target image; a local geometric transformation prediction unit which is trained to predict a local geometric transformation for local movement of the person between the source image and the target image on the basis of preconfigured input data; a geometric transformation combination unit which combines the global geometric transformation and the local geometric transformation so as to calculate an overall movement geometric transformation for overall movement of the person; an optical flow prediction unit which is trained to calculate an optical flow between the source image and the target image on the basis of the source image and the overall movement geometric transformation; and an image generation unit which is trained to reconstruct the target image on the basis of the source image and the optical flow.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G06T 13/20 - 3D [Three Dimensional] animation
G06T 7/269 - Analysis of motion using gradient-based methods
G06T 5/00 - Image enhancement or restoration
G10L 15/04 - SegmentationWord boundary detection
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals

29. Device and method for synthesizing image capable of improving image quality

Application Number	17763814
Grant Number	12190480
Status	In Force
Filing Date	2021-06-08
First Publication Date	2023-06-08
Grant Date	2025-01-07
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

An image synthesis device according to a disclosed embodiment is an image synthesis device has one or more processors and a memory which stores one or more programs executed by the one or more processors. The image synthesis device includes a first artificial neural network model provided to learn each of a first task of using a damaged image as an input to output a restored image and a second task of using an original image as an input to output a reconstructed image, and a second artificial neural network model trained to use the reconstructed image output from the first artificial neural network model as an input and improve the image quality of the reconstructed image.

IPC Classes ?

G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction
G06T 5/00 - Image enhancement or restoration
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals

30. Apparatus and method for generating lip sync image

Application Number	17764324
Grant Number	12190903
Status	In Force
Filing Date	2021-06-03
First Publication Date	2023-06-08
Grant Date	2025-01-07
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Hwang, Guem Buel Chae, Gyeong Su

Abstract

An apparatus for generating a lip sync image according to a disclosed embodiment has one or more processors and a memory which stores one or more programs executed by the one or more processors. The apparatus includes a first artificial neural network model configured to generate an utterance synthesis image by using a person background image and an utterance audio signal corresponding to the person background image as an input, and generate a silence synthesis image by using only the person background image as an input, and a second artificial neural network model configured to output, from a preset utterance maintenance image and the first artificial neural network model, classification values for the preset utterance maintenance image and the silence synthesis image by using the silence synthesis image as an input.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06T 13/80 - 2D animation, e.g. using sprites

31. Device and method for synthesizing image capable of improving image quality

Application Number	17763941
Grant Number	12236558
Status	In Force
Filing Date	2021-06-08
First Publication Date	2023-06-08
Grant Date	2025-02-25
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

An image synthesis device according to a disclosed embodiment has one or more processors and a memory which stores one or more programs executed by the one or more processors. The image synthesis device includes a first artificial neural network provided to learn each of a first task of using a damaged image as an input to output a restored image and a second task of using an original image as an input to output a reconstructed image, and a second artificial neural network connected to an output layer of the first artificial neural network, and trained to use the reconstructed image output from the first artificial neural network as an input and improve the image quality of the reconstructed image.

IPC Classes ?

G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction
G06T 5/00 - Image enhancement or restoration
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals

32. Apparatus and method for generating lip sync image

Application Number	17764651
Grant Number	12236943
Status	In Force
Filing Date	2021-06-08
First Publication Date	2023-06-08
Grant Date	2025-02-25
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Hwang, Guem Buel Chae, Gyeong Su

Abstract

An apparatus for generating a lip sync image according to disclosed embodiment has one or more processors and a memory which stores one or more programs executed by the one or more processors. The apparatus includes a first artificial neural network model configured to generate an utterance match synthesis image by using a person background image and an utterance match audio signal corresponding to the person background image as an input, and generate an utterance mismatch synthesis image by using the person background image and an utterance mismatch audio signal not corresponding to the person background image as an input, and a second artificial neural network model configured to output classification values for an input pair in which an image and a voice match and an input pair in which an image and a voice do not match by using the input pairs as an input.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 15/25 - Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis

33. Speech image providing method and computing device for performing the same

Application Number	17765094
Grant Number	11830120
Status	In Force
Filing Date	2021-07-09
First Publication Date	2023-01-05
Grant Date	2023-11-28
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

A computing device according to an embodiment includes one or more processors, a memory storing one or more programs executed by the one or more processors, a standby state image generating module configured to generate a standby state image in which a person is in a standby state, and generate a back-motion image set including a plurality of back-motion images at a preset frame interval from the standby state image for image interpolation between a preset reference frame of the standby state image, a speech state image generating module configured to generate a speech state image in which a person is in a speech state based on a source of speech content, and an image playback module configured to generate a synthetic speech image by combining the standby state image and the speech state image while playing the standby state image.

IPC Classes ?

G06T 13/20 - 3D [Three Dimensional] animation
G06T 13/80 - 2D animation, e.g. using sprites
G10L 21/10 - Transforming into visible information

34. METHOD FOR PROVIDING SPEECH VIDEO, AND COMPUTING DEVICE FOR EXECUTING SAME

Application Number	KR2021008828
Publication Number	2023/277231
Status	In Force
Filing Date	2021-07-09
Publication Date	2023-01-05
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

Disclosed are a method for providing a speech video, and a computing device for executing same. A computing device according to an embodiment disclosed herein comprises: one or more processors; and memory in which one or more programs executed by the one or more processors are stored. The computing device includes: a standby state video generation module for generating a standby state video in which a person in the video is in a standby state, and generating a back motion image set including a plurality of back motion images at preset frame intervals of a standby state video in order to perform image interpolation between preset reference frames of the standby state video; a speaking state video generation module for generating, on the basis of a source of speech content, a speaking state video in which the person in the video is in a speaking state; and a video playback module for generating a synthesized speech video by synthesizing the standby state video and the speaking state video while playing back the standby state video.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware

35. METHOD FOR PROVIDING UTTERANCE IMAGE AND COMPUTING DEVICE FOR PERFORMING SAME

Application Number	KR2021008825
Publication Number	2022/270669
Status	In Force
Filing Date	2021-07-09
Publication Date	2022-12-29
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

Disclosed are a method for providing an utterance image and a computing device for performing same. The computing device according to a disclosed embodiment relates to a computing device comprising one or more processors and a memory for storing one or more programs executed by the one or more processors, and the computing device comprises: a standby state image generation module for generating a standby state image in which a person in an image is in a standby state; an interpolation image generation module for generating an interpolation image set for interpolation between the standby state image and a pre-stored utterance preparation image; and an image playback module that, when an utterance state image is generated, generates a connection image for connecting the standby state image and the utterance state image on the basis of the interpolation image set.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
H04N 7/01 - Conversion of standards
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
G10L 15/04 - SegmentationWord boundary detection

36. METHOD FOR PROVIDING SPEECH VIDEO AND COMPUTING DEVICE FOR EXECUTING METHOD

Application Number	KR2021008823
Publication Number	2022/265148
Status	In Force
Filing Date	2021-07-09
Publication Date	2022-12-22
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doo Hyun

Abstract

Disclosed are a method for providing a speech video and a computing device for executing the method. The method for providing a speech video according to one embodiment is a method performed in a computing device having one or more processors and a memory for storing one or more programs executed by means of the one or more processors, the method comprising the steps of: generating a standby state video in which a character in the video is in a standby state; generating a plurality of back motion images in a predetermined frame interval in the standby state video, for image interpolation between predetermined reference frames of the standby state video; generating a speaking state video in which the character in the video is in a speaking state, on the basis of a source of speech contents; returning the standby state video being played to the reference frame on the basis of the plurality of back motion images of the standby state video, on the basis of the point in which the speaking state video has been generated; and generating a synthesized speaking video by synthesizing the reference frame with the frame of the speaking state video.

IPC Classes ?

G10L 21/10 - Transforming into visible information
H04N 5/265 - Mixing
H04N 5/272 - Means for inserting a foreground image in a background image, i.e. inlay, outlay
H04N 5/14 - Picture signal circuitry for video frequency region
H04N 7/01 - Conversion of standards

37. Method and device for generating speech moving image

Application Number	17762914
Grant Number	12205212
Status	In Force
Filing Date	2020-12-08
First Publication Date	2022-12-15
Grant Date	2025-01-21
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A device which generates a speech moving image includes a first encoder, a second encoder, a combination unit, and an image reconstruction unit. The first encoder receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector. The second encoder receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector. The combination unit generates a combination vector of the compressed image feature vector and the compressed voice feature vector. The image reconstruction unit reconstructs the speech moving image of the person with the combination as an input.

IPC Classes ?

G06T 13/00 - Animation
G06T 9/00 - Image coding
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

38. Method and device for generating speech video using audio signal

Application Number	17620867
Grant Number	12148431
Status	In Force
Filing Date	2020-06-19
First Publication Date	2022-12-15
Grant Date	2024-11-19
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel Park, Sungwoo Jang, Seyoung

Abstract

A device according to an embodiment has one or more processors and a memory storing one or more programs executable by the one or more processors. The device includes a first encoder configured to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder configured to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner configured to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder configured to reconstruct the speech video of the person using the combined vector as an input.

IPC Classes ?

G10L 17/14 - Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
G06N 3/08 - Learning methods
G10L 17/02 - Preprocessing operations, e.g. segment selectionPattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal componentsFeature selection or extraction
G10L 21/10 - Transforming into visible information
H04N 5/265 - Mixing
H04N 21/2368 - Multiplexing of audio and video streams
H04N 21/439 - Processing of audio elementary streams
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

39. LEARNING METHOD FOR GENERATING LIP-SYNC VIDEO ON BASIS OF MACHINE LEARNING AND LIP-SYNC VIDEO GENERATING DEVICE FOR EXECUTING SAME

Application Number	KR2021007643
Publication Number	2022/255529
Status	In Force
Filing Date	2021-06-17
Publication Date	2022-12-08
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su

Abstract

Disclosed are a learning method for generating a lip-sync video on the basis of machine learning, and a lip-sync video generating device for executing the method. A lip-sync video generating device based on machine learning according to a disclosed embodiment comprises: a video synthesis model which is an artificial neural network model, and generates a lip-sync video by using a person background video and an utterance audio signal as an input; and a lip-sync determination model which is an artificial neural network model, and determines a degree of accordance between the lip-sync video generated by the video synthesis model and the utterance audio signal input to the video synthesis model.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G06N 3/08 - Learning methods
G06N 3/04 - Architecture, e.g. interconnection topology

40. Device and method for generating speech video along with landmark

Application Number	17762926
Grant Number	12347197
Status	In Force
Filing Date	2020-12-15
First Publication Date	2022-11-24
Grant Date	2025-07-01
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu

Abstract

A speech video generation device according to an embodiment includes a first encoder, which receives an input of a person background image that is a video part in a speech video of a predetermined person, and extracts an image feature vector from the person background image, a second encoder, which receives an input of a speech audio signal that is an audio part in the speech video, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, a first decoder, which reconstructs the speech video of the person using the combined vector as an input, and a second decoder, which predicts a landmark of the speech video using the combined vector as an input.

IPC Classes ?

G06V 20/40 - ScenesScene-specific elements in video content
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals

41. Device and method for generating speech video

Application Number	17763243
Grant Number	12205342
Status	In Force
Filing Date	2020-12-15
First Publication Date	2022-11-24
Grant Date	2025-01-21
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A speech video generation device according to an embodiment includes a first encoder that receives an input of a first person background image of a predetermined person partially hidden by a first mask, and extracts a first image feature vector from the first person background image, a second encoder, which receives an input of a second person background image of the person partially hidden by a second mask, and extracts a second image feature vector from the second person background image, a third encoder, which receives an input of a speech audio signal of the person, and extracts a voice feature vector from the speech audio signal, a combining unit, which generates a combined vector of the first image feature vector, the second image feature vector, and the voice feature vector, and a decoder, which reconstructs a speech video of the person using the combined vector as an input.

IPC Classes ?

G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06T 13/20 - 3D [Three Dimensional] animation
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions

42. Method and apparatus for text-based speech synthesis

Application Number	17763337
Grant Number	12080270
Status	In Force
Filing Date	2020-12-22
First Publication Date	2022-11-17
Grant Date	2024-09-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Kim, Dalhyun

Abstract

IPC Classes ?

G10L 13/033 - Voice editing, e.g. manipulating the voice of the synthesiser
G06F 40/12 - Use of codes for handling textual entities
G10L 13/047 - Architecture of speech synthesisers
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

43. METHOD AND DEVICE FOR GENERATING SPEECH VIDEO ON BASIS OF MACHINE LEARNING

Application Number	17620948
Status	Pending
Filing Date	2020-06-19
First Publication Date	2022-11-10
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel Park, Sungwoo Jang, Seyoung

Abstract

A device for generating a speech video may include a first encoder to receive a person background image corresponding to a video part of a speech video of a person and extract an image feature vector from the person background image, a second encoder to receive a speech audio signal corresponding to an audio part of the speech video and extract a voice feature vector from the speech audio signal, a combiner to generate a combined vector by combining the image feature vector output from the first encoder and the voice feature vector output from the second encoder, and a decoder to reconstruct the speech video of the person using the combined vector as an input. The person background image input to the first encoder includes a face and an upper body of the person, with a portion related to speech of the person covered with a mask.

IPC Classes ?

G06T 13/20 - 3D [Three Dimensional] animation
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G10L 15/25 - Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
G10L 15/16 - Speech classification or search using artificial neural networks

44. Method and device for generating speech video by using text

Application Number	17620863
Grant Number	11972516
Status	In Force
Filing Date	2020-06-19
First Publication Date	2022-11-03
Grant Date	2024-04-30
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel Park, Sungwoo Jang, Seyoung

Abstract

A device for generating a speech video according to an embodiment has one or more processor and a memory storing one or more programs executable by the one or more processors, and the device includes a video part generator configured to receive a person background image of a person and generate a video part of a speech video of the person; and an audio part generator configured to receive text, generate an audio part of the speech video of the person, and provide speech-related information occurring during the generation of the audio part to the video part generator.

IPC Classes ?

G06T 13/20 - 3D [Three Dimensional] animation
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions

45. METHOD FOR GENERATING DATA USING MACHINE LEARNING AND COMPUTING DEVICE FOR EXECUTING METHOD

Application Number	KR2021007631
Publication Number	2022/231061
Status	In Force
Filing Date	2021-06-17
Publication Date	2022-11-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel Park, Sung Woo

Abstract

Disclosed are a method for generating data using machine learning and a computing device for executing same. A computing device including a machine learning model, according to one embodiment disclosed herein comprises: one or more processors; and a memory for storing one or more programs executed by the one or more processors, wherein the machine learning model is trained so as to receive original data that is partially damaged or removed, to perform, as a main task, an operation of restoring and outputting the damaged or removed part of the original data, and to perform, as an auxiliary task, an operation of reconstructing and outputting the received original data.

IPC Classes ?

G06N 20/00 - Machine learning
G06N 5/02 - Knowledge representationSymbolic representation

46. Learning device and method for generating image

Application Number	17762820
Grant Number	12131441
Status	In Force
Filing Date	2020-12-01
First Publication Date	2022-11-03
Grant Date	2024-10-29
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A learning device for generating an image according to an embodiment disclosed is a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The learning device includes a first machine learning model that generates a mask for masking a portion related to speech in a person basic image with the person basic image as an input, and generates a person background image by synthesizing the person basic image and the mask.

IPC Classes ?

G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction

47. APPARATUS AND METHOD FOR PREPROCESSING TEXT

Application Number	17763756
Status	Pending
Filing Date	2021-05-04
First Publication Date	2022-11-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	You, Jae Seong Chae, Gyeong Su Jang, Se Young

Abstract

An apparatus for preprocessing text according to a disclosed embodiment includes an acquisition unit that acquires text data including a plurality of grapheme, a conversion unit that converts the plurality of graphemes into a plurality of phonemes on the basis of previously set conversion rules, and a generation unit that generates one or more tokens by grouping the plurality of phonemes, by previously set number units, on the basis of an order in which the plurality of graphemes are depicted.

IPC Classes ?

G06F 40/40 - Processing or translation of natural language
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

48. Neural network-based key point training apparatus and method

Application Number	17762819
Grant Number	12112571
Status	In Force
Filing Date	2020-12-01
First Publication Date	2022-10-27
Grant Date	2024-10-08
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A neural network-based key point training apparatus according to an embodiment disclosed includes a key point model trained to extract key points from an input image and an image reconstruction model trained to reconstruct the input image with the key points output by the key point model as the input.

IPC Classes ?

G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
G06T 11/00 - 2D [Two Dimensional] image generation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

49. METHOD AND DEVICE FOR GENERATING SPEECH IMAGE

Application Number	17762876
Status	Pending
Filing Date	2020-12-08
First Publication Date	2022-10-27
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A device for generating a speech image according to an embodiment disclosed herein is a speech image generation device including one or more processors and a memory storing one or more programs executed by the one or more processors. The device includes a first machine learning model that extracts an image feature with a speech image of a person as an input to reconstruct the speech image from the extracted image feature and a second machine learning model that predicts the image feature with a speech audio signal of the person as an input.

IPC Classes ?

G06V 20/40 - ScenesScene-specific elements in video content
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
G06N 3/04 - Architecture, e.g. interconnection topology

50. IMAGE COMBINING APPARATUS AND METHOD CAPABLE OF IMPROVING IMAGE QUALITY

Application Number	KR2021007126
Publication Number	2022/169035
Status	In Force
Filing Date	2021-06-08
Publication Date	2022-08-11
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

Disclosed are an image combining apparatus and method capable of improving image quality. The image combining apparatus according to a disclosed embodiment is an image combining apparatus comprising one or more processors and a memory for storing one or more programs executed by the one or more processors, and comprises: a first artificial neural network model provided to learn each of a first task for outputting a restored image by using a damaged image as an input and a second task for outputting a reconstructed image by using an original image as an input; and a second artificial neural network model trained to improve image quality of the reconstructed image by using the reconstructed image output from the first artificial neural network model as an input.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods
G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction
G06T 5/00 - Image enhancement or restoration

51. IMAGE SYNTHESIS APPARATUS AND METHOD CAPABLE OF IMPROVING IMAGE QUALITY

Application Number	KR2021007145
Publication Number	2022/169036
Status	In Force
Filing Date	2021-06-08
Publication Date	2022-08-11
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeong Su Hwang, Guem Buel

Abstract

Disclosed are an image synthesis apparatus and method capable of improving image quality. The image synthesis apparatus according to a disclosed embodiment is an image synthesis apparatus comprising one or more processors and a memory for storing one or more programs executed by the one or more processors, and comprises: a first artificial neural network unit provided to learn each of a first task for outputting a restored image by using a damaged image as an input and a second task for outputting a reconstructed image by using an original image as an input; and a second artificial neural network unit connected to an output layer of the first neural network unit and trained to improve image quality of the reconstructed image by using, as an input, the reconstructed image output from the first artificial neural network unit.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods
G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction
G06T 5/00 - Image enhancement or restoration

52. APPARATUS AND METHOD FOR GENERATING LIP-SYNC VIDEO

Application Number	KR2021007125
Publication Number	2022/149667
Status	In Force
Filing Date	2021-06-08
Publication Date	2022-07-14
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Hwang, Guem Buel Chae, Gyeong Su

Abstract

Disclosed are a method and an apparatus for generating a lip-sync video. An apparatus for generating a lip-sync video according to a disclosed embodiment is a lip-sync video generation apparatus comprising one or more processors and a memory for storing one or more programs executed by the one or more processors, and comprises: a first artificial neural network model for generating a synthesized utterance matching video by using, as an input, a person background video and an utterance matching audio signal corresponding to the person background video, and generating a synthesized utterance mismatch video by using, as an input, a person background video and an utterance mismatch audio signal which does not correspond to the person background video; and a second artificial neural network model which uses, as an input, an input pair in which a video and a voice match each other and an input pair in which a video and a voice do not match each other, so as to output a classification value relating thereto.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware

53. LIP SYNC VIDEO GENERATION APPARATUS AND METHOD

Application Number	KR2021006913
Publication Number	2022/124498
Status	In Force
Filing Date	2021-06-03
Publication Date	2022-06-16
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Hwang, Guem Buel Chae, Gyeong Su

Abstract

Disclosed are a lip sync video generation apparatus and method. The lip sync video generation apparatus, according to a disclosed embodiment, is a lip sync video generation apparatus comprising one or more processors and memory storing one or more programs executed by the one or more processors, and comprises: a first artificial neural network model which generates a synthesized speech video by using, as an input, a background video of a person and a speech audio signal corresponding to the background video of the person, and generates a synthesized silence video by using, as an input, only the background video of the person; and a second artificial neural network model which outputs classification values for a speech maintenance video and the synthesized silence video by using, as an input, a preset speech maintenance video and the synthesized silence video from the first artificial neural network model.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods
G10L 21/10 - Transforming into visible information
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks

54. SPEECH IMAGE PROVISION METHOD, AND COMPUTING DEVICE FOR PERFORMING SAME

Application Number	KR2020018937
Publication Number	2022/092439
Status	In Force
Filing Date	2020-12-22
Publication Date	2022-05-05
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Kim, Doohyun

Abstract

A speech image provision method, and a computing device for performing same are disclosed. A computing device according to one embodiment of the disclosure relates to a computing device having one or more processors, and a memory for storing one or more programs executed by one or more processors, and comprises: a standby state image generation module for generating a standby state image in which a person in the image is in a standby state; a speech state image generation module for generating, on the basis of the source of speech content, a speech state image in which the person in the image is in a speech state; and an image playback module which plays back the standby state images, and which generates a synthesized speech image by synthesizing the standby state image and speech state image that are being played back.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
H04N 7/01 - Conversion of standards
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
G10L 15/04 - SegmentationWord boundary detection

55. TEXT-BASED VOICE SYNTHESIS METHOD AND DEVICE

Application Number	KR2020018935
Publication Number	2022/065603
Status	In Force
Filing Date	2020-12-22
Publication Date	2022-03-31
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Kim, Dalhyun

Abstract

Disclosed are a text-based voice synthesis method and device. A voice synthesis device according to a disclosed embodiment is a computing device comprising one or more processors and a memory for storing one or more programs executed by the one or more processors, and comprises a preprocessing module which marks a predetermined classification sign on each input unit text, and a voice synthesis module which receives each unit text marked with the classification sign, and synthesizes, on the basis of the input unit text, a voice uttering the unit text.

IPC Classes ?

G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 13/02 - Methods for producing synthetic speechSpeech synthesisers
G10L 15/04 - SegmentationWord boundary detection
G10L 15/08 - Speech classification or search

56. APPARATUS AND METHOD FOR GENERATING SPEECH VIDEO THAT CREATES LANDMARKS TOGETHER

Application Number	KR2020018372
Publication Number	2022/045485
Status	In Force
Filing Date	2020-12-15
Publication Date	2022-03-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu

Abstract

Disclosed are an apparatus and method for generating a speech video, wherein the apparatus and method also create landmarks along with the speech video. The disclosed apparatus for generating a speech video according to an embodiment is a computing apparatus comprising one or more processors and memory for storing one or more programs executed by the one or more processors, the apparatus comprising: a first encoder which receives an input of a person background image, which is a video part of a speech video of a prescribed person, and extracts an image feature vector from the person background image; a second encoder which receives an input of a speech audio signal, which is an audio part of the speech video, and extracts a voice feature vector from the speech audio signal; a combination unit which combines the image feature vector output from the first encoder and the voice feature vector output from the second encode, thereby generating a combination vector; a first decoder which receives the combination vector as an input to reconstruct the speech video of the person; and a second decoder which receives the combination vector as an input to predict landmarks of the speech video.

IPC Classes ?

G10L 21/10 - Transforming into visible information
H04N 21/854 - Content authoring
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
G10L 19/06 - Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
G10L 21/0356 - Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for synchronising with other signals, e.g. video signals
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals

57. METHOD AND APPARATUS FOR GENERATING SPEECH VIDEO

Application Number	KR2020018374
Publication Number	2022/045486
Status	In Force
Filing Date	2020-12-15
Publication Date	2022-03-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A method and an apparatus for generating a speech video are disclosed. The disclosed apparatus for generating a speech video according to an embodiment corresponds to a computing apparatus having one or more processors and a memory for storing one or more programs executed by the one or more processors, and comprises: a first encoder for receiving a first person background image of a predetermined person partially covered by a first mask and extracting a first image feature vector from the first person background image; a second encoder for receiving a second person background image of a person partially covered by a second mask and extracting a second image feature vector from the second person background image; a third encoder for receiving a speech audio signal of a person and extracting a voice feature vector from the speech audio signal; a combining unit for generating a combined vector by combining the first image feature vector output from the first encoder, the second image feature vector output from the second encoder, and the voice feature vector output from the third encoder; and a decoder for reconstructing a speech video of a person by using the combined vector as an input.

IPC Classes ?

G10L 21/10 - Transforming into visible information
H04N 21/854 - Content authoring
G10L 21/055 - Time compression or expansion for synchronising with other signals, e.g. video signals
G10L 15/04 - SegmentationWord boundary detection
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
G06N 20/00 - Machine learning

58. APPARATUS AND METHOD FOR PREPROCESSING TEXT

Application Number	KR2021005600
Publication Number	2022/030732
Status	In Force
Filing Date	2021-05-04
Publication Date	2022-02-10
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	You, Jae Seong Chae, Gyeong Su Jang, Se Young

Abstract

Disclosed are an apparatus and a method for preprocessing text. The apparatus for preprocessing text according to one embodiment comprises: an acquisition unit for acquiring text data comprising a plurality of graphemes; a conversion unit for converting the plurality of graphemes to a plurality of phonemes on the basis of previously set conversion rules; and a generation unit for generating one or more tokens by grouping, by previously set number units, the plurality of phonemes on the basis of the order in which the plurality of graphemes are depicted.

IPC Classes ?

G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates
G06F 40/163 - Handling of whitespace
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

59. METHOD AND APPARATUS FOR GENERATING SPEECH VIDEO

Application Number	KR2020017848
Publication Number	2022/025359
Status	In Force
Filing Date	2020-12-08
Publication Date	2022-02-03
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

Disclosed are a method and an apparatus for generating a speech video. The disclosed speech video generating apparatus according to an embodiment corresponds to a speech video generating apparatus having at least one processor and a memory for storing at least one program executed by the at least one processor, and comprises: a first machine learning model which receives an input of a speech video of a person, extracts a video feature therefrom, and reconstructs the speech video from the extracted video feature; and a second machine learning model which receives an input of a speech audio signal of a person and predicts a video feature therefrom.

IPC Classes ?

G10L 21/10 - Transforming into visible information
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods

60. UTTERANCE MOVING IMAGE GENERATION METHOD AND APPARATUS

Application Number	KR2020017847
Publication Number	2022/014800
Status	In Force
Filing Date	2020-12-08
Publication Date	2022-01-20
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

Disclosed are an utterance moving image generation method and apparatus. An utterance moving image generation apparatus according to a disclosed embodiment is a computing device comprising one or more processors and memory for storing one or more programs executed by the one or more processors, and comprises: a first encoder for receiving an input of a person background image which is a video portion among an utterance moving image of a predetermined person and in which a portion of the person related to utterance is covered by a mask, extracting an image feature vector from the person background image, and compressing the extracted image feature vector; a second encoder for receiving an input of an utterance audio signal that is an audio portion among the utterance moving image, extracting a voice feature vector from the utterance audio signal, and compressing the extracted voice feature vector; a combining unit for generating a combined vector by combining the compressed image feature vector output from the first encoder and the compressed voice feature vector output from the second encoder; and an image reconstruction unit for reconstructing the utterance moving image of the person by using the combined vector as an input.

IPC Classes ?

H04N 21/854 - Content authoring
H04N 21/439 - Processing of audio elementary streams
H04N 21/44 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
G10L 15/04 - SegmentationWord boundary detection
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit

61. NEURAL NETWORK-BASED KEY POINT TRAINING APPARATUS AND METHOD

Application Number	KR2020017404
Publication Number	2022/004970
Status	In Force
Filing Date	2020-12-01
Publication Date	2022-01-06
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

Disclosed are neural network-based key point training apparatus and method. The key point training apparatus according to one embodiment disclosed comprises: a key point model trained to extract key points from an input image; and an image reconstruction model trained to reconstruct the input image with the key points outputted by the key point model as the input.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/04 - Architecture, e.g. interconnection topology

62. LEARNING DEVICE AND METHOD FOR GENERATING IMAGE

Application Number	KR2020017408
Publication Number	2022/004971
Status	In Force
Filing Date	2020-12-01
Publication Date	2022-01-06
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Chae, Gyeongsu Hwang, Guembuel

Abstract

A learning device and method for image generation are disclosed. A learning device for image generation, according to one disclosed embodiment, is a computing device including one or more processors, and a memory for storing one or more programs executed by the one or more processors, and comprises a first machine learning model which uses a basic image of a person as an input to generate a mask to be masked on a portion related to speech in the basic image of a person, and which combines the basic image of a person and the mask to generate a background image of a person.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/08 - Learning methods
G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction

63. Method, computer device and computer readable recording medium for providing natural language conversation by timely providing substantial reply

Application Number	16761435
Grant Number	11302332
Status	In Force
Filing Date	2018-10-30
First Publication Date	2020-12-17
Grant Date	2022-04-12
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Jang, Seyoung Yoon, Dosang Seol, Jaeho

Abstract

A method for providing a natural language conversation, which is implemented by an interactive agent system, may include receiving a natural language input, determining a user intent based on the natural language input, and providing a natural language response corresponding to the natural language input, based on the natural language input and/or the determined user intent, which is associated with execution of a specific task, provision of specific information, and/or a simple statement. The provision of the natural language response includes determining whether a first condition is satisfied based on whether it is possible to obtain all sufficient information from the natural language input, without having to request additional information, and when the first condition is satisfied, determining whether a second condition is satisfied and providing a natural language response belonging to a category of substantial replies when the second condition is satisfied.

IPC Classes ?

G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/26 - Speech to text systems
G06F 40/289 - Phrasal analysis, e.g. finite state techniques or chunking
G06F 40/274 - Converting codes to wordsGuess-ahead of partial word inputs
G06N 3/08 - Learning methods
H04L 51/02 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail using automatic reactions or user delegation, e.g. automatic replies or chatbot-generated messages

64. Method and computer device for providing natural language conversation by providing interjection response in timely manner, and computer-readable recording medium

Application Number	16761406
Grant Number	11481443
Status	In Force
Filing Date	2018-05-25
First Publication Date	2020-11-19
Grant Date	2022-10-25
Owner	DEEPBRAIN AI INC. (Republic of Korea)
Inventor	Seol, Jaeho Jang, Seyoung Yoon, Dosang

Abstract

A method for providing natural language conversation is implemented by an interactive agent system. The method for providing natural language conversation, according to an embodiment of the present invention includes receiving a natural language input; determining a user intent based on the natural language input by processing the natural language input, and providing a natural language response corresponding to the natural language input, based on at least one of the natural language input and the determined user intent. The natural language response may be provided by determining whether a predetermined first condition is satisfied, providing a natural language response belonging to a category of substantial replies when the first condition is satisfied, determining whether a predetermined second condition is satisfied when the first condition is not satisfied, and providing a natural language response belonging to a category of interjections when the second condition is satisfied.

IPC Classes ?

G06F 16/9032 - Query formulation
G06F 16/9535 - Search customisation based on user profiles and personalisation
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/26 - Speech to text systems
G10L 15/30 - Distributed recognition, e.g. in client-server systems, for mobile phones or network applications