Disclosed herein are system, method, and computer program product aspects for According to some aspects, a computing device (e.g., a server, a cloud-based device, an application-service device, etc.) may identify a characteristic of content received via a recording application on a user device (e.g., a mobile device, a smart device, a computing device, etc.). A type of the user device may be determined based on an identifier received with the content. Based on the type of the user device, an instruction may be sent to the user device that causes a change in an operational state of a component of the user device that is utilized by the recording application. Remediation instructions that remediate the characteristic of the content may be sent to the user device based on an indication of the change in the operation state of the audio component.
Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.
G10L 21/0356 - Amélioration de l'intelligibilité de la parole, p.ex. réduction de bruit ou annulation d'écho en changeant l’amplitude pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
3.
Audiovisual collaboration method with latency management for wide-area broadcast
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04L 65/611 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour la multidiffusion ou la diffusion
H04L 65/612 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour monodiffusion [unicast]
H04L 65/75 - Gestion des paquets du réseau multimédia
H04N 21/242 - Procédés de synchronisation, p.ex. traitement de références d'horloge de programme [PCR]
H04N 21/462 - Gestion de contenu ou de données additionnelles, p.ex. création d'un guide de programmes électronique maître à partir de données reçues par Internet et d'une tête de réseau ou contrôle de la complexité d'un flux vidéo en dimensionnant la résolution o
H04N 21/4788 - Services additionnels, p.ex. affichage de l'identification d'un appelant téléphonique ou application d'achat communication avec d'autres utilisateurs, p.ex. discussion en ligne
4.
AUDIO-VISUAL EFFECTS SYSTEM FOR AUGMENTATION OF CAPTURED PERFORMANCE BASED ON CONTENT THEREOF
Visual effects schedules are applied to audiovisual performances with differing visual effects applied in correspondence with differing elements of musical structure. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects schedules are mood-denominated and may be selected by a performer as a component of his or her visual expression or determined from an audiovisual performance using machine learning techniques.
G11B 27/02 - Montage, p.ex. variation de l'ordre des signaux d'information enregistrés sur, ou reproduits à partir des supports d'enregistrement ou d'information
Audiovisual performances, including vocal music, are captured and coordinated with those of other users in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for visually prominent presentation performance synchronized video of one or more of the contributors. Prominence of particular performance synchronized video may be based, at least in part, on computationally-defined audio features extracted from (or computed over) captured vocal audio. Over the course of a coordinated audiovisual performance timeline, these computationally-defined audio features are selective for performance synchronized video of one or more of the contributing vocalists.
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing template-based excerpting and rendering of multimedia performances technologies. An embodiment includes at least one computer processor configured to retrieve a first content instance and applying a template that results in transforming the first content instance. The first content instance may include a plurality of structural elements. The first content instance may be transformed by a rendering engine running on the at least one computer processor and/or transmitted to a content-playback device. An embodiment of transforming the first content instance includes trimming the content instance based on requirements provided by social media platforms.
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing template-based excerpting and rendering of multimedia performances technologies. An embodiment includes at least one computer processor configured to retrieve a first content instance and corresponding first metadata. The first content instance may include a first plurality of structural elements, with at least one structural element corresponding to at least part of the first metadata. The first content instance may be transformed by a rendering engine running on the at least one computer processor and/or transmitted to a content-playback device.
User interface techniques provide user vocalists with mechanisms for seeding subsequent performances by other users (e.g., joiners). A seed may be a full-length seed spanning much or all of a pre-existing audio (or audiovisual) work and mixing, to seed further contributions of one or more joiners, a user's captured media content for at least some portions of the audio (or audiovisual) work. A short seed may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited “chunk” of an audio (or audiovisual) work may constitute a seed. A seeding user's call invites other users to join the full-length or short-form seed by singing along, singing a particular vocal part or musical section, singing harmony or other duet part, rapping, talking, clapping, recording video, adding a video clip from camera roll, etc. The resulting group performance, whether full-length or just a chunk, may be posted, livestreamed, or otherwise disseminated in a social network.
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
H04N 21/233 - Traitement de flux audio élémentaires
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04L 65/75 - Gestion des paquets du réseau multimédia
10.
Crowd-sourced technique for pitch track generation
Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing user-generated templates for segmented multimedia performances. An embodiment includes at least one computer processor configured to transmit a first version of a content instance and corresponding metadata. The first version of the content instance may include a plurality of structural elements, with at least one structural element corresponding to at least part of the metadata. The first content instance may be transformed by a rendering engine triggered by the at least one computer processor.
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing densification in music search. An embodiment includes processor(s) configured to obtain a first feature set extracted from a first audio recording, and a first fingerprint of the first audio recording; and evaluate, using at least one first machine-learning algorithm, a similarity index corresponding to the first audio recording with respect to at least one second audio recording, considering: the first feature set extracted from the first audio recording, and a second feature set extracted from the at least one second audio recording; or the first fingerprint of the first audio recording, and at least one second fingerprint of the at least one second audio recording. Further embodiments include defining arrangement group(s) including the first audio recording and the at least one second audio recording with similarity index within a predetermined range, outputting densified response(s) to a search query.
Vocal audio of a user together with performance synchronized video is captured and coordinated with audiovisual contributions of other users to form composite duet-style or glee club-style or window-paned music video-style audiovisual performances. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for presentation, at any given time along a given performance timeline, performance synchronized video of one or more of the contributors. Selections are in accord with a visual progression that codes a sequence of visual layouts in correspondence with other coded aspects of a performance score such as pitch tracks, backing audio, lyrics, sections and/or vocal parts.
G11B 27/02 - Montage, p.ex. variation de l'ordre des signaux d'information enregistrés sur, ou reproduits à partir des supports d'enregistrement ou d'information
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p.ex. de signaux audio, vidéo
G11B 27/10 - Indexation; Adressage; Minutage ou synchronisation; Mesure de l'avancement d'une bande
G11B 27/28 - Indexation; Adressage; Minutage ou synchronisation; Mesure de l'avancement d'une bande en utilisant une information détectable sur le support d'enregistrement en utilisant des signaux d'information enregistrés par le même procédé que pour l'enregistrement principal
Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme.
G10L 19/00 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p.ex. dans les vocodeurs; Codage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p.ex. dans les vocodeurs; Codage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p.ex. vocodeurs à transformée ou vocodeurs à sous-bandes
Visual effects, including augmented reality-type visual effects, are applied to audiovisual performances with differing visual effects and/or parameterizations thereof applied in correspondence with computationally determined audio features or elements of musical structure coded in temporally-synchronized tracks or computationally determined therefrom. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects are based on an audio feature computationally extracted from a captured audiovisual performance or from an audio track temporally-synchronized therewith.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
16.
Wireless handheld audio capture device and multi-vocalist method for audiovisual media application
Embodiments described herein relate generally to systems comprising a display device, a display device-coupled computing platform, a mobile device in communication with the computing platform, and a content server in which methods and techniques of capture and/or processing of audiovisual performances are described and, in particular, description of techniques suitable for use in connection with display device connected computing platforms for rendering vocal performance captured by a handheld computing device.
A63F 13/215 - Dispositions d'entrée pour les dispositifs de jeu vidéo caractérisées par leurs capteurs, leurs finalités ou leurs types comprenant des moyens de détection des signaux acoustiques, p.ex. utilisant un microphone
A63F 13/22 - Opérations de configuration, p.ex. le calibrage, la configuration des touches ou l’affectation des boutons
A63F 13/537 - Commande des signaux de sortie en fonction de la progression du jeu incluant des informations visuelles supplémentaires fournies à la scène de jeu, p.ex. en surimpression pour simuler un affichage tête haute [HUD] ou pour afficher une visée laser dans un jeu de tir utilisant des indicateurs, p.ex. en montrant l’état physique d’un personnage de jeu sur l’écran
A63F 13/655 - Création ou modification du contenu du jeu avant ou pendant l’exécution du programme de jeu, p.ex. au moyen d’outils spécialement adaptés au développement du jeu ou d’un éditeur de niveau intégré au jeu automatiquement par des dispositifs ou des serveurs de jeu, à partir de données provenant du monde réel, p.ex. les mesures en direct dans les compétitions de course réelles par importation de photos, p.ex. du joueur
A63F 13/814 - Performances musicales, p.ex. en évaluant le joueur sur sa capacité à suivre une notation
A63F 13/45 - Commande de la progression du jeu vidéo
G10L 21/013 - Adaptation à la hauteur tonale ciblée
G10L 25/57 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour le traitement des signaux vidéo
17.
Crowd-sourced device latency estimation for synchronization of recordings in vocal capture applications
Latency on different devices (e.g., devices of differing brand, model, vintage, etc.) can vary significantly and tens of milliseconds can affect human perception of lagging and leading components of a performance. As a result, use of a uniform latency estimate across a wide variety of devices is unlikely to provide good results, and hand-estimating round-trip latency across a wide variety of devices is costly and would constantly need to be updated for new devices. Instead, a system has been developed for crowdsourcing latency estimates.
G10L 25/60 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour mesurer la qualité des signaux de voix
H04R 29/00 - Dispositifs de contrôle; Dispositifs de tests
18.
Coordinating and mixing vocals captured from geographically distributed performers
Despite many practical limitations imposed by mobile device platforms and application execution environments, vocal musical performances may be captured and continuously pitch-corrected for mixing and rendering with backing tracks in ways that create compelling user experiences. Based on the techniques described herein, even mere amateurs are encouraged to share with friends and family or to collaborate and contribute vocal performances as part of virtual “glee clubs.” In some implementations, these interactions are facilitated through social network- and/or eMail-mediated sharing of performances and invitations to join in a group performance. Using uploaded vocals captured at clients such as a mobile device, a content server (or service) can mediate such virtual glee clubs by manipulating and mixing the uploaded vocal performances of multiple contributing vocalists.
Visual effects, including augmented reality-type visual effects, are applied to audiovisual performances with differing visual effects and/or parameterizations thereof applied in correspondence with computationally determined audio features or elements of musical structure coded in temporally-synchronized tracks or computationally determined therefrom. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects are based on an audio feature computationally extracted from a captured audiovisual performance or from an audio track temporally-synchronized therewith.
User interface techniques provide user vocalists with mechanisms for forward and backward traversal of audiovisual content, including pitch cues, waveform- or envelope-type performance timelines, lyrics and/or other temporally-synchronized content at record-time, during edits, and/or in playback. Recapture of selected performance portions, coordination of group parts, and overdubbing may all be facilitated. Direct scrolling to arbitrary points in the performance timeline, lyrics, pitch cues and other temporally-synchronized content allows user to conveniently move through a capture or audiovisual edit session. In some cases, a user vocalist may be guided through the performance timeline, lyrics, pitch cues and other temporally-synchronized content in correspondence with group part information such as in a guided short-form capture for a duet. A scrubber allows user vocalists to conveniently move forward and backward through the temporally-synchronized content.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
21.
Audiovisual collaboration method with latency management for wide-area broadcast
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04N 21/4788 - Services additionnels, p.ex. affichage de l'identification d'un appelant téléphonique ou application d'achat communication avec d'autres utilisateurs, p.ex. discussion en ligne
H04N 21/242 - Procédés de synchronisation, p.ex. traitement de références d'horloge de programme [PCR]
H04N 21/462 - Gestion de contenu ou de données additionnelles, p.ex. création d'un guide de programmes électronique maître à partir de données reçues par Internet et d'une tête de réseau ou contrôle de la complexité d'un flux vidéo en dimensionnant la résolution o
H04L 65/611 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour la multidiffusion ou la diffusion
H04L 65/612 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour monodiffusion [unicast]
H04L 65/75 - Gestion des paquets du réseau multimédia
22.
User-generated templates for segmented multimedia performance
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing user-generated templates for segmented multimedia performances. An embodiment includes at least one computer processor configured to transmit a first version of a content instance and corresponding metadata. The first version of the content instance may include a plurality of structural elements, with at least one structural element corresponding to at least part of the metadata. The first content instance may be transformed by a rendering engine triggered by the at least one computer processor.
Social music system and method with continuous, real-time pitch correction of vocal performance and dry vocal capture for subsequent re-rendering based on selectively applicable vocal effect(s) schedule(s)
Embodiments described provide a method for mixing vocal performances from different vocalists. A vocal score temporally synchronized with a corresponding backing track and lyrics is retrieved via a communications interface of a portable computing device. A first vocal performance of a user is captured, via a microphone interface of the portable computing device, and in correspondence with the backing track. An open call indication for soliciting, from a second vocalist, a second vocal performance to be mixed for audible rendering with the first vocal performance is transmitted. A mix to one of the user and the second vocalist is provided by selecting, based on to whom the mix is provided, the mix from alternative mixes each having a different prominent vocal performance.
G10H 1/10 - Circuits pour établir le contenu harmonique des sons en combinant des sons pour obtenir des effets de chœur, des effets célestes ou des effets d'ensemble
G10L 21/013 - Adaptation à la hauteur tonale ciblée
24.
Template-based excerpting and rendering of multimedia performance
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing template-based excerpting and rendering of multimedia performances technologies. An embodiment includes at least one computer processor configured to retrieve a first content instance and corresponding first metadata. The first content instance may include a first plurality of structural elements, with at least one structural element corresponding to at least part of the first metadata. The first content instance may be transformed by a rendering engine running on the at least one computer processor and/or transmitted to a content-playback device.
H04N 9/80 - Transformation du signal de télévision pour l'enregistrement, p.ex. modulation, changement de fréquence; Transformation inverse pour la reproduction
Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.
G10H 1/02 - Moyens pour contrôler la fréquence des sons, p.ex. attaque ou affaiblissement; Moyens pour produire des effets musicaux particuliers, p.ex. vibratos ou glissandos
G10L 21/013 - Adaptation à la hauteur tonale ciblée
G10L 21/0356 - Amélioration de l'intelligibilité de la parole, p.ex. réduction de bruit ou annulation d'écho en changeant l’amplitude pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
26.
SHORT SEGMENT GENERATION FOR USER ENGAGEMENT IN VOCAL CAPTURE APPLICATIONS
User interface techniques provide user vocalists with mechanisms for solo audiovisual capture and for seeding subsequent performances by other users (e.g., joiners). Audiovisual capture may be against a full-length work or seed spanning much or all of a pre-existing audio (or audiovisual) work and in some cases may mix, to seed further contributions of one or more joiners, a users captured media content for at least some portions of the audio (or audiovisual) work. A short seed or short segment may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited chunk of an audio (or audiovisual) work may constitute a short seed or short segment. Computational techniques are described that allow a system to automatically identify suitable short seeds or short segments. After audiovisual capture against the short seed or short segment, a resulting, solo or group, full-length or short-form performance may be posted, livestreamed, or otherwise disseminated in a social network
H04N 21/4402 - Traitement de flux élémentaires vidéo, p.ex. raccordement d'un clip vidéo récupéré d'un stockage local avec un flux vidéo en entrée ou rendu de scènes selon des graphes de scène MPEG-4 impliquant des opérations de reformatage de signaux vidéo pour la redistribution domestique, le stockage ou l'affichage en temps réel
H04N 21/439 - Traitement de flux audio élémentaires
Audiovisual performances, including vocal music, are captured and coordinated with those of other users in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for visually prominent presentation performance synchronized video of one or more of the contributors. Prominence of particular performance synchronized video may be based, at least in part, on computationally-defined audio features extracted from (or computed over) captured vocal audio. Over the course of a coordinated audiovisual performance timeline, these computationally-defined audio features are selective for performance synchronized video of one or more of the contributing vocalists.
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing template-based excerpting and rendering of multimedia performances technologies. An embodiment includes at least one computer processor configured to retrieve a first content instance and corresponding first metadata. The first content instance may include a first plurality of structural elements, with at least one structural element corresponding to at least part of the first metadata. An embodiment may further include selecting a first template comprising a first set of parameters. A parameter of the first set of parameters may be applicable to the at least one structural element. Applicable parameter(s) of the first template may be actively associated with the at least part of the first metadata corresponding to the at least one structural element. The first content instance may be transformed by a rendering engine running on the at least one computer processor.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
G06F 16/58 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/907 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
29.
Crowd-sourced technique for pitch track generation
Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.
Coordinated audio and video filter pairs are applied to enhance artistic and emotional content of audiovisual performances. Such filter pairs, when applied in audio and video processing pipelines of an audiovisual application hosted on a portable computing device (such as a mobile phone or media player, a computing pad or tablet, a game controller or a personal digital assistant or book reader) can allow user selection of effects that enhance both audio and video coordinated therewith. Coordinated audio and video are captured, filtered and rendered at the portable computing device using camera and microphone interfaces, using digital signal processing software executable on a processor and using storage, speaker and display devices of, or interoperable with, the device. By providing audiovisual capture and personalization on an intimate handheld device, social interactions and postings of a type made popular by modern social networking platforms can now be extended to audiovisual content.
G10L 21/055 - Compression ou expansion temporelles pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p.ex. de signaux audio, vidéo
G06F 3/0482 - Interaction avec des listes d’éléments sélectionnables, p.ex. des menus
G06F 3/04842 - Sélection des objets affichés ou des éléments de texte affichés
G10L 21/003 - Changement de la qualité de la voix, p.ex. de la hauteur tonale ou des formants
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
Vocal audio of a user together with performance synchronized video is captured and coordinated with audiovisual contributions of other users to form composite duet-style or glee club-style or window-paned music video-style audiovisual performances. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for presentation, at any given time along a given performance timeline, performance synchronized video of one or more of the contributors. Selections are in accord with a visual progression that codes a sequence of visual layouts in correspondence with other coded aspects of a performance score such as pitch tracks, backing audio, lyrics, sections and/or vocal parts.
G11B 27/02 - Montage, p.ex. variation de l'ordre des signaux d'information enregistrés sur, ou reproduits à partir des supports d'enregistrement ou d'information
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p.ex. de signaux audio, vidéo
G11B 27/10 - Indexation; Adressage; Minutage ou synchronisation; Mesure de l'avancement d'une bande
G11B 27/28 - Indexation; Adressage; Minutage ou synchronisation; Mesure de l'avancement d'une bande en utilisant une information détectable sur le support d'enregistrement en utilisant des signaux d'information enregistrés par le même procédé que pour l'enregistrement principal
Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing template-based excerpting and rendering of multimedia performances technologies. An embodiment includes at least one computer processor configured to retrieve a first content instance and corresponding first metadata. The first content instance may include a first plurality of structural elements, with at least one structural element corresponding to at least part of the first metadata. An embodiment may further include selecting a first template comprising a first set of parameters. A parameter of the first set of parameters may be applicable to the at least one structural element. Applicable parameter(s) of the first template may be actively associated with the at least part of the first metadata corresponding to the at least one structural element. The first content instance may be transformed by a rendering engine running on the at least one computer processor.
H04N 9/80 - Transformation du signal de télévision pour l'enregistrement, p.ex. modulation, changement de fréquence; Transformation inverse pour la reproduction
Visual effects, including augmented reality-type visual effects, are applied to audiovisual performances with differing visual effects and/or parameterizations thereof applied in correspondence with computationally determined audio features or elements of musical structure coded in temporally-synchronized tracks or computationally determined therefrom. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects are based on an audio feature computationally extracted from a captured audiovisual performance or from an audio track temporally-synchronized therewith.
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04N 21/431 - Génération d'interfaces visuelles; Rendu de contenu ou données additionnelles
H04N 21/434 - Désassemblage d'un flux multiplexé, p.ex. démultiplexage de flux audio et vidéo, extraction de données additionnelles d'un flux vidéo; Remultiplexage de flux multiplexés; Extraction ou traitement de SI; Désassemblage d'un flux élémentaire mis en paquets
H04N 21/236 - Assemblage d'un flux multiplexé, p.ex. flux de transport, en combinant un flux vidéo avec d'autres contenus ou données additionnelles, p.ex. insertion d'une adresse universelle [URL] dans un flux vidéo, multiplexage de données de logiciel dans un flu; Remultiplexage de flux multiplexés; Insertion de bits de remplissage dans le flux multiplexé, p.ex. pour obtenir un débit constant; Assemblage d'un flux élémentaire mis en paquets
H04N 21/2368 - Multiplexage de flux audio et vidéo
Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme.
G10L 21/055 - Compression ou expansion temporelles pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p.ex. dans les vocodeurs; Codage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p.ex. vocodeurs à transformée ou vocodeurs à sous-bandes
G10L 19/00 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p.ex. dans les vocodeurs; Codage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique
35.
Coordinating and mixing vocals captured from geographically distributed performers
Despite many practical limitations imposed by mobile device platforms and application execution environments, vocal musical performances may be captured and continuously pitch-corrected for mixing and rendering with backing tracks in ways that create compelling user experiences. Based on the techniques described herein, even mere amateurs are encouraged to share with friends and family or to collaborate and contribute vocal performances as part of virtual “glee clubs.” In some implementations, these interactions are facilitated through social network- and/or eMail-mediated sharing of performances and invitations to join in a group performance. Using uploaded vocals captured at clients such as a mobile device, a content server (or service) can mediate such virtual glee clubs by manipulating and mixing the uploaded vocal performances of multiple contributing vocalists.
An application that manipulates audio (or audiovisual) content, automated music creation technologies may be employed to generate new musical content using digital signal processing software hosted on handheld and/or server (or cloud-based) compute platforms to intelligently process and combine a set of audio content captured and submitted by users of modern mobile phones or other handheld compute platforms. The user-submitted recordings may contain speech, singing, musical instruments, or a wide variety of other sound sources, and the recordings may optionally be preprocessed by the handheld devices prior to submission.
G10L 21/00 - Traitement du signal de parole ou de voix pour produire un autre signal audible ou non audible, p.ex. visuel ou tactile, afin de modifier sa qualité ou son intelligibilité
37.
AUDIOVISUAL COLLABORATION SYSTEM AND METHOD WITH SEED/JOIN MECHANIC
User interface techniques provide user vocalists with mechanisms for seeding subsequent performances by other users (e.g., joiners). A seed may be a full-length seed spanning much or all of a pre-existing audio (or audiovisual) work and mixing, to seed further contributions of one or more joiners, a user's captured media content for at least some portions of the audio (or audiovisual) work. A short seed may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited chunk of an audio (or audiovisual) work may constitute a seed. A seeding user's call invites other users to join the full-length or short form seed by singing along, singing a particular vocal part or musical section, singing harmony or other duet part, rapping, talking, clapping, recording video, adding a video clip from camera roll, etc. The resulting group performance, whether full-length or just a chunk, may be posted, livestreamed, or otherwise disseminated in a social network.
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a singwith- the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04N 21/434 - Désassemblage d'un flux multiplexé, p.ex. démultiplexage de flux audio et vidéo, extraction de données additionnelles d'un flux vidéo; Remultiplexage de flux multiplexés; Extraction ou traitement de SI; Désassemblage d'un flux élémentaire mis en paquets
H04N 21/472 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés
H04N 21/485 - Interface pour utilisateurs finaux pour la configuration du client
User interface techniques provide user vocalists with mechanisms for forward and backward traversal of audiovisual content, including pitch cues, waveform- or envelope-type performance timelines, lyrics and/or other temporally-synchronized content at record-time, 5 during edits, and/or in playback. Recapture of selected performance portions, coordination of group parts, and overdubbing may all be facilitated. Direct scrolling to arbitrary points in the performance timeline, lyrics, pitch cues and other temporally-synchronized content allows user to conveniently move through a capture or audiovisual edit session. In some cases, a user vocalist may be guided through the performance timeline, lyrics, pitch cues 10 and other temporally-synchronized content in correspondence with group part information such as in a guided short-form capture for a duet. A scrubber allows user vocalists to conveniently move forward and backward through the temporally-synchronized content.
User interface techniques provide user vocalists with mechanisms for forward and backward traversal of audiovisual content, including pitch cues, waveform- or envelope-type performance timelines, lyrics and/or other temporally-synchronized content at record-time, during edits, and/or in playback. Recapture of selected performance portions, coordination of group parts, and overdubbing may all be facilitated. Direct scrolling to arbitrary points in the performance timeline, lyrics, pitch cues and other temporally-synchronized content allows user to conveniently move through a capture or audiovisual edit session. In some cases, a user vocalist may be guided through the performance timeline, lyrics, pitch cues and other temporally-synchronized content in correspondence with group part information such as in a guided short-form capture for a duet. A scrubber allows user vocalists to conveniently move forward and backward through the temporally-synchronized content.
H04N 21/439 - Traitement de flux audio élémentaires
H04N 21/4402 - Traitement de flux élémentaires vidéo, p.ex. raccordement d'un clip vidéo récupéré d'un stockage local avec un flux vidéo en entrée ou rendu de scènes selon des graphes de scène MPEG-4 impliquant des opérations de reformatage de signaux vidéo pour la redistribution domestique, le stockage ou l'affichage en temps réel
H04N 21/466 - Procédé d'apprentissage pour la gestion intelligente, p.ex. apprentissage des préférences d'utilisateurs pour recommander des films
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
41.
Audiovisual collaboration system and method with seed/join mechanic
User interface techniques provide user vocalists with mechanisms for seeding subsequent performances by other users (e.g., joiners). A seed may be a full-length seed spanning much or all of a pre-existing audio (or audiovisual) work and mixing, to seed further contributions of one or more joiners, a user's captured media content for at least some portions of the audio (or audiovisual) work. A short seed may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited “chunk” of an audio (or audiovisual) work may constitute a seed. A seeding user's call invites other users to join the full-length or short-form seed by singing along, singing a particular vocal part or musical section, singing harmony or other duet part, rapping, talking, clapping, recording video, adding a video clip from camera roll, etc. The resulting group performance, whether full-length or just a chunk, may be posted, livestreamed, or otherwise disseminated in a social network.
Latency on different devices (e.g., devices of differing brand, model, vintage, etc.) can vary significantly and tens of milliseconds can affect human perception of lagging and leading components of a performance. As a result, use of a uniform latency estimate across a wide variety of devices is unlikely to provide good results, and hand-estimating round-trip latency across a wide variety of devices is costly and would constantly need to be updated for new devices. Instead, a system has been developed for crowdsourcing latency estimates.
G06F 17/00 - TRAITEMENT ÉLECTRIQUE DE DONNÉES NUMÉRIQUES Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
H04R 29/00 - Dispositifs de contrôle; Dispositifs de tests
G10L 25/60 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour mesurer la qualité des signaux de voix
43.
Social music system and method with continuous, real-time pitch correction of vocal performance and dry vocal capture for subsequent re-rendering based on selectively applicable vocal effect(s) schedule(s)
Vocal musical performances may be captured and, in some cases or embodiments, pitch-corrected and/or processed in accord with a user selectable vocal effects schedule for mixing and rendering with backing tracks in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured on mobile devices in the context of a karaoke-style presentation of lyrics in correspondence with audible renderings of a backing track. Such performances can be pitch-corrected in real-time at the mobile device in accord with pitch correction settings. Vocal effects schedules may also be selectively applied to such performances. In these ways, even amateur user/performers with imperfect pitch are encouraged to take a shot at “stardom” and/or take part in a game play, social network or vocal achievement application architecture that facilitates musical collaboration on a global scale and/or, in some cases or embodiments, to initiate revenue generating in-application transactions.
G10H 1/10 - Circuits pour établir le contenu harmonique des sons en combinant des sons pour obtenir des effets de chœur, des effets célestes ou des effets d'ensemble
G10L 21/013 - Adaptation à la hauteur tonale ciblée
09 - Appareils et instruments scientifiques et électriques
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Game software; downloadable mobile applications for online
social networking; social networking software; computer
software that provides access to an application programming
interface (api) for computer software which facilitates
online services for social networking, building social
networking applications and for allowing data retrieval,
upload, download, access and management; downloadable
computer software for modifying the appearance and enabling
transmission of images, audio-visual and video content;
computer software to enable uploading, downloading,
accessing, posting, displaying, tagging, blogging,
streaming, linking, sharing or otherwise providing
electronic media or information via computer and
communication networks; downloadable computer software for
mobile devices for personalizing video performances on a
mobile device based on interaction of user; computer
software programs for the integration of text, audio,
graphics, still images and moving pictures into an
interactive delivery for multimedia applications;
downloadable computer software for mobile devices that
enables users to transform and enhance the video and audio
capabilities of their mobile devices; downloadable computer
software executable to communicate amongst mobile devices
using audio encodings of information; downloadable computer
software executable to provide mobile devices with signal
processing capabilities; communications software for
connecting mobile users; computer software for manipulating
digital audio information for use in audio media
applications; downloadable computer software for mobile
devices that enables users to send music, text, audio,
video, and graphics information to other users of mobile
devices; downloadable computer software that creates a voice
changer feature on a mobile device; software for
manipulating and editing images, sound and video;
video-editing software; software for manipulating digital
video files; software for video and audio processing, audio
and video editing, and audio and video encoding; computer
software for manipulating digital audio information for use
in videos; downloadable computer software for mobile devices
for personalizing video files on a mobile device;
Downloadable software that enables users to upload
arrangements for songs; downloadable software used for
uploading composition arrangements. Computer services, namely, hosting an online community
website for registered users to participate in discussions,
get feedback from their peers, form virtual communities, and
engage in social networking; providing technology via a web
site that enables internet users to share documents, images
and videos; providing temporary use of non-downloadable
software via a web site allowing web site users to upload
on-line videos for sharing with others for entertainment
purposes; providing temporary use of non-downloadable
software via a website allowing web site users to upload
arrangements for songs for sharing with others for
entertainment purposes; providing temporary use of
non-downloadable software via a website allowing web site
users to upload composition arrangements; application
service provider (asp) services featuring software to enable
uploading, posting, showing, displaying, tagging, blogging,
sharing or otherwise providing electronic media or
information over the internet or other communications
network; providing software as a service featuring software
for use in connection with transmitting, streaming, and
downloading audiovisual content.
45.
Crowd-sourced device latency estimation for synchronization of recordings in vocal capture applications
Latency on different devices (e.g., devices of differing brand, model, vintage, etc.) can vary significantly and tens of milliseconds can affect human perception of lagging and leading components of a performance. As a result, use of a uniform latency estimate across a wide variety of devices is unlikely to provide good results, and hand-estimating round-trip latency across a wide variety of devices is costly and would constantly need to be updated for new devices. Instead, a system has been developed for crowdsourcing latency estimates.
G06F 17/00 - TRAITEMENT ÉLECTRIQUE DE DONNÉES NUMÉRIQUES Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
H04R 29/00 - Dispositifs de contrôle; Dispositifs de tests
G10L 25/60 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour mesurer la qualité des signaux de voix
Visual effects schedules are applied to audiovisual performances with differing visual effects applied in correspondence with differing elements of musical structure. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects schedules are mood-denominated and may be selected by a performer as a component of his or her visual expression or determined from an audiovisual performance using machine learning techniques.
H04N 21/439 - Traitement de flux audio élémentaires
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04N 21/4402 - Traitement de flux élémentaires vidéo, p.ex. raccordement d'un clip vidéo récupéré d'un stockage local avec un flux vidéo en entrée ou rendu de scènes selon des graphes de scène MPEG-4 impliquant des opérations de reformatage de signaux vidéo pour la redistribution domestique, le stockage ou l'affichage en temps réel
H04N 5/272 - Moyens pour insérer une image de premier plan dans une image d'arrière plan, c. à d. incrustation, effet inverse
G06N 99/00 - Matière non prévue dans les autres groupes de la présente sous-classe
09 - Appareils et instruments scientifiques et électriques
Produits et services
Game software; downloadable mobile applications for online
social networking; social networking software; downloadable
computer software for modifying the appearance and enabling
transmission of images, audio-visual and video content;
computer software to enable uploading, downloading,
accessing, posting, displaying, tagging, blogging,
streaming, linking, sharing or otherwise providing
electronic media or information via computer and
communication networks; downloadable computer software for
mobile devices for personalizing video performances on a
mobile device based on interaction of user; computer
software programs for the integration of text, audio,
graphics, still images and moving pictures into an
interactive delivery for multimedia applications;
downloadable computer software for mobile devices that
enables users to transform and enhance the video and audio
capabilities of their mobile devices; downloadable computer
software executable to communicate amongst mobile devices
using audio encodings of information; downloadable computer
software executable to provide mobile devices with signal
processing capabilities; communications software for
connecting mobile users; computer software for manipulating
digital audio information for use in audio media
applications; downloadable computer software for mobile
devices that enables users to send music, text, audio,
video, and graphics information to other users of mobile
devices; downloadable computer software that creates a voice
changer feature on a mobile device; software for
manipulating digital video files; software for video and
audio processing, audio and video editing, and audio and
video encoding; computer software for manipulating digital
audio information for use in videos; downloadable software
that enables users to upload arrangements for songs;
downloadable software used for uploading composition
arrangements.
56.
Display screen or portion thereof with animated graphical user interface
Visual effects schedules are applied to audiovisual performances with differing visual effects applied in correspondence with differing elements of musical structure. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects schedules are mood-denominated and may be selected by a performer as a component of his or her visual expression or determined from an audiovisual performance using machine learning techniques.
G11B 27/02 - Montage, p.ex. variation de l'ordre des signaux d'information enregistrés sur, ou reproduits à partir des supports d'enregistrement ou d'information
Vocal audio of a user together with performance synchronized video is captured and coordinated with audiovisual contributions of other users to form composite duet-style or glee club-style or window-paned music video-style audiovisual performances. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for presentation, at any given time along a given performance timeline, performance synchronized video of one or more of the contributors. Selections are in accord with a visual progression that codes a sequence of visual layouts in correspondence with other coded aspects of a performance score such as pitch tracks, backing audio, lyrics, sections and/or vocal parts.
G11B 27/02 - Montage, p.ex. variation de l'ordre des signaux d'information enregistrés sur, ou reproduits à partir des supports d'enregistrement ou d'information
60.
AUDIOVISUAL COLLABORATION METHOD WITH LATENCY MANAGEMENT FOR WIDE-AREA BROADCAST
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
H04N 21/434 - Désassemblage d'un flux multiplexé, p.ex. démultiplexage de flux audio et vidéo, extraction de données additionnelles d'un flux vidéo; Remultiplexage de flux multiplexés; Extraction ou traitement de SI; Désassemblage d'un flux élémentaire mis en paquets
H04N 21/436 - Interfaçage d'un réseau de distribution local, p.ex. communication avec un autre STB ou à l'intérieur de la maison
09 - Appareils et instruments scientifiques et électriques
41 - Éducation, divertissements, activités sportives et culturelles
42 - Services scientifiques, technologiques et industriels, recherche et conception
45 - Services juridiques; services de sécurité; services personnels pour individus
Produits et services
Game software; Downloadable mobile applications for online social networking; social networking software; computer software that provides access to an application programming interface (api) for computer software which facilitates online services for social networking, building social networking applications and for allowing data retrieval, upload, download, access and management; Downloadable computer software for modifying the appearance and enabling transmission of images, audio-visual and video content; computer software to enable uploading, downloading, accessing, posting, displaying, tagging, blogging, streaming, linking, sharing or otherwise providing electronic media or information via computer and communication networks; downloadable computer software for mobile devices for personalizing video performances on a mobile device based on interaction of user; computer software programs for the integration of text, audio, graphics, still images and moving pictures into an interactive delivery for multimedia applications; downloadable computer software for mobile devices that enables users to transform and enhance the video and audio capabilities of their mobile devices; downloadable computer software executable to communicate amongst mobile devices using audio encodings of information; downloadable computer software executable to provide mobile devices with signal processing capabilities; communications software for connecting mobile users; computer software for manipulating digital audio information for use in audio media applications; downloadable computer software for mobile devices that enables users to send music, text, audio, video, and graphics information to other users of mobile devices; downloadable computer software that creates a voice changer feature on a mobile device; software for manipulating and editing images, sound and video; video-editing software; software for manipulating digital video files; software for video and audio processing, audio and video editing, and audio and video encoding; computer software for manipulating digital audio information for use in videos; downloadable computer software for mobile devices for personalizing video files on a mobile device; Downloadable software that enables users to upload arrangements for songs; downloadable software used for uploading composition arrangements Providing online non-downloadable game software; online journals, namely, blogs in the field of entertainment; entertainment services, namely, providing a website featuring non-downloadable audio clips, video clips, photographs, and other multimedia materials in the nature of music videos, all in the fields of music and entertainment; entertainment services, namely, conducting contests Computer services, namely, creating an on-line community for registered users to participate in discussions, get feedback from their peers, form virtual communities, and engage in social networking; Providing a web site featuring technology that enables internet users to share documents, images and videos; Providing a web site featuring temporary use of non-downloadable software allowing web site users to upload on-line videos for sharing with others for entertainment purposes; providing a website featuring temporary use of non-downloadable software allowing web site users to upload arrangements for songs for sharing with others for entertainment purposes; providing a website featuring temporary use of non-downloadable software allowing web site users to upload composition arrangements; Application service provider (asp) featuring software to enable uploading, posting, showing, displaying, tagging, blogging, sharing or otherwise providing electronic media or information over the internet or other communications network; Providing software as a service featuring software for use in connection with transmitting, streaming, and downloading audiovisual content Online social networking services; Providing a social networking website for entertainment purposes
62.
Audiovisual collaboration method with latency management for wide-area broadcast
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
H04N 21/43 - Traitement de contenu ou données additionnelles, p.ex. démultiplexage de données additionnelles d'un flux vidéo numérique; Opérations élémentaires de client, p.ex. surveillance du réseau domestique ou synchronisation de l'horloge du décodeur; Intergiciel de client
H04N 21/4788 - Services additionnels, p.ex. affichage de l'identification d'un appelant téléphonique ou application d'achat communication avec d'autres utilisateurs, p.ex. discussion en ligne
H04L 29/08 - Procédure de commande de la transmission, p.ex. procédure de commande du niveau de la liaison
H04L 29/06 - Commande de la communication; Traitement de la communication caractérisés par un protocole
H04N 21/242 - Procédés de synchronisation, p.ex. traitement de références d'horloge de programme [PCR]
H04N 21/462 - Gestion de contenu ou de données additionnelles, p.ex. création d'un guide de programmes électronique maître à partir de données reçues par Internet et d'une tête de réseau ou contrôle de la complexité d'un flux vidéo en dimensionnant la résolution o
Audiovisual performances, including vocal music, are captured and coordinated with those of other users in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for visually prominent presentation performance synchronized video of one or more of the contributors. Prominence of particular performance synchronized video may be based, at least in part, on computationally-defined audio features extracted from (or computed over) captured vocal audio. Over the course of a coordinated audiovisual performance timeline, these computationally-defined audio features are selective for performance synchronized video of one or more of the contributing vocalists.
In some examples, a system includes a first portable computing device that audibly renders a backing track, captures and pitch corrects a vocal performance of a first user, and transmits the first user's pitch corrected vocal performance. The system may also include a second portable computing device including a data communications interface that receives the first user's pitch corrected vocal performance, an audio transducer that audibly renders a mix of the backing track and the first user's pitch corrected vocal performance, a display for concurrent presentation of lyrics temporally synchronized with a vocal score and the backing track, a microphone interface that captures a vocal performance of a second user, and pitch correction code executable on the second portable computing device to pitch correct the second user's vocal performance in accord with the vocal score to produce a composite multi-vocal performance.
09 - Appareils et instruments scientifiques et électriques
Produits et services
Game software; Downloadable mobile applications for online social networking; social networking software; Downloadable computer software for modifying the appearance and enabling transmission of images, audio-visual and video content; computer software to enable uploading, downloading, accessing, posting, displaying, tagging, blogging, streaming, linking, sharing or otherwise providing electronic media or information via computer and communication networks; downloadable computer software for mobile devices for personalizing video performances on a mobile device based on interaction of user; computer software programs for the integration of text, audio, graphics, still images and moving pictures into an interactive delivery for multimedia applications; downloadable computer software for mobile devices that enables users to transform and enhance the video and audio capabilities of their mobile devices; downloadable computer software executable to communicate amongst mobile devices using audio encodings of information; downloadable computer software executable to provide mobile devices with signal processing capabilities; communications software for connecting mobile users; computer software for manipulating digital audio information for use in audio media applications; downloadable computer software for mobile devices that enables users to send music, text, audio, video, and graphics information to other users of mobile devices; downloadable computer software that creates a voice changer feature on a mobile device; software for manipulating digital video files; software for video and audio processing, audio and video editing, and audio and video encoding; computer software for manipulating digital audio information for use in videos; Downloadable software that enables users to upload arrangements for songs; downloadable software used for uploading composition arrangements
66.
Continuous pitch-corrected vocal capture device cooperative with content server for backing track mix
Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.
G10H 1/02 - Moyens pour contrôler la fréquence des sons, p.ex. attaque ou affaiblissement; Moyens pour produire des effets musicaux particuliers, p.ex. vibratos ou glissandos
G10L 21/0356 - Amélioration de l'intelligibilité de la parole, p.ex. réduction de bruit ou annulation d'écho en changeant l’amplitude pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
67.
Coordinating and mixing vocals captured from geographically distributed performers
Despite many practical limitations imposed by mobile device platforms and application execution environments, vocal musical performances may be captured and continuously pitch-corrected for mixing and rendering with backing tracks in ways that create compelling user experiences. Based on the techniques described herein, even mere amateurs are encouraged to share with friends and family or to collaborate and contribute vocal performances as part of virtual “glee clubs.” In some implementations, these interactions are facilitated through social network- and/or eMail-mediated sharing of performances and invitations to join in a group performance. Using uploaded vocals captured at clients such as a mobile device, a content server (or service) can mediate such virtual glee clubs by manipulating and mixing the uploaded vocal performances of multiple contributing vocalists.
G10L 21/00 - Traitement du signal de parole ou de voix pour produire un autre signal audible ou non audible, p.ex. visuel ou tactile, afin de modifier sa qualité ou son intelligibilité
G10L 21/013 - Adaptation à la hauteur tonale ciblée
Advanced, but user-friendly composition and editing environments for musical scores may be provided using the types, and in some cases the instances, of computing devices that will in turn consume musical score content so generated. Indeed, by integrating musical composition facilities within synthetic musical instruments that can be widely deployed on hand-held or portable computing devices, a social music network that includes such synthetic musical instruments gains access to a large, and potentially prolific, population of authors, editors and reviewers, as well as to the community-sourced musical scores that they can generate. By curating such content and/or by applying crowd-sourcing or other computational techniques to maintain quality, a social music network may rapidly deploy the new and ever evolving content that its user community desires.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
G10H 1/32 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques - Parties constitutives
Vocal musical performances may be captured and continuously pitch-corrected at a mobile device for mixing and rendering with backing tracks in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured in the context of a karaoke-style presentation of lyrics in correspondence with audible renderings of a backing track. Such performances can be pitch-corrected in real-time at the mobile device in accord with pitch correction settings. In some cases, such pitch correction settings code a particular key or scale for the vocal performance or for portions thereof. In some cases, pitch correction settings include a score-coded melody sequence of note targets supplied with, or for association with, the lyrics and/or backing track. In some cases, pitch correction settings are dynamically variable based on gestures captured at a user interface.
G10L 21/0356 - Amélioration de l'intelligibilité de la parole, p.ex. réduction de bruit ou annulation d'écho en changeant l’amplitude pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
41 - Éducation, divertissements, activités sportives et culturelles
42 - Services scientifiques, technologiques et industriels, recherche et conception
45 - Services juridiques; services de sécurité; services personnels pour individus
Produits et services
Online journals, namely, blogs in the field of entertainment; entertainment services, namely, providing a website featuring audio clips, video clips, photographs, other multimedia materials in the fields of music and entertainment; entertainment services, namely, conducting contests Computer services, namely, creating an on-line community for registered users to participate in discussions, get feedback from their peers, form virtual communities, and engage in social networking; Providing a web site featuring technology that enables internet users to share documents, images and videos; Providing a web site featuring temporary use of non-downloadable software allowing web site users to upload on-line videos for sharing with others for entertainment purposes; providing a website featuring temporary use of non-downloadable software allowing web site users to upload arrangements for songs for sharing with others for entertainment purposes; providing a website featuring temporary use of non-downloadable software allowing web site users to upload composition arrangements; Application service provider (asp) featuring software to enable uploading, posting, showing, displaying, tagging, blogging, sharing or otherwise providing electronic media or information over the internet or other communications network; Providing software as a service featuring software for use in connection with transmitting, streaming, and downloading audiovisual content Online social networking services; Providing a social networking website for entertainment purposes
09 - Appareils et instruments scientifiques et électriques
Produits et services
Software for processing digital music files; computer
software for manipulating digital audio information for use
in electronic games; computer software featuring musical
sound recordings and musical video recordings; downloadable
computer software for mobile devices for personalizing video
performances on a mobile device based on interaction of
user; downloadable computer software for mobile devices for
personalizing audio performances on a mobile device based on
interaction of user; game software; computer software
programs for the integration of text, audio, graphics, still
images and moving pictures into an interactive delivery for
multimedia applications; computer software platforms for
development of interactive audio applications for use with
mobile audio platforms; downloadable computer software for
mobile devices that enables users to transform and enhance
the video and audio capabilities of their mobile devices;
downloadable computer software executable to communicate
amongst mobile devices using audio encodings of information;
downloadable computer software executable to provide mobile
devices with signal processing capabilities; downloadable
computer software for mobile devices that enables users to
interactively perform individualized audio and video
applications; children's educational software;
communications software for connecting mobile users;
downloadable computer software for mobile devices for
enabling users to download and share interactive audio and
music using wireless devices; computer software for
manipulating digital audio information for use in audio
media applications; downloadable computer software for
mobile devices that enables users to send music, text,
audio, video, and graphics information to other users of
mobile devices; downloadable computer software that creates
a voice changer feature on a mobile device; computer
software executable to cause mobile devices to function as
musical instruments; downloadable computer software for
mobile devices for enhancing audio and video capabilities
for entertainment purposes; software for manipulating and
editing images, sound and video; software that allows users
to incorporate music into video recordings and share the
videos through social media; video-editing software;
software for manipulating digital video files; software for
video and audio processing, audio and video editing, and
audio and video encoding; computer software for manipulating
digital audio information for use in videos; downloadable
computer software for mobile devices for personalizing video
files on a mobile device.
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Computer programming services, namely, software-as-a-service (SaaS) services, application services in the nature of application service provider featuring application programming interface (API) software for publishing, editing, managing, sharing and utilizing snippets and groups of snippets
42 - Services scientifiques, technologiques et industriels, recherche et conception
Produits et services
Computer programming services, namely, software-as-a-service (SaaS) services, application services in the nature of application service provider featuring application software, and cloud computing services, all featuring software for publishing, editing, managing, sharing, and utilizing discrete data items containing frequently-used phrases and images ("snippets") using customized short abbreviations which save the user keystrokes and improve productivity, enabling development and transfer of individual and team knowledge and for accessing remotely-stored data for such applications; software-as-a-service (SaaS) services, application services in the nature of application service provider featuring application software, and cloud computing services, all featuring software allowing use of snippets in fill-in-the- blank fields, in an organization within groups, and for the import and export of snippets via external sources, for insertion of date and time in user-preferred formats, for insertion of editor-independent text and code templates, and invocation of scripts of snippets into documents; providing temporary use of online non-downloadable software and applications software which enables the compilation and processing of snippets for a central database though local and global networks and for accessing remotely-stored data for such applications; providing temporary use of online non-downloadable software for online access to, editing of, grouping of, management of, and sharing of a database of snippets, including a directory of publicly published snippet groups
74.
Display screen or portion thereof with graphical user interface
09 - Appareils et instruments scientifiques et électriques
41 - Éducation, divertissements, activités sportives et culturelles
42 - Services scientifiques, technologiques et industriels, recherche et conception
45 - Services juridiques; services de sécurité; services personnels pour individus
Produits et services
Downloadable mobile applications for online social networking; social networking software; computer software that provides access to an application programming interface (api) for computer software which facilitates online services for social networking, building social networking applications and for allowing data retrieval, upload, download, access and management; Downloadable computer software for modifying the appearance and enabling transmission of images, audio-visual and video content; computer software to enable uploading, downloading, accessing, posting, displaying, tagging, blogging, streaming, linking, sharing or otherwise providing electronic media or information via computer and communication networks; downloadable computer software for mobile devices for personalizing video performances on a mobile device based on interaction of user; computer software programs for the integration of text, audio, graphics, still images and moving pictures into an interactive delivery for multimedia applications; downloadable computer software for mobile devices that enables users to transform and enhance the video and audio capabilities of their mobile devices; downloadable computer software executable to communicate amongst mobile devices using audio encodings of information; downloadable computer software executable to provide mobile devices with signal processing capabilities; communications software for connecting mobile users; computer software for manipulating digital audio information for use in audio media applications; downloadable computer software for mobile devices that enables users to send music, text, audio, video, and graphics information to other users of mobile devices; downloadable computer software that creates a voice changer feature on a mobile device; software for manipulating and editing images, sound and video; video-editing software; software for manipulating digital video files; software for video and audio processing, audio and video editing, and audio and video encoding; computer software for manipulating digital audio information for use in videos; downloadable computer software for mobile devices for personalizing video files on a mobile device; Downloadable software that enables users to upload arrangements for songs; downloadable software used for uploading composition arrangements Online journals, namely, blogs in the field of entertainment; entertainment services, namely, providing a website featuring non-downloadable audio clips, video clips, photographs, and other multimedia materials in the nature of music videos, all in the fields of music and entertainment; entertainment services, namely, conducting contests Computer services, namely, creating an on-line community for registered users to participate in discussions, get feedback from their peers, form virtual communities, and engage in social networking; Providing a web site featuring technology that enables internet users to share documents, images and videos; Providing a web site featuring temporary use of non-downloadable software allowing web site users to upload on-line videos for sharing with others for entertainment purposes; providing a website featuring temporary use of non-downloadable software allowing web site users to upload arrangements for songs for sharing with others for entertainment purposes; providing a website featuring temporary use of non-downloadable software allowing web site users to upload composition arrangements; Application service provider (asp) featuring software to enable uploading, posting, showing, displaying, tagging, blogging, sharing or otherwise providing electronic media or information over the internet or other communications network; Providing software as a service featuring software for use in connection with transmitting, streaming, and downloading audiovisual content Online social networking services; Providing a social networking website for entertainment purposes
77.
CROWD-SOURCED TECHNIQUE FOR PITCH TRACK GENERATION
Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.
G10L 25/75 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes pour la modélisation des paramètres du conduit vocal
G10L 25/90 - Détermination de la hauteur tonale des signaux de parole
G10L 15/14 - Classement ou recherche de la parole utilisant des modèles statistiques, p.ex. des modèles de Markov cachés [HMM]
Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.
Synthetic multi-string musical instruments have been developed for capturing and rendering musical performances on handheld or other portable devices in which a multi-touch sensitive display provides one of the input vectors for an expressive performance by a user or musician. Visual cues may be provided on the multi-touch sensitive display to guide the user in a performance based on a musical score. Alternatively, or in addition, uncued freestyle modes of operation may be provided. In either case, it is not the musical score that drives digital synthesis and audible rendering of the synthetic multi-string musical instrument. Rather, it is the stream of user gestures captured at least in part using the multi-touch sensitive display that drives the digital synthesis and audible rendering.
G10H 1/06 - Circuits pour établir le contenu harmonique des sons
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
G06F 3/041 - Numériseurs, p.ex. pour des écrans ou des pavés tactiles, caractérisés par les moyens de transduction
81.
Display screen or portion thereof with animated graphical user interface
Captured vocals may be automatically transformed using advanced digital signal processing techniques that provide captivating applications, and even purpose-built devices, in which mere novice user-musicians may generate, audibly render and share musical performances. In some cases, the automated transformations allow spoken vocals to be segmented, arranged, temporally aligned with a target rhythm, meter or accompanying backing tracks and pitch corrected in accord with a score or note sequence. Speech-to-song music applications are one such example. In some cases, spoken vocals may be transformed in accord with musical genres such as rap using automated segmentation and temporal alignment techniques, often without pitch correction. Such applications, which may employ different signal processing and different automated transformations, may nonetheless be understood as speech-to-rap variations on the theme.
G10L 21/055 - Compression ou expansion temporelles pour la synchronisation avec d’autres signaux, p.ex. signaux vidéo
G01L 21/04 - Indicateurs de vide ayant une chambre de compression dans laquelle le gaz dont on doit mesurer la pression est comprimé dans lesquels la chambre est fermée par un liquide; Indicateurs de vide du type MacLeod
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p.ex. dans les vocodeurs; Codage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p.ex. vocodeurs à transformée ou vocodeurs à sous-bandes
G10L 19/00 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p.ex. dans les vocodeurs; Codage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique
Social music system and method with continuous, real-time pitch correction of vocal performance and dry vocal capture for subsequent re-rendering based on selectively applicable vocal effect(s) schedule(s)
Vocal musical performances may be captured and, in some cases or embodiments, pitch-corrected and/or processed in accord with a user selectable vocal effects schedule for mixing and rendering with backing tracks in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured on mobile devices in the context of a karaoke-style presentation of lyrics in correspondence with audible renderings of a backing track. Such performances can be pitch-corrected in real-time at the mobile device in accord with pitch correction settings. Vocal effects schedules may also be selectively applied to such performances. In these ways, even amateur user/performers with imperfect pitch are encouraged to take a shot at “stardom” and/or take part in a game play, social network or vocal achievement application architecture that facilitates musical collaboration on a global scale and/or, in some cases or embodiments, to initiate revenue generating in-application transactions.
G10L 21/00 - Traitement du signal de parole ou de voix pour produire un autre signal audible ou non audible, p.ex. visuel ou tactile, afin de modifier sa qualité ou son intelligibilité
Techniques have been developed for transmitting and receiving information conveyed through the air from one portable device to another as a generally unperceivable coding within an otherwise recognizable acoustic signal. For example, in some embodiments in accordance with the present invention(s), information is acoustically communicated from a first handheld device toward a second by encoding the information in a signal that, when converted into acoustic energy at an acoustic transducer of the first handheld device, is characterized in that the acoustic energy is discernable to a human ear yet the encoding of the information therein is generally not perceivable by the human. The acoustic energy is transmitted from the acoustic transducer of the first handheld device toward the second handheld device across an air gap that constitutes a substantially entirety of the distance between the devices. Acoustic energy received at the second handheld device may then be processed using signal processing techniques tailored to detection of the particular information encodings employed.
Embodiments described herein relate generally to systems comprising a display device, a display device-coupled computing platform, a mobile device in communication with the computing platform, and a content server in which methods and techniques of capture and/or processing of audiovisual performances are described and, in particular, description of techniques suitable for use in connection with display device connected computing platforms for rendering vocal performance captured by a handheld computing device.
H04N 5/775 - Circuits d'interface entre un appareil d'enregistrement et un autre appareil entre un appareil d'enregistrement et un récepteur de télévision
G06F 3/0488 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] utilisant des caractéristiques spécifiques fournies par le périphérique d’entrée, p.ex. des fonctions commandées par la rotation d’une souris à deux capteurs, ou par la nature du périphérique d’entrée, p.ex. des gestes en fonction de la pression exer utilisant un écran tactile ou une tablette numérique, p.ex. entrée de commandes par des tracés gestuels
G06F 3/0346 - Dispositifs de pointage déplacés ou positionnés par l'utilisateur; Leurs accessoires avec détection de l’orientation ou du mouvement libre du dispositif dans un espace en trois dimensions [3D], p.ex. souris 3D, dispositifs de pointage à six degrés de liberté [6-DOF] utilisant des capteurs gyroscopiques, accéléromètres ou d’inclinaiso
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
Embodiments described herein relate generally to systems comprising a display device, a display device-coupled computing platform, a mobile device in communication with the computing platform, and a content server in which methods and techniques of capture and/or processing of audiovisual performances are described and, in particular, description of techniques suitable for use in connection with display device connected computing platforms for rendering vocal performance captured by a handheld computing device.
Embodiments described herein relate generally to systems comprising a display device, a display device-coupled computing platform, a mobile device in communication with the computing platform, and a content server in which methods and techniques of capture and/or processing of audiovisual performances are described and, in particular, description of techniques suitable for use in connection with display device connected computing platforms for rendering vocal performance captured by a handheld computing device.
G06F 1/16 - TRAITEMENT ÉLECTRIQUE DE DONNÉES NUMÉRIQUES - Détails non couverts par les groupes et - Détails ou dispositions de structure
H04M 1/72412 - Interfaces utilisateur spécialement adaptées aux téléphones sans fil ou mobiles avec des moyens de soutien local des applications accroissant la fonctionnalité par interfaçage avec des accessoires externes utilisant des interfaces sans fil bidirectionnelles à courte portée
G10L 25/57 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour le traitement des signaux vidéo
G10L 21/013 - Adaptation à la hauteur tonale ciblée
H04M 1/72442 - Interfaces utilisateur spécialement adaptées aux téléphones sans fil ou mobiles avec des moyens de soutien local des applications accroissant la fonctionnalité pour faire jouer des fichiers musicaux
98.
Coordinated audio and video capture and sharing framework
Coordinated audio and video filter pairs are applied to enhance artistic and emotional content of audiovisual performances. Such filter pairs, when applied in audio and video processing pipelines of an audiovisual application hosted on a portable computing device (such as a mobile phone or media player, a computing pad or tablet, a game controller or a personal digital assistant or book reader) can allow user selection of effects that enhance both audio and video coordinated therewith. Coordinated audio and video are captured, filtered and rendered at the portable computing device using camera and microphone interfaces, using digital signal processing software executable on a processor and using storage, speaker and display devices of, or interoperable with, the device. By providing audiovisual capture and personalization on an intimate handheld device, social interactions and postings of a type made popular by modern social networking platforms can now be extended to audiovisual content.
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p.ex. de signaux audio, vidéo
G06F 3/0482 - Interaction avec des listes d’éléments sélectionnables, p.ex. des menus
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
G10L 21/003 - Changement de la qualité de la voix, p.ex. de la hauteur tonale ou des formants
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
Notwithstanding practical limitations imposed by mobile device platforms and applications, truly captivating musical instruments may be synthesized in ways that allow musically expressive performances to be captured and rendered in real-time. In some cases, synthetic musical instruments can provide a game, grading or instructional mode in which one or more qualities of a user's performance are assessed relative to a musical score. By constantly adapting to such modes to actual performance characteristics and, in some cases, to the level of a given user musician's skill, user interactions with synthetic musical instruments can be made more engaging and may capture user interest and economic opportunities (e.g., for in-app purchase and/or social networking) over generally longer periods of time.
Notwithstanding practical limitations imposed by mobile device platforms and applications, truly captivating musical instruments may be synthesized in ways that allow musically expressive performances to be captured and rendered in real-time. Synthetic musical instruments that provide a game, grading or instructional mode are described in which one or more qualities of a user's performance are assessed relative to a musical score. By providing a range of modes (from score-assisted to fully user-expressive), user interactions with synthetic musical instruments are made more engaging and tend to capture user interest over generally longer periods of time. Synthetic musical instruments are described in which force dynamics of user gestures (such as finger contact forces applied to a multi-touch sensitive display or surface and/or the temporal extent and applied pressure of sustained contact thereon) are captured and drive the digital synthesis in ways that enhance expressiveness of user performances.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
G10H 7/00 - Instruments dans lesquels les sons sont synthétisés à partir d'une mémoire de données, p.ex. orgues à calculateur