Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
-
Updated
Nov 13, 2024 - Jupyter Notebook
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
SincNet is a neural architecture for efficiently processing raw audio samples.
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Identifying people from small audio fragments
Deep Learning - one shot learning for speaker recognition using Filter Banks
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
Source code for paper "Who is real Bob? Adversarial Attacks on Speaker Recognition Systems" (IEEE S&P 2021)
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Pytorch implementation of Generalized End-to-End Loss for speaker verification
A tool for summarizing dialogues from videos or audio
mirror of VoxCeleb dataset - a large-scale speaker identification dataset
Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks
Add a description, image, and links to the speaker-identification topic page so that developers can more easily learn about it.
To associate your repository with the speaker-identification topic, visit your repo's landing page and select "manage topics."