CV - Saurabh Kumar
Curriculum Vitae [PDF]
Saurabh Kumar
Google Scholar | GitHub | LinkedIn
Email: saurabhk0317@gmail.com
Education
B.Tech. in Electronics and Communication Engineering
National Institute of Technology (NIT), Patna
2016–2020
- GPA: 7.42/10.00
- Capstone Project: Simulation of Hybrid Plasmonic Waveguides-Based Devices and Its Applications
- Conducted simulations and analyzed characteristics of hybrid plasmonic waveguide devices using COMSOL Multiphysics.
- Advisor: Dr. R. Ranjan
Experience
Research Associate
SPIRE Lab, IISc Bangalore, India
July 2021 – Present
- Developing automatic speech recognition (ASR) models and resources for low-resource Indian languages under the RESPIN project.
- Exploring auxiliary tasks such as dialect and domain identification to improve ASR performance.
- Conducted a comprehensive study on the impact of statistical language models for dialect and domain-specific ASR.
- Contributed to the SYSPIN project, building TTS systems in nine Indian languages.
Machine Learning/Signal Processing Consulting Engineer
ARTPARK, IISc Bangalore, India
September 2022 – Present
- Leading the development and curation of speech and text corpora for the VAANI project, covering 773 districts in India.
- Designed data pipelines for ensuring quality and accuracy in curated datasets.
Undergraduate Student Researcher
Department of ECE, NIT Patna
July 2019 – June 2020, December 2020 – June 2021
- Explored acoustic feature enhancement techniques to improve children’s speech recognition in noisy conditions.
- Developed speech recognition systems and analyzed data augmentation techniques for children’s speaker verification.
Summer Intern
EICT Academy, IIT Guwahati
May 2019 – June 2019
- Implemented a cascade filter bank structure for the discrete wavelet transform using Verilog for FPGA simulation.
Conferences and Workshops
- ICASSP 2024: Co-hosted LIMMITS’24 Challenge on multi-speaker, multi-lingual Indic TTS with voice cloning.
- ASRU 2023 Workshop: Organized the MADASR Challenge on model adaptation for ASR in low-resource Indian languages.
- SLT 2022 Hackathon: Won Best Hackathon Project Award for dialectical speech recognition in Bengali and Bhojpuri.
- Interspeech 2022: Organized the Gram Vaani ASR Challenge, contributing to an open-source Hindi ASR corpus.
Skills
- Machine Learning/Deep Learning Toolkits: PyTorch, Scikit-learn
- ASR Frameworks: ESPnet, Kaldi, Fairseq
- Programming Languages: Python, Bash, PHP
- Other Tools: MATLAB, Audacity, Praat, Git
Languages
- Programming: Python, Bash, PHP
- Natural: Bhojpuri, English, Hindi, Maithili
- TOEFL: 100/120
Extracurricular Activities
- Student Volunteer at ICASSP 2024.
- Web Coordinator of the Yoga and Meditation Club, NIT Patna (2018–2019).
- Organized the annual cultural fest of NIT Patna (2018).
- Co-organized a fundraising project for flood-affected people in Bihar (2017).
Publications
Journal Articles
S. Shahnawazuddin, A. Kumar, V. Kumar, Saurabh Kumar, and W. Ahmad,
“Robust children’s speech recognition in zero resource condition”,
Applied Acoustics, vol. 185, p. 108382, 2022.
DOI: LinkS. Shahnawazuddin, A. Kumar, Saurabh Kumar, and W. Ahmad,
“Enhancing robustness of zero resource children’s speech recognition system through bispectrum-based front-end acoustic features”,
Digital Signal Processing, vol. 118, p. 103226, 2021.
DOI: Link
Conference Proceedings
A. Singh, A. Jayakumar, Deekshitha G., H. Tiwari, J. Bandekar, S. Badiger, S. Udupa, Saurabh Kumar, and P. K. Ghosh,
“An End-to-End TTS Model in Chhattisgarhi, a Low-Resource Indian Language”,
Speech and Computer, Cham: Springer Nature Switzerland, 2023, pp. 164–172.
ISBN: 978-3-031-48312-7.A. Singh, A. S. Mehta, K. S. Ashish Khuraishi, G. Deekshitha, G. Date, J. Nanavati, J. Bandekar, K. Basumatary, P. Karthika, S. Badiger, S. Udupa, Saurabh Kumar, and P. K. Ghosh,
“An ASR Corpus in Chhattisgarhi, a Low-Resource Indian Language”,
Speech and Computer, Cham: Springer Nature Switzerland, 2023, pp. 173–181.
ISBN: 978-3-031-48312-7.S. Udupa, J. Bandekar, G. Deekshitha, Saurabh Kumar, P. K. Ghosh, S. Badiger, A. Singh, S. Murthy, P. Pai, S. Raghavan, and R. Nanavati,
“Gated Multi Encoders and Multitask Objectives for Dialectal Speech Recognition in Indian Languages”,
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2023, pp. 1–8.A. Bhanushali, G. Bridgman, D. G., P. Ghosh, P. Kumar, Saurabh Kumar, A. Raj Kolladath, N. Ravi, A. Seth, A. Singh, V. Sukhadia, U. S., S. Udupa, and L. V. S. V. D. Prasad,
“Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi”,
Proc. Interspeech 2022, 2022, pp. 3548–3552.
