Kazuhito Koishida
Principal Research Manager
I am a Principal Lead Scientist at Applied Sciences Group in Experiences + Devices organization. I have been with Microsoft since 2000. My area of interests is in signal processing and machine learning for audio, speech, computer vision, and other sensor data.
Past projects
- Audio and voice compression: Bitrate/bandwidth scalable codec, MELP codec at 1.2kbps, and Windows Media Audio and Voice codec
- Audio matching: Voice note application and music recognition service
- Microphone array processing: Beamforming and sound source localization
- Audio/voice detection and recognition: Keyword spotting and speaker identification
- Speech enhancement: Audio/visual fusion and bandwidth expansion
Education
- B.S degree in Electrical Engineering from the Tokyo Institute of Technology, Japan, in 1994
- M.S. degree in Electrical Engineering from the Tokyo Institute of Technology, Japan, in 1995
- Ph.D. degree in Electrical Engineering from the Tokyo Institute of Technology, Japan, in 1998. Dissertation title: Speech Coding Based on Mel-Generalized Cepstral Analysis
- Post doctoral researcher at Signal Compression Lab in the University of California, Santa Barbara, 1998-2000
-
CorrGAN: Simultaneous Learning of Speech Enhancement and Perceptual Quality Loss FunctionsTo appear in ICASSP 2025 2025
-
NeurIPS Workshop 2024 December, 2024
-
NeurIPS Workshop 2024 December, 2024
-
Proceedings, 2024 IEEE International Conference on Image Processing (ICIP) October, 2024
-
IEEE/ACM Transactions on Audio, Speech, and Language Processing October, 2024 Vol. 32 Pages 4727-4740
-
Proceedings, Interspeech 2024 September, 2024
-
LiveSpeech: Low-Latency Zero-Shot Text-to-Speech via Autoregressive Modeling of Audio Discrete CodesProceedings, Interspeech 2024 September, 2024
-
DMLR Workshop in ICML 2024 July, 2024
-
Proceedings of the Twelfth International Conference on Learning Representations (ICLR) May, 2024 Vol. abs/2404.01740
-
Proceedings. ICASSP 2024 April, 2024 Pages 5435-5439
-
Proceedings. International Conference on Complex Networks and Their Applications November, 2023 Pages 363-373 ISBN: 978-3-031-53468-3
-
Proceedings, Interspeech 2023 August, 2023 Pages 2463-2467
-
Workshop on Efficient Systems for Foundation Models @ ICML2023 July, 2023
-
Proceedings, ICASSP 2023 June, 2023 Pages 1-5
-
Proceedings, ICASSP 2022 May, 2022 Pages 6557-6561
-
Proceedings, ICASSP 2022 May, 2022 Pages 6962-6966
-
Proceedings, Interspeech 2021 August, 2021 Pages 2696-2700
-
Proceedings, Interspeech 2021 August, 2021 Pages 2796-2800
-
Proceedings, ICASSP 2021 June, 2021 Pages 7153-7157
-
Proceedings, Interspeech 2020 October, 2020 Pages 61-65
-
Proceedings, Interspeech 2020 October, 2020 Pages 2442-2446
-
Proceedings, Interspeech 2020 October, 2020 Pages 175-179
-
Proceedings, Interspeech 2020 October, 2020 Pages 2447-2451
-
Proceedings of the 37th International Conference on Machine Learning (ICML), Vienna, Austria July, 2020
-
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June, 2020
-
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) June, 2020 Pages 4084-4090
-
Proceedings, ICASSP 2020 May, 2020 Pages 6214-6218
-
Proceedings, ICASSP 2020 May, 2020 Pages 846-850
-
Proceedings, ICASSP 2020 May, 2020 Pages 7539-7543
-
Proceedings, Interspeech 2019 September, 2019 Pages 3629-3633
-
IEEE Journal of Selected Topics in Signal Processing May, 2019 Vol. 13, No. 2 Pages 347-358
-
Proceedings, 2019 International Conference on Acoustics, Speech and Signal Processing (ICASSP) May, 2019 Pages 3717-3721
-
IEEE/ACM Transactions on Audio, Speech, and Language Processing September, 2018 Vol. 26, No. 9 Pages 1633-1644
-
Proceedings, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) December, 2017 Pages 584-590
-
Proceedings, Interspeech 2017 2017 Pages 1487-1491
-
Proceedings, 2008 IEEE 10th Workshop on Multimedia Signal Processing October, 2008 Pages 927-932
-
Speech Coding, 2002, IEEE Workshop Proceedings October, 2002 Pages 90-92
-
IEICE Transactions on Information and Systems October, 2001 Vol. E84-D, No. 10 Pages 1427-1434
-
2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421) September, 2000 Pages 90-92
-
Proceedings, 2000 International Conference on Acoustics, Speech and Signal Processing (ICASSP) June, 2000 Pages 1149-1152
-
2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100) June, 2000 Vol. 3 Pages 1375-1378
-
IEICE Transactions on Information and Systems April, 2000 Vol. 83 Pages 876-883
-
Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181) May, 1998 Vol. 1 Pages 161-164
-
CELP Speech Coding Based on Mel-Generalized Cepstral AnalysisIEICE Transactions on Information and Systems February, 1998 Vol. J81-A, No. 2 Pages 252-260
-
Proceedings, 5th International Conference on Spoken Language Processing (ICSLP '98) 1998 Vol. 6 Pages 2583-2586
-
Low Bit Rate Speech Coding Based on Mel-Generalized Cepstral AnalysisTokyo Institute of Technology 1998
-
Spectral Representation of Speech Based on Mel-Generalized Cepstral Coefficients and Its PropertiesIEICE Transactions on Information and Systems November, 1997 Vol. J80-A, No. 11 Pages 1999-2006
-
1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding September, 1997 Pages 19-20
-
Proceedings, 1997 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) April, 1997 Vol. 2 Pages 1355-1358
-
Proceedings, 4th International Conference on Spoken Language Processing (ICSLP '96) October, 1996 Vol. 1 Pages 318-321
-
Proceedings, 1995 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) May, 1995 Vol. 1 Pages 33-36