Taehan Kim

AI, Single/Multi-channel Signal Processing, Audio/Speech Enhancement

About Me

Welcome to my page!

I’m Taehan Kim, a passionate researcher with a focus on audio preprocessing and its many applications, especially in the realm of speech.

Over the years, I’ve had the privilege of working on a variety of exciting projects, including TTS (Text-to-Speech), Speaker Diarization, Speech Enhancement/Separation, and embedding speech processing modules into robots.
Currently, my research is centered around Speech Enhancement (Universal model, Online/Lightweight model for embedded systems), Speech Separation (achieving SOTA results, generalizing for more than two speakers, online models), and personalized Speech Processing.
But beyond these, I’m always eager to explore new challenges and push the boundaries of what’s possible.

I previously majored in Electronic Engineering at Sogang University in Seoul, graduating as the top student of the School of Engineering department.(GPA: 4.21/4.3, 134 credits in 7 semesters)
I also interned at the Intelligent Information Processing Lab (Prof. Hyung-min Park) within the Department of Electronic Engineering. Now, I’m continuing my journey as a master’s student in the same lab.

I love to connect with others who share my interests or simply want to chat about the latest in tech. Feel free to reach out!

Interests

Speech Enhancement, Speech Separation, Speaker Verification, Interesting Problems with Deep Learning

Work Experience

Military Service

Republic of Korea Army (Feb 2019 - Sep 2021)

Research Student

IIP Lab, Sogang University (Jul 2021 - Feb 2023)
- System Developer / Machine Learning Engineer

Researcher

IIP Lab, Sogang University (Mar 2023 - Present)
- System Developer / Machine Learning Engineer / Embedded System Developer

Visiting Research Fellow

Carnegie Mellon University (Aug 2024 - Feb 2025)

Projects

Text-to-Speech (TTS) Service Website (Mar 2022 - Jun 2022, IIP Lab, Sogang Univ.)

Designed a TTS model using Tacotron2 and Hi-Fi GAN vocoder.
Established a pipeline system to connect AI servers with websites.
Real-time Meeting Minutes Transcription System [Demo] (Jul 2022 - Nov 2022, IIP Lab, Sogang Univ.)
Developed a rule-based Speaker Diarization system based on similarity criteria between newly generated speaker embeddings and stored speaker table.
Designed a system integrating Source Separation, Speaker Diarization, and ASR.
Source Localization and Speech Enhancement in Robot Vacuum Cleaner (May 2023 - Nov 2023, IIP Lab, Sogang Univ.)
Fabricated a lightweight Speech Enhancement model tailored for embedded systems.
Devised a mask based on the output of deep learning models and formulated a source localization algorithm utilizing it.
Multi-Channel Audio preprocessing (Sep 2023 - Dec 2023, IIP Lab, Sogang Univ.)
Constructed an integrated system for signal processing and speech preprocessing algorithms.
Speech Signal Improvement Challenge - ICASSP2024 (Jan 2024 - Feb 2024, IIP Lab, Sogang Univ.)
Participated in the Real-time Track, placing 5th out of 13 teams.
Designed various model compression techniques and developed loss functions and training methods to maintain performance after compression.
Speech Separation [Demo] (Feb 2024 - May 2024, IIP Lab, Sogang Univ.)
Developed a novel and advanced architecture for Speech Separation and conducted its design and experiments.
Multi-Channel Speech Enhnancement for Speech Recognition in Robot Vaccum Cleaner (Mar 2024 - Sep 2024, IIP Lab, Sogang Univ.)
Fabricated a lightweight Speech Enhancement model tailored for embedded systems.
Research on Speech Enhancement Training Methods to Improve Speech Recognition Performance in Low SNR Conditions.
Multi-Channel Speech Enhnancement for Speech Recognition in Robot Vaccum Cleaner (Mar 2024 - Sep 2024, IIP Lab, Sogang Univ.)
Fabricated a lightweight Speech Enhancement model tailored for embedded systems.
Research on Speech Enhancement Training Methods to Improve Speech Recognition Performance in Low SNR Conditions.
Text-to-Music(TTM) Unlearning (Sep 2024 - Nov 2024, Carnegie Mellon Univ.)
Did unlearn the TTM model (MusicGen) to remove copyright contents or harmful music.
Research on Effective Methods to Unlearn Music Information in TTM Model.
Hallucination Detection for Audio Language Models (Sep 2024 - Nov 2024, Carnegie Mellon Univ.)
Propose the metric to evaluate the reliablity of model prediction and quantify the risk of hallucinations.
Combined two key feature : decisiveness and uncertainty.

Publication / Preprints / Experience

[W] : Workshop, [J] : Journal, [C] : Confernece, [Pt] : Patent

[W1] Yeon-Jin Kim, Taehan Kim, Hyung-Min Park. Real-time meeting minutes using speaker diarization.
Brain Engineering Society of Korea workshop(2023)

[J1, Pt1] J. -H. Kim, Taehan Kim, S. -H. Kim, J. -M. Song, Y. -J Park, Hyung-Min Park. A Real-Time Sound Source Localization System for Robotic Vacuum Cleaner with a Microphone Array.
IEEE Sensors Journal*(2024) /Applying for KR, US Patent(2024W) [Link]

[C1] Taehan Kim, Hyung-Min Park. Comparison and Analysis of Output Methods for Real-Time Multichannel Speech Enhancement Models.
Korean Society of Speech Sciences Conference(2024)

[C2] Ui-Hyeop Shin, Sangyoun Lee, Taehan Kim, Hyung-Min Park. Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation.
NeurIPS 2024. [Link][Demo]

[W2] Jinju Kim, Taehan Kim, Abdul Waheed, Jong Hwan Ko, Rita Singh. No Encore: Unlearing as Opt-Out in Music Generation.
Accepted by AI4Music at NeurIPS 2025. [Link]

About Me

Interests

Work Experience

Military Service

Research Student

Researcher

Visiting Research Fellow

Projects

Text-to-Speech (TTS) Service Website (Mar 2022 - Jun 2022, IIP Lab, Sogang Univ.)

Real-time Meeting Minutes Transcription System [Demo] (Jul 2022 - Nov 2022, IIP Lab, Sogang Univ.)

Source Localization and Speech Enhancement in Robot Vacuum Cleaner (May 2023 - Nov 2023, IIP Lab, Sogang Univ.)

Multi-Channel Audio preprocessing (Sep 2023 - Dec 2023, IIP Lab, Sogang Univ.)

Speech Signal Improvement Challenge - ICASSP2024 (Jan 2024 - Feb 2024, IIP Lab, Sogang Univ.)

Speech Separation [Demo] (Feb 2024 - May 2024, IIP Lab, Sogang Univ.)

Multi-Channel Speech Enhnancement for Speech Recognition in Robot Vaccum Cleaner (Mar 2024 - Sep 2024, IIP Lab, Sogang Univ.)

Multi-Channel Speech Enhnancement for Speech Recognition in Robot Vaccum Cleaner (Mar 2024 - Sep 2024, IIP Lab, Sogang Univ.)

Text-to-Music(TTM) Unlearning (Sep 2024 - Nov 2024, Carnegie Mellon Univ.)

Hallucination Detection for Audio Language Models (Sep 2024 - Nov 2024, Carnegie Mellon Univ.)

Publication / Preprints / Experience

Education

Sogang University, Seoul, Korea

B.S.E. in Electronic Engr. (Mar 2018 - Feb 2023)

Sogang University, Seoul, Korea

M.S.E. in Electronic Engr. (Mar 2023 — Present)

Carnegie Mellon University, PA, US

Visiting Research Fellow in S3D for IITP program. (Jul 2024 — Feb 2025)

Awards

Sogang Convergence Technology Contest (3rd place) Dec. 2022

Outstanding Graduate Award Feb. 2023