Career Profile
I completed my Ph.D. at Yonsei University, Korea (Advisor: Professor Won Woo Ro), with a dissertation “GPU Architecture Design for Effective Computing Resource Usage.” My expertise lies in designing computer architectures from general-purpose to domain-specific processors, focusing on optimizing computing and memory resource utilization. My recent research focuses on optimizing GPUs’ computing resource usage and democratizing various domain-specific accelerators. To broaden my expertise from microarchitecture to system-level design, I am currently working as a system architect at MangoBoost Inc. and exploring SmartNICs to enhance MangoBoost DPUs.
- Fast learner
- Resilient personality
- Independent and highly motivated
- Highly reliable team player or leader
- Professional communication, writing, and presentation skills
Professional Experiences
Education
Dissertation: GPU Architecture Design for Effective Computing Resource Usage
Publications
Avant-Garde: Empowering GPUs with Scaled Numeric Formats
The International Symposium on Computer Architecture (ISCA 2025)
Effective Interplay between Sparsity and Quantization: From Theory to Practice
The International Conference on Learning Representations (ICLR 25, Spotlight)
M3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUs
The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2024)
Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs
The International Symposium on Microarchitecture (MICRO 2024)
Recompiling QAOA Circuits on Various Rotational Directions
The International Conference on Parallel Architectures and Compilation Techniques (PACT 2024)
MAD MAcce: Supporting Multiply-Add Operations for Democratizing Matrix-Multiplication Accelerators
The International Symposium on Microarchitecture (MICRO 2023)
TensorCV: Accelerating Inference-Adjacent Computation Using Tensor Processors
The International Symposium on Low Power Electronics and Design (ISLPED’23)
R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUs
The International Symposium on Computer Architecture (ISCA 2023)
Investigation on NVIDIA Ampere GPU Architecture With Reverse Engineering
The International Conference on Electronics, Information, and Communication (ICEIC 2023)
Detecting Pattern of Warp Register Value Differences in CTA using GPU Compiler
The International Conference on Electronics, Information, and Communication (ICEIC 2020)
Hardware Accelerator Systems for Artificial Intelligence and Machine Learning
Advances in Computers, Elsvier, vol. 122: Academic Press; 2020, Chapter 6
Trends of High-End Graphic Processing Unit Development
Korean Information Science Society (2019)
----- On-Going Projects (only project names) -----
DMA between GPUs
Effective LLM Inferencing
Sparse/Dense NPUs
The next step for MAD Macce
Bit-wise Matrix Multiplication
Continuous Learning
Remarks
Industry Project
- Study on Memory Sub-System Architecture for Hyper-Scale AI Training, SK Hynix, 2024
- Development of PIM Software Architecture based on Data-Flow Computing, Korea Government (Institute of Information & communications Technology Planning & Evaluation, 2024
- Analysis and Development of GPU Architecture for HPC Workloads [Patent], Samsung (SAIT), 2021-2022
- Development of Data Center Many-core NPU Architecture and Memory Interface, Samsung, 2019-2020
- Development of CPU-GPU Heterogeneous Computing Simulation Environment, SK Hynix, 2019-2020
- Development of the Identification Data Processing Technology for On-site Police Officers, Korea Government (Korea National Police Agency), 2018-2023
- Development of Multi-GPU Based High Speed Ray-Tracing Engine, Samsung, 2017-2018
- Study on Memory Sub-System Architecture for Hyper-Scale AI Training, SK Hynix, 2024
- Development of PIM Software Architecture based on Data-Flow Computing, Korea Government (Institute of Information & communications Technology Planning & Evaluation, 2024
- Analysis and Development of GPU Architecture for HPC Workloads [Patent], Samsung (SAIT), 2021-2022
- Development of Data Center Many-core NPU Architecture and Memory Interface, Samsung, 2019-2020
- Development of CPU-GPU Heterogeneous Computing Simulation Environment, SK Hynix, 2019-2020
- Development of the Identification Data Processing Technology for On-site Police Officers, Korea Government (Korea National Police Agency), 2018-2023
- Development of Multi-GPU Based High Speed Ray-Tracing Engine, Samsung, 2017-2018
Program Committee
- Workshop on General Purpose Processing using GPUs (GPGPU), 2025
- Workshop on General Purpose Processing using GPUs (GPGPU), 2025
Reviewer
- IEEE Computer Architecture Letters (CAL) ×6 (2022-2025)
- IEEE Transactions on Emerging Topics in Computing (TETC) ×1 (2022-2023)
- ACM Transactions on Architecture and Code Optimization (TACO) ×2 (2022-2023)
- IEEE Computer Architecture Letters (CAL) ×6 (2022-2025)
- IEEE Transactions on Emerging Topics in Computing (TETC) ×1 (2022-2023)
- ACM Transactions on Architecture and Code Optimization (TACO) ×2 (2022-2023)
Teaching Assistant
- EEE3530: Computer Architecture, Yonsei University, Seoul, Korea, 2021 Spring
- EEE4473: Embedded System Lab., Yonsei University, Seoul, Korea, 2020 Spring
- EEE3530: Computer Architecture, Yonsei University, Seoul, Korea, 2021 Spring
- EEE4473: Embedded System Lab., Yonsei University, Seoul, Korea, 2020 Spring
Fundings
- BK21 FOUR Project Scholarship: 23,800,000 KRW
- Teaching Assistanct Scholarship: 3,716,000 KRW
- Research Assistance Scholarship: 3,000,000 KRW
- External Scholarship: 11,400,000 KRW
- BK21 FOUR Project Scholarship: 23,800,000 KRW
- Teaching Assistanct Scholarship: 3,716,000 KRW
- Research Assistance Scholarship: 3,000,000 KRW
- External Scholarship: 11,400,000 KRW