Career Profile
I completed my Ph.D. at Yonsei University, Korea, with a dissertation titled “GPU Architecture Design for Effective Computing Resource Usage” under the guidance of Professor Won Woo Ro. My expertise lies in designing computer architectures from general-purpose processing units to domain-specific accelerators, with a focus on optimizing computing resource utilization. My recent work focuses on optimizing GPUs’ computing resource usage and democratizing various domain-specific accelerators. I am currently a system architect at MangoBoost Inc., exploring the RDMA engine architecture to improve MangoBoost DPUs further.
- Fast learner
- Resilient personality
- Independent and highly motivated
- Highly reliable team player or leader
- Professional communication, writing, and presentation skills
Professional Experiences
Education
Dissertation: GPU Architecture Design for Effective Computing Resource Usage
Publications
M3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUs
The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2024)
Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs
The 57th International Symposium on Microarchitecture (MICRO 2024)
Recompiling QAOA Circuits on Various Rotational Directions
The International Conference on Parallel Architectures and Compilation Techniques (PACT 2024)
MAD MAcce: Supporting Multiply-Add Operations for Democratizing Matrix-Multiplication Accelerators
The 56th International Symposium on Microarchitecture (MICRO 2023)
TensorCV: Accelerating Inference-Adjacent Computation Using Tensor Processors
The 2023 International Symposium on Low Power Electronics and Design (ISLPED’23)
R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUs
The 50th International Symposium on Computer Architecture (ISCA 2023)
Investigation on NVIDIA Ampere GPU Architecture With Reverse Engineering
The 22th International Conference on Electronics, Information, and Communication (ICEIC 2023)
Detecting Pattern of Warp Register Value Differences in CTA using GPU Compiler
The 19th International Conference on Electronics, Information, and Communication (ICEIC 2020)
Hardware Accelerator Systems for Artificial Intelligence and Machine Learning
Advances in Computers, Elsvier, vol. 122: Academic Press; 2020, Chapter 6
Trends of High-End Graphic Processing Unit Development
Korean Information Science Society (2019)
----- On-Going Projects (only project names) -----
DMA between GPUs
Effective LLM Inferencing
Sparse/Dense NPUs
The next step for MAD Macce
General Quantum Program Optimization
Bit-wise Matrix Multiplication
Continuous Learning
Projects
Industry Project
- Study on Memory Sub-System Architecture for Hyper-Scale AI Training, SK Hynix, 2024
- Development of PIM Software Architecture based on Data-Flow Computing, Korea Government (Institute of Information & communications Technology Planning & Evaluation, 2024
- Analysis and Development of GPU Architecture for HPC Workloads [Patent], Samsung (SAIT), 2021-2022
- Development of Data Center Many-core NPU Architecture and Memory Interface, Samsung, 2019-2020
- Development of CPU-GPU Heterogeneous Computing Simulation Environment, SK Hynix, 2019-2020
- Development of the Identification Data Processing Technology for On-site Police Officers, Korea Government (Korea National Police Agency), 2018-2023
- Development of Multi-GPU Based High Speed Ray-Tracing Engine, Samsung, 2017-2018
- Study on Memory Sub-System Architecture for Hyper-Scale AI Training, SK Hynix, 2024
- Development of PIM Software Architecture based on Data-Flow Computing, Korea Government (Institute of Information & communications Technology Planning & Evaluation, 2024
- Analysis and Development of GPU Architecture for HPC Workloads [Patent], Samsung (SAIT), 2021-2022
- Development of Data Center Many-core NPU Architecture and Memory Interface, Samsung, 2019-2020
- Development of CPU-GPU Heterogeneous Computing Simulation Environment, SK Hynix, 2019-2020
- Development of the Identification Data Processing Technology for On-site Police Officers, Korea Government (Korea National Police Agency), 2018-2023
- Development of Multi-GPU Based High Speed Ray-Tracing Engine, Samsung, 2017-2018
Paper Peer Review
- IEEE Computer Architecture Letters (CAL) ×5
- IEEE Transactions on Emerging Topics in Computing (TETC) ×1
- ACM Transactions on Architecture and Code Optimization (TACO) ×2
- IEEE Computer Architecture Letters (CAL) ×5
- IEEE Transactions on Emerging Topics in Computing (TETC) ×1
- ACM Transactions on Architecture and Code Optimization (TACO) ×2
Teaching Assistant
- EEE3530: Computer Architecture, Yonsei University, Seoul, Korea, 2021 Spring
- EEE4473: Embedded System Lab., Yonsei University, Seoul, Korea, 2020 Spring
- EEE3530: Computer Architecture, Yonsei University, Seoul, Korea, 2021 Spring
- EEE4473: Embedded System Lab., Yonsei University, Seoul, Korea, 2020 Spring