Stanford University
KernelBench Framework for Evaluating GPU Kernels
Pages
37
Time to read
75 mins
Publication
Language
English
Pages
37
Time to read
75 mins
Publication
Language
English
This document is a technical report that introduces KernelBench, an open-source framework designed to evaluate the ability of language models (LMs) to generate efficient GPU kernels for various machine learning workloads. The report outlines the significance of efficient GPU kernels in enhancing machine learning architectures and addresses the challenges associated with writing them. KernelBench encompasses a suite of 250 tasks that reflect real-world AI engineering environments, allowing LMs to automate kernel generation. The framework includes a new evaluation metric, fastp, which assesses the functional correctness and performance of generated kernels. The findings indicate that while frontier reasoning models show potential, they often fail to meet performance benchmarks, achieving functional correctness in less than 20% of cases. The report further discusses the importance of compiler feedback and profiling metrics in improving kernel generation quality and identifies key challenges for future advancements in this area.