KernelBench Framework for Evaluating GPU Kernels preview page 1

Stanford University

KernelBench Framework for Evaluating GPU Kernels

Pages

Time to read

75 mins

Publication

02/19/25

Language

English

Summary

This document is a technical report that introduces KernelBench, an open-source framework designed to evaluate the ability of language models (LMs) to generate efficient GPU kernels for various machine learning workloads. The report outlines the significance of efficient GPU kernels in enhancing machine learning architectures and addresses the challenges associated with writing them. KernelBench encompasses a suite of 250 tasks that reflect real-world AI engineering environments, allowing LMs to automate kernel generation. The framework includes a new evaluation metric, fastp, which assesses the functional correctness and performance of generated kernels. The findings indicate that while frontier reasoning models show potential, they often fail to meet performance benchmarks, achieving functional correctness in less than 20% of cases. The report further discusses the importance of compiler feedback and profiling metrics in improving kernel generation quality and identifies key challenges for future advancements in this area.

Stanford University

KernelBench Framework for Evaluating GPU Kernels

Summary

Get the Full Copy