NPU Developer Community

NPU GenAI

The Developer Hub for On-Device AI

Resources, guides, and community for developers building with NPU-accelerated generative AI — across AMD, Intel, Qualcomm, and beyond.

Browse Resources Join Community

120+

Curated Resources

4

NPU Platforms

800+

Developers

12

SDKs Tracked

Featured Topics

Start here — popular guides and resources from the community

Guide AMD XDNA · Intel NPU

Getting Started with NPU Development

From zero to your first NPU-accelerated inference. Set up your environment, install the right SDK, and run a model on-device in under 30 minutes.

Model All Platforms

Running LLaMA 3.2 on NPU

Deploy Meta's LLaMA 3.2 with INT4 quantization on your NPU using ONNX Runtime and Olive. Includes benchmark results across AMD, Intel, and Qualcomm hardware.

SDK All Platforms

ONNX Runtime NPU Execution Provider

How to configure ONNX Runtime to target your NPU with DirectML and platform-specific execution providers. Covers model export, EP selection, and optimization flags.

Tool AMD · Intel · Qualcomm

Model Quantization with Olive

Microsoft Olive makes it easy to quantize and optimize models for NPU targets. This guide walks through INT4/INT8 quantization pipelines for common LLMs and vision models.

Model Qualcomm · AMD

Whisper Speech Recognition on NPU

Real-time transcription with OpenAI Whisper running entirely on your NPU. Latency comparisons between NPU, CPU, and GPU across multiple device categories.

OpenVINO on Intel NPU Deep Dive

Intel's OpenVINO toolkit for deploying models on Intel NPU. From model conversion to async inference pipelines — covering Core Ultra 100H and 200V series.

View All Resources →

Latest Updates

What's new in the NPU + GenAI ecosystem

2025 · Jun

ONNX Runtime 1.20 Adds AMD XDNA 3 Support

The latest ONNX Runtime release brings native execution provider support for AMD Ryzen AI Max series chips, with up to 40% throughput improvement over the previous NPU EP implementation.

ONNX Runtime AMD XDNA SDK Update

2025 · May

Qualcomm AI Hub Expands Model Library to 200+ Models

Qualcomm's AI Hub now offers over 200 pre-optimized models for Snapdragon X NPU, including Phi-3.5 Mini, Stable Diffusion XL, and Whisper Large v3 — all ready for Hexagon deployment.

Qualcomm AI Hub Model Zoo

2025 · Apr

Intel OpenVINO 2025.1 Improves NPU Throughput by 35%

OpenVINO 2025.1 ships with a revamped NPU compiler backend delivering significant latency and throughput gains on Core Ultra 200V series, particularly for transformer-based models.

Intel OpenVINO Performance

Join the Community

Connect with NPU developers around the world

Discord Server

Real-time discussions, help channels for each platform, and weekly Q&A sessions with NPU engineers.

Join Discord →

GitHub Organization

Open-source NPU tools, model optimization scripts, and community benchmark contributions. PRs welcome.

View on GitHub →

Weekly Newsletter

SDK release notes, curated tutorials, and NPU hardware news — delivered every Friday.

Developer Forum

Searchable Q&A, code snippets, and long-form technical discussions indexed for future reference.

Browse Forum →