YantraVision’s AMD NPU
Services
AMD-NPU | Deeplearning | Image and Signal Processing
AMD Ryzen AI processors integrate an on-die NPU based on the XDNA architecture to execute AI workloads locally, independent of CPU and GPU resources.
This enables low-latency, power-efficient inference for vision, media, and assistant workloads across Ryzen AI platforms, reaching throughput of ~50–60 TOPS.
Why work with YantraVision?
YantraVision provides end-to-end engineering for AMD Ryzen AI NPUs, covering system design, model optimization, and deployment using the Ryzen AI software and AI Engine stack.
We support teams from prototype to production, enabling on-device AI workloads without in-house NPU expertise, and deliver optimized, production-ready AI features on Ryzen AI platforms.
Our Core offerings
NPU architecture and feasibility :
- Analyze workloads across CPU, integrated GPU, and NPU.
- Partition execution using Ryzen AI software and ONNX Runtime.
- Define model selection, quantization approach, and performance targets per Ryzen AI generation.
Custom NPU Applications :
- Develop custom AI Engine (AIE) kernels using MLIR-AIE.
- Design multi-tile graphs with DMA and streaming dataflow.
- Implement image processing, signal processing, and pre/post-processing pipelines.
Validation, tooling, and enablement :
- Benchmark execution across CPU, GPU, and NPU
- Perform regression and long-run stability testing on Ryzen AI platforms
- Deliver documentation, reference implementations, and developer training
Overview
- System Design
- HDL’s / Higher Level Synthesis
- Custom IP Development
- 3rd Party IP Integration
- FPGA Tools – Xilinx Vivado, HLS – Vivado
- Intel Quartus, Intel HLS
- Simulation – ModelSim, NCSIM
- Others – Xilinx SDSoC, Xilinx reVISION, Intel SoC EDS
Hardware Layer
- MicroEngine, Hardware Kernel & DataFlow Computing
- Image, Video and Audio Processing
- ML and DL Optimised Implementation
- Algorithm Porting
- openCL
- xfOpenCV, xfDNN
- AWS-F1
OS / Firmware Layer
- Kernel and Firmware development for BareMetal platforms
- Linux based platforms
- Device drivers, Kernel module development for IP’s
- High performance – zero copy – kernel modules
- Yocto
- Petalinux
- Xilinx SDK
- Intel SDK
Middleware Layer
- J2K
- Multichannel H.264/H.265 Codec Integration
- Device Driver Integration
- GStreamer
- FFMPEG
Application Layer
- Image, Video and Audio Processing
- ML and DL Optimised Implementation
- Algorithm Porting
- Python Libraries
- C, C++ based Applications
- OpenCL, OpenCV
- OpenGL
Example Application Domains
Computer vision & imaging
- Real-time image conversions.
- Image enhancement, denoising, and background removal.
- Background Blur, noise suppression.
- On-device inspection and camera-based analytics.
- Smart Framing, low-latency inference.
- Classical vision pipelines on AIE tiles.
- Agri Processing Application: Sorting
- Textile Application: Fabric Inspection
- Pharma Application: Tablet Inspection
- Print Application: Print Quality Inspection