Company: Huawei Technologies Research & Development (UK) Ltd

Apply for the GPU Chief Architect / XPU Lab Director

Location: Cambridge

Job Description:

Job Summary

We are seeking a highly experienced GPU architect to lead the definition and execution of next‑generation mobile GPU architecture in our Kirin SOC, driving architectural convergence between GPU and NPU toward a coherent xPU sub‑system design.

This role requires deep expertise in GPU microarchitecture, strong system‑level architectural capability including hardware and software, and a thorough understanding of graphics and AI common workloads. A proven track record of delivering related sub‑system IP or complex SoC silicon is highly desirable.

The successful candidate will shape a converged xPU architecture native for future AI compute, optimized for performance, power efficiency, and silicon area in next‑generation mobile compute platforms.

Key Responsibilities

Analyze and characterize future mobile graphics and AI workloads, redefine an xPU (GPU & NPU) converged architecture, including hardware and software, from the ground up that is optimal for future applications.
Ensure compatibility or easy transition from the old architecture.
Define unified or partially unified execution resources (vector, scalar, tensor units).
Develop shared scheduling and workload dispatch mechanisms for graphics and AI.
Design resource sharing and isolation strategies under mixed workloads.
Evaluate architectural trade‑offs between dedicated and converged compute blocks.
Mobile GPU Architecture Leadership
Ensure the timely delivery of next‑generation mobile GPU architecture and long‑term roadmap.
Lead evolution of shader cores, execution pipelines, and cache hierarchy.
Drive performance, power efficiency (Perf/W), and area efficiency (Perf/mm²).
Provide architectural leadership from concept phase through tape‑out.
Memory & Interconnect Architecture
Define a memory hierarchy strategy for converged GPU/NPU workloads.
The architect shared cache structures and bandwidth arbitration policies.
Optimise on‑chip interconnect for heterogeneous compute traffic.
Reduce data movement overhead across compute domains.
System‑Level Architecture Collaboration
Collaborate with CPU, AI software, runtime, and system architecture teams.
Participate in SoC‑level power, thermal, and floorplanning trade‑offs.
Align hardware architecture with graphics APIs and AI frameworks.
Support performance modelling, workload characterisation, and silicon bring‑up.

Required

15+ years of experience in GPU, AI accelerator, or heterogeneous compute architecture.
Deep understanding of GPU microarchitecture (SIMD/SIMT, scheduling, memory systems).
Strong knowledge of tensor/matrix computation and AI acceleration techniques.
Proven experience delivering high‑volume silicon.
Expertise in performance modelling and power analysis.
Strong cross‑functional communication and leadership capability.

What we offer

33 days annual leave entitlement per year (including UK public holidays).
Group Personal Pension.
Life insurance.
Private medical insurance.
Medical expense claim scheme.
Employee Assistance Program.
Cycle to work scheme.
Company sports club and social events.
Additional time off for learning and development.

#J-18808-Ljbffr…

Posted: May 22nd, 2026