AI performance and efficiency is the major tech theme for the next decade. We are building systems that autonomously discover, test, and ship state-of-the-art GPU kernels. We just closed an unannounced $4.2M pre-seed round from top-tier funds and technical angels, and have proven results with large and sophisticated enterprise partners on custom neural architectures We believe that revolutionary breakthroughs often happen at the intersections of fields. We are not a research lab, nor are we an AI agents company. Own complex production ML/AI systems end-to-end Communicate and share ideas through high-quality documentation, technical meet-ups and blogs
You’ve written and shipped high-performance or SOTA CUDA kernels You’ve made deliberate choices about tiling, memory access patterns, warp-level primitives, and instruction scheduling You’ve traced performance cliffs to their root cause through profiler output You know transformers at the implementation level. You’ve worked with production inference or training frameworks, vLLM, Megatron-LM, etc. You’ve built performance-critical infrastructure before – compilers, profilers, auto-tuners, or search systems You’re familiar with new or esoteric technical methods such as Neural Algorithmic Reasoning, Geometric Deep Learning, Category Theory, Neuroevolution, Megakernels, or the work of François Chollet, Kenneth Stanley, Jeff Clune, Jurgen Schmidhuber, David Ha, and Christian Szegedy
Bonus Publications in ML/AI, kernel optimisation or evolutionary methods (NeurIPS, ICLR, CVPR, GECCO or equivalent) Experience building agentic systems Demonstrated work on KernelBench, Kaggle, GitHub, Blogs, StackOverflow Answers, or any public work that demonstrates deep EA, ML or GPU/HW expertise This is a full-time, permanent role. Competitive salary + significant founding equity. On-site/hybrid/remote flexible – Dublin, London, Paris or NYC preferred…
