About the Role
AI Engineer – Customer EngineeringUp to $250,000 base | ~$50,000 performance bonus | Meaningful equity
The OpportunityWe're partnering with one of the most exciting companies in the AI infrastructure space right now. A well-funded, US-based hardware startup that is genuinely reshaping the economics of AI inference. This isn't another GPU-adjacent play. This company has built purpose-built silicon from the ground up, with a product shipping in production today.
They're backed by some of the most credible investors and strategic partners in the industry including major financial trading firms, global sovereign wealth funds, and one of the world's leading chip architecture companies. Their customers include hyperscalers, Tier 2 CSPs, CDN providers, and latency-sensitive financial services firms who are deploying their hardware.The team is around 50 people today and targeting 100 by the end of 2026. This is a rare window to join a company that has real traction, real revenue, and a product roadmap that could genuinely dent Nvidia's inference dominance.
The RoleAs an AI Engineer on the Customer Engineering team, you'll sit between software, customer impact, and product validation. You're not writing demos, you're building production-grade AI applications that real customers rely on.
Your work spans agentic systems, document intelligence pipelines, real-time audio processing, and conversational AI with a remit to design applications that showcase the unique capabilities of purpose-built inference hardware. You'll own integration into leading AI frameworks, contribute to open-source positioning, and build the reference implementations that customers take and extend.
This role directly drives revenue, customer adoption, and the company's technical differentiation story. You'll have real scope from day one.
What You'll Be DoingBuilding and deploying production-ready AI applications with tool use, document intelligence pipelines, real-time audio and conversational AI designed to highlight the hardware's unique performance advantages including disaggregated inference, intelligent caching, and system-level optimization.Integrating the company's platform into leading AI frameworks and open-source tooling, building bridges between current capabilities and next-generation products through clever application design.Developing multi-tenant serving patterns, deployment frameworks, and application-layer optimizations that enable direct revenue generation on the company's cloud infrastructure.Architecting with advanced patterns: RAG with retrieval optimization, multi-agent orchestration, semantic caching, and disaggregated inference then documenting and sharing results with customers and the broader ecosystem.
What We're Looking ForYou've shipped production AI applications that real users depend on. You can point to specific things you've built, and talk concretely about their impact. You have genuine hands-on experience with modern AI application patterns RAG, agentic systems, tool use, conversational AI not just awareness of them.
You have a track record of taking models from development through to production: containerisation, API design, monitoring, the works. You think about performance architecturally caching strategies, request routing, latency and throughput. And you're genuinely comfortable with AI-assisted development workflows and autonomous coding agents.
Beyond that, the following will make you stand out: experience with real-time and streaming applications (audio, WebSockets, SSE), document intelligence pipelines, inference server deployment (vLLM, TensorRT, Triton), understanding of LLM inference internals like KV cache and attention mechanisms, and Kubernetes/cloud-native deployment experience.This is a startup. They move fast and expect the same from you.
Why This Role
The inference market is the fastest-growing segment of AI infrastructure right now and this company has a credible, differentiated position that the biggest names in finance and cloud are already backing with real dollars. The technical problems are challenging, the team is small enough that your work matters immediately, and the comp package reflects the calibre of person they want.