About the Role
At Two Bear Capital we believe in partnering with our portfolio companies to build the best teams possible. We look forward to working with you and CIQ for their new Principal AI Performance Engineer opportunity.
CIQ OVERVIEW
CIQ builds the enterprise infrastructure that powers the world's most demanding workloads. From the operating system layer through AI infrastructure, high-performance computing, and cloud-native orchestration, CIQ delivers the speed, security, scalability, and sovereignty that major enterprises, government agencies, and research institutions depend on.
CIQ is the founding support and services partner of Rocky Linux and the developer of the RLC Pro family of Enterprise Linux distributions, Fuzzball workload orchestration, Warewulf Pro cluster provisioning, and Ascender Pro automation. Our customers include some of the largest and most technically sophisticated organizations in the world, working across HPC, AI/ML, defense, and regulated industries.
We are a company of builders, operators, and open source practitioners. If you want to do work that matters, at a company that is genuinely changing how enterprise infrastructure gets built and run, we want to talk.
POSITION SUMMARY
CIQ is seeking a highly experienced Senior or Principal AI Engineer to own and drive AI/ML innovation across our product portfolio. This role sits at the intersection of AI engineering and systems performance - the right candidate brings deep expertise in model inference optimization, training workflows, and production AI deployment, combined with a strong instinct for performance at the systems level.
In this role, you will be the AI engineering standard-bearer at CIQ. You will design and build turnkey AI workload examples - both internal reference pipelines and customer-facing solutions - ensuring that CIQ's AI story is always compelling, practical, and demonstrably best-in-class. You will integrate deeply with Fuzzball, CIQ's cloud-native computing platform, running AI workloads end-to-end through it and helping customers do the same. This role is leveled as Senior or Principal based on qualifications and demonstrated capabilities. KEY
RESPONSIBILITIES AI Inference Optimization
• Design, implement, and tune inference pipelines for large language models and other AI workloads, targeting maximum throughput and minimum latency.
• Apply state-of-the-art optimization techniques: quantization (INT4/INT8/FP8), model pruning, speculative decoding, continuous batching, and kernel fusion.
• Optimize inference-serving stacks, including vLLM, TensorRT-LLM, ONNX Runtime, and similar frameworks, for production deployment on CIQ's OS platform.
• Profile and tune GPU/accelerator utilization across the full inference stack, from model weights and memory bandwidth to CUDA kernels and driver overhead.
• Establish inference performance baselines and regression detection across CIQ's AI-focused solutions. AI Training Workflows
• Design and optimize distributed training pipelines for large-scale models, including data, model, tensor, and pipeline parallelism strategies.
• Tune training efficiency through mixed-precision training, gradient checkpointing, activation recomputation, and optimizer-level improvements.
• Benchmark training throughput and scaling efficiency across multi-GPU and multi-node configurations on CIQ's infrastructure.
• Collaborate with infrastructure and performance teams to resolve training bottlenecks at the network (RDMA/InfiniBand), storage, and OS layers.
• Stay current on frontier model architectures and training techniques, including MoE models, RLHF pipelines, and emerging post-training methods. Turn-Key AI Examples & Reference Workloads
• Build and maintain a library of turn-key AI workload examples that run on CIQ's platform, covering inference serving, fine-tuning, batch processing, RAG pipelines, and agentic workflows.
• Develop both internal reference pipelines for CI/testing and customer-facing examples designed for immediate productivity on CIQ's OS and Fuzzball.
• Package workloads using containers to deliver portable, reproducible AI environments across HPC and cloud-native settings.
• Create compelling, well-documented demos and reference architectures that communicate CIQ's AI capabilities to technical and business audiences alike.
• Partner with product and customer success teams to translate real-world AI use cases into reusable, production-quality examples.
AI Engineering & Tooling
• Build and maintain AI-powered engineering tooling — leveraging LLM-based agents, automated analysis pipelines, and AI-assisted code generation to accelerate the broader engineering organization.
• Champion an AI-first development culture: identify opportunities where AI tooling can reduce toil, surface insights faster, and improve software quality across CIQ's products.
• Evaluate and integrate emerging AI frameworks, libraries, and hardware as they become relevant to CIQ's customers an