C

Principal Data Engineer, LLM/AI Platforms (Remote)

CrowdStrike · Anywhere

Full-timeStaff+PythonAWSGCPDockerKubernetesLangChain

🔥19 people viewed this job

About the Role

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn't changed — we're here to stop breaches, and we've redefined modern security with the world's most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We're also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We're always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About the Role:CrowdStrike is looking for a Principal Data Engineer with deep expertise in Large Language Models (LLMs) and AI platforms to join our growing Data Science Platform Engineering Team. You will be a key leader, responsible for designing, building, and deploying cutting-edge data infrastructure that powers our next generation of AI-driven security products. This role requires significant hands-on experience in LLM integration, agentic workflows, and agent harnessing to deliver high-impact, scalable solutions. You will champion engineering excellence, focusing on shipping fast, writing elegant, high-quality code, and actively mentoring and strengthening the team's technical knowledge and capabilities. The scale of our systems and data are approaching Exabytes in size. Experience with extremely large-scale systems, including DevSecOps patterns, practices, and standards are important for this work. What You'll Do:Architect, implement, and optimize data platforms and pipelines specifically designed to support LLMs, Retrieval-Augmented Generation (RAG), and sophisticated AI agentic systems at Exabyte scale. Drive the adoption and deployment of agentic workflows and agent harnessing techniques to create autonomous, data-driven security features. Design and implement highly scalable, fault-tolerant, and cost-effective data solutions, emphasizing rapid iteration and high-quality deployment. Write elegant, production-ready code with a focus on performance, maintainability, and testing rigor, ensuring the ability to ship fast without compromising quality. Provide technical leadership and deep expertise in data modeling, normalization, and semantic cataloging for AI/ML workloads. Establish best practices for MLOps/DataOps surrounding LLMs, including monitoring, observability, and zero-touch recovery mechanisms for AI services. Actively mentor engineers, conducting technical workshops, leading design reviews, and strengthening the team's knowledge in cutting-edge AI platform technologies. Collaborate across the organization with Data Scientists, Product Managers, and other engineering teams to transform research prototypes into robust, production-grade services. Own the end-to-end lifecycle of critical data services: development, testing, deployment, and monitoring. Tech Stack (Expertise in several key areas is expected):MLOps Tools (MLflow, Sagemaker, Vertex AI) Experience with common agentic workflow frameworks (e.g., LangChain, LlamaIndex). Expert-level proficiency in a high-level coding language (Python, or JVM technologies). Deep experience with distributed data processing frameworks (e.g., Spark, Dask, Flink). Strong expertise with cloud platforms (AWS, GCP, or OCI) and related data services. Containerization and orchestration mastery (Docker, Kubernetes). Message queuing and streaming technologies (Kafka, Pulsar). Data Warehousing (Snowflake, BigQuery) and Data Orchestration (Airflow, Kubeflow). What You'll Need:Master's degree or PhD in Computer Science, Data Engineering, or a related STEM field, or equivalent practical experience. 10+ years of progressive experience in Data Engineering/Platform Engineering, with at least 3 years focused on architecting and building platforms for AI/ML or Data Science at massive scale. Demonstrable hands-on experience in LLM engineering (fine-tuning, prompt engineering, deployment), RAG, and developing agentic workflows. Proven track record of designing and delivering large-scale distributed systems (sharding, partitioning, concurrency). Exceptional ability to write clean, elegant, performant, and well-tested code, coupled with a proactive mindset for delivering results quickly. A thorough understanding of engineering practices, including effective peer code reviews, resilient architecture design, and comprehensive testing paradigms. Prior experience in a Principal or Staff level engineering role, demonstrating technical leadership and mentorship capabilities. Bonus Points:Direct experience building, deploying, and managing LLMs in a production environment. Prior experience in t

💬 Developer Questions

Ask the team a question — answers show up here

🎯

What does the interview process look like?

🤖

What AI/vibe coding tools does the team use daily?

👥

How big is the engineering team?

Is the team fully async or are there required meetings?

🚀

What does onboarding look like for remote hires?

🔧

Can you share more about the tech stack and architecture?

📈

What does career growth look like in this role?

📅

What does a typical day look like?

💰

Is there a salary range you can share?

📊

Is equity or stock options part of the package?

🌍

Are there timezone requirements or preferences?

🛂

Do you sponsor work visas?

🏢 Is this your listing? Claim it to answer questions

Similar Jobs

Helpful resources

Hiring for a similar role? Post your job here — it's free →