About the Role
Read this first
This is a founding-team role. Compensation is equity-only today and shifts to base + equity the moment we close our first capital event — either our seed round or our first signed enterprise contract. Both are weeks away, not months. We have signed pipeline with multiple customers ready to go, the product is near-complete, and we're hiring the technical core that will own the AI side of the platform.
If equity-only doesn't work for you right now, that's fair — apply when cash is on the table. If you've done founding-engineer work before, know what early equity is worth, and want to own a meaningful slice of something with real customer demand: keep reading.
About the role
You'll be the technical owner of the AI agent and automation infrastructure at Betwixt. Architect retrieval pipelines, design and ship multi-step agents, and build the workflow plumbing that lets us deliver LLM-powered features that actually work in production — not demoware.
This is end to end: data ingestion and chunking, embedding strategy and vector storage, retrieval and reranking, agent design, tool wiring, evaluation, observability, cost and latency tuning, and the deployment infrastructure that keeps everything reliable. You won't be writing prototypes for someone else to "productionize" — you are the person who productionizes.
What you'll do
• Architect and own end-to-end RAG pipelines: ingestion, chunking strategy, embeddings, vector storage, retrieval, reranking, and grounded answer synthesis.• Design multi-step agents using tool use, function calling, and structured output — and decide when an agent is the right answer vs. a deterministic workflow.• Build internal automations and pipelines that connect LLMs to the rest of our systems (databases, APIs, queues, schedulers, third-party SaaS).• Choose and integrate the right tools across the stack: LLM providers (Anthropic, OpenAI, open-weights), vector DBs (Pinecone, Weaviate, Qdrant, pgvector), orchestration frameworks (LangGraph, LlamaIndex, custom), eval harnesses (Braintrust, LangSmith, custom).• Define and run evaluations: golden sets, regression suites, online and offline metrics. Treat eval as production code, not a notebook afterthought.• Instrument observability: tracing, prompt and response logs, cost and latency budgets, drift detection.• Handle the unsexy production concerns: rate limits, retries, backoff, idempotency, timeouts, caching, fallbacks across providers, prompt-injection defense, PII handling.• Partner with the rest of the team to expose agent capabilities as clean APIs that frontends and backends can consume.• Help shape the technical direction of the company — this is a founding role.
What we're looking for
• 10+ years of professional software engineering experience overall, with strong fundamentals in Python or TypeScript.• Hands-on experience designing and shipping production RAG pipelines: chunking strategy, embeddings, vector search, hybrid retrieval, reranking, citation/grounding.• Built and operated agent systems in production: tool use / function calling, planning and reflection patterns (e.g., ReAct, plan-and-execute), structured output, multi-step orchestration.• Deep practical knowledge of at least one major LLM provider API (Anthropic, OpenAI, etc.) and at least one orchestration framework (LangGraph, LlamaIndex, Haystack, or a justified custom stack).• Hands-on experience with at least one vector store (Pinecone, Weaviate, Qdrant, Chroma, pgvector, etc.) and an opinion about when to use which.• Strong grasp of evaluation: building golden sets, automated grading (LLM-as-judge with sanity checks), regression testing, online metrics.• Experience integrating LLM-powered features into real backend systems via REST/GraphQL/queues/webhooks — not just notebooks or chat UIs.• Demonstrated production use of AI coding tools (Claude Code, Cursor, Copilot) in your daily workflow.• Excellent written communication; can write a one-page design doc that an engineering team can ship from.• Comfortable with founding-team risk and the equity-only structure described above.
Nice to have
• Prior founding engineer experience.• Built workflow/automation systems (Temporal, Inngest, Airflow, Prefect, n8n, custom).• Open-weights model experience (Llama, Mistral, Qwen) and self-hosted inference (vLLM, TGI, Ollama).• Fine-tuning, distillation, or DPO experience.• Background in classical ML, IR, or NLP that informs your retrieval and ranking choices.• Prompt-injection defense, jailbreak red-teaming, and LLM safety patterns.• Open-source contributions to AI/agent frameworks.• Experience deploying agentic systems to enterprise customers with audit, compliance, and SOC2 considerations.• Familiarity with MCP (Model Context Protocol) and broader agent-tool standards.
Compensation, transparently
• Today through first capital event: equity-only. Founding-team grant, vesting from start date.• At first capital event (seed close or first major contract — imminent): co