S

AI ML ENGINEER-W2 Only

Steneral Consulting · United States

About the Role

Job Summary For AI/ML Engineer (KFORCE Urgent Requirement) • Role: AI/ML Engineer • Location: Remote (Preference for Tampa area candidates) Key Responsibilities • Design, deploy, and optimize open-source LLMs and AI frameworks for on-premises, bare-metal/private hardware clusters. • Build, maintain, and secure local AI inference pipelines, advanced fine-tuning/embedding workflows, and RAG (Retrieval-Augmented Generation) systems. • Optimize AI models and workflows for maximum efficiency (latency, throughput, CPU/GPU/memory usage) on bare-metal infrastructure. • Ensure complete data isolation, integrating AI with internal data sources while adhering to compliance and regulatory requirements. • Evaluate and select open-source, local-first AI tools (vector databases, orchestration frameworks, model serving layers). • Collaborate with engineering and compliance teams to align AI solutions with regulatory needs. • May perform additional duties as assigned. Required Skills & Qualifications • Extensive local (non-cloud) AI deployment experience, especially with open-source LLMs (e.g., LLaMA, Mistral) in air-gapped/private environments. • Deep expertise in AI frameworks like PyTorch, Hugging Face, LangChain, LlamaIndex. • Strong experience with RAG architectures, embeddings, semantic search, and vector databases (e.g., FAISS, Qdrant, Milvus, Chroma). • Proficient in containerizing AI workloads (Docker/Kubernetes) and managing GPU-based compute environments. • Solid understanding of advanced ML concepts (e.g., LoRA/QLoRA fine-tuning, prompt engineering, model quantization formats like GGUF, AWQ, EXL2). • Ability to work autonomously in isolated, non-cloud development environments. • Experience in compliance-driven or regulated industries (e.g., fintech, legal-tech). • Familiarity with local-first agentic workflows, Model Context Protocol, and developing internal developer copilots. Other Details • Contractor, project-based role with immediate start and potential for extension. • Mission to build a fully secure, scalable, and self-hosted AI foundation. • All AI systems must operate 100% locally, with zero external API calls. • Deliver high-throughput, low-latency inference pipelines and a secure, auditable AI architecture for future company-wide use.

💬 Developer Questions

Ask the team a question — answers show up here

🎯

What does the interview process look like?

🤖

What AI/vibe coding tools does the team use daily?

👥

How big is the engineering team?

Is the team fully async or are there required meetings?

🚀

What does onboarding look like for remote hires?

🔧

Can you share more about the tech stack and architecture?

📈

What does career growth look like in this role?

📅

What does a typical day look like?

💰

Is there a salary range you can share?

📊

Is equity or stock options part of the package?

🌍

Are there timezone requirements or preferences?

🛂

Do you sponsor work visas?

🏢 Is this your listing? Claim it to answer questions

Similar Jobs

Helpful resources

Hiring for a similar role? Post your job here — it's free →