E

LLM Inference Deployment Engineer

EnCharge AI · Anywhere

Full-timePythonDockerKubernetesPyTorchTensorFlow

🔥29 people viewed this job

About the Role

Job Description: Deploy and optimize LLMs (GPT, LLaMA, Mistral, Falcon, etc.) post-training from libraries like HuggingFace Utilize inference runtimes such as ONNX Runtime, vLLM for efficient execution. Optimize batching, caching, and tensor parallelism to improve LLM scalability in real-time applications. Develop and maintain high-performance inference pipelines using Docker, Kubernetes, and other inference servers.Requirements: Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field. Experience in LLM inference deployment, model optimization, and runtime engineering. Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, TensorRT-LLM, DeepSpeed). In-depth knowledge of the Python programming language for model integration and performance tuning. Strong understanding of high-level model representations and experience implementing framework-level optimizations for Generative AI use cases Experience with containerized AI deployments (Docker, Kubernetes, Triton Inference Server, TensorFlow Serving, TorchServe). Strong knowledge of LLM memory optimization strategies for long-context applications. Experience with real-time LLM applications (chatbots, code generation, retrieval-augmented generation).Benefits:

EnCharge AI has 1 open position on Remote Vibe Coding Jobs.

💬 Developer Questions

Ask the team a question — answers show up here

🎯

What does the interview process look like?

🤖

What AI/vibe coding tools does the team use daily?

👥

How big is the engineering team?

Is the team fully async or are there required meetings?

🚀

What does onboarding look like for remote hires?

🔧

Can you share more about the tech stack and architecture?

📈

What does career growth look like in this role?

📅

What does a typical day look like?

💰

Is there a salary range you can share?

📊

Is equity or stock options part of the package?

🌍

Are there timezone requirements or preferences?

🛂

Do you sponsor work visas?

🏢 Is this your listing? Claim it to answer questions

Similar Jobs

Helpful resources

Hiring for a similar role? Post your job here — it's free →