LockedIn AI is hiring an AI Cloud Engineer to design and operate the cloud infrastructure powering real-time AI systems used by over 1 million users worldwide.
About the Role
We are looking for a cloud-native, AI-infrastructure-focused engineer to build and scale the backbone of our AI systems. This role sits at the intersection of cloud engineering, DevOps, and machine learning infrastructure, where you will design the environments that train, serve, and optimize large-scale AI models in production.
You will own the full lifecycle of AI infrastructure β from GPU clusters for training to low-latency inference systems powering real-time interview assistance.
Key Responsibilities
AI-Optimized Cloud Architecture
-
Design scalable cloud infrastructure for ML training, fine-tuning, and inference
-
Architect GPU-based compute environments optimized for AI workloads
-
Build multi-environment systems (training, staging, production) with proper isolation
-
Implement auto-scaling systems for dynamic AI workloads
Model Serving & Inference Infrastructure
-
Build production-grade inference systems for real-time AI responses
-
Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, TGI, etc.)
-
Optimize latency, throughput, batching, and GPU utilization
-
Design load balancing, routing, and failover systems for AI APIs
GPU Compute & Training Systems
-
Manage GPU clusters for model training and evaluation
-
Configure distributed training (multi-node, multi-GPU setups)
-
Optimize spot/preemptible instance usage for cost efficiency
-
Operate managed ML platforms (SageMaker, Vertex AI, Azure ML, etc.)
Cloud Cost Optimization (FinOps for AI)
-
Monitor and optimize cloud spend across GPU, storage, and API usage
-
Implement cost dashboards and alerts for infrastructure usage
-
Optimize LLM usage, token consumption, and inference efficiency
-
Reduce idle compute and improve GPU utilization rates
Networking, Security & Reliability
-
Design secure VPCs, private endpoints, and high-performance networking
-
Implement IAM policies, encryption, and secrets management
-
Ensure compliance readiness (SOC2, GDPR, CCPA)
-
Build resilient systems with high availability and fault tolerance
Infrastructure as Code & Observability
-
Build all infrastructure using Terraform, Pulumi, or CloudFormation
-
Implement GitOps workflows for reproducible deployments
-
Develop monitoring systems for GPU health, latency, and system performance
-
Build alerting systems for failures, spikes, and anomalies
Required Qualifications
Experience
-
3+ years in cloud engineering, DevOps, or infrastructure roles
-
Experience with ML/AI workloads in production environments
-
Hands-on experience with GPU-based compute systems
-
Startup or high-growth environment experience preferred
Technical Skills
-
Strong proficiency in Python, Go, or Bash
-
Deep experience with AWS, GCP, or Azure
-
Strong Kubernetes expertise (GPU scheduling, autoscaling, Helm, etc.)
-
Experience with model serving systems (vLLM, Triton, TensorRT, etc.)
-
Infrastructure as Code (Terraform, Pulumi, CloudFormation)
-
Monitoring tools (Prometheus, Grafana, Datadog, CloudWatch, etc.)
Soft Skills
-
Strong systems thinking and cloud architecture mindset
-
Cost-conscious engineering approach
-
Clear communication and documentation skills
-
Ability to work independently in fast-paced environments
Preferred Qualifications
-
Experience with large-scale LLM inference systems
-
Multi-GPU distributed training expertise
-
Knowledge of real-time streaming or low-latency systems
-
Experience with RDMA, InfiniBand, or high-performance networking
-
Background in SaaS, edtech, or AI product companies
-
Open-source or startup experience
What We Offer
-
Competitive equity in a fast-growing AI company
-
$140,000 β $195,000 USD / year compensation range
-
Remote-first work model (US-based, NYC optional hybrid)
-
Opportunity to build infrastructure used by 1M+ users
-
High-impact engineering ownership from day one
-
Fast-paced, AI-native development environment
About LockedIn AI
LockedIn AI is building the worldβs leading real-time AI interview and meeting copilot. Our platform helps users succeed in interviews, assessments, and professional conversations using advanced AI systems operating at scale.
How to Apply
Submit:
-
Resume/CV
-
Short note covering:
-
Why you want to join LockedIn AI
-
Experience with cloud or AI infrastructure
-
Ideas for improving AI system performance or scalability
-
Optional: GitHub, portfolio, or technical writeups