Job Description

We are at the precipice of the artificial intelligence revolution, and Nebula AI Systems is building the infrastructure to power the future. We are seeking a visionary Senior AI Infrastructure Engineer to lead the architectural design of our next-generation neural processing systems. As we prepare for the paradigm shifts of 2026, your role will be pivotal in deploying scalable, fault-tolerant, and high-performance computing environments that support advanced generative models.
In this role, you will bridge the gap between cutting-edge AI research and robust, production-grade engineering. You will work in a high-velocity environment where your code directly impacts the capabilities of AI agents worldwide. If you are passionate about optimizing compute resources and building systems that scale to petabyte levels, we want to hear from you.

Responsibilities

Architect High-Performance Systems: Design and implement distributed computing architectures capable of handling massive inference loads for 2026-level generative models.
Optimize Inference Latency: Work closely with ML researchers to optimize model weights and kernels for specific GPU hardware, reducing latency and increasing throughput by up to 40%.
Cloud-Native Deployment: Leverage Kubernetes and containerization strategies to ensure seamless deployment across hybrid cloud environments with zero-downtime rollouts.
Infrastructure Automation: Develop IaC (Infrastructure as Code) pipelines using Terraform and Ansible to automate provisioning and scaling of GPU clusters.
Security & Compliance: Implement rigorous security protocols to protect sensitive training data and ensure adherence to industry standards.
Performance Monitoring: Establish real-time monitoring and alerting systems using Prometheus and Grafana to proactively identify and resolve bottlenecks.

Qualifications

Education: BS, MS, or PhD in Computer Science, Electrical Engineering, or a related technical field.
Experience: 5+ years of experience in software engineering, with at least 3 years specifically focused on AI infrastructure or high-performance computing.
Programming: Expert-level proficiency in Python and C++. Deep understanding of GPU programming (CUDA, OpenCL) or NPU architectures.
Systems: Strong working knowledge of Linux internals, distributed systems theory, and message queues (Kafka, RabbitMQ).
Tools: Experience with container orchestration (Docker, Kubernetes), cloud providers (AWS, GCP, or Azure), and ML frameworks (PyTorch, TensorFlow, JAX).
Soft Skills: Exceptional problem-solving abilities and the ability to communicate complex technical concepts to cross-functional teams.

Senior AI Infrastructure Engineer (2026 Readiness)

Job Description

Responsibilities

Qualifications

Required Skills

Ready to Take This Challenge?

Related Jobs

Senior AI Engineer: 2026 Roadmap

2026 Futurist Strategist

Senior AI/ML Engineer - Vision 2026

Quantum AI Research Scientist 2026

Senior AI & 2026 Tech Lead | New York, NY

Quantum AI Systems Architect