Job Description
Are you ready to architect the future of Artificial Intelligence?
We are Nebula Horizon Solutions, a pioneer in next-generation technology. As we prepare for the massive shifts in the AI landscape projected for 2026, we are seeking a visionary GenAI Infrastructure Architect to lead our core research and deployment efforts. This is not just a job; it is an opportunity to define the standards of the industry.
In this role, you will bridge the gap between theoretical machine learning breakthroughs and scalable, production-grade infrastructure. You will be responsible for designing the systems that power our proprietary large language models and autonomous agents.
Why join us?
- Work on cutting-edge technology that will define the market for 2026 and beyond.
- Competitive compensation package and equity options.
- Flexible remote-first culture with a premium tech stack.
- Access to state-of-the-art hardware and research facilities.
Responsibilities
- Design and implement a scalable, fault-tolerant AI infrastructure capable of handling petabyte-scale data.
- Lead the architecture for fine-tuning and deploying Large Language Models (LLMs) on edge devices and cloud environments.
- Optimize model inference latency and reduce operational costs through advanced caching and quantization strategies.
- Collaborate with cross-functional teams of data scientists, engineers, and product managers to translate research into product features.
- Establish best practices for MLOps, including CI/CD pipelines, experiment tracking, and model governance.
- Conduct research into emerging trends for 2026, such as neuromorphic computing or quantum-assisted machine learning.
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field; PhD preferred.
- 5+ years of experience in software engineering, with at least 3 years specifically in Machine Learning Engineering or AI Infrastructure.
- Deep proficiency in Python, C++, and CUDA.
- Extensive experience with ML frameworks such as PyTorch, TensorFlow, or JAX.
- Strong understanding of distributed systems, cloud platforms (AWS, GCP, or Azure), and containerization technologies (Docker, Kubernetes).
- Experience with model quantization, pruning, and serving technologies (vLLM, TGI, TensorRT).
- Excellent problem-solving skills and the ability to work in a fast-paced, ambiguous environment.