Lead DevOps Engineer, Foundry RnD
2026-02-25T10:22:02+00:00
Mastercard Foundation
https://cdn.greatkenyanjobs.com/jsjobsdata/data/employer/comp_8415/logo/gggggg.jpeg
https://mastercardfdn.org/en/
FULL_TIME
Nairobi
Nairobi
00100
Kenya
Nonprofit, and NGO
Computer & IT, Science & Engineering, Management
2026-02-03T17:00:00+00:00
8
Mastercard is a leading global payments & technology company that connects consumers, businesses, merchants, issuers & governments around the world.
Read more about this company
What You'll Do
Drive Platform Infrastructure: Own DevOps and infrastructure for MLOps and agentic AI systems, establishing reusable patterns for CI/CD, scalable inference, orchestration, observability, and cost control. Design secure, scalable, repeatable systems using Infrastructure as Code (IaC) to support R&D workloads.
Build secure CI/CD & automation systems: Enable secure tool access, workload isolation, and infrastructure for LLM-backed APIs and MCP servers, while partnering with security and compliance on access control, infrastructure governance and auditability.
Ensure Reliability & Observability: Implement monitoring, logging, and alerting. Tune observability for ML-specific workloads to ensure performance, reliability, and operational insight.
Provide Technical Leadership: Offer hands-on leadership across DevOps and platform initiatives. Review code, enforce best practices, improve tooling, and promote clean, well-tested infrastructure.
Cross-Functional Collaboration: Partner with ML, software, and platform engineers to design deployment strategies, scope work, manage agile deliverables, and meet milestones.
What You'll Bring
Extensive DevOps Experience: 8–12+ years in DevOps, SRE, or platform engineering, including senior/lead roles. Experience designing end-to-end infrastructure systems, solving scale/performance challenges, and operating platforms in production.
Cloud & Infrastructure Expertise: Strong skills in cloud platforms (AWS, Azure, or GCP) and AI/ML components such as Databricks, Azure ML, and MLflow. Deep experience with Infrastructure as Code using Terraform and orchestration tools like Terragrunt.
Container & Orchestration Mastery: Expertise in Kubernetes and Docker, including how they optimise ML development workflows. Experience with container security, networking, and cluster management at scale.
AI/ML Platform Knowledge: Understanding of ML workflow requirements—model registries, feature stores, AI agents, Retrieval-Augmented Generation (RAG) techniques, and frameworks like LangChain/LlamaIndex.
Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
Problem-Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8–12+ years of proven experience architecting and operating production-grade infrastructure, especially those supporting AI/ML workloads.
Infrastructure as Code: Expert in Terraform and IaC orchestration tools like Terragrunt. Strong experience with configuration management and GitOps practices.
Programming & Scripting: Advanced Bash and Python skills and strong software engineering fundamentals (version control, CI, code reviews). Familiarity with Go or other systems programming languages is a plus.
* Own DevOps and infrastructure for MLOps and agentic AI systems, establishing reusable patterns for CI/CD, scalable inference, orchestration, observability, and cost control. * Design secure, scalable, repeatable systems using Infrastructure as Code (IaC) to support R&D workloads. * Enable secure tool access, workload isolation, and infrastructure for LLM-backed APIs and MCP servers, while partnering with security and compliance on access control, infrastructure governance and auditability. * Implement monitoring, logging, and alerting. * Tune observability for ML-specific workloads to ensure performance, reliability, and operational insight. * Offer hands-on leadership across DevOps and platform initiatives. * Review code, enforce best practices, improve tooling, and promote clean, well-tested infrastructure. * Partner with ML, software, and platform engineers to design deployment strategies, scope work, manage agile deliverables, and meet milestones.
* DevOps * SRE * Platform Engineering * Cloud platforms (AWS, Azure, or GCP) * AI/ML components (Databricks, Azure ML, MLflow) * Infrastructure as Code (Terraform, Terragrunt) * Kubernetes * Docker * Container security * Container networking * Cluster management * ML workflow requirements * Model registries * Feature stores * AI agents * Retrieval-Augmented Generation (RAG) techniques * LangChain * LlamaIndex * Bash scripting * Python scripting * Software engineering fundamentals * Version control * CI * Code reviews * Go (plus) * Systems programming languages (plus)
* Bachelor's degree in Computer Science, Engineering, or related field. * 8–12+ years of proven experience architecting and operating production-grade infrastructure, especially those supporting AI/ML workloads. * Expert in Terraform and IaC orchestration tools like Terragrunt. * Strong experience with configuration management and GitOps practices. * Advanced Bash and Python skills. * Strong software engineering fundamentals (version control, CI, code reviews). * Familiarity with Go or other systems programming languages is a plus.
JOB-699ecd4a3b5c3
Vacancy title:
Lead DevOps Engineer, Foundry RnD
[Type: FULL_TIME, Industry: Nonprofit, and NGO, Category: Computer & IT, Science & Engineering, Management]
Jobs at:
Mastercard Foundation
Deadline of this Job:
Tuesday, February 3 2026
Duty Station:
Nairobi | Nairobi
Summary
Date Posted: Wednesday, February 25 2026, Base Salary: Not Disclosed
Similar Jobs in Kenya
Learn more about Mastercard Foundation
Mastercard Foundation jobs in Kenya
JOB DETAILS:
Mastercard is a leading global payments & technology company that connects consumers, businesses, merchants, issuers & governments around the world.
Read more about this company
What You'll Do
Drive Platform Infrastructure: Own DevOps and infrastructure for MLOps and agentic AI systems, establishing reusable patterns for CI/CD, scalable inference, orchestration, observability, and cost control. Design secure, scalable, repeatable systems using Infrastructure as Code (IaC) to support R&D workloads.
Build secure CI/CD & automation systems: Enable secure tool access, workload isolation, and infrastructure for LLM-backed APIs and MCP servers, while partnering with security and compliance on access control, infrastructure governance and auditability.
Ensure Reliability & Observability: Implement monitoring, logging, and alerting. Tune observability for ML-specific workloads to ensure performance, reliability, and operational insight.
Provide Technical Leadership: Offer hands-on leadership across DevOps and platform initiatives. Review code, enforce best practices, improve tooling, and promote clean, well-tested infrastructure.
Cross-Functional Collaboration: Partner with ML, software, and platform engineers to design deployment strategies, scope work, manage agile deliverables, and meet milestones.
What You'll Bring
Extensive DevOps Experience: 8–12+ years in DevOps, SRE, or platform engineering, including senior/lead roles. Experience designing end-to-end infrastructure systems, solving scale/performance challenges, and operating platforms in production.
Cloud & Infrastructure Expertise: Strong skills in cloud platforms (AWS, Azure, or GCP) and AI/ML components such as Databricks, Azure ML, and MLflow. Deep experience with Infrastructure as Code using Terraform and orchestration tools like Terragrunt.
Container & Orchestration Mastery: Expertise in Kubernetes and Docker, including how they optimise ML development workflows. Experience with container security, networking, and cluster management at scale.
AI/ML Platform Knowledge: Understanding of ML workflow requirements—model registries, feature stores, AI agents, Retrieval-Augmented Generation (RAG) techniques, and frameworks like LangChain/LlamaIndex.
Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
Problem-Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8–12+ years of proven experience architecting and operating production-grade infrastructure, especially those supporting AI/ML workloads.
Infrastructure as Code: Expert in Terraform and IaC orchestration tools like Terragrunt. Strong experience with configuration management and GitOps practices.
Programming & Scripting: Advanced Bash and Python skills and strong software engineering fundamentals (version control, CI, code reviews). Familiarity with Go or other systems programming languages is a plus.
Work Hours: 8
Experience in Months: 12
Level of Education: bachelor degree
Job application procedure
Application Link:Click Here to Apply Now
All Jobs | QUICK ALERT SUBSCRIPTION