Senior Platform SRE (Platform Operations)
Role purpose:
- Ensure reliability, operability, and continuous improvement of TD SYNNEX enterprise platforms across hybrid cloud and on‑prem environments.
- Engineering‑driven operations focused on automation, Infrastructure‑as‑Code (IaC), observability, and toil reduction.
- Serve as the L3 escalation for complex incidents; continuously improve platform run posture and readiness for L1/L2 execution.
Core responsibilities:
- Platform reliability (hybrid cloud + on‑prem): Own L3 reliability posture; define SLOs/KPIs; lead operability gates and production readiness; maintain runbooks/SOPs.
- Automation & IaC: Design/build operational automation (health checks, remediation workflows); develop Terraform/Ansible configurations; script with Python (preferred), PowerShell, and/or Bash; integrate with ITSM for auditable self‑service and controlled remediation.
- Incident/problem/RCA (L3): Lead diagnosis, stabilization, and recovery for major incidents; drive problem management, RCA, preventive actions; reduce MTTR/MTTD via better signals, runbooks, and automation.
- Observability standards: Define actionable signals, alert quality, dashboards, logging; tune alerting to reduce noise; run data‑driven operational reviews.
- AIOps enablement: Advance predictive/proactive operations (anomaly detection, trend/capacity analysis); support Python‑based analytics and ML/DL where applicable; industrialize operational intelligence safely.
- Provider enablement (outsourced L1/L2): Equip provider with clear runbooks, training, standard changes, escalation criteria; govern performance and ITSM alignment; drive continuous improvement.
- Collaboration & CI: Partner with Platform Engineering to ensure operable‑by‑design capabilities; feed operational insights into roadmap; mentor peers and promote engineering‑led operations culture.
Required qualifications:
- 5+ years in platform/SRE/operations/platform engineering with production ownership in large‑scale environments.
- Hands‑on hybrid operations (cloud + on‑prem) with strong enterprise cloud fundamentals (compute, networking, storage, identity).
- Production IaC and automation (Terraform, Ansible); scripting with Python/PowerShell/Bash (Python strongly preferred).
- Proven L3 incident troubleshooting and major incident leadership.
- Strong infrastructure fundamentals: networking (including DNS/DHCP concepts), virtualization, storage, Windows Server and/or Linux.
- ITSM experience (incident, problem, change) and ticket‑based operations.
- Azure platform knowledge.
Preferred/valued:
- SRE practices (SLOs, error budgets, postmortems, toil reduction).
- Virtualization and backup/DR operations experience.
- Exposure to containers and DevOps/CI/CD; configuration drift control.
- Python for operational analytics; familiarity with ML/DL for anomaly detection, forecasting, clustering.
- Experience in large, multinational, 24/7 operations.
- Knowledge of AI/agentic approaches and modern automation patterns.
Desired attributes:
- Engineering mindset; automates and standardizes to reduce toil.
- Strong ownership; calm, structured incident leader.
- Clear communicator in global, matrixed environments; effective cross‑team/vendor partner.
- Comfortable across cloud and on‑prem; disciplined documentation; committed to operational excellence.
Key competencies:
- Site reliability/operations engineering
- Automation, IaC (Terraform/Ansible), scripting (Python/PowerShell/Bash)
- Hybrid infrastructure operations
- Incident/problem management, RCA, continuous improvement
- Observability and alert quality management (tool‑agnostic)
- Provider enablement and operational governance
- Data‑driven operations and AIOps‑oriented thinking
Key SkillsAt TD SYNNEX, our values guide everything we do: Together, We Own It, We Dare to Go, We Grow and Win, and above all, We Do the Right Thing. These principles shape how we work with each other, our partners, and our communities as we drive innovation and create lasting impact.
What’s In It For You?- Elective Benefits: Our programs are tailored to your country to best accommodate your lifestyle.
- Grow Your Career: Accelerate your path to success (and keep up with the future) with formal programs on leadership and professional development, and many more on-demand courses.
- Elevate Your Personal Well-Being: Boost your financial, physical, and mental well-being through seminars, events, and our global Life Empowerment Assistance Program.
- Diversity, Equity & Inclusion: It’s not just a phrase to us; valuing every voice is how we succeed. Join us in celebrating our global diversity through inclusive education, meaningful peer-to-peer conversations, and equitable growth and development opportunities.
- Make the Most of our Global Organization: Network with other new co-workers within your first 30 days through our onboarding program.
- Connect with Your Community: Participate in internal, peer-led inclusive communities and activities, including business resource groups, local volunteering events, and more environmental and social initiatives.
Don’t meet every single requirement? Apply anyway. At TD SYNNEX, we’re proud to be recognized as a great place to work and a leader in the promotion and practice of diversity, equity and inclusion. If you’re excited about working for our company and believe you’re a good fit for this role, we encourage you to apply. You may be exactly the person we’re looking for!