All roles

Senior Principal Site Reliability Engineer | Oracle Health Federal Operations Team

Remote · USA Full-time New today

About the position Are you a creative person who loves a challenge? Solve the complex puzzles you’ve been dreaming of as our Engineer. If you have a passion for innovation in tech, we want you on our team! Thrive in this crucial automation role. Oracle is a technology leader that’s changing how the world does business. We’re looking for an experienced and self-motivated person. We appreciate you taking the time to review the list of qualifications and to apply for the position. Come and join us! Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health. This team will focus on product deployment, sustainability, troubleshooting and product strategy for Oracle Health, while building out a complete platform supporting modernized, automated healthcare. This is a net new line of business, constructed with an entrepreneurial spirit that promotes an energetic and creative environment. We are unencumbered and will need your contribution to make it a world class engineering center with the focus on excellence. As a Senior Principal Site Reliability DevOps Engineer, you will be responsible for defining and deploying key services with deep focus on architecture, production operations, capacity planning, performance management, deployment, and release engineering. You will work with multiple cross-functional teams helping deliver new and outstanding experiences to our collaborators while ensuring reliability and performance.

Responsibilities

  • Own the full service lifecycle: design, implementation, deployment, on-call, and continuous improvement—maintaining high code and reliability standards.
  • Define and meet service-level objectives (availability, latency, durability) while reducing toil through automation, observability, and self-healing mechanisms.
  • Lead architecture, analysis, design, implementation, and production operations for Core System Framework solutions, with strong documentation and runbooks.
  • Create and maintain clear, version-controlled documentation—architectural diagrams, SOPs, runbooks, and incident playbooks—to ensure repeatable operations, auditability, and fast onboarding.
  • Design, write, and deploy software that improves the availability, scalability, and efficiency of platform services.
  • Develop designs, architectures, standards, and methods for large-scale distributed systems.
  • Build automation to prevent problem recurrence; drive real-time monitoring, alerting, and self-healing into production systems.
  • Conduct capacity planning and demand forecasting; perform software performance analysis, system tuning, and optimization.
  • Contribute to and support platform services across architecture, provisioning, configuration, deployment, and ongoing operations.
  • Partner with distributed teams to prototype and launch new platform services.
  • Stay current on emerging technologies and introduce innovations that improve reliability, security, and developer productivity.
  • Assist in defining and governance of approved cloud tooling, patterns, and implementations in support of OHAI customer environments
  • Mentor and guide engineers in distributed systems design, high-scale data processing, and operational excellence.
  • Set and raise engineering standards across multiple teams; model best practices in reliability, security, and automation.
  • Collaborate closely with storage, networking, observability, and security teams to deliver platform features and secure-by-default designs.
  • Participate in an on-call rotation; lead incident response, postmortems, and follow-through on corrective actions to drive continuous improvement.

Requirements

  • Remote US role must be a US Citizen to be able to support Oracle Federal Customers (Public Trust Clearance)

Apply tot his job Apply To this Job

Related roles

DevOps Site Reliability Engineer

Remote · USA Full-time

Cloud Site Reliability Engineer

Remote · USA Full-time

Engineer Sr Lead, Site Reliability (MS SQL/Azure App Services Focused)

Remote · USA Full-time

SRE - Observability

Remote · USA Full-time

Principal SRE / Hybrid / Tempe

Remote · USA Full-time

Senior Database Site Reliability Engineer

Remote · USA Full-time

Software Engineer - Java, AWS, Kubernetes

Remote · USA Full-time

Azure Kubernetes SME

Remote · USA Full-time

Cloud Platform / Kubernetes Operations Engineer

Remote · USA Full-time

Network Engineer | Federal Team | Virtual

Remote · USA Full-time

Proofreader (Polish)

Remote · USA Full-time

Registered Nurse - MNA - Pediatrics Atrius Health

Remote · USA Full-time

Experienced Customer Service Representative – Remote Work Opportunity at arenaflex

Remote · USA Full-time

Medical Coding Specialist (Home Health) – Per Diem & Remote

Remote · USA Full-time

Experienced Live Chat Support Specialist – Delivering Exceptional Customer Experiences at arenaflex

Remote · USA Full-time

Global Agentic Technologist Operations Manager

Remote · USA Full-time

Experienced Disability Access Service Video Chat Specialist – Full Time Opportunity at arenaflex

Remote · USA Full-time

Job Title: Experienced Data Entry Assistant / Typing Professional – Remote Work Opportunity with arenaflex

Remote · USA Full-time

Assistant Manager, Fund Administration (Core IS)

Remote · USA Full-time

Experienced Customer Support Specialist – Delivering Exceptional Experiences in a Remote Work Environment at arenaflex

Remote · USA Full-time