[Remote] Sr. Associate, Site Reliability Engineering
Note: The job is a remote job and is open to candidates in USA. McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. They are seeking a Sr. Associate in Site Reliability Engineering to develop, deploy, and maintain cloud-based infrastructure and ensure the reliability and performance of their data systems.
Responsibilities
- Development, deployment, and maintenance of cloud-based infrastructure and data platforms hosted within AWS
- Designing and maintaining scalable, secure, and highly available cloud environments that support our production workloads
- Ensuring the reliability and performance of our Databricks-based data infrastructure, which is central to our business intelligence and data science operations
- Support rapid deployment cycles and maintain consistency across development, staging, and production environments
- Diagnosing complex system failures and implementing preventive measures to minimize downtime
- Managing access controls, encryption, and vulnerability remediation
- Collaborate with software engineers, data scientists, and IT operations teams
Skills
- Master's degree, or a foreign equivalent, in Computer Science or a related field of study and two (2) years of experience in an SRE or DevOps role on any cloud platform, in the job offered or a related occupation
- Experience must include two (2) years in the following skills: Amazon Web Services (AWS), including services EC2, S3, Lambda, CloudFormation, and IAM, DMS, RDS Proxy, Event Bus, Athena, State Machines, API Gateway, DynamoDB
- Databricks for managing large-scale data pipelines, real-time analytics, and machine learning workflows
- Infrastructure automation using tools Terraform, Ansible, GitLab and CI/CD pipelines
- Incident management practices and high-availability system design to ensure 24/7 uptime of mission-critical systems
- Security best practices and compliance standards including SOC 2 and ISO 27001
- Linux system administration
- Programming in python, ruby, or bash
- Cloud resources and concepts such as networking, load balancing, DNS, and security
- Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues
- Deploying and maintaining docker applications and container orchestration systems management environments in production using ECS or EKS
- Working with relational databases MySQL and PostgreSQL
- Infrastructure monitoring using Datadog, including setting up synthetic monitoring, oncall alerts and pager alerts
- Participation in Incident Management teams
- Leading and managing technical projects including costing and time management projections
- Identity management within Okta and Azure AD
- Experience must include one (1) year in the following skills: Salesforce technical support and knowledge
- GoAnywhere MFT
- Tableau
- PHP
Benefits
- We are proud to offer a competitive compensation package at McKesson as part of our Total Rewards.
- In addition to base pay, other compensation, such as an annual bonus or long-term incentive opportunities may be offered.
- For more information regarding benefits at McKesson, please click here.
Company Overview
Company H1B Sponsorship