All roles

[Remote] Senior Site Reliability Engineer

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. CURO Financial Technologies Corp, through its brand Attain Finance, is a leading consumer credit lender with over 50 years of expertise in providing financial solutions. They are seeking a Senior Site Reliability Engineer to enhance the reliability and operational excellence of their software deployment and management processes, focusing on AWS, Kubernetes, and various DevOps tools.

Responsibilities

  • Build and operate the delivery platform. Work across AWS, EKS, ArgoCD, Helm, GitHub Actions, Azure DevOps, Terraform, and Python
  • Fix the problems you own. Find root cause across the AWS and Kubernetes stack, fix it, and harden it so it stays fixed
  • Respond to incidents. Help stabilize during outages, drive root-cause analysis, and ship corrective actions for your systems
  • Standardize how we build and ship. Define reproducible container builds and GitOps paths on ArgoCD and Helm that replace manual deployment
  • Help consolidate the CI estate. Standardize pipelines across GitHub Actions and Azure DevOps for your services — remove brittle steps and silent failures and improve visibility
  • Support platform adoption. Build golden-path templates and tooling and help teams move services onto the platform
  • Use progressive delivery. Canary and blue green deploys (Argo Rollouts) and automated rollback for the services you operate
  • Build observability in. Wire golden-signal metrics, logs, and traces (Prometheus/Mimir, Loki, Tempo, OpenTelemetry) into your services, surfaced in Grafana with SLOs for your domain
  • Operate production systems. Troubleshoot failed to deploy, respond to alerts, and improve behavior from real incidents
  • Help meet SLOs and carry on call. Track reliability metrics for the services you operate and share the rotation
  • Built across environments. Design dev, test, and prod for safe promotion, recovery from failed deployments, and zero-downtime upgrades
  • Help set the standard. Build reference implementations for build, deploy, GitOps, promotion gates, and observability
  • Uphold compliance with the pipeline. Support deployment traceability, approval trails, and segregation of duties for PCI DSS, SOC 2, SOX, and GLBA
  • Cut toil and cost. Automate repetitive ops work and help tune EKS compute, CI runners, and observability cardinality
  • Unblock across teams. Get hands-on with Cloud, Security, Application Engineering, Data, and Product to keep delivery moving
  • Kill knowledge silos. Write docs, runbooks, and incident learnings, so engineers operate independently

Skills

  • Kubernetes, ArgoCD, Helm, Terraform, Python. Deep hands-on production experience
  • Hands-on AWS. Operate and debug EKS, ECS, EC2, ECR, IAM/IRSA, VPC networking, ALB/NLB, CloudWatch, Secrets Manager, and KMS
  • GitHub Actions and/or Azure DevOps. Build and operate CI/CD at scale
  • Grafana and the observability stack. Hands-on with Grafana dashboards and alerting, and the metrics, logs, and traces stack (Prometheus/Mimir, Loki, Tempo, OpenTelemetry)
  • Strong scripting. Python and Bash, with the ability to grow into systems-level coding
  • Production troubleshooting. Comfortable getting into a system under load, finding root cause, and fixing it
  • Production ownership. Uptime and reliability accountability
  • Incident response. You respond and help drive postmortems that yield real improvements
  • Standards contribution. You contribute to engineering standards and best practices
  • Compliance awareness. Experience in regulated or high-rigor environments or implementing audit and access controls in pipelines
  • Mentorship. Through code review, examples, and pairing
  • 5+ years in site reliability, platform, DevOps, or software engineering, with production ownership of systems or pipelines
  • Advanced GitOps. ArgoCD (or Flux), reusable Helm patterns, Argo Rollouts
  • CI consolidation or migration. Moving between CI systems, such as Azure DevOps to GitHub Actions
  • Self-hosted observability at scale. Running Grafana, Mimir, Loki, and Tempo in production
  • Supply chain security. SBOMs, artifact signing (Sigstore/cosign), SLSA provenance
  • Platform migrations. Contributing to modernization with minimal disruption
  • .NET / C#. Enough to containerize and reason about application workloads
  • Low-level Kubernetes. Cilium/eBPF, Karpenter, or self-hosted networking and autoscaling
  • Resilience testing. Chaos/failure injection or disaster recovery drills
  • AI-assisted tooling. Responsible use with output validation
  • Certification. AWS Solutions Architect, AWS DevOps Engineer, or CKA/CKAD
  • Degree in computer science or equivalent practical experience

Benefits

  • Flexible Paid Time Off Program
  • Medical
  • Dental
  • Vision
  • Life Insurance
  • Disability
  • Other voluntary coverages
  • 401k program, starting on the first of the month following 30 days of employment with a company match

Company Overview

  • Big News! We’ve changed our name. CURO is now Attain Finance. It was founded in 1997, and is headquartered in Wichita, Kansas, USA, with a workforce of 0-1 employees. Its website is https://www.curo.com/.
  • Apply To This Job

    Related roles