All roles

Lead Systems Monitoring Engineer

Remote · USA Full-time New today

Halvik Corp delivers a wide range of services to 13 executive agencies and 15 independent agencies. Halvik is a highly successful WOB business with more than 50 prime contracts and 500+ professionals delivering Digital Services, Advanced Analytics, Artificial Intelligence/Machine Learning, Cyber Security and Cutting-Edge Technology across the US Government. Be a part of something special!

The Lead Systems Engineer is responsible for leading the design, implementation, and operational support of enterprise‑grade monitoring and observability solutions for cloud‑hosted applications and infrastructure. This role provides technical leadership to a team of engineers while working closely with government customers to translate operational, performance, and availability requirements into reliable monitoring solutions.

The position oversees the administration and integration of platforms such as Splunk, AWS CloudWatch, Azure Monitor, and other application performance monitoring (APM) tools to ensure end‑to‑end visibility across complex cloud environments. Responsibilities include developing dashboards, SLA and performance reports, managing high‑volume data ingestion architectures, and implementing synthetic monitoring capabilities to proactively detect service degradation.

The role also plays a critical part in performance engineering and reliability operations by analyzing load and regression test results, conducting Java/JVM performance analysis, and leading incident triage efforts for AWS‑hosted applications. By combining deep technical expertise with customer engagement and team leadership, this position ensures high availability, performance, and reliability of mission‑critical systems.

Core Responsibilities

  • Strategic Execution: Executes cloud observability and reliability strategy by leading engineering teams, translating customer requirements into scalable monitoring solutions, and operationalizing performance, availability, and SLA objectives through Splunk and cloud‑native services.
  • Operational Oversight: Provides operational oversight of cloud monitoring and observability platforms, ensuring SLA compliance, proactive issue detection, effective incident response, and continuous performance optimization for AWS‑hosted applications.
  • Collaboration: Collaborates with government customers, engineering teams, and stakeholders to translate monitoring requirements into scalable solutions, coordinate incident response, and drive continuous improvement in cloud reliability and performance.
  • Technical Performance: Ensures optimal application and infrastructure performance through advanced monitoring, JVM instrumentation, load and regression analysis, and Splunk‑based event analytics. Proactively identifies, analyzes, and resolves performance and reliability issues in AWS cloud environments to meet SLA and availability targets.

Minimum Requirements

  • Education: Bachelor's Degree in Computer Science with 8 years' experience or Master's Degree with 5 years of experience
  • Experience: 8+ years of IT experience with 5+ years specializing in Application Performance Management, cloud monitoring, and performance engineering for complex, multi‑tier applications. Strong expertise in synthetic monitoring, Java/JVM performance analysis, and automation using Splunk, cloud‑native tools, and scripting.
  • Technical Proficiency: Application Performance Management, synthetic availability monitoring, and performance analysis across complex, multi‑tier systems using Java/JVM, cloud and APM tools. Strong scripting and automation skills (Shell, PowerShell, Regex) with expertise in performance tuning, monitoring, and data‑driven troubleshooting.
  • Compliance: Ensures monitoring and observability solutions adhere to government, security, and operational compliance requirements while meeting defined SLAs and availability standards. Maintains compliant configurations, reporting, and data handling practices across cloud monitoring platforms and performance engineering processes.
  • Certifications: certifications in APM/observability (Splunk), cloud platforms (AWS/Azure)

Preferred Expertise

  • Strong expertise in AWS cloud monitoring and Splunk observability platforms, with experience supporting government environments (USPTO preferred), combined with proven technical leadership and strong communication skills for customer‑facing collaboration.
  • Strong analytical and problem-solving capabilities.
Halvik offers a competitive full benefits package including: Company-supported medical, dental, vision, life, STD, and LTD insurance Benefits include 11 federal holidays and PTO Eligible employees may receive performance-based incentives in recognition of individual and/or team achievements. 401(k) with company matching Flexible Spending Accounts for commuter, medical, and dependent care expenses Tuition Assistance Charitable Contribution matching Halvik Corp is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or veteran status. Apply To This Job

Related roles

Full Stack Developer

Remote · USA Full-time

Family Support Specialist (Case Manager), Developmental Disabilities Program

Remote · USA Full-time

Spc Coder Sr Coding PB TMF BusOf B

Remote · USA Full-time

Online English Teacher

Remote · USA Full-time

Outside Sales Representative

Remote · USA Full-time

Field Auditor Premium Insurance

Remote · USA Full-time

Revenue Cycle Specialist

Remote · USA Full-time

Technical Product Manager

Remote · USA Full-time

Operations Service Manager - MI SPECT - AMERICAS

Remote · USA Full-time

Remote Service Support Specialist - CT Computed Tomography – AMERICAS

Remote · USA Full-time

PATIENT ACCOUNTS REP

Remote · USA Full-time

Experienced Data Entry Clerk with Customer Service Expertise – Work from Home Remote Opportunity

Remote · USA Full-time

TELEHEALTH Counselor, Licensed (60274)

Remote · USA Full-time

Remote Logistics Data Entry Analyst (Typist) – Data Management & Supply Chain Support Specialist at arenaflex

Remote · USA Full-time

Mila - Pro

Remote · USA Full-time

Entry-Level Online Customer Help Roles – Work from Home at $25–$35/Hour

Remote · USA Full-time

Experienced Fiber Customer Support Analyst – Delivering Exceptional Technical Support and Customer Service

Remote · USA Full-time

Lead Account Executive, Enterprise Sales (Remote: New England area)

Remote · USA Full-time

Experienced Full Stack Data Entry Specialist – Packaging Strategy Lead-Data

Remote · USA Full-time

Co-op, Flight Operations – Project Development Office (Fall 2026)

Remote · USA Full-time