All roles

Data Pipeline Engineer - School of Computer Science - MLD

Remote · USA Full-time New today

Carnegie Mellon University is a private, global research university that stands among the world’s most renowned education institutions. With ground-breaking brain science, path-breaking performances, creative start-ups, big data, big ambitions, hands-on learning, and a whole lot of robots, CMU doesn’t imagine the future, we invent it. If you’re passionate about joining a community that challenges the curious to deliver work that matters, your journey starts here! The Machine Learning Department (MLD) at Carnegie Mellon University is a leading hub for research and education in artificial intelligence and machine learning. It focuses on developing innovative algorithms and models to address complex problems in diverse fields such as robotics, healthcare, and finance. The department offers a range of undergraduate and graduate programs, fostering a collaborative environment that bridges theoretical research and practical applications. Faculty and students frequently collaborate with industry and other academic disciplines to push the boundaries of what is possible with machine learning. We are seeking a Data Pipeline Engineer to join the team! As a Data Pipeline Engineer, your role is vital in ensuring the integrity and reliability of our data pipelines. This position is responsible for monitoring, troubleshooting, and conducting root cause analysis of data quality issues within our pipelines, but as a part time team member, you will consult and assist rather than lead in these areas. Your contributions are crucial to maintaining the high standard of our epidemiological tracking and forecasting tools. This role will report directly to the Delphi Engineering Manager. Core Responsibilities Monitor and maintain the health and efficiency of data pipelines. Troubleshoot and perform root cause analysis for data discrepancies and pipeline issues. Communicate with data providers to understand data discrepancies and manage changes in data delivery. Implement fixes and enhancements to improve data quality and pipeline performance. Collaborate with data scientists and analysts to understand data needs and implement effective data solutions. Develop strategies for data validation and quality assurance. Optimize data flow and collection to improve system efficiency. Document and manage data pipeline architectures, including maintenance and update protocols. Use tools such as SQL, version control and CI/CD, containerization, task schedulers, python frameworks, and cloud services for data pipeline management. Ensure compliance with data governance and security standards. Adaptability, excellence, and passion are vital qualities within Carnegie Mellon University. We are in search of a team member who can effectively interact with a varied population of internal and external partners at a high level of integrity. We are looking for someone who shares our values and who will support the mission of the university through their work. Qualifications: Bachelor’s Degree required. Minimum one year of research computing experience required. Basic Linux use and administration: system layout, file permissions, shell, utilities (syslog, cron), diagnostic tools (ps, htop, grep, lsof) Experience in Apache Airflow, preferably version 3.0 Basic database use, especially in Postgres Rough script programming (Python, bash) Team software development (git/GitHub, Jira, code reviews, agile methodologies) Data analysis: diagnosing and fixing runtime errors and logic bugs; performing basic growth projections to predict future problems; communicating results Required technologies: Python, MySQL/Postgres, Linux, git & GitHub, Apache Airflow A combination of education and proven experience from which comparable knowledge is demonstrated may be considered. Preferred Technologies and Languages: Linux, Ubuntu, Bash, Make Apache Airflow Python, pandas, Flask, PyPI publishing SQL, Postgres git, GitHub, GitHub Actions, GitHub Issues Docker, Docker Compose Elastic, Kibana, FileBeat G Suite (Calendar, Mail, Docs, Sheets, Slides, Forms, AppsScript, Groups) Jira Software Requirements: Successful completion of a pre-employment background check Joining the CMU team opens the door to an array of exceptional benefits. Benefits eligible employees enjoy a wide array of benefits including comprehensive medical, prescription, dental, and vision insurance as well as a generous retirement savings program with employer contributions. Unlock your potential with tuition benefits, take well-deserved breaks with ample paid time off and observed holidays, and rest easy with life and accidental death and disability insurance. Additional perks include a free Pittsburgh Regional Transit bus pass, access to our Family Concierge Team to help navigate childcare needs, fitness center access, and much more! For a comprehensive overview of the benefits available, explore our Benefits page. At Carnegie Mellon, we value the whole package when extending offers of employment. Beyond credentials, we evaluate the role and responsibilities, your valuable work experience, and the knowledge gained through education and training. We appreciate your unique skills and the perspective you bring. Your journey with us is about more than just a job; it’s about finding the perfect fit for your professional growth and personal aspirations. Are you interested in an exciting opportunity with an exceptional organization?! Apply today! Location RemoteJob Function Software/Applications Development/EngineeringPosition Type Staff – Fixed Term (Fixed Term)Full Time/Part time Part timePay Basis HourlyMore Information: Please visit “Why Carnegie Mellon” to learn more about becoming part of an institution inspiring innovations that change the world. Click here to view a listing of employee benefits Carnegie Mellon University is an Equal Opportunity Employer/Disability/Veteran. Statement of Assurance Apply To This Job

Related roles

Cloud Solution Architect – Functional (Dynamics 365 Finance & Operations)

Remote · USA Full-time

Senior Software Engineer - Omni Tooling & Visibility (REMOTE)

Remote · USA Full-time

Sales Operations & Commissions Analyst

Remote · USA Full-time

Linux Administrator

Remote · USA Full-time

Software Engineer

Remote · USA Full-time

Senior Manager, Lighting & Show Production

Remote · USA Full-time

Senior Global Channel Manager

Remote · USA Full-time

Security Operations Engineer

Remote · USA Full-time

Senior Manager, Digital Manufacturing Operations

Remote · USA Full-time

Title Express Processor (Remote)

Remote · USA Full-time

Associate Customer Care Representative

Remote · USA Full-time

Experienced Associate Customer Care Professional, Generosity Program, Remote

Remote · USA Full-time

Experienced Customer Service Representative – Remote Opportunity with arenaflex Airlines

Remote · USA Full-time

Experienced Home-Based Chat Support Representative – Immediate Start, No Experience Required

Remote · USA Full-time

Remote Sales Representative -Entry Level Full Time & Part Time

Remote · USA Full-time

Security Engineer, Privacy

Remote · USA Full-time

Experienced Customer Support Agent – SaaS Fulfillment Team at arenaflex

Remote · USA Full-time

Experienced Full Stack Customer Support Representative – Live Chat Agent – Flexible Schedule Available at arenaflex

Remote · USA Full-time

Fiscal and Policy Analyst

Remote · USA Full-time

Senior Full Stack AWS Engineer- Virtual

Remote · USA Full-time