Engineering - SRE Platforms - Software Engineer - Vice President - Dallas
|  The Goldman Sachs Group | |
|   United States, Texas, Dallas  | |
|  Oct 29, 2025 | |
| Goldman Sachs is seeking a talented and motivated Site Reliability Engineering Manager to join our team. As a leader within the firm's Technology division, you will be responsible for overseeing the Site Reliability Engineering (SRE) function, ensuring the stability and reliability of critical applications and infrastructure. You will manage a team of SRE engineers who work closely with developers, infrastructure engineers, and operations teams to build and maintain highly available systems. Key Responsibilities: * Manage a team of Site Reliability Engineers responsible for ensuring the reliability, availability, and performance of critical applications and infrastructure * Develop and implement best practices for Site Reliability Engineering, including incident management, monitoring, automation, and capacity planning * Collaborate with development teams to design and build highly available and scalable systems * Work with infrastructure teams to ensure that critical infrastructure components are operating optimally and are able to support the needs of the business * Develop and maintain Service Level Agreements (SLAs) and Service Level Objectives (SLOs) to ensure that critical systems meet the needs of the business * Manage and prioritize workload for the SRE team, ensuring that they are aligned with business priorities * Develop and maintain relationships with key stakeholders across the organization to ensure that the SRE function is aligned with business goals Qualifications: * Bachelor's degree in Computer Science, Engineering, or related field * 8+ years of experience in Site Reliability Engineering, with at least 3 years in a management role * Strong leadership skills with the ability to manage a team of engineers * Experience with cloud computing platforms such as AWS or Azure * Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation * Experience with containerization technologies such as Docker and Kubernetes * Strong problem-solving skills with the ability to troubleshoot complex issues | |
 
                             
  
 