Principal Site Reliability Engineer AI Platform Architecture
Link Group
Szczecin, zachodniopomorskiefull_timeIT18 April 2026
€348,000 – €432,000
Descrizione completa
Key Responsibilities: Defining the reliability architecture for AI compute services, including SLO frameworks, fault tolerance patterns, and advanced capacity planning models. Driving hands-on development of automation and tooling that scales the SRE team's impact and eliminates operational toil. Designing a comprehensive observability strategy, leveraging existing platforms to build specialized telemetry and GPU-specific monitoring for AI workloads. Architecting deployment safety standards, in…
View full job description on Adzuna →