Gathering your results ...
29 days
Not Specified
Not Specified
Not Specified
<p>At Klaviyo, Platform Engineering is what you get when you treat operating complex systems as a software engineering problem. Our Observability Platform group applies that philosophy to how we collect, store, and surface signals about the health of our products and infrastructure. We build and run the shared observability stack-metrics, logs, traces, alerting, and developer-facing tooling-that enables every product and platform team at Klaviyo to understand how their systems behave in production and to ship changes with confidence.</p> <p>As a Senior Observability Platform Engineer, you will design, build, and operate the core observability services that power Klaviyo's monitoring and incident response. You'll partner closely with product engineering, other platform teams, and security to define how we instrument services, standardize telemetry, and make it easy for engineers to debug issues in a fast-growing, distributed environment.</p> <p>How you'll make an impact</p> <ul> <li>Own observability platforms end-to-end - Design, implement, and operate scalable, highly available systems for metrics, logging, tracing, and alerting (e.g., Prometheus-compatible metrics, time-series storage, log pipelines, distributed tracing backends). </li><li>Build opinionated developer experiences - Create libraries, dashboards, runbooks, and self-service tooling that make "doing the right thing" for observability the easiest path for Klaviyo engineers. </li><li>Set standards for telemetry - Define and evangelize best practices for instrumentation, SLOs, alerting, and incident readiness across services and teams. </li><li>Drive reliability through data - Use observability data to identify performance bottlenecks, reliability risks, and architectural improvements, and collaborate with teams to address them. </li><li>Automate everything - Treat infrastructure as code; build automation for provisioning, configuration, scaling, and upgrades of observability components. </li><li>Mentor and multiply - Partner with engineers across Klaviyo to level up skills in debugging distributed systems, designing effective alerts, and using observability tools to make better product and reliability decisions. </li><li>Utilize AI - You've already experimented with AI in work or personal projects, and you're excited to dive in and learn fast. You're hungry to responsibly explore new AI tools and workflows, finding ways to make your work smarter and more efficient. </li></ul> <p>What we're looking for</p> <ul> <li>Strong software engineering experience in at least one modern language (e.g., Go, Python, Java) and comfort working in Linux-based production environments. </li><li>Hands-on experience designing and operating observability systems at scale (for example: Prometheus / Cortex / Thanos / Mimir, OpenTelemetry, Grafana, alerting pipelines, log aggregation systems, or distributed tracing backends). </li><li>A track record of improving reliability and performance of complex, distributed applications using telemetry and data-driven insights. </li><li>Experience with infrastructure-as-code and modern cloud-native tooling (e.g., Terraform, Kubernetes, service meshes, CI/CD systems). </li><li>Strong technical communication and collaboration skills-you're comfortable partnering with many teams, writing clear documentation, and leading technical discussions. </li><li>A mindset that values simple, well-understood systems, iterative improvement, and a bias toward empowering other engineers rather than being on the critical path for every change. </li></ul> <p>Technologies we use (not exhaustive):</p> <ul> <li>Backend: Python, Django, Go </li><li>Observability Platform: Chronosphere, Cortex, Prometheus, OTEL </li><li>Testing Frameworks: Pytest </li><li>Infrastructure and CI: AWS, Kubernetes, Terraform, Helm, Buildkite </li><li>Data: MySQL, Redis, Kafka </li></ul> <p>Klaviyo is growing fast and we have opportunities for engineers who care deeply about reliability, developer experience, and building strong foundational platforms. Learn more about our engineering culture at https://klaviyo.tech/.</p> <p>We use Covey as part of our hiring and / or promotional process. For jobs or candidates in NYC, certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on April 3, 2025.</p> <p>Please see the independent bias audit report covering our use of Covey here</p>
POST A JOB
It's completely FREE to post your jobs on ZiNG! There's no catch, no credit card needed, and no limits to number of job posts.
The first step is to SIGN UP so that you can manage all your job postings under your profile.
If you already have an account, you can LOGIN to post a job or manage your other postings.
Thank you for helping us get Americans back to work!
It's completely FREE to post your jobs on ZiNG! There's no catch, no credit card needed, and no limits to number of job posts.
The first step is to SIGN UP so that you can manage all your job postings under your profile.
If you already have an account, you can LOGIN to post a job or manage your other postings.
Thank you for helping us get Americans back to work!