Gathering your results ...
2 days
Not Specified
Not Specified
Not Specified
<ul> <li>Release Management: Coordinate and manage release cycles for observability platforms. Ensure smooth and timely releases with minimal disruption to services. Work with partners to migrate legacy monitoring to modern solutions. Work with the observability engineering team to provide solutions for new requirements that arise, by leveraging existing or developing new solutions. </li><li>Incident/Request Management: Troubleshoot and resolve incidents related to observability platforms. Manage escalated customer issues and requests, ensuring timely and effective resolution. Document incident remediation activities and automate remediation activities where possible. </li><li>Performance Optimization: Continuously monitor and enhance platform performance to support scalability and complexity. </li><li>Collaboration and Communication: Collaborate with cross-functional infrastructure, application, and business stakeholders to ensure observability solutions align with the broader IT strategy and infrastructure requirements. Communicate effectively with team members, management, and other stakeholders. </li><li>Continuous Improvement: Identify opportunities for process optimization and efficiency gains. Stay current with industry trends and best practices to continuously improve observability operations. </li><li>Customer Focus: Ensure high levels of customer satisfaction by effectively managing customer relationships. Provide excellent customer service and support for observability solutions. </li><li>Compliance and Security: Ensure observability platforms comply with organizational policies and security standards. Implement tools and processes to detect and remediate configuration drifts and security risks. </li><li>Documentation and Reporting: Maintain comprehensive documentation of observability platform, Product DOU, processes, and procedures. </li></ul> <p>Technical Expertise:</p> <ul> <li>5+ Years of experience in IT operations, with significant responsibilities in system monitoring, </li><li>performance tuning, and troubleshooting enterprise applications. </li><li>4+ Years in a Site Reliability Engineering (SRE) role managing modern observability solutions. </li><li>5+ years of development experience on enterprise class applications: Javascript/Java, Sql ,Spring boot & Micro services </li><li>5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana). </li><li>5+ years of experience of cloud computing platforms (e.g., AWS, Azure, GCP) and container orchestration (e.g., Kubernetes, Docker) </li><li>Familiarity with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab , ArgoCD etc) </li><li>Experience developing and implementing monitoring and logging standards for infrastructure, platforms, and applications. </li><li>Experience establishing and implementing event correlation policies and related rules to enrich event data, and reduce TTD and TTR. </li></ul> <p>#LI-RJ2</p> <p>Salary Range - $90,790-$110,000 a year</p>
POST A JOB
It's completely FREE to post your jobs on ZiNG! There's no catch, no credit card needed, and no limits to number of job posts.
The first step is to SIGN UP so that you can manage all your job postings under your profile.
If you already have an account, you can LOGIN to post a job or manage your other postings.
Thank you for helping us get Americans back to work!
It's completely FREE to post your jobs on ZiNG! There's no catch, no credit card needed, and no limits to number of job posts.
The first step is to SIGN UP so that you can manage all your job postings under your profile.
If you already have an account, you can LOGIN to post a job or manage your other postings.
Thank you for helping us get Americans back to work!