Your role
Do you have a knack for technology and interested in improving existing processes? Are you at your best when supporting others? Are you passionate about IT process automation using cutting edge tooling platforms? Have you supported large scale production systems? We’re looking for someone with such experience to help us:
- Provide front-line technical support to end users on a variety of technology issues.
- Support and monitor various business/IT applications, batch jobs and infrastructure components as part of an application support team providing coverage from multiple regions.
- Continually drive improvement to our service offering using data driven analytics, emphasizing opportunities for automation and robotics.
- Coordinate with multiple and varied teams to conduct and manage special events such as disaster recovery tests, infrastructure change weekends, Data Centre maintenance, and special event days.
- Ensure supportability and quality are maintained with the introduction of production changes.
Key Responsibilities:
- incident Management: Incident investigation, diagnosis & resolution. Uphold high standards for timely issue resolution.
- problem Management: Identification and managing tasks to both analyze incidents and implement preventative measures to avoid repeat occurrences and major incidents.
- change Management: Change request review and approval, and support for production change deployment and events, including weekends coverage.
- service Transition: Work with Application Development to ensure new projects are implemented in line with current standards and are ready for production, including applications that are moving to the Cloud.
- service Improvement: Leveraging data driven analysis and metrics, partner with IT Infrastructure and Application Development to proactively improve and measure application serviceability, reliability, and scalability (efficiency first approach).
- knowledge Management: Maintain knowledge articles, such as support and disaster recovery procedures, to ensure that they are kept relevant and up to date.
- batch Management: Resolve batch events and help identify opportunities for optimization and automation.
- performance Monitoring & Enhancement: Work with IT Infrastructure to ensure that all relevant metrics for server performance and capacity are available. Use metrics to maintain and enhance system performance and automation.
Your team
You will be working for Application Support & Reliability AM in London, United Kingdom. We are aligned to the Asset Management (AM) business and are responsible for providing production support in a 24x7 follow-the-sun model, utilizing hubs in Singapore, India, Poland and the US. In this role you are part of a high-performance team of Site Reliability Engineers with wide expertise on monitoring agents and platforms. You will work with committed, quality driven and very technical people understanding the monitoring area, all the tools interacting with it and the needs of our clients.
Your expertise
- ideally 8+ years of hands-on experience with support background across multiple operating systems
- knowledge of diverse technology solutions and services (networks, operating systems, infrastructure components, applications)
- understanding of Microsoft Azure or related Cloud technologies, DevOps, Orchestration and automation (Automation Anywhere, IPcenter)
- experience in monitoring an instrumentation (Netcool, Moogsoft, AppDynamics, Splunk)
- knowledge of Windows Server/ Linux, Scripting (SQL, PowerShell, Python), Relational databases (Oracle/Sybase/MSSQL/PostgreSQL)