Thinkfind Corporation
Salary: $45 – $50 per Hour
Dynatrace Operations Awareness Engineer Fort Worth, Texas
As an Operations Awareness Engineer for our leading client, you will monitor, alert, and support our client’s systems to ensure seamless operations. Your duties will include Incident and System Management – Collaborate with internal teams and suppliers to analyze and resolve critical IT and Telecom service interruptions, and protect system availability through incident, problem, and change management; System Monitoring and Optimization – Monitor systems for faults, identify optimization opportunities, and implement tools and process changes to improve monitoring and alerting; and, Incident Response and Root Cause Analysis – Work with major incident response teams for escalations and monitoring during major incidents.
Ideally, you will have 3-5 years of experience with Dynatrace, CloudWatch, Zabbix, SCOM or similar tools, and a solid understanding of cloud architecture and DevOps principles. AWS X-Ray experience is a substantial plus. Technical certifications or 5+ years in Event monitoring and alerting, DevOps, Infrastructure Support or IT Major Incident Management is needed. Other required skills include DevOps application performance tuning; strong writing skills for documentation; proficiency in distributed systems/administration (Windows, Unix, Linux, VMWare, etc.); Knowledge of ITIL best practices (certification is a plus); familiarity with SDLC life cycle; experience in SLA/KPI-driven environments; ServiceNow proficiency; and general scripting/programming skills (Python, Node.js, Ruby, Perl, Bash/sh). Preferred qualifications include Cloud certifications (AWS, Azure, etc.); experience with infrastructure as code tools (Terraform, Ansible, etc.); ITIL V3 or V4 certification; Advanced technical skills in various operating systems and environments; and a proven ability to improve monitoring and alerting processes. Must be self-motivated with an ability to define, develop, and execute plans; manage system outages; and handle high-stress situations. Must be able to work in a 24/7 environment and provide on-call support. Bachelor’s degree in Computer Science, Information Systems, or Engineering preferred.
**Position requires 50% onsite work**
**Weekend availability may be required periodically**
Local candidates preferred / Currently open to USC and GC holders only