Role Overview:
We are seeking an NOC Infrastructure Engineer - L1 to join our IT operations team. The ideal candidate will have a foundational understanding of IT infrastructure, with a strong desire to grow and develop their technical skills in a fast-paced, mission-critical environment. The NOC Infrastructure Engineer - L1 will be the first point of contact for monitoring and responding to alerts, troubleshooting issues, and ensuring the stability and performance of the infrastructure and systems.
Key Responsibilities:
Monitoring and Incident Detection:
- Continuously monitor IT Infrastructure, systems using Solarwinds.
- Respond to alerts and incidents as they arise, ensuring quick detection and resolution of issues.
- Escalate issues to L2 or L3 engineers as necessary, following established protocols.
Initial Troubleshooting:
- Perform troubleshooting on IT Infrastructure and system issues, including Systems, Windows and Linux OS, DB, Storage system, Cloud Solution, Virtual Environment, MS Azure etc
- Use remote tools to access and troubleshoot IT Infrastructure devices and servers.
- Document all incidents, troubleshooting steps, and resolutions in the incident management system.
Incident Management:
- Categorize and prioritize incidents based on severity and impact.
- Ensure timely communication with affected stakeholders and provide updates until the issue is resolved.
- Follow up on escalated issues to ensure they are being addressed promptly by higher-level support teams.
IT Infrastructure and System Maintenance:
- Assist in routine IT Infrastructure and system maintenance tasks, including applying patches, updates, and configuration changes.
- Support in performing health checks on IT Infrastructure devices, servers, and critical services to ensure optimal performance.
Documentation and Reporting:
- Maintain accurate and up-to-date documentation of the IT Infrastructure topology, configurations, and standard operating procedures (SOPs).
- Generate daily, weekly, and monthly reports on IT Infrastructure performance, incidents, and other key metrics.
Collaboration and Communication:
- Collaborate with L2 and L3 engineers, as well as other IT teams, to resolve complex issues and improve IT Infrastructure reliability.
- Participate in team meetings to discuss ongoing issues, share knowledge, and contribute to continuous improvement initiatives.
Shift Work and On-Call Support:
- Be willing to work in a 24/7 shift environment, providing round-the-clock support as part of the NOC team.
- Participate in on-call rotations to ensure coverage during off-hours and weekends.
Required Qualifications:
- Experience: 3+ years of experience in Large Industry monitoring Digital Infrastructure in production environment
- Technical Skills: Experience in troubleshooting infrastructure components (Systems, Windows and Linux OS, DB, Storage system, Cloud Solution, Virtual Environment, MS Azure etc.
- Monitoring Tools: Experience with IT Infrastructure monitoring tools (SolarWinds) and incident management systems (ServiceNow).
- Troubleshooting: Strong analytical and problem-solving skills, with the ability to perform basic troubleshooting on IT Infrastructure and system issues.
- Communication: Excellent communication skills, both written and verbal, with the ability to document and explain technical issues clearly.
- Certifications: Microsoft/Linux/Cloud /Hardware/Solarwinds certifications are preferred.
Preferred Qualifications:
- Experience in a 24/7 operations environment, particularly in a NOC or similar role.
- Familiarity with ITIL processes, particularly incident and problem management.
- Knowledge of cloud environments (e.g., AWS, Azure) and their monitoring tools.
- Good understanding of Digital & Cybersecurity service management processes
- Exposure to configuration management tools