Job Description
NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is a 'learning machine' that constantly evolves by adapting to new opportunities that are hard to pursue, that only we can tackle, and that matter to the world. This is our life's work, to amplify human creativity and intelligence. Make the choice to join us today.
NVIDIA is a 'learning machine' that constantly evolves by adapting to new opportunities that are hard to tackle, that only we can pursue, and that matter to the world. We are looking for an
Infrastructure Site Reliability Engineer for our
Global Compute Team working from Bangalore. As part of this team, you will be helping in crafting, automating and running the corporate servers and virtual infrastructure for an incredible customer experience. We are looking for creative minds, authorities in the field, with a passion to provide outstanding services to our employees and evolving our infrastructure into a cloud centric IaaS.
What you will be doing:
Build automation workflows for self-service and auto-healing capabilities.
Participate in infrastructure operations with a follow-the-sun model.
Perform scheduled maintenance and provide support outside of normal working hours
Plan and implement optimizations for compute infrastructure being used
Set-up alerting, reports, and dashboards for monitoring overall system health
Collect and review system data for capacity and planning purposes. Analyzes capacity data and develop capacity plans for appropriate level enterprise-wide systems. Coordinates with appropriate management personnel in implementing changes.
Participate in customer engagement meetings to solve complex problems
Ability to perform independently to document work logs and Develop and maintain a tool for inventory of data center assets.
What we need to see:
8+ years of experience in On-Prem and Cloud platforms mainly on automation of Infrastructure configuration and management
Sound knowledge in DevOps development techniques, such as CI/CD and Agile
Design and Deployment experience of VMware Virtualization, VMware SDS (vSAN), SDN (NSX-V and NSX-T)
Proficient in building and managing highly available and scalable IT infrastructure, with knowledge on Docker/Virtualization, Git, Svn, Continuous Delivery, Continuous Monitoring (Zabbix, Nagios) Jenkins, MySQL
Proficiency with at least one of these languages: Golang/Python/PowerShell/Ruby
Direct experience in one or more of the following: Kubernetes Administration
Infrastructure as Code frameworks, such as: SaltStack /Puppet/Terraform/Ansible
Should have the ability to communicate both verbally and in writing with users, vendors and management.
Ability to communicate complex interaction concepts clearly and persuasively across different audiences and varying levels of the organization
Ways to stand out from the crowd:
Experience in Network automation like NSX-T
Prior experience in VMware Horizon VDI Design and implementation
Knowledge on scripting / Infrastructure-as-Code (IaC)
With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most brilliant and talented people in the world working for us. If you're creative and motivated, we want to hear from you!
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer
Employement Category:
Employement Type: Full time
Industry: IT
Functional Area: IT
Role Category: Software Engineer
Role/Responsibilies: Infrastructure Site Reliability Engineer
Contact Details:
Company: Nvidia
Location(s): Bengaluru