Site Reliability Engineer - New York, New York | STAND 8 Careers |
Are you passionate about digital media, entertainment, and software services? Do you like big challenges and working within a highly motivated team environment? As a Site Reliability Engineer in the Hosting, Strategy, Sales & Scheduling (HSS&S) team, you will help develop modern software solutions that accelerate the migration of workflows into the cloud.
STAND 8 provides end to end IT solutions to enterprise partners across the United States and with offices in New Jersey, LA, Atlanta, New York and more.
- Develop and implement platform and cloud services to drive our applications
- Work closely with our development partners to define technical specifications based on conceptual design and business requirements.
- Assist with the design and implementation of security and forensics capabilities to ensure governance across multiple cloud venues, private and public.
- Evaluate new and emerging technologies and tools for infrastructure orchestration.
- Design, develop, test, debug, and document new and existing software and or applications.
- Contribute to and respond to code and architecture reviews as needed.
- Write code and scripts to automate everything possible.
- Strong technical expertise and troubleshooting skills for large-scale distributed computing systems and software.
- Strong knowledge of devOps tools for continuous integration
- BS in computer science or related field
- At least 4+ years of experience supporting high volume or large-scale environments
- Knowledgeable in public and private cloud technologies
- Demonstrated ability to conceive, manage, and complete project deliverables
- Linux systems administration skills, across distributions, and especially in a cloud or virtualized environment
- Understanding of IP networking and traffic scaling
- Proven ability to design and present understandable and practical solutions to complex problems
- Demonstrate leadership skills in a fast-paced, team-driven environment
- Strong verbal and written communication skills, including visual presentation skills
- Demonstrated experience in research data collection, analysis, and presentation
- Ability to work effectively across internal and external organizations
- Ability to travel when needed; expected travel is 5-25%
- Extensive experience leveraging AWS, Azure, and/or Google Compute Platform to deploy highly reliable and scalable cloud applications
- Expert at script language development, including Python, Node.js, and Perl
- Experience creating and managing projects in version control, including git and GitHub
- Experience with large-scale distributed infrastructures, including technologies for clustering and load balancing
- Understanding of distributed capacity management
- Understanding of Service-Oriented Architectures (SOA and REST), Infrastructure as a Service (IaaS) and Platform as a Service (PaaS)
- Experience implementing continuous integration and continuous delivery (CI/CD) tools and systems
- Specific experience with container management platforms with a preference of kubernetes
- Demonstrated ability to automate the deployment of infrastructure using tools like Terraform, Ansible, or Chef/Puppet.
- Deep understanding of HTTP, TCP, DNS, UDP, IPv4/IPv6 networking and protocols
- Understanding of network database and storage technologies including NoSQL, NAS, and object stores
- Experience with Agile, including Scrum, Kanban, and Extreme Programming
- Understanding software development in a DevOps culture
- Proponent of open source software licenses
- Ability and desire to mentor engineers, technologists, and managers