Earnbetter

Job Search Assistant

SITE RELIABILITY ENGINEER II - APPTIO

Octo Consulting Group • Austin, TX 73301 • Posted 1 day ago

Boost your interview chances in seconds

Tailored resume, cover letter, and cheat sheet

In-person • Full-time • $121,000-$182,000/yr • Mid Level

Job Highlights

Using AI ⚡ to summarize the original job post

As a Site Reliability Engineer II at Apptio, you will be responsible for ensuring the smooth and stable operation of the company's infrastructure and applications. This role involves working proactively on system reliability, preventing outages, observing key metrics, taking urgent mitigation measures, and assisting other teams on infrastructure-related topics. You will interact with various technologies such as Kubernetes, Docker, Helm, Elasticsearch, DataDog, Grafana, Sensu, Puppet, Ansible/AWX, AWS, Azure, Python/Bash/PowerShell, Terraform/Terragrunt, and more.

Responsibilities

  • Scale systems sustainably through mechanisms like automation
  • Ownership of monitoring system
  • Maintain services in production by measuring and monitoring availability, latency, and overall system health
  • Application expansion and horizontal scaling
  • Work closely with developers, support and QA teams on maintaining and improving the whole lifecycle of services
  • Practice sustainable incident response and blameless post-mortems
  • Provide primary operational support and engineering for multiple large distributed software applications

Qualifications

Required

  • Familiarity with Site-Reliability Engineering
  • Ability to thrive in Autonomy
  • Knowledge of configuration management tools (e.g. Ansible or Puppet)
  • Experience with any scripting language (Bash, Python, PowerShell, etc.)
  • Experience with containerization (e.g., Docker, Podman, etc.)
  • Experience with container orchestration tools (e.g., Kubernetes, Open Shift, Docker Swarm, etc.)
  • Experience with database administration and management (MS SQL Server, PostgreSQL, MongoDB)
  • Familiarity with public cloud providers such as AWS, Azure, or IBM Cloud
  • Experience with monitoring, observability & logging (e.g., DataDog, Prometheus, Grafana, ELK stack, Loki, etc.)
  • Familiarity with RESTful systems and their APIs
  • Experience with high-level programming languages (Golang, .Net, Java, etc.)

Preferred

  • Ability to thrive in autonomy
  • Experience in a large-scale, distributed Linux/Unix or Windows
  • Mentoring peers and sharing skills
  • Great communication skills

Full Job Description

IntroductionAt IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, lets talk.Your Role and ResponsibilitiesYou:You are passionate about observability, automation, and reliability. Your team can count on you to deliver creative and inventive solutions to hard problems. You are comfortable working with developers, senior leadership, and non-technical individuals to help provide value to the broader organization. You take opportunities to fix problems, mentor your peers, and step outside your comfort zone to develop your skillset.Us:Apptio Targetprocess empowers businesses to adopt and scale agile across the enterprise. We develop Agile tool that connects teams, products, and portfolios to business objectives using SAFe, LeSS and other Agile frameworks. In the 2021 Gartner Magic Quadrant for Enterprise Agile Planning Tools report, Apptio’s recently acquired Targetprocess has been recognized as a “Leader”.SRE Team:Apptio Targetprocess SRE team’s main responsibility is to make sure that company’s infrastructure and applications runs in a smooth and stable manner. We count on our site reliability engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. That mostly means work proactively on system’s reliability, preventing any kind of outages, observing and keeping an eye on the key metrics, taking urgent mitigation measures when needed, assisting other teams on infrastructure-related topics.On a typical day in this role, you will interact with Kubernetes, Docker, Helm, Elasticsearch, DataDog, Grafana, Sensu, Puppet, Ansible/AWX, AWS, Azure, Python/Bash/PowerShell, Terraform/Terragrunt. If you don’t know all these tools, don’t worry, we are not expecting that you know them all, we understand that technology evolves quickly. Major Responsibilities:Scale systems sustainably through mechanisms like automationOwnership of monitoring systemMaintain services in production by measuring and monitoring availability, latency, and overall system health.Application expansion and horizontal scaling.Work closely with developers, support and QA teams on maintaining and improving the whole lifecycle of services.Practice sustainable incident response and blameless post-mortems.Provide primary operational support and engineering for multiple large distributed software applications.Required Technical and Professional ExpertiseFamiliarity with Site-Reliability EngineeringThe ability to thrive in AutonomyKnowledge of configuration management tools (e.g. Ansible or Puppet)Experience with any scripting language (Bash, Python, PowerShell, etc.)Experience with containerization (e.g., Docker, Podman, etc.)Experience with container orchestration tools (e.g., Kubernetes, Open Shift, Docker Swarm, etc.)Experience with database administration and management (MS SQL Server, PostgreSQL, MongoDB)Familiarity with public cloud providers such as AWS, Azure, or IBM CloudExperience with monitoring, observability & logging (e.g., DataDog, Prometheus, Grafana, ELK stack, Loki, etc.)Familiarity with RESTful systems and their APIsExperience with high-level programming languages (Golang, .Net, Java, etc.) is a plusMentoring peers and sharing skillsPreferred Technical and Professional ExpertiseAbility to thrive in autonomyExperience in a large-scale, distributed Linux/Unix or Windows is a plusMentoring peers and sharing skillsGreat communication skillsAbout Business UnitIBM Software infuses core business operations with intelligence—from machine learning to generative AI—to help make organizations more responsive, productive, and resilient. IBM Software helps clients put AI into action now to create real value with trust, speed, and confidence across digital labor, IT automation, application modernization, security, and sustainability. Critical to this is the ability to make use of all data, because AI is only as good as the data that fuels it. In most organizations data is spread across multiple clouds, on premises, in private datacenters, and at the edge. IBM’s AI and data platform scales and accelerates the impact of AI with trusted data, and provides leading capabilities to train, tune and deploy AI across business. IBM’s hybrid cloud platform is one of the most comprehensive and consistent approach to development, security, and operations across hybrid environments—a flexible foundation for leveraging data, wherever it resides, to extend AI deep into a business.Your Life @ IBMIn a world where technology never stands still, we understand that, dedication to our clients success, innovation that matters, and trust and personal responsibility in all our relationships, lives in what we do as IBMers as we strive to be the catalyst that makes the world work better.Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our IBMers have to make critical decisions everyday is essential to IBM becoming the catalyst for progress, always embracing challenges with resources they have to hand, a can-do attitude and always striving for an outcome focused approach within everything that they do.Are you ready to be an IBMer?About IBMIBM’s greatest invention is the IBMer. We believe that through the application of intelligence, reason and science, we can improve business, society and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.Restlessly reinventing since 1911, we are not only one of the largest corporate organizations in the world, we’re also one of the biggest technology and consulting employers, with many of the Fortune 50 companies relying on the IBM Cloud to run their business. At IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing and blockchain. Now it’s time for you to join us on our journey to being a responsible technology innovator and a force for good in the world.Location StatementIBM offers a competitive and comprehensive benefits program. Eligible employees may have access to:– Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being– Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long- term disability coverage, and opportunities for performance based salary incentive programs.– Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs. IBM also offers paid family leave benefits to eligible employees where required by applicable law.– Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals.– Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences.The compensation range and benefits for this position are based on a full-time schedule for a full calendar year. The salary will vary depending on your job-related skills, experience and location. Pay increment and frequency of pay will be in accordance with employment classification and applicable laws. For part time roles, your compensation and benefits will be adjusted to reflect your hours. Benefits may be pro-rated for those who start working during the calendar year.This position was posted on the date cited in the key job details section and is anticipated to remain posted for 21 days from this date or less if not needed to fill the role.We consider qualified applicants with criminal histories, consistent with applicable law.Being You @ IBMIBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.Role:Site Reliability Engineer II – ApptioLocation: Multiple Locations See AllBellevue San JoseBostonResearch Triangle ParkAustinCategory:Infrastructure & TechnologyEmployment Type:Full-TimeTravel Required:Up to 20% or 1 day a weekContract Type:RegularCompany:(0147) International Business Machines CorporationReq ID:723277BRProjected Minimum Salary:$121,000 per yearProjected Maximum Salary:$121,000-$182,000/year per yearDate Posted:September 17, 2024