Site Reliability Engineer V (Mclean, Va Or Sunnyvale, Ca)

Site Reliability Engineer V (Mclean, Va Or Sunnyvale, Ca)
Company:

Id.Me


Details of the offer

Role OverviewTheSite Reliability Engineer V(SRE) will combine software and systems engineering to build and run distributed, fault-tolerant systems at scale. SRE's ensure our services have the appropriate reliability and uptime to protect and promote our customers' experience.*Note candidates must be located in Washington DC (McLean, VA) or San Francisco Bay (Sunnyvale, CA) as this role requires an onsite presence.ResponsibilitiesDesign, build, implement, and maintain platform tooling that improves reliability across the entire product surface area, to improve the availability, scalability, latency, and efficiency of ID.me servicesManage end-to-end distributed systems availability and ensure high-performance of ID.me applicationsBuild automation solutions to prevent problem recurrenceBuild visibility into SLIs, SLOs, SLAs, and dependency metrics to manage operational burden and systems reportingDesign, build, implement, and maintain observability ecosystem to provide visibility across the ID.me platform services and applicationsProactively identify risks and develop engineering processes and/or tooling to reduce availability riskEvangelize best practices and mentor service owners on reliability, resiliency, and scalability for new and existing services and/or featuresParticipate in an on-call rotation and hold retroactive root cause analysis meetings, focusing on identifying remediations and product resiliency opportunitiesMinimum QualificationsAt least 7 years of experience working in medium or large scale production systemsThe ability to take a systematic approach to analyzing, troubleshooting, and diagnosing system problems to identify, locate, resolve, and repair problemsExperience in software development or systems engineering with codeExperience designing for scale and automation-forward ecosystems and solutionsPossess a breadth of engineering skills with an interest in service reliability, automation, monitoring, and capacity planningUnderstanding of modern application architecture (e.g. microservices, EDA)Experience with APM services and solutions (e.g. Open Telemetry, Honeycomb, New Relic, Dynatrace, AppDynamics, Datadog)Experience with time-series observability solutions (e.g. InfluxDB, Prometheus, Grafana)Experience with scaled indexed logging solutions (e.g. Splunk, ElasticSearch, OpenSearch)Experience running and operating Ruby on Rails applications and infrastructureDeep knowledge with major cloud services providers and solutions (Amazon Web Services, Google Cloud Platform, Microsoft Azure)Previous experience working within site reliability engineering culture (e.g. improving reliability through systems engineering automation, chaos testing, synthetics, and process improvement)Experience designing, building, implementing, and operating distributed systems and cloud infrastructure at scaleExperience with container computing and container orchestration (e.g. proprietary systems such as Google Kubernetes Engine (GKE), multi-cloud solutions such as Kubernetes, or Nomad)Experience with configuration management systems (e.g. Ansible, Puppet, Chef, Saltstack, Consul)Experience with virtual networking (e.g. cloud networking, service mesh, SDN)Experience in security automation (e.g. cloud proprietary solutions such as Google Secret Manager or Vault)Experience with infrastructure-as-code (e.g. Terraform)Strong written communication skillsAbility to work in an asynchronous environmentExperience in supporting a 24/7 operational infrastructure including on-call rotationsPreferred Qualifications Must have an obsession for building quality products Ability to thrive when there are changing priorities and shifting of gearsStrong oral and written communication skillsMust be a team player with a strong, self-managing work ethicMust be a self-starter with a passion for platform engineering, learning and continuous improvementDay to Day LifeEnsure observability tooling and integrations are providing telemetry and logging statistics across the entirety of ID.me systems and applicationsEnable the Engineering organization the ability to identify and triage operational issues, empowering teams to own and operate autonomouslyContribute to defining and executing on the Observability Roadmap in maintaining and modernizing cloud-native observability within the organizationIntegrate telemetry and logging frameworks to the cloud platformEvaluate new and existing observability technologies to ensure capabilities are inclusive of black box solutions (e.g. COTS) as well as Engineering-created softwareManage distributed system and application scaling activity directly (as applicable) as well as in an advisory capacity on behalf of Engineering development teamsVision: To be the world's leading digital identity network empowering people to control their own information and to prove their credentials across all channels: online, call center, and in-person.Mission:To make the world a more trusted place by delivering the highest level of security with the least amount of friction at the lowest possible cost.People: We have an audacious mission. We aim to fix the identity layer of the internet. Billions of people will live better lives with more trust and convenience thanks toID.me . We are like Special Forces. We take on the most difficult challenges with amazing teammates.At ID.me, we believe that an in-office culture fosters professional growth and development, mentorship, collaboration, and accelerated innovation. This position will be in-office based at one of our locations in either McLean, VA or Sunnyvale, CA. Working in an office together allows our culture to thrive and our team members to establish real connections with their coworkers and the opportunity for lifelong friendships. Our work is critical to protecting online identity and we're confident that working together is how we'll change the world.


Source: Greenhouse

Requirements

Site Reliability Engineer V (Mclean, Va Or Sunnyvale, Ca)
Company:

Id.Me


Senior Manager For Digital Products

The OpportunityInCharge Energy is seeking aSenior Manager for Digital Productsto lead our product management team and oversee the development of our InContro...


From Incharge Energy - Virginia

Published a month ago

Mechanical Engineer- Chassis

XDIN subsidiary of ALTEN Group, includes 500 employees dedicated to the automotive engineering development. ALTEN is a Leader in Engineering & Information Te...


From Xdin - Virginia

Published 18 days ago

Director/Sr. Director Product Marketing, Wallet

Role OverviewAs a Product Marketing leader at ID.me, you will play a pivotal role in driving the success of our products in the market. You will be a key con...


From Id.Me - Virginia

Published 14 days ago

Principal Product Manager - Government Marketplaces

Role Overview:Location: Onsite Sunnyvale CA, or McLean, VAWe are seeking aPrincipal Product Managerwho will own the consumer experience for ID.me Government ...


From Id.Me - Virginia

Published 12 days ago

Built at: 2024-06-17T13:55:32.916Z