Site Reliability Engineer (Kafka)
Company: Instaclustr
Location: Redwood City
Posted on: June 24, 2022
Job Description:
Instaclustr is Australia based open source as-a-service company,
delivering reliability at scale. We manage cutting edge open source
technologies (Apache Cassandra, Apache Kafka, Apache SparkTM,
Elasticsearch, Cadence, Postgre) for our customers around the
world. Recently, We were named in Deloitte's 2020 fast 500
companies of North America after another monumental growth year,
Since our foundation in 2013, We have seen significant growth, with
over 200 large scale and really exciting customers spread across
the globe, and over 6500+ clusters under our management. We have
offices in Australia, the USA, and Europe and employ over 250
people worldwide and still growing!In this exciting growth phase we
are looking for Sit Reliability Engineer with Kafka admin
experience to join our growing TechOps team in USA. This is 100%
remote work opportunity within the USA.The Role If you have
excellent operational knowledge in managing Kafka clusters, look no
further !! As a Site Reliability Engineer, you are in the frontline
team keeping our large fleet of cloud-hosted Apache Kafka clusters
up and running. Every day, you will diagnose and solve challenging
and interesting technical problems providing a Kafka service in a
highly automated environment. Our service is relied on by some of
the leading global names in Banking and Financial Services,
Telecom, IoT and other tech companies to deliver for millions of
end users. I'm interested. What else will I be doing?
- Provide expert operational support to our nodes running in the
cloud (AWS/Azure/GCP), using technologies such as Linux (CoreOS,
Ubuntu), Docker, and languages including Java, Python and
bash.
- Liaise with our customer's engineers in resolving issues with
their use of Apache Kafka, and other supported technologies.
- Undertake complex cluster operations such as migrations,
upgrades and maintenance on our fleet of 6500+ nodes.
- Develop and continually improve our suite of internal
automation tools, applications, and processes.Note - This role will
follow a staggered workweek 5 on, 2 off model with shift work on
need basisSkills & Experience We're looking for smart engineers
with exceptional communication skills, a positive attitude, and a
passion for IT and learning new things. We expect you to be, or
quickly become proficient in a range of the technologies we use.
Successful candidates for this role will:
- Have strong experience in Apache Kafka and a desire to learn
more and develop to a true expert level. This should include
experience diagnosing issues such as ISR drop or broker failures
through the analysis of logs, the Apache Kafka codebase and the
Kafka Jira project.
- Good fundamental computer science / software engineering skills
and knowledge, particularly operating system internals, memory
management, and networking.
- Strong knowledge and experience with Linux and be comfortable
working from the command line.
- Ideally, programming skills in Ansible, Python or Java, and
source code control using Git.
- Exceptional ability to communicate clearly and professionally
in written and verbal English (essential).
- Follow required processes and procedures.
- Work as part of a team and use your initiative to get things
done.
- Any customer service experience is favourable.What's in it for
you?Instaclustr prides itself on being a workplace of choice for
all of its employees. When you join our family this is what you can
expect:
- You will be part of Growing Global company with a Start-up
culture
- You get to enjoy Workplace flexibility and great work-life
balance
- You will be given Structured personal development week to
undertake your own project
- Our rotation program which is a great opportunity for
broadening your skills and experience by working with different
people, teams, and technologies
- A well defined career model and opportunities for
progression
- Company SWAG - Hoodies, t-shirt & New Laptop etc
- New Laptop
- Wellbeing programs to nurture the body, mind and soul.As a
company, we work to maximise productivity, provide flexible work
hours, offer benefits including 401k, medical, dental, and vision
coverage, and will work with you to understand what you need to be
productive (equipment, home-office, co-working, etc.). Trust, Team
and Tenacity. These are the core values that stand at the forefront
of who we are. Our values enable people from all walks of life to
work together in an environment that fosters belonging and
empowerment. We take immense pride in our diverse team and continue
to set the standard throughout the Open Source community.Being
unique is powerful. We promote an environment where you can bring
your whole self to work each and every day.
Keywords: Instaclustr, Redwood City , Site Reliability Engineer (Kafka), Engineering , Redwood City, California
Didn't find what you're looking for? Search again!
Loading more jobs...