Senior Site Reliability Engineer
Columbus, OH, USA
Posted on Thursday, January 18, 2024
Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in history, we surpassed $2B in revenue in our last fiscal year with extensive growth potential ahead.
At the heart of Veeva are our values: Do the Right Thing, Customer Success, Employee Success, and Speed. We're not just any public company – we made history in 2021 by becoming a public benefit corporation (PBC), legally bound to balancing the interests of customers, employees, society, and investors.
As a Work Anywhere company, we support your flexibility to work from home or in the office, so you can thrive in your ideal environment.
Join us in transforming the life sciences industry, committed to making a positive impact on its customers, employees, and communities.
Veeva is seeking a talented and motivated Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you are innately curious, have a penchant for problem-solving, and will play a crucial role in ensuring the reliability, scalability, and performance of our systems. Our mission is to protect, provide for, and progress the software and systems utilized by our product engineering teams.
Ideal candidates have worked in enterprise software development or for a high-growth technology company.
What You'll Do
- Take responsibility for managing production and pre-production environments, security, change management, deployment, architecture, and tools
- Perform root cause analysis for complex failures and offer modern solutions and tools
- Analyze performance and ensure the applications (GitLab, Jira, Confluence, TestRail, Mattermost), hosted in AWS, meet the scalability and reliability needs of our internal teams
- Work closely with Infrastructure, DevOps, Security, and product teams to stabilize, secure, and scale applications for continued growth
- Automate deployment, monitoring, and incident response processes to enhance system reliability and performance
- Continuously monitor system health, proactively identify issues, and implement solutions to ensure optimal performance
- Identify and troubleshoot performance bottlenecks and reliability issues across the stack
- Implement best practices for cloud-based infrastructure, ensuring security, scalability, and cost efficiency
- You want to make the system better every day and are self-driven to learn all that is necessary to provide full-stack diagnostics and determine the root cause of problems
- During an incident, lead the effort to triage and mitigate. You might need to perform periodic on-call duty if issues are escalated
- Communicate effectively with engineering and infrastructure teams, and describe problems succinctly with sufficient detail
- Engage in real-time communication during outages with both technical and non-technical audiences
- Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent work experience)
- 3+ years of working experience as a DevOps or SRE engineer
- Independent learner, curious to learn new technologies
- Experience with AWS and container orchestration tools (e.g., Kubernetes)
- Familiarity with infrastructure as code tools (e.g., Terraform, Ansible) and version control systems (e.g., Git)
- GitLab system administration experience
- Experience supporting GitLab including CI/CD processes and GitLab runners
- Solid scripting skills; experience with Shell, Bash, Ansible, Python, Go, Ruby, etc.
- Excellent problem-solving skills and the ability to troubleshoot complex issues under pressure
- 3+ years of experience in relational databases with a mastery of SQL
- Demonstrated history of incident management and leadership ability
- Hands-on operational experience in a high-volume or critical production service environment
- Effective communication skills across all levels — whether talking to individual contributors or executives
- Experience with disaster recovery planning and implementation
- Experience with performance tuning of databases and distributed storage systems
- Ability to handle the periodic, on-call duty
- Fluent in English – both written and verbal
- We are looking for strong mentors with a proven record of making your team better
Nice to Have
- Experience with serverless computing and serverless architectures (e.g., AWS Lambda, Azure Functions)
- Knowledge of security best practices
Perks & Benefits
- Medical, dental, vision, and basic life insurance
- Flexible PTO and company paid holidays
- Retirement programs
- 1% charitable giving program
- Base pay: $65,000 - $115,000
- The salary range listed here has been provided to comply with local regulations and represents a potential base salary range for this role. Please note that actual salaries may vary within the range above or below, depending on experience and location. We look at compensation for each individual and base our offer on your unique qualifications, experience, and expected contributions. This position may also be eligible for other types of compensation in addition to base salary, such as variable bonus and/or stock bonus.
Veeva’s headquarters is located in the San Francisco Bay Area with offices in more than 15 countries around the world.
Veeva is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity or expression, religion, national origin or ancestry, age, disability, marital status, pregnancy, protected veteran status, protected genetic information, political affiliation, or any other characteristics protected by local laws, regulations, or ordinances. If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us at email@example.com.