SandboxAQ is seeking a generalist software engineer to build infrastructure, development tooling, data pipelines, and data storage systems for AI and simulation in chemistry and life sciences. In this role, you’ll be crucial in developing tools that ingest, process, and serve large amounts of data. You’ll also contribute to improving the developer experience to increase developer velocity.

You’ll bring broad experience in data storage and processing technologies, orchestration tools, Python programming, CI/CD, and infrastructure as code. Most importantly, you’ll bring a track record of working in a small, fast-moving software development team, exploring new technologies, and solving problems across an entire software stack.

What You’ll Do

  • Build and operate scalable data pipelines for data ingestions, processing, analytics, and storage. Optimize performance and cost-effectiveness of data pipelines and storage.
  • Maintain data warehouse and data lake solutions.
  • Collaborate closely with R&D teams to build and operate data tooling to meet project goals.
  • In collaboration with domain experts, design and implement data models for scientific data and APIs to store and manipulate data across file storage, graph databases, and relational databases.
  • Contribute to the design and implementation of complex, security-sensitive data processing and storage systems with complex tenancy and data isolation requirements.
  • Collaborate closely with the product team and internal stakeholders in all phases of software development to validate the solutions you propose and implement.
  • In collaboration with the rest of the engineering team, build and manage infrastructure for SandboxAQ’s simulation and data platform.
  • Review code and participate in design and architectural discussions.

About You

  • 3+ years of experience with Python, with strong knowledge of software design principles.
  • Understanding of database principles and best practices.
  • Experience with large-scale analytic databases. BigQuery preferred.
  • 2+ years of experience with infrastructure as code management of public cloud providers. Familiar with terraform. GCP preferred.
  • 2+ years of experience building data pipelines or data processing systems at scale.
  • Experience with orchestration tools like Airflow.
  • Experience with writing and optimizing database queries, graph database experience is a plus.
  • Strong understanding of web and network fundamentals and experience with designing, building, and testing web APIs.
  • Knowledge of CI/CD best practices and building CI/CD pipelines.
  • Excellent communication and collaboration skills, with the ability to effectively influence a cross-functional team.

Nice to have

  • Experience with GraphQL and familiarity with Strawberry or similar.
  • Experience with FastAPI.
  • Experience with CircleCI.
Job Overview
Job alerts

Subscribe to our weekly job alerts below and never miss the latest jobs

Sign in

Sign Up

Forgotten Password

Job Quick Search

Cart

Cart

Share