Thumbtack’s Data Engineering team is a centralized team that works closely with engineers, analysts, data scientists, and machine learning engineers to help design and curate data sets originating from internal and third-party sources to meet current and future needs. Over the next year, it will continue to build on its prior successes in building a more cohesive data warehouse while starting to work more deeply upstream to build data best practices into the full software development lifecycle (SDLC).
Challenge
There are several teams all over Thumbtack with Terabytes of data and unique challenges trying to clean and organize this data to measure their performance. In this role, you will work with Engineers, Data Scientists, Managers, and others to understand their needs, and actively work to build datasets to tackle these challenges.
What you’ll do
- Collaboratively refine and evangelize a comprehensive framework for integrating data-thinking into the software development lifecycle for product teams
- Design, architect, and maintain core marketing datasets, data marts, and feature stores that support a blend of mature products and features with a rapidly evolving product line, in partnership with analytics, data science, and machine learning
- Integrate with teams consisting of product engineers, analysts, data scientists, machine learning engineers throughout Thumbtack to understand their data needs, and help design datasets with the same engineering rigor as any other software we design
- Drive data quality and best practices across different business areas
- Help build the next generation data products at Thumbtack, based on real-time data products on top of Apache Kafka
In order to be successful, you must bring
- 4 or more years of experience designing and building data sets and warehouses
- Excellent ability to understand the needs of and collaborate with stakeholders in other functions, especially Analytics, and identify opportunities for process improvements across teams
- Expertise in SQL for analytics/reporting/business intelligence and also for building SQL- and Python-based transforms inside an ETL pipeline, or similar
- Experience designing, architecting, and maintaining a data warehouse and data marts that seamlessly stitches together data from production databases, clickstream data, and external APIs to serve multiple stakeholders
- Expertise building the above with a modern data stack based on a cloud-native data warehouse, in our case we use BigQuery, dbt, and Apache Airflow, but a similar stack is fine
- Strong sense of ownership and pride in your work, from ideation and requirements-gathering to project completion and maintenance.
Expected salary ranges
- For candidates living in San Francisco / Bay Area, San Jose, New York City, or Seattle metros, the expected salary range for the role is currently $200,900 – $259,900.
- For candidates living in Austin, TX or Washington DC metros or in California, Massachusetts, New Jersey, or Washington states, the expected salary range for the role is currently $180,800 – $234,000.
- For candidates living in all other US locations, the expected salary range for this role is currently $170,800 – $221,000.
Share
Facebook
Twitter
LinkedIn
Telegram
Tumblr
WhatsApp
VK
Mail