Create and maintain optimal data pipeline architecture.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and Databricks technologies.
Work with stakeholders including the executive, product, data, and design teams to assist with data-related technical issues and support their data infrastructure needs.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Assemble large, complex data sets that meet functional and non-functional business requirements.
Monitor and maintain reliability of big data platform and related tools.
Collaborate with cross-functional teams to deliver the data features and products.
Control and automate the deployment process of data API, job, and data pipeline on the platform.
Qualifications :
5 years of experience in building and optimizing big data data pipelines, architectures, and data sets.
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) and working familiarity with various databases.
Strong analytic skills related to working with unstructured datasets.
Experience supporting and working with cross-functional teams in a dynamic environment.
Experience with big data tools: Hadoop, Hive, Spark, Presto.
Experience with cloud services: AWS, Microsoft Azure is a plus.
Experience with relational SQL and NoSQL databases such as Postgres, Cassandra and MongoDB is a plus.