AWS Data Bricks Engineer

Brij HR Sollutions

Early Applicant

5 months ago
Be among the first 50 applicants

Exp: 2-5 Years

Full time

Nan, India

Job Description

This is a remote position.

Requirements

Strong experience as a AWS Data Engineer and must have AWS Databricks experience.

Expert proficiency in Spark Scala, Python, andPySpark is a plus

Must have data migration experience from on prem to cloud

Hands-on experience in Kinesis to process & analyze Streaming data, and AWS DynamoDB

In depth understanding of AWS cloud and AWS Data lake and Analytics solutions.

Expert level hands-on development Design and Develop applications on Databricks, Databricks Workflows, AWS Managed Airflow, Apache Airflow is required.

Extensive hands-on experience implementing data migration and data processing using AWS services: VPC/SG, EC2, S3, AutoScaling, CloudFormation, LakeFormation, DMS, Kinesis, Kafka, Nifi, CDC processing, Amazon S3, EMR, Redshift, Athena, Snowflake, RDS, Aurora, Neptune, DynamoDB, Cloudtrail, CloudWatch, Docker, Lambda, Spark, Glue, SageMaker, AI/ML, API GW, etc.

Hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc.

Knowledge of different programming and scripting languages

Good working knowledge of code versioning tools [such as Git, Bitbucket or SVN]

Hands-on experience in using Spark SQL with various data sources like JSON, Parquet and Key Value Pair

Experience preparing data for Data Science and Machine Learning.

Experience preparing data for use in SageMaker and AWS Databricks.

Demonstrated experience preparing data, automating and building data pipelines for AI Use Cases (text, voice, image, IoT data etc.).

Good to have programming language experience with .NET or Spark/Scala

Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark Scala, Spark SQL/PySpark

Knowledge of AWS/Azure DevOps processes like CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence

Working experience with Visual Studio, PowerShell Scripting, and ARM templates.

Strong understanding of Data Modeling and defining conceptual logical and physical data models.

Big Data/analytics/information analysis/database management in the cloud

IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations.

Ability to remain up to date with industry standards and technological advancements that will enhance data quality and reliability to advance strategic initiatives

Basic experience with or knowledge of agile methodologies

Working knowledge of RESTful APIs, OAuth2 authorization framework and security best practices for API Gateways

Responsibilities:

Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigation

Manage data related requests, analyze issues, and provide efficient resolution. Design all program specifications and perform required tests

Design and Develop data Ingestion using Glue, AWS Managed Airflow, Apache Airflow and processing layer using Databricks.

Work with the SMEs to implement data strategies and build data flows.

Prepare codes for all modules according to required specification.

Monitor all production issues and inquiries and provide efficient resolution.

Evaluate all functional requirements, map documents, and troubleshoot all development processes

Document all technical specifications and associates project deliverables.

Design all test cases to provide support to all systems and perform unit tests.

Qualifications:

2+ years of hands-on experience designing and implementing multi-tenant solutions using AWS Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions.

5+ years experience in a software development, data engineering, or data analytics field using Python, PySpark, Scala, Spark, Java, or equivalent technologies.

Bachelors or Masters degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience

Strong written and verbal communication skills

Ability to manage competing priorities in a fast-paced environment

Ability to resolve issues

Self-Motivated and ability to work independently

Nice to have-

- AWS Certified: Solutions Architect Professional

- Databricks Certified Associate Developer for Apache Spark