Build and optimizing ‘big data’ data pipelines, architectures and data sets.
To support migration of data from Hadoop platform to Snowflake.
Performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
To support and develop Data modeling, Data mapping activities.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Perform CI/CD Activities, Deployment activities in both cloud and on premises or as required.
Requirements
Experience with Data engineering tools (Python, Spark (Scala or Python), Hadoop).
Experience with relational (oracle, MySQL, MS Sql) or NoSQL databases, including Postgres and Cassandra.
Experience with data pipeline and workflow management tools.
Experience with stream-processing systems: Storm, Spark- Streaming, etc Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala,etc
Experience on Cloudera/ Data bricks/ Snowflakes platforms will be added advantage
Experience with Cloud (Aws, Azure or Google).