- Build and optimizing ‘big data’ data pipelines, architectures and data sets.
- Performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- To support and develop Data modeling, Implementation activities.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Perform CI/CD Activities, Deployment activities in both cloud and on premises or as required.
- Experience with Data engineering tools (Python, Scala, Hadoop).
- Experience with relational (oracle, MySQL, MS Sql) or NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools.
- Experience with stream-processing systems
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala,etc
- Experience on Cloudera/ Data bricks/ Snowflakes platforms will be added advantage
- Experience with Cloud (Aws, Azure or Google).