Experience with big data tools: Python, spark, scala, hadoop, etc
Experience in Migrating SAS code and strong sql queries.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with stream-processing systems: Storm, Spark- Streaming, etc Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala,etc
Experience on Cloudera/ Data bricks/ Snowflakes platforms will be added advantage
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Willingness to provide L2/L3 application support.
Willingness to support on production issues on a 24/7 rotational basis.
Willingness to support weekend and public holidays.