- Responsible for building real-time and batch data ingestion pipelines integrating different data sources like RDBMS, data warehouses, APIs, plant legacy systems, historian systems, file storage systems etc.
- Design, develop, and launch extremely efficient and reliable data pipelines.
- Solve issues in the existing data pipelines and build their successors.
- Build modular pipeline to construct features and modelling tables.
- Maintain data warehouse architecture and relational databases.
- Monitor incidents by performing root cause analysis and implement the appropriate action.
- Create, document and monitor highly readable code.
Additional:
- Migrate local code bases and pipelines(Databricks) to the cloud platform
Skills
- Bachelor's or master’s degree in Computer background and 2 to 4 years of IT experience
- Minimum 1+ year of experience in Data Engineering and DevOps environment on Azure, AWS or Google cloud platform
- Must have hands on experience on following Azure Services – Data Factory, Data Lake, Data Warehouse, Database, U-SQL, Databricks, Functions, EventHub/IoT Hub and Storage, or equivalents in AWS or Google
- Must have experience in: Python, SQL, PowerShell
- Must have project experience: real time streaming and batch processing implementations
- Must have experience in Git/Git Lab/Bit Bucket repos and branches or equivalent
- Must have experience in data modelling and visualization using Microsoft Power BI, Python scripting and open-source tools (plotly, D3.js, Bokeh)