Job Description
1. Analyse and optimize database views, stored procedures, and SQL queries for the Rex project.
2. Led the migration of backend API from Flask to FastAPI, enhancing performance and scalability.
3. Design and develop a custom data pipeline framework utilizing Python, Airflow, and SQL, significantly reducing data processing time from 2 days to 2 hours by automating data extraction, transformation, and loading across Oracle, MSSQL, and SharePoint.
4. Collaborate with stakeholders to gather requirements, develop, test, data loading and deploy pipelines using CI/CD, ensuring seamless integration.
5. Develop Airflow DAGs to automate the replication framework, eliminating manual processes and improving accuracy.
6. Automate reporting processes, ensuring timely data uploads to SharePoint for business users.
7. Implement automated reconciliation processes for source and target tables, integrating these checks into Airflow DAGs.
8. Continuously enhance the pipeline architecture to improve efficiency and reliability.
9. Migration of informatica mappings to the new rex replication framework to reduce the dependency on informatica and automating for easy data loading.
10. Conduct research on AI model-based tools like GPT-Engineer and Aider to assess their potential for project development.
11. Monitor and fix any dev or production issue in the data pipeline.
12. As per the stockholder’s requirements do data modelling and provide cleansed and analyse data to them for business use cases.
SKILLS & COMPETENCIES
•Experience in Data Warehousing Technology, Big Data, Requirement gathering and Analysis, Data Modeling, Data Analysis, Development, Implementation and Testing of various Business Rules.
•Excellent team player and good communication skills; Proven track record of building and maintaining high client satisfaction feedback rates.
•Very efficient in working with Oracle SQL, Hive, AWS, Python, Looker/Tableau and PL-SQL procedures and Unix Scripting and implemented the same in various projects.
•Good exposure of working for Data warehouse Building POCs project, Spark implementation for Data Lake Development. Worked on a POC for Python implementation for Snowflake db and other POC on Kafka Connect.
•Experience with Data pipeline development, Data Integration, and Interface Development project lifecycle from commencement.
•Good analytical and problem-solving skills. Always keen to learn technologies on requirement and implement the same in the project.
•Good experience of interaction with SMEs, Onshore and Offshore teams and playing an effective role in the discussion of design, development and efficient solutions for complex issues.
•Good working experience of complex data analysis in Data Migration and Interface Development projects.
•Flexible to learn and work on Data Analytics technologies as per the project requirements, having basic knowledge of Kafka, Big Data Technologies, Blockchain, AI, Neo4j, Flask, FastAPI
•Automation and Innovation experience in automating manual works using ETL scripts.
•Proper coordination with various teams and stakeholders with effective communication skills. Built data pipeline replication framework to extract data from mssql, oracle, sharepoint and load/upload to oracle or share point using, python, sql, airflow, cicd.