We are seeking a talented and a motivated Senior Data Engineer to join our dynamic team. As a Senior Data Engineer, you will be responsible for designing, implementing, and maintaining scalable data pipelines and infrastructure to support our data-driven initiatives. You will collaborate closely with cross-functional teams to understand data requirements, optimize data models, and ensure the reliability and performance of our data systems.
Key Responsibilities:
Design, develop, and maintain robust data pipelines to ingest, process, and transform large volumes of structured and unstructured data from various sources (e.g., Database, API, SFTP).
Build and optimize data models and schemas to support efficient data storage, retrieval, and analysis.
Implement data integration solutions to consolidate data from disparate systems and sources.
Work closely with Data Scientists and Analysts to understand their requirements and provide them with clean, reliable, and well-structured data.
Collaborate with DevOps and Infrastructure teams to deploy and manage data infrastructure in cloud environments (e.g., AWS, Azure).
Familiar with the concept of data security to ensure the data pipelines are secured and PII (Personally Identifiable Information) data are encrypted.
Monitor and optimize the performance, scalability, and reliability of data systems to ensure high availability and low latency.
Design and develop MLOps production pipelines; provide technical support to data scientists/ML engineers by getting their ML/DL models deployed at scale and meeting SLAs on both cloud and on-premises GPU and CPU instances.
Develop and maintain documentation, standards, and best practices for data engineering processes and technologies.
Stay current with emerging technologies and trends in data engineering and contribute to the continuous improvement of our data architecture and practices.
Explore, evaluate, and champion the introduction of next-generation technologies in the data-ingestion workflow. Participate in project planning and provide technical guidance on cloud architecture for data projects.
Requirements:
BS in Computer Science or other related discipline is required. Advanced degrees in Computer Science (PhD, MS) are highly desirable.
Proven experience (3 years) working as a Data Engineer or in a similar role.
Strong proficiency in programming languages such as Python. Conversant with data structures and algorithm design.
Hands-on experience with distributed computing frameworks and big data technologies such as Spark, Kafka, etc.
Proficiency in SQL and experience with relational databases (e.g., Azure SQL DB, AWS Redshift).
Experience with data modelling, ETL/ELT processes, and data warehousing concepts.
Familiarity with cloud platforms and services (e.g., AWS S3, AWS Step Functions, AWS Airflow, Azure Data Logic Apps, Azure Blob Storage).
Experience with setting up SSO using AD and implementing RBAC (Role Based Access Control) for data security.
Excellent problem-solving skills and attention to detail.
Effective communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
Experience with version control systems (e.g., Git) and CI/CD pipelines is a plus.