Requirements:
· Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field
· Quantexa certified data engineer / data architect is preferrable.
· Proven experience as a Data Engineer, working with Hadoop, Spark, and data processing technologies in large-scale environments
· Proficiency in Scala programming language and familiarity with functional programming concepts
· Experience with Quantexa tool is highly preferred.
· In-depth understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
· Strong expertise in designing and developing data infrastructure using Hadoop, Spark, and related tools (HDFS, Hive, Pig, etc)
· Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes
· Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java
· Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket)
· Experience with Graphana, Prometheus, Splunk will be an added benefit
· Experience integrating and working with Elasticsearch for data indexing and search applications
· Solid understanding of Elasticsearch data modeling, indexing strategies, and query optimization
· Experience with distributed computing, parallel processing, and working with large datasets
· Proficient in performance tuning and optimization techniques for Spark applications and Elasticsearch queries
· Strong problem-solving and analytical skills with the ability to debug and resolve complex issues
· Familiarity with version control systems (e.g., Git) and collaborative development workflows