- Project Description:
About the Project: Part of the Data Lake project team at Client
- Responsibilities:
§Undertaking the necessary Requirement analysis
§ Designing the Technical architecture and application customization
§ Responsible for development and troubleshooting on Hadoop technologies like HDFS, Hive, Sqoop, Spark, MapReduce2, HBase, and Kafka.
§ Translate, load and exhibit unrelated data sets in various formats and sources.
§ Managing and Reviewing Hadoop Log Files
§ Manage Hadoop jobs using scheduler
§ Support MapReduce programs running on the Hadoop cluster.
§ Fine tune applications and systems for high performance and higher volume throughput.
- Mandatory Skills Description:
Must:
§ Extensive knowledge about Hadoop Architecture and HDFS
§ Strong hands-on experience in Core Java (Ideally have the knowledge of Java 8)
§ Familiar with Unix/Linux operation system
§ Expertise in writing HiveSQL, Impala scripts, shell scripting
§ Java Map Reduce
§ Proven Knowledge of workflow/ schedulers
§ Ability to deal with Data Loading tools like Flume and Sqoop
§ Strong knowledge in SQL (Oracle, MySQL, MariaDB)
§ Cloud exposure with OpenShift, Docker image build
§ Knowledge in Kafka and Spark Streaming
§ Good Knowledge in Spark Programming
§ Experience in ELK or Solr search tools would be advantageous
Knowledge of AWS cloud development