- Study the complex architecture which involves integrations with multiple systems and make necessary recommendations and changes to the current architecture.
- Work on entire development life cycle of Project involving Big Data applications and make decisions like choosing right big data development tools, best practices in setting up the cluster lever Big Data Eco-system and best software build/deployment practices by considering time constraints with the team.
- Evaluate the elastic search cluster performance and work on extensive performance tuning for ES queries to be performed.
- Work on finding cluster level solutions for our complex system and developed enterprise level applications followed by unit testing.
- Involve in Data transformation using complex Hive Queries language(HQL) and build a Data Lake which is used for Data Analysis. Run complex queries and work on Bucketing, Partitioning, Joins and sub-queries.
- Work on modifying the existing configuration files in accordance with development requirements and deal with setting up the new Apache Hadoop components in the eco-system.
- Work on complex Data cleanups, Data Extraction and Validations using Map Reduce & Hive.
- Write applications in multiple languages like Scala, Java, Python.
- Write Big Data Advanced business application programs code in both functional and object-oriented programming.
- Develop standalone applications in Spark/Scala that reads error logs from multiple upstream data sources and run validations on it.
- Write build scripts to build applications using tools like Apache Maven, Ant, Sbt and deploy the code using Jenkins for CI/CD,
- Work on writing complex workflow jobs using Oozie and set up multiple programs scheduler system which helped in managing multiple Hadoop, Hive, Sqoop, Spark jobs.
- Closely monitor the pipeline jobs and worked on failed jobs. Deal with setting up several new property configurations within Oozie.
- Work on developing Kafka producers that listen to several streaming data with-in a specified duration.
- Develop collector service using Akka, Akka streams and Kafka.
- Work on spark streaming which listens to multiple streaming data coming from several http web streams.
- Perform data cleanups and validations on streaming data using spark, spark streaming and Scala.
- Work on developing data ingestion scripts in Scala and python.
- Build dashboards with Elastic Search components like Kibana and Logstash.
- Spark, Spark-Streaming, Spark-SQL, Hadoop, HDFS, Map Reduce, Yarn, Hive, Sqoop, Kafka, Storm, Oozie, ZooKeeper, Teradata, ElasticSearch, Kibana, Logstash, Akka, HBase, Cassandra, JAVA, J2EE, Scala, HTML, XML, XSL, GIT, CVS, SVN, Junit, MRUnit, Ant, Jenkins, Maven, Log4j, Winscp, putty, Eclipse, Linux, , Windows
If you are interested in working in a fast-paced, challenging, fun, entrepreneurial environment and would like to have the opportunity of being a part of this fascinating industry, Please send resume to firstname.lastname@example.org.
This position qualifies for employee referal bonus. Bonus to be paid is $1,000 if the referal is hired.