We are looking for a Big Data Engineer who loves solving complex problems across a full spectrum of technologies. He/she will help ensure technological infrastructure operates seamlessly in support of our business objectives.
Soft Skills:
- Ability to take lead and work in a trustworthy working environment.
- Partner with the required teams and get seamless outputs.
- Should be curious to learn more and collaborate whenever needed.
- Ability to independently manage projects and report/present efforts to clients.
- Strong communication skills.
Responsibilities:
- Gathering and processing raw data and translating analyses.
- Define data retention policies.
- Design and develop data applications using selected tools and frameworks as required and requested for a variety of teams and projects.
- Design and implement relational databases for storage and processing.
- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.
- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.
- Work closely with the engineering team to integrate your work into our production systems.
- Process unstructured data into a form suitable for analysis.
- Analyze processed data.
- Evaluating new data sources for acquisition and integration.
- Support business decisions with ad hoc analysis as needed.
- Monitoring data performance and modifying infrastructure as needed.
- Working directly with the technology and engineering teams to integrate data processing and business objectives.
Required skills:·
- Bachelor’s Degree or more in Computer Science, Mathematics or a related field.
- Strong knowledge of and experience with statistics.
- Programming experience, ideally in Python, Spark, Kafka or Java, Pig, Hadoop, MapReduce, Hive, MySQL, Cassandra, MongoDB, NoSQL, SQL, Data streaming and a willingness to learn new programming languages to meet goals and objectives.
- Experience in data engineering.
- Experience in C, C++, Perl, Javascript, R, Python, Ruby, Java, SAS, SPSS, and Matlab or other programming languages is a plus.
- Experience in MapReduce is a plus.
- Experience with machine learning toolkits including, H2O, SparkML or Mahout.
- Experience in production support and troubleshooting.
- Experience processing large amounts of structured and unstructured data, including integrating data from multiple sources.
- Knowledge of ETL tools, data APIs, data modelling, and data warehousing solutions.
- Knowledge of data cleaning, wrangling, visualization and reporting, with an understanding of the best, most efficient use of associated tools and applications to complete these tasks.
- Familiarity with Mesos, AWS, and Docker tools.
- Deep knowledge of data mining, machine learning, natural language processing, or information retrieval.
- A willingness to explore new alternatives or options to solve data mining issues, and utilize a combination of industry best practices, data innovations and your experience to get the job done.