We are looking for a Big Data Pipeline Developer, who will have the ability to manipulate data, query relational data and exercise his/her knowledge of basic data structures and formats.
Soft Skills:
- Ability to take lead and work in a trustworthy working environment.
- Partner with the required teams and get seamless outputs.
- Should be curious to learn more and collaborate whenever needed.
- Ability to independently manage projects and report/present efforts to clients.
- Strong communication skills.
Responsibilities:
- Develop and maintain data pipelines implementing ETL processes.
- Work on ingesting, storing, processing and analyzing large data sets.
- Work closely with a data science team implementing data analytic pipelines.
- Create scalable and high-performance web services for tracking data.
- Translate complex technical and functional requirements into detailed designs.
- Investigate and analyze alternative solutions to data storing, processing etc. to ensure most streamlined approaches are implemented.
- Serve as a mentor to junior staff by conducting technical training sessions and reviewing project outputs.
- Take responsibility for Hadoop development and implementation.
- Help define data governance policies and support data versioning processes.
- Maintain security and data privacy working closely with Data Protection Officer internally.
- Analyze a vast number of data stores and uncover insights.
Required skills:·
- Bachelor’s Degree or more in Computer Science, Mathematics or a related field.
- Experience in Python, Spark and Hive
- Understanding of data warehousing and data modeling techniques
- Knowledge of industry-wide analytical and visualization tools (Tableau and R)
- Strong data engineering skills on the Azure Cloud Platform is essential
- Streaming frameworks like Kafka
- Knowledge of core Java, Linux, SQL, and any scripting language
- Knowledge of ETL tools, data APIs, data modelling, and data warehousing solutions.