Responsibilities:
- Design, implement, and optimize end-to-end data pipelines for AI and machine learning applications using cloud platforms such as Azure and AWS.
- Collaborate with data scientists and other stakeholders to understand AI model requirements and deploy scalable solutions..
- Develop and maintain data processing and feature engineering workflows for machine learning model training.
- Implement data orchestration and workflow automation using tools like Apache Airflow, Azure Data Factory, or AWS Step Functions.
- Work with big data technologies, including Apache Spark and Databricks, to process and analyze large datasets.
- Implement data versioning and lineage tracking for model reproducibility and compliance.
- Collaborate with cross-functional teams to design and implement AI-driven applications.
- Ensure data quality, security, and compliance with data governance standards.
- Optimize and tune data pipelines for performance, scalability, and cost-effectiveness.
- Stay updated on the latest advancements in AI, machine learning, and data engineering.
Requirements:
- Minimum 5 years of experience in data engineering with a focus on AI, machine learning and AI enrichment.
- Bachelor’s degree in Computer Science or a related field.
- Proficiency in cloud-based data engineering platforms, including Azure, AWS, and Databricks.
- Strong programming skills in languages such as Python or Scala.
- Experience with machine learning frameworks and libraries
- Knowledge of data modelling, database design, and optimization.
- Familiarity with data warehousing concepts and technologies.
- Excellent problem-solving and analytical skills.
- Ability to work collaboratively in a team environment.
- Strong communication and documentation skills.
Preferred Skills:
- Industry certifications related to data engineering, AI, or machine learning (e.g., Microsoft Certified: Azure AI Engineer Associate).
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Knowledge of MLOps practices for deploying and managing machine learning models.
- Understanding of natural language processing (NLP) and computer vision.
- Familiarity with distributed computing and parallel processing.