Data Architect

Posted by Virtusa

  • IT
  • job type Full time
  • job location Colombo

Job description

· Design and Develop Scalable Data Pipelines: Build and maintain robust data pipelines using Python to process, transform, and integrate large-scale data from diverse sources.

· Orchestration and Automation: Implement and manage workflows using orchestration tools such as Apache Airflow to ensure reliable and efficient data operations.

· Data Warehouse Management: Work extensively with Snowflake to design and optimize data models, schemas, and queries for analytics and reporting.

· Queueing Systems: Leverage message queues like Kafka, SQS, or similar tools to enable real-time or batch data processing in distributed environments.

· Collaboration: Partner with Data Science, Product, and Engineering teams to understand data requirements and deliver solutions that align with business objectives.

· Performance Optimization: Optimize the performance of data pipelines and queries to handle large scales of data efficiently.

· Data Governance and Security: Ensure compliance with data governance and security standards to maintain data integrity and privacy.

· Documentation: Create and maintain clear, detailed documentation for data solutions, pipelines, and workflows.

Qualification

· 9+ years of experience in data engineering roles with a focus on building scalable data solutions.

· Proficiency in Python for ETL, data manipulation, and scripting.

· Hands-on experience with Snowflake or equivalent cloud-based data warehouses.

· Strong knowledge of orchestration tools such as Apache Airflow or similar.

· Expertise in implementing and managing messaging queues like Kafka, AWS SQS, or similar.

· Demonstrated ability to build and optimize data pipelines at scale, processing terabytes of data.

· Experience in data modeling, data warehousing, and database design.

· Proficiency in working with cloud platforms like AWS, Azure, or GCP.

· Strong understanding of CI/CD pipelines for data engineering workflows.

· Experience working in an Agile development environment, collaborating with cross-functional teams.

· Familiarity with other programming languages like Scala or Java for data engineering tasks.

· Knowledge of containerization and orchestration technologies (Docker, Kubernetes).

· Experience with stream processing frameworks like Apache Flink.

· Experience with Apache Iceberg for data lake optimization and management.

· Exposure to machine learning workflows and integration with data pipelines.

Soft Skills:

· Strong problem-solving skills with a passion for solving complex data challenges.

· Excellent communication and collaboration skills to work with cross-functional teams.

· Ability to thrive in a fast-paced, innovative environment.

Tell your friends about this:

Similar Jobs
    Available Categories
      Similar Jobs