Job Details
Job Details
In this role, you will design and build data governance tooling, metadata workflows, and automation that ensure data quality, lineage, compliance, and consistency across complex data environments.
Responsibilities
· Architect, build, and maintain data governance services and workflow automation (APIs, metadata ingestion pipelines, lineage services, classification/tagging systems).
· Develop and optimize ETL/ELT data pipelines with embedded data quality and governance controls.
· Integrate and enhance data catalog/metadata tools (e.g., Collibra, Alation, DataHub, or internal equivalents).
· Support development of data dictionaries, data definitions, lineage documentation, and business-to-technical mapping.
· Implement monitoring, alerting, and governance dashboards to track data quality KPIs and metadata coverage.
· Ensure compliance with data governance policies, privacy standards, and data access controls (e.g., GDPR, CCPA).
· Partner with data engineering, analytics, product, and privacy teams to translate governance requirements into scalable technical solutions.
· Create and maintain documentation, system designs, and operational processes.
Requirements:
Required Qualifications
· Bachelor’s degree in Computer Science, Engineering, Information Systems, or equivalent work experience.
· 3+ years of experience in software development (Java, Python, Scala, or similar languages).
· Experience working with distributed data platforms (e.g., Spark, Hive, Snowflake, BigQuery, Presto, Redshift).
· Hands-on experience with data governance, metadata management, or data lineage systems.
· Strong SQL proficiency and familiarity with data pipeline/orchestration tools.
· Experience with cloud environments (AWS, GCP, or Azure).
· Excellent communication skills and ability to collaborate across technical and non-technical teams.
· Self-driven, organized, and comfortable operating in a fast-paced environment.
Preferred Qualifications
· Experience working in streaming media, consumer data, or other large-scale data environments.
· Familiarity with data catalog platforms (Collibra, Alation, DataHub).
· Experience with Airflow or other orchestration tools.