Senior Data Platform Engineer
Original Advert
Government-backed Abu Dhabi organization focused on advanced technology R&D (est. 2020), defining strategy, funding, and policies across AI, robotics, and emerging technologies. Oversees the full innovation lifecycle - from research and programs to commercialization - through dedicated applied research, innovation, and venture entities.
The first production system is an AI-enabled operational platform that gives a senior leadership team a shared situational picture, an AI-classified signal feed, a daily AI-generated briefing, and an action accountability tracker. MVP target: operational within two weeks of team formation. The platform is also the technical foundation for all subsequent Data & AI systems across the organization.
Build and operate the data platform that powers the DAIO's (Data & AI Office) production systems and the long-term data estate. In the immediate term: the signal ingestion pipeline, data quality layer, and observability for all data flows. In the medium term: the enterprise data warehouse on Azure and sovereign compute, the metadata catalog, and the governed data access layer for AI agents.
WHAT THIS ROLE BUILDS & OWNS
Signal ingestion pipeline - 30-minute polling job across all defined open-source feeds (news wires, maritime AIS, financial feeds, social/keyword feeds)
Deduplication and normalization layer - common signal schema across all sources
Ingestion observability - every item logged with source, timestamp, processing status, and failure reason; no silent drops
PostgreSQL schema deployment and migration scripts (Alembic)
Azure Redis Cache - session management and ingestion queue configuration
Phase 2 data warehouse: ADLS + Synapse/Fabric, data ingestion from SAP, M365, and ATRC enterprise systems
Data quality monitoring - automated checks on signal completeness, classification coverage, and freshness
KEY DECISIONS THIS ROLE OWNS
Polling frequency, retry logic, and backoff strategy for each signal source
Deduplication key design - what makes a signal unique across sources
Whether a data quality failure is a warning (flag it) or a stop (pause ingestion)
Schema migration approach - blue-green, Alembic auto-migrate, or manual rollout
Data retention schedule - what is archived, what is purged, and when
WHAT THIS ROLE DOES NOT DO
Define the data model or classification schema - that is the Head of Data Architecture
Build the application API endpoints - that is the Backend/Systems Engineers
Write AI prompts or tune classification outputs
Manage cloud infrastructure provisioning - that is a DevOps/infra function
Microsoft data stack - Fabric or Synapse;
Strong Python(custom integration code, async processing);
Building and supporting data pipelines(Airflow or similar);
Postgres/Alembic
Observability and monitoring(prometheus/grafana/etc)
Application managed by ZooLATECH