DataHub Manipulators are powerful data transformation tools that enable users to process, modify, and validate data as it flows through the DataHub pipeline. It consists of a read pass (where all data manipulation occurs) and a write pass (where data is written out). Manipulators only operate during the read pass . These manipulators execute in a specific sequence (Pass 1, Pass 2, Pass 3) to ensure proper data handling and maintain data integrity throughout the transformation process:
- Converter (Pass 1) – Single-column transformations, typically ensuring type validity. Converts source field values into the desired type or format. See Data Hub Converter - Runs in Pass 1
- Mutator (Pass 2) – Multi-column logic, merging, splitting, or deriving fields. Combines or transforms one or more input fields into one or more output fields. See DataHub Mutator - Runs in Pass 2
- Validator (Pass 3) – Checks the validity of field values and raises errors if conditions are not met, but never modifying data.
Note: Script Template Structure: It must be in JavaScript format. See: DataHub Validator - Pass 3
Integration with DataHub Pipeline
Manipulators integrate seamlessly with the DataHub data flow:
- Data Ingestion: Source data enters the pipeline
- Pass 1 (Converters): Data type conversion and basic formatting
- Pass 2 (Mutators): Complex transformations and calculations
- Pass 3 (Validators): Final validation and quality checks
- Data Output: Processed data continues to destination
Conclusion
DataHub Manipulators provide a flexible and powerful framework for data transformation within the DataHub ecosystem. By understanding the three-pass execution model and proper configuration techniques, users can create sophisticated data processing pipelines that ensure data quality, consistency, and business rule compliance.
The sequential nature of Converters → Mutators → Validators ensures that data flows through a logical transformation process, making the system both predictable and maintainable for complex data integration scenarios.