Milestone DataCleaner 2.4
Integration with the EasyDataQuality (aka EasyDQ) cloud platform. Concretely we're aiming at providing a customer data quality solution, which includes:
- Duplicate detection (aka Deduplication or Fuzzy matching)
- Address validation and cleansing
- Name validation and cleansing
- Phone validation and cleansing
- Email validation and cleansing
New analysis job components:
- "Table lookup", which allows looking up (multiple values) in any datastore table (on multiple conditions).
- "Insert into table" writer, which allows to insert data into eg. database tables and other writable datastore tables (CSV+Excel).
- Timestamp converter, which allows conversion from timestamp
New datastores supported:
- MongoDB support (read + write).
- Streaming XML file support (SAX based).
- Added support for header line numbering in Fixed Width value files.
Minor updates and bugfixes to DC 2.3:
- SAS versioning issue (resolve: SassyReader 0.3)
- CSV writer separator char issue (resolve: MetaModel 2.0.2)
Extensibility and stability:
- Command line interface now supports specifying jobs variables.
- UI components for selecting columns, enum values etc. have been refactored and made much easier to extend and combine in custom extensions.
- Allowed for properties to have custom serialization strategies, eg. for encrypting passwords etc. in job xml files.
Note: See
TracRoadmap for help on using
the roadmap.
