DataCleaner 1.5.4 released with dBase and MS Access support
Here it is: DataCleaner 1.5.4 :)
Although this release is a minor release it contains a few exciting features and fixes:
- We've updated the MetaModel version to 1.2 which adds support for two new datastores:
- dBase databases (.dbf files)
- MS Access databases (.mdb files)
- We've fixed a bug pertaining to text-file dictionary "file not found" errors.
- A lot of the other underlying libraries have been updated, providing improvements to performance and stability.
Head on over to the downloads page to grab the new DataCleaner.
MetaModel 1.2 introduces cross-datastore querying and MS Access and dBase support
We're happy to present a new version of the wonderful MetaModel component. This version adds a radical new feature: Cross-datastore querying, which means that you can now execute queries that spans multiple datastores (ie. with transparent client-side joining, filtering, grouping etc.). You can check out a simple example of this at kasper's source (blog).
Version 1.2 also adds support for two long-awaited datastores: Microsoft Access databases and dBase databases. Access support is implemented for MetaModel with a core based on the Jackcess project. MetaModel's dBase support is based on a derivate of xBaseJ, courtesy of xBaseJ, American Coders and Joe McVerry.
To look into MetaModel 1.2, here are the crucial resources:
- Downloadables at google code.
- Javadocs available online.
- Maven-support out of the box:
<dependency> <groupId>dk.eobjects.metamodel</groupId> <artifactId>MetaModel-full</artifactId> <version>1.2</version> </dependency>
With MetaModel 1.2 we're feature-complete with all of the 1.x features of the MetaModel-roadmap. We hope that you will find it to be as great and useful as we ever intended it to be!
DataCleaner 1.5.3 released
After much waiting, we are finally ready to release DataCleaner 1.5.3. Here's the wrap-up on what's been going on:
- The MetaModel dependency has been upgraded to version 1.1.8, which means:
- Improved Excel spreadsheet support
- Improved SQL Server support
- Improved performance for CSV files
- Fixed a bug that caused certain database connection errors to be ignored in terms of user feedback.
- Fixed a bug that caused re-opening of database dictionaries to throw a NullPointerException.
- Fixed a bug related to dictionary lookups of null values.
- Added support for Teradata databases.
- Added connection templates for SQL Server connections.
- Added support for selection of custom encodings when reading CSV files.
- Fixed a minor bug relating to reading files on the classpath when running in Java WebStart mode (which manifested in an exception thrown when clicking on "About DataCleaner").
So as you can see, it's been a mix of minor bugfixes and a couple of improvements to compatibility and performance regarding certain datastores. We hope you enjoy this new release of DataCleaner. As always, you can ...
Let us know what you think!
MetaModel 1.1.8 adds better SQL Server support
I'm happy to announce the release of MetaModel 1.1.8.
This release is a minor release with updates only relating to MS SQL Server. The changes are, however, profound in this regard. Microsoft SQL Server JDBC drivers are known to be quirky when it comes to metadata exploration and we are happy to say that MetaModel now addresses these issues. So if you're a MS SQL Server you should be sure to get the latest version of MetaModel!
MetaModel is as always available at the following locations:
- Downloadables at google code.
- Javadocs available online.
- Maven-support out of the box:
<dependency> <groupId>dk.eobjects.metamodel</groupId> <artifactId>MetaModel-full</artifactId> <version>1.1.8</version> </dependency>
We hope you're all satisified with the improvements of this release and don't hesitate to give us any feedback.
New book on Open Source Business Intelligence tells the DataCleaner-story
About half a year ago we received an exciting inquiry from Jos van Dongen on behalf of him and his co-author Roland Bouman, telling us that they where writing a new book about Open Source Business Intelligence and in particular Pentaho-based solutions. And for this they where looking into DataCleaner for the data profiling section of the book!
The book is now out! It's called "Pentaho Solutions" and it's published by Wiley Publishing. You can read about it and buy it on their website as well.
The book contains a walkthrough for building a data warehouse using Open Souce tools and in doing so applying DataCleaner for the important job of profiling and validation.
We congratulate Roland Bouman and Jos van Dongen for their great work to promote Open Source Business Intelligence and thank them for mentioning DataCleaner while they're at it!
Explore and query all your datastores with MetaModel 1.1.7
We're pleased to announce the release of MetaModel 1.1.7. The major changes from our latest release is the introduction of two important improvements:
- Microsoft SQL Server is finally supported and integration tests have been added to our portfolio of tests of supported databases. Thank you to Asbjørn Leeth for the major contributions of this feature.
- We've added an option to configure the character encoding for opening CSV files.
With the addition of these two improvements we think that we've added some significant "drops in the ocean" on our way of becoming the most comprehensive and advanced framework for object-oriented querying and datastore-independent schema exploration.
If you use Maven, update your dependencies to the following:
<dependency> <groupId>dk.eobjects.metamodel</groupId> <artifactId>MetaModel-full</artifactId> <version>1.1.7</version> </dependency>
... or if you don't, head on over to our download site at Google Code and download a copy of the release.
eobjects.org announces Open Source data quality with DataCleaner 1.5.2
Dear DataCleaner users,
We are happy to announce the release of DataCleaner 1.5.2. Users of DataCleaner 1.5.0 or 1.5.1 won't be able to see a lot of changes in the user interface, but this release actually holds quite a lot of improvements “beneath the surface”:
- The most notable improvement is in the Value Distribution Profile. Previously this profile consumed quite a lot of memory which could lead to out-of-memory errors in extreme cases. This has been fixed by using on-disk caching with the berkeley db when nescesary.
- Another notable feature is that we can now distribute DataCleaner as a single JAR file. This means that we will be serving the application as a Java WebStart application (ie. run it as if it's an online application) and we are also considering other distribution options.
- When starting the application, it automatically downloads regular expressions from the RegexSwap.
- A bug in regards to matching number-based columns in dictionaries was reported and fixed.
- A bug in regards to invalid characters in XML-export formats was reported and fixed.
- When opening files, we are now ignoring suffix case so that .CSV files can be opened as well as .csv.
- The number of columns shown in the preview window are automatically restricted if there are too many to show on a single screen.
You can download DataCleaner from the downloads page or you can use our new feature: Get it via Java WebStart!
This release underlines the ongoing evolution of DataCleaner to be a more and more professionally capable data profiler and data quality tool. Seeing that DataCleaner is being used in large corporations world wide I wish to address some thoughts that I have been having and that I know users are pondering with: How do you best combine the low adoption cost of Open Source applications like DataCleaner with the high flexibility that most commercial business-software provide? To service this need we've opened up a new division of the company that I work with, Lund&Bendsen. Whether you need to deploy DataCleaner to high-scale installations, integrate the applications with your existing systems or develop customized profiles, validation rules or satisfy other enterprise needs, we offer you first class services and in-depth expertise you wont find anywhere else.
To cut to the chase: DataCleaner 1.5.2 is here and we wish to extends the community development with a professional effort. So don't hesitate to let us know if you see an opportunity to invest. Adding value by targeting your use of the product is in the interest of both customer, developer and community and this is the reason our business is there.
To all you non-business users out there: Sorry for the obvious commercial rant and we hope you all enjoy the newest DataCleaner release.
Best regards,
Kasper Sørensen
Founder of eobjects.org and the DataCleaner project
MetaModel 1.1.6 released: Small changes, a bug fixed
We've released yet another version of MetaModel, namely version 1.1.6.
This release contains very few changes to the 1.1.5 release:
- A convenience method was added to the Query class: select(FunctionType, Column).
- Upgrading the Apache POI version in MetaModel introduced a few bugs that we did not discover in the 1.1.5 milestone. In 1.1.6 we fixed these bugs and unittesting was significantly improved for this part of the code to prevent any new bugs from emerging.
We hope you enjoy this release and excuse for the hectic release schedule - the before mentioned bug fixes where critical and we hope that you appreciate the quick response from the community.
eobjects.org announces the release of MetaModel 1.1.5
We have just released the newest version of MetaModel, 1.1.5. This release is a minor release which means no API changes, but a few upgrades in terms of performance, flexibility and ease of distribution (full list):
- The most important upgrade have been to CSV performance. We encountered a bug when querying this type of datastore that meant that the whole DataSet was stored in memory while using it. This has undergone quite some refactoring so that it will now stream through memory as expected, thus keeping the door open for very large CSV files.
- A minor change in the column naming scheme have been implemented for the Excel-based DataContext's. This means that if the first row of a spreadsheet contains only blank fields, we will automatically assign the names "[column 1]", "[column 2]" etc. accordingly.
- The downloadable zip or tar.gz file will now contain a "MetaModel-1.1.5-all.jar" file, which is an assembled jar file containing the classes of all MetaModel modules (core, csv, jdbc, excel etc.), which should substantially ease deployment of the framework.
We hope you enjoy the new release of MetaModel and keep up the good work of providing the valuable feedback that drives development of it.
DataCleaner 1.5.1 released
We're happy to announce the release of DataCleaner version 1.5.1. This release is a minor release, nevertheless containing a few nice features - especially for the users who are enjoying the exporting features that was introduced in 1.5:
- An additional HTML export format have been added to the built-in export formats (usable when exporting Profiler results in the desktop app and when executing the runjob command-line tool).
- The export format is now choosable directly in the desktop app.
- Four new measures where added to the String Analysis profile: avg. chars and max/min/avg white spaces.
The new version of DataCleaner is (as always) downloadable for free on the downloads page and feedback from users is also greatly appreciated, ie:
- Fill out our online user survey, or
- Post your comments and questions at our discussion forum.
We hope that you all enjoy DataCleaner 1.5.1.
DataCleaner 1.5 released!
"Finally!" one might say. And this is definately what is going through my head right as I write this news-item. Finally, DataCleaner 1.5 has been released! Once again the effort to bring about the best open source data quality solution is bearing fruit.
The new release is definately one of the most significant ones in the history of DataCleaner. The overall goal of the release has been to step up from the shadows of the "small tools" pool and mark DataCleaner as an enterprise-ready application for profiling and validating datastores of all kinds - both in scheduled mode, on servers and in an intuitive desktop environment.
For those of you with an interest in every little detail about this release, please feel free to review the complete list of changes - for everyone else, here's the recap:
- Change of license to LGPL.
- Multi-threaded execution of Profiler and Validator.
- Command line (batch) execution of DataCleaner tasks.
- More elaborate status information during profiler and validator execution.
- New profile: Date mask matcher.
- New profile: Regex matcher.
- Load regex from the online RegexSwap repository.
- Automatic download and install of popular database drivers.
- More file types supported (.dat, .txt)
- XML file support improved (.xml)
- Memory improvements in Time analysis profile.
- Improved logging when running profiling and validation.
- Information schema provided for file-based datastores.
- Lazy-loading of columns in datastore-tree.
We hope you enjoy the new DataCleaner 1.5! Now go over and download it right away.
Data quality pro launches DataCleaner articles
Things are starting to shape up for the big release of DataCleaner 1.5. We are starting off with a bit of excitement around in the data quality community.
Probably the most dedicated online magazine about data quality, data quality pro, have launched a series of articles about profiling, validating and comparing data with DataCleaner. So far an introductory tutorial (including a complete and realistic example data-set) and a background article/interview have been published:
- Learn how to profile and validate data (for free) using DataCleaner
- Interview with Kasper Sørensen, creator of DataCleaner
We hope that you will enjoy the articles and we thank data quality pro for their great interest in our community.
First commercial support company for DataCleaner and MetaModel
Today we are announcing the first company, Lund&Bendsen, to officially support DataCleaner and MetaModel on a commercial level. These eobjects.org projects are, as you know, independent projects that are run with the community in mind. But as time goes on they grow and for companies to pick them up and start using them in a commercial setting we also welcome third party commercial support to help spread the projects to environments where community-based support is insufficient.
Lund&Bendsen is a Danish company with a strong expertise in Java development and training. Their service offerings include training, customization, integration and enhancement of DataCleaner and MetaModel so if your company is considering applying DataCleaner they might be interested in hiring some professionals to aid them in the process.
Over time more companies are expected to join in on commercial support for the eobjects.org projects. Keep up to date on the DataCleaner support page and don't hesitate to contact us for any inquiries in this regard either.
Independent analysis firm points at DataCleaner for open source data quality
The Technology Evaluation Centers (TEC) have published an interesting, unbiased and independent analysis of the market for Open Source business intelligence products. We are delighted to see that the article features a section about data quality and that TEC points at DataCleaner as a competent choise within the open source products:
In such situations, where the vendor does not support a specific functionality, organizations can look to complementary open source solutions; the DataCleaner project from eobjects.org, for instance, provides functionality to help profile data and monitor data quality. It also points to a significant advantage with open source applications: the fact that software is developed by the community and for the community makes it much simpler to share innovative solutions quickly and seamlessly.
You can read the whole article by Anna Mallikarjunan from TEC by going to their website (user registration is required).
Another release candidate (2) of DataCleaner 1.5 ready for download
Another batch of updates, fixes and improvements for the upcoming DataCleaner release is ready. This time it's Release Candidate 2 offering a preview of what's to come in DataCleaner 1.5.
- DataCleaner download site: http://datacleaner.eobjects.org/downloads
The main changes since Release Candidate 1 are multithreaded execution, the command line interface (runjob.sh / runjob.cmd), some UI updates and a few bugfixes. Go download the release candidate and use it as an opportunity to influence the development process by posting your comments on the DataCleaner forum.
Release Candidate 1 of DataCleaner 1.5 out
After working hard for a couple of days to implement substantial new features regarding integration of eobjects services and automatic download and install of popular database drivers, a new release candidate of DataCleaner is ready!
- DataCleaner download site: http://datacleaner.eobjects.org/downloads
We hope that a lot of people will use the release candidate and provide feedback for further development towards the 1.5 final release.
A few screenshots of recent development
I've spent the last couple of days implementing a couple of cool enhancements to the DataCleaner desktop-application:
- Automatic download and install of popular database drivers. Followed along with template connection strings in the "Open database" dialog. This will hopefully make it much easier for less experienced users to set up a connection to their database of choice.
- Direct integration with the new RegexSwap system so that the regexes that you post online will be accessible from within the desktop-application.
Screenshots have been posted to the media page.
Wait for DataCleaner 1.5 for these features or build it yourself to check them out now.
MetaModel 1.1.4 released
A new release of MetaModel is ready for download. The new version, 1.1.4, is a bug-fix release with a critical issue for PostgreSQL databases fixed. Other than that no changes from 1.1.3, so it should be a drop-in replacement update.
Enjoy.
- You can download an archived version
- Or get it using maven:
<dependency> <groupId>dk.eobjects.metamodel</groupId> <artifactId>MetaModel-full</artifactId> <version>1.1.4/version> </dependency>
DataCleaner launches new regex sharing subsite - RegexSwap
Only a few days after the launch of the new DataCleaner website, we are once again ready with new exciting features. This time we are launching the first edition of our new regular expression (regex) sharing subsite called "RegexSwap".
RegexSwap is a specialized forum for sharing, categorizing, commenting and voting on regular expressions that can be used in DataCleaner and other regex-based applications. It is really easy to post your own regular expressions, test them online on the website, comment and vote on the regexes that you have found useful. In time the next releases of DataCleaner will also take advantage of this online "always up to date" regex resource and offer direct integration with RegexSwap.
RegexSwap is still in beta but is ready at a functional level which is why we are launching publically it now. It will recieve dedicated attention in the weeks and months to come.
A new website for DataCleaner
Dear everybody,
As a special christmas present we have been working hard to design a new website for DataCleaner! Hopefully you will all enjoy the new site, which have been designed to further support our community and let it grow by incorporating more features to socialize and share ideas online. So go visit it now at the new URL:
Among the new features are a more personal profile system which is linked to some of the communities that our users already use frequently, namely LinkedIn and SourceForge. We have a whole new media section with cool screenshots and webcasts. We are also redesigning our mailing list structure. Instead of the single mailing list that we have been using so far, we are launching new "announcement" and "dev" mailing lists.
Our goal is to continuously launch new features on the website. The first one being a user survey to gain a better insight into the minds of our users and community. So be sure to fill it out. In the future we will add more exiting features such as online sharing of regular expressions and reference data for DataCleaner dictionaries.
The old website will continue to exist, but primarily as a wiki and bugtracking system. During the next couple of days we will be editing the wiki pages to make them more suitable for wiki-style editing (by everyone) as opposed to the former readonly strategy.
We hope you like our christmas present and that you will let us know. and we wish you all a great 2009. Without a doubt, it will bring exiting times for DataCleaner and the DataCleaner community.
Maven issues and MetaModel 1.1.3-FINAL
As we where recently made aware of, we have once again messed up our maven deployments of MetaModel, sorry! If you're using maven for your Java projects and you just updated your <dependency> tag in your POM files, replacing the version entry "1.1.2" with "1.1.3", I'm sure you ran into a lot of ClassNotFoundException's, because the maven artifacts where in fact empty! We are very sorry about this poor release management situation, but here are a couple of ways that we (you) can fix this:
- You can add the eobjects maven repository to your POM. The eobjects maven repository contains valid maven artifacts so that's quite an easy fix:
<repositories> <repository> <id>eobjects-maven</id> <name>Eobjects repository for Maven</name> <url>http://datacleaner.sourceforge.net/m2-repo/</url> </repository> </repositories>
- You can wait a few hours and the central maven repo will have been updated with a couple of new artifacts with the "1.1.3-FINAL" version literal. So your new dependency will look like this:
<dependency> <groupId>dk.eobjects.metamodel</groupId> <artifactId>MetaModel-full</artifactId> <version>1.1.3-FINAL</version> </dependency>
MetaModel 1.1.3 released
We've just released MetaModel version 1.1.3. This is a stabilization release containing some microscopical bugfixes, specifically in regards of Schema serialization. If you're currently using any 1.1.x release of MetaModel, then you should do a drop-in replacement and expect no changes to your code.
As always MetaModel is available from our downloads page and through the maven repositories.
Unless anything urgent comes up this will be the last release of the 1.1 branch of MetaModel. The next focus of MetaModel 1.2 will be to include support for more datastore formats, including dBase and improved XML tag-to-table modelling.
And of course if you have any ideas for development, don't hesitate to let us know!
New eobjects hosts, return of continuous integration!
I'm happy to announce that eobjects.org have gotten new hosts and that the troubles that we have been experiencing the last couple of months due to weird server crashed is finally over! My final word on the matter is - getting a large OSS-based J2EE environment up and running on a proprietary power pc platform is kind of a nasty affair! :-) So luckily we've found a better solution. This also means that we can once again say hello to our friend Hudson, the continuous integration system. While it is already online I will be tweaking it for the days to come so look out for periodic builds, test-reports and all that stuff that we all love!
Update: After some initial problems cloning the old environment we have finally ruled out all the small defects I think. So lets have a cheers for our new postgresql server (humly hosting the trac system) and our new Hudson server:
Error in the maven repository version of MetaModel 1.1.1
We have identified a problem with the MetaModel 1.1.1 artifacts in the maven repository, so we will be releasing MetaModel 1.1.2 very shortly. The problem was related to the upload process and caused the jar's in the repository to not contain any .class files! The downloadables from our website did not suffer from this problem, so if you're using those, you're OK.
The new maven artifacts can be downloaded using this dependency tag:
<dependency> <groupId>dk.eobjects.metamodel</groupId> <artifactId>MetaModel-full</artifactId> <version>1.1.2</version> </dependency>
A single feature have been added in the 1.1.2 release - CSV and XML content is now accessible not only through files but all kinds of input sources, including internet URLs.
Update: After some repository-synchronization waiting time the 1.1.2 release have finally been submitted to the public Maven repositories!
Minor fix release of MetaModel
We've just released MetaModel 1.1.1, the successor to the major 1.1 release!
This release is a minor fix release and you should be able to make an easy drop-in replacement of the 1.1 release. Here are the three fixes/improvements that we have been working on for the update:
- Minor bug fixed: The equals() method of SelectClause had a minor bug related to comparing the distinct property.
- Improvement: The Column and Table classes have had a getQualifiedLabel() method added. The qualified label is a dot-separated qualified name such as "MY_SCHEMA.MY_TABLE.MY_COLUMN". The qualified label can be used as a unique identifier for the column but is not necessarily directly transferable to SQL syntax.
- Improvement: Getters and setters have been added to the SelectItem class
MetaModel 1.1 released!
As I write this newsitem I'm uploading the new version of MetaModel to the maven repositories! So let me take the time to tell you what's new in this release.
First of all I'd like to say that this is really a release with lots of fundamental changes and we have sacrificed backwards-compatibility at some places, so be sure to check that everything is working exactly as before. That said - those things that we have changed will also cause you compilation problems, so if you do a drop-in replacement and your build fails, then it's because the features have changed. We think this is the easiest way for everybody to deal with changes - it's a lot more obvious that you need to do something if it's really keeping your application from working! The good thing is that the new MetaModel provides a lot of great improvements and new features!
Here's a sum-up of the changes made to MetaModel from version 1.0 to 1.1:
- We've done a major restructuring of the project as to make it more modular and easier to figure out.
- This also means that the way you create DataContext objects have changed. In 1.0 you used the constructor of DataContext. This approach have been replaced by a factory class, which does all the instantiation and initialization stuff for you: DataContextFactory.
- The MetaModel project is now LGPL licensed instead of using the Apache License version 2.0. For more info see MetaModelLicense.
- The built-in query-engine, "Query postprocessor", which is used to serve CSV, Excel and XML content, have gone through numerous improvements to performance and functinality.
- Column types can now be detected, narrowed and transformed using the Query postprocessor engine. This means that you can use the engine to detect and retrieve Integer, Double, Date, Time and Boolean types as well as the old String-based values, even from text-only datastores such as CSV files.
- The JDBC datastores now have a query rewriter component which allows for optimization of queries using native SQL-syntax.
- Query postprocessor now also generates information schemas used to investigate metadata about CSV, Excel and XML files.
- Database compliancy have grown constantly during development and will keep doing so forward on. You can check the supported databases here: MetaModelCompliancy
All in all I think this release marks a high degree of maturity for the MetaModel project and we're very proud to present it to you!
DataCleaner 1.5 "snapshot" released
As we're moving steadily along towards the release of DataCleaner 1.5 we are fixing a few bugs and enhancing a lot of features. This leads to the desire to release our work since practically nothing has undergone changes that could destabilize the application since the 1.4 release. So today we're releasing DataCleaner 1.5 "snapshot". This also marks the first release under our new LGPL license.
Here are the changes from 1.4 so far:
- Change of license to LGPL.
- New profile: Date mask matcher.
- New profile: Regex matcher.
- More file types supported (.dat, .txt)
- XML file support improved (.xml)
Although this is in principle a development/beta release, we feel that it would be worth working with for most of your profiling needs. So... Go on, download it, tell us what you think and we'll see you around!
Eobjects announces change in preferred license
We've made a principal decision at eobjects.org to change the preferred license of our projects from the Apache License 2.0 to the Lesser General Public License (LGPL).
The main difference between the two licenses are that the LGPL requires any modifications to be contributed back to the Open Source community (ie. licensed under a similar license; LGPL or GPL). The eobjects.org projects are gaining the obvious advantages of the LGPL by ensuring that improvements are submitted back to the projects. This also means that we don't risk that anyone sell modified versions of our projects. It is still just as appropriate to use the projects as a part of commercial applications, but any modifications must be contributed back to the community.
- The Apache License 2.0: http://www.apache.org/licenses/LICENSE-2.0
- Lesser General Public License: http://www.gnu.org/licenses/lgpl-3.0.txt
Initially this change in license will affect the two flagship projects of eobjects.org: DataCleaner and MetaModel. This means that the next versions of these projects (DataCleaner 1.5 and MetaModel 1.1 accordingly) will be LGPL licensed. Also, new projects will be LGPL licensed unless special circumstances suggest otherwise.
Go watch the new appetizer webcast of DataCleaner 1.4
We've just uploaded a webcast of the new DataCleaner 1.4 which provides a long awaited update for the old 0.4 webcasts!
Go enjoy the webcast - and be sure to download the newest version of DataCleaner. Over and out!
DataCleaner 1.4 released!
I'm please to announce the release of DataCleaner 1.4! This is a release that we feel will satisfy a lot of users with improvements and fixes for a lot of issues. Here's a very short compilation of changes, for more details, take a look at the roadmap.
- Replaced "Repeated values" profile with better and more advanced "Value distribution" profile.
- Dictionary matcher drill-to-details options.
- New application logo.
- Lots of small bugfixes and UI beautifications.
- Lots of sample dictionaries and regexes.
We hope you enjoy the new version of DataCleaner - Get it now!.
Two new releases planned for DataCleaner
After some considerations about the future of DataCleaner, we've updated the roadmap to reflect our current plans for the direction of development. We are planning on releasing DataCleaner 1.4 by the end of the month and after that two new milestones have been added:
- DataCleaner 1.5: The main focus of this release is to provide a command line interface for our data quality framework. This means that users will be able to easily create batch jobs that they can schedule using their favorite scheduler. Other features will also include Pattern Finder improvements and a couple of new profiles.
- DataCleaner 1.6: We have a lot of suggestions that have been filling up our backlog. DataCleaner 1.6 will be all about getting everybody's needs into the application before we get ready to begin the webapp. Some of the exciting features of DataCleaner 1.6 will be relationship profiling and exporting of results.
Kasper Sørensen presenting DataCleaner at Open Source Days '08
Great news everybody. The Open Source Days '08 conference in Copenhagen will feature a so-called Lightning Speak by Kasper Sørensen on the topic of DataCleaner and the eobjects.org community.
We're really happy to get the message of DataCleaner out to more people and a conference like this is an ideal spot for demonstrations, discussions and experiences. Read more about the lightning speak at Kasper's blog:
Update: The presentation is over and you can now also read the retrospective at Kasper's blog:
eobjects.org have been acquired
During the last year eobjects.dk have grown rapidly and attracted a lot of attention both from Denmark where the community was originally founded, but also internationally from users and contributors in all parts of the world. We believe that this world wide interest in eobjects should be reflected in the website name and address, which is why we have acquired the eobjects.org domain name as of today! Eobjects.dk will still prevail and the domain names are exact aliases but forward on we will undergo a gradual name change from .dk to .org. This will be reflected in several matters;
- The official name of the website will change to eobjects.org.
- For the sake of compatibility we will not change the package names of our java classes just yet. Only major version releases will include such changes (ie. wait for DataCleaner 2.0 and MetaModel 1.1).
- The same principle goes for our Maven artifacts. In time they will probably change though, but this also depends on the repository crew at apache.
We are happy that we now have a domain name that symbolize the international appeal of our software and we hope that it will enforce the community with a likewise global culture and sense of vitality.
MetaModel 1.0.7 is out!
We're happy to announce that MetaModel 1.0.7 has just been released! The new release should be a drop-in replacement with minor improvements and bugfixes:
- Improved memory handling and fixed a very slight memory leak (#191).
- Added support for RIGHT JOIN when using the embedded query engine (#175).
- XML support have been improved with more precise column types (#176).
You can download the new MetaModel at our google code download site. For all you maven people out there, here's the update to your POM:
<dependency> <groupId>dk.eobjects.commons</groupId> <artifactId>MetaModel</artifactId> <version>1.0.7</version> </dependency>
Enjoy.
Development/snapshot release of DataCleaner 1.4
We've released a development/snapshot release of DataCleaner 1.4 in order to get early reactions for all the improvements and new features as well as supporting our users with up to date functionality. In my own opinion the development release is just as stable and "safe to use" as 1.3, but of course it lacks a bit of the manual testing that we put into the real releases.
You can download the development release at our sourceforge download site.
Here's a short list of fixes since DataCleaner 1.3:
- Better memory handling and garbage collection
- Reference columns in drill-to-details windows
- Better error handling when loading schemas
- Quoting of string values in visualized tables (in order to distinguish empty strings and white spaces)
- New profile: Value Distribution, which is an improved version of the Repeated Values profile. The Value Distribution profile has an option to configure the top/bottom n values to include in the result.
- Better control of profile result column width.
- Bugfix: Copy to clipboard functions now work properly.
- Bugfix: Scrollbars added to visualized tables.
Take a look at the roadmap for more current developments of DataCleaner.
Welcome to the new eobjects.dk website
After a great deal of work we're happy to announce the launch of the new eobjects.dk website at our new server host! Thanks to Copenhagen Business School we now have a much better bandwidth available as well as more powerful hardware. Take a look around - a lot of things have changed, but the important stuff is still the same.
- The most remarkable change is probably what you're looking at right now - the News page! With the News page we'll be sure to keep you updated with all that goes on at eobjects.dk - project releases, roadmap changes, events, visions and goals etc. etc.
- There's a new left-hand side menu to ease navigation. We've created a new Docs page and a Downloads page for quick access to common inquiries. You'll also notice that the projects have been highlighted in the menu to give a better overview of our work.
- For contributors and developers, the Hudson continuous integration is still not migrated yet. So we hope you have patience and discipline to live without CI for a couple of days.
We hope that you like the new website. If there's anything you'd like to comment on or anything that doesn't work as it should, please don't hesitate to go to the discussion forums and point it out for us! We will then make sure that the new website lives up to all the hopes we have for it.

rss