pgloader, the PostgreSQL parallel ETL, in python
What does mean such a mouthful title? The easiest way to find out about this would be to check out the slides from the first PostgreSQL European Conference, held in Prato. Or to read the following two paragraphs:
The PostgreSQL parallel ETL, writen in python. This little piece of software
is a layer atop the PostgreSQL COPY command. The difference between pgloader
and plain COPY is the behavior in case of errors in the source data files
and the range of input formats supported.
And this little fact that pgloader is able to parallelize its tasks, loading
several file at once or even using more than one CPU to load a single source
file.
Documentation
The fine manual of pgloader is online at the http://pgfoundry web site.
Tutorial
I should write a pgloader tutorial some day. Maybe.
Examples
Check them out in your distribution of pgloader, or directly browse the
pgfoundry CVS web.
Contributions
Any idea or comment is welcome, on the form of a simple mail or documentation or source code patch. Don't be shy!
As a matter of fact, several of the most advanced features of pgloader are in there only because some intrepid user asked for them. Join the fun and be one of them anytime soon.

