John Karr on 5 Apr 2010 22:29:11 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] postgres data loading

Someone on #postgres recommended a python script, pgloader, which I wasn't
able to get to work and I also tried out the incomplete perl port of the
script, which said it worked but didn't load any data. In the meantime I
observed that "copy from" is about 1000 times as fast as individual inserts.
So rather than try to understand the code of someone else's large and
complex program I've been writing my own, smaller, simpler pg_bulkload
program, it partially works at this point. If I can polish it up enough, I
will submit it to CPAN and pgfoundry. 

Needless to say I'm still unhappy that I wasted a bunch of time searching
for a solution to a problem that is solved in mySQL and all 3 commercial SQL
products, and then ended up having to write my own solution. The only good
that will come of it is, that it's about time I wrote a cpan module. 

-----Original Message-----
[] On Behalf Of sean finney
Sent: Monday, April 05, 2010 4:24 AM
To: Philadelphia Linux User's Group Discussion List
Subject: Re: [PLUG] postgres data loading

On Fri, Apr 02, 2010 at 02:53:54PM -0400, John Karr wrote:
> Any ideas on how to do this load faster? Normally this load would 
> occur about once every 3 months, but in times peak activity it could 
> be weekly and it wouldn't be acceptable to have the application down 
> for a day for every load.

what postgres version?  8.3 or 8.4 offers PITR recovery which would probably
be the fastest way to replace one large dataset with another (think of it
like transferring a journalled version of the datadir between hosts).


Philadelphia Linux Users Group         --
Announcements -
General Discussion  --