[ale] perl/postgresql question
Jim Kinney
jim.kinney at gmail.com
Thu Mar 31 16:52:42 EDT 2011
I was searching for a clue and saw the post on the pg list :-) Got all
excited thinking I had found a solution to point to you to! Then I realized,
"Hey! This looks REALLY similar..."
Maybe a single run then a grep for non-ascii characters?
On Thu, Mar 31, 2011 at 4:11 PM, Geoffrey Myers <lists at serioustechnology.com
> wrote:
> Jim Kinney wrote:
> > The "g" _should_ do it. Quite puzzling.
> > Can you do a run and find an example of when it fails? Is it always 0xbd?
>
> The sad thing is, all it tells me is which table in the database has the
> error. It does not identify the field. I've posted to the Postgres
> list to see if there's any way to narrow down the culprit, there is not.
>
> Once it fails, it stops loading that table, so I never get pass this error.
>
> When I first caught this error, I was simply trying to load an SQL_ASCII
> database dump into a UTF8 database. That's when I wrote my cleandata.pl
> code. The code cleans 62750 records. I just don't understand why it
> doesn't fix this one.
>
> > Maybe someone in Tibet doesn't like your code :-) (It's a Tibetan
> > character code in utf8)
>
> But what is it in SQL_ASCII? What happened is, some enterprising users
> figured out how to enter special characters from the keyboard, say, like
> a degrees symbol. That goes in to the SQL_ASCII database fine, but when
> you try to load that into the UTF8 encoded database, it pukes.
>
> >
> > On Thu, Mar 31, 2011 at 2:59 PM, Geoffrey Myers
> > <lists at serioustechnology.com <mailto:lists at serioustechnology.com>>
> wrote:
> >
> > Jim Kinney wrote:
> > > should it be replaced with nothing or with a NULL? Does the perl
> > pointer
> > > jump to the next character automatically when a match is made thus
> > > skipping a character in your replace? Does a double run fix it?
> >
> > Missed the first question. It should be removed, not replaced with
> an
> > null. For example I have:
> >
> > This is my <UGLY CHARACTER>data
> >
> > Should then be:
> >
> > This is my data
> >
> > >
> > > On Thu, Mar 31, 2011 at 12:04 PM, Geoffrey Myers
> > > <lists at serioustechnology.com <mailto:lists at serioustechnology.com>
> > <mailto:lists at serioustechnology.com
> > <mailto:lists at serioustechnology.com>>> wrote:
> > >
> > > So, we are trying to convert a number of postgresql databases
> > that were
> > > created with SQL_ASCII encoding to UTF8 encoding. As such, I
> > need to
> > > strip certain characters out of the data before dumping and
> > reloading
> > > the new databases. I'm using the following:
> > >
> > > data =~ s/(.)/((ord($1) >= 0) && (ord($1) <= 8))
> > > || (ord($1) == 11)
> > > || ((ord($1) >= 13) && (ord($1) <= 31))
> > > || ((ord($1) >= 127)) ?"": $1/egs;
> > >
> > > Yet I'm getting the following error, which indicates a
> > character that
> > > should be handled by the above code is still in the data:
> > >
> > > pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte
> > sequence
> > > for encoding "UTF8": 0xbd
> > >
> > >
> > > Certainly, the above code should replace the 0xbd with
> nothing?
> > >
> > > Any perl/postgres gurus out there?
> > >
> > > --
> > > Until later, Geoffrey
> > >
> > > "I predict future happiness for America if they can prevent
> > > the government from wasting the labors of the people under
> > > the pretense of taking care of them."
> > > - Thomas Jefferson
> > > _______________________________________________
> > > Ale mailing list
> > > Ale at ale.org <mailto:Ale at ale.org> <mailto:Ale at ale.org
> > <mailto:Ale at ale.org>>
> > > http://mail.ale.org/mailman/listinfo/ale
> > > See JOBS, ANNOUNCE and SCHOOLS lists at
> > > http://mail.ale.org/mailman/listinfo
> > >
> > >
> > >
> > >
> > > --
> > > --
> > > James P. Kinney III
> > > I would rather stumble along in freedom than walk effortlessly in
> > chains.
> > >
> > >
> > >
> > >
> >
> ------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > Ale mailing list
> > > Ale at ale.org <mailto:Ale at ale.org>
> > > http://mail.ale.org/mailman/listinfo/ale
> > > See JOBS, ANNOUNCE and SCHOOLS lists at
> > > http://mail.ale.org/mailman/listinfo
> >
> >
> > --
> > Until later, Geoffrey
> >
> > "I predict future happiness for America if they can prevent
> > the government from wasting the labors of the people under
> > the pretense of taking care of them."
> > - Thomas Jefferson
> > _______________________________________________
> > Ale mailing list
> > Ale at ale.org <mailto:Ale at ale.org>
> > http://mail.ale.org/mailman/listinfo/ale
> > See JOBS, ANNOUNCE and SCHOOLS lists at
> > http://mail.ale.org/mailman/listinfo
> >
> >
> >
> >
> > --
> > --
> > James P. Kinney III
> > I would rather stumble along in freedom than walk effortlessly in chains.
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Ale mailing list
> > Ale at ale.org
> > http://mail.ale.org/mailman/listinfo/ale
> > See JOBS, ANNOUNCE and SCHOOLS lists at
> > http://mail.ale.org/mailman/listinfo
>
>
> --
> Until later, Geoffrey
>
> "I predict future happiness for America if they can prevent
> the government from wasting the labors of the people under
> the pretense of taking care of them."
> - Thomas Jefferson
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>
--
--
James P. Kinney III
I would rather stumble along in freedom than walk effortlessly in chains.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.ale.org/pipermail/ale/attachments/20110331/6d66bab7/attachment-0001.html
More information about the Ale
mailing list