I was searching for a clue and saw the post on the pg list :-) Got all excited thinking I had found a solution to point to you to! Then I realized, "Hey! This looks REALLY similar..."<br><br>Maybe a single run then a grep for non-ascii characters?<br>
<br><div class="gmail_quote">On Thu, Mar 31, 2011 at 4:11 PM, Geoffrey Myers <span dir="ltr"><<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div class="im">Jim Kinney wrote:<br>
> The "g" _should_ do it. Quite puzzling.<br>
> Can you do a run and find an example of when it fails? Is it always 0xbd?<br>
<br>
</div>The sad thing is, all it tells me is which table in the database has the<br>
error. It does not identify the field. I've posted to the Postgres<br>
list to see if there's any way to narrow down the culprit, there is not.<br>
<br>
Once it fails, it stops loading that table, so I never get pass this error.<br>
<br>
When I first caught this error, I was simply trying to load an SQL_ASCII<br>
database dump into a UTF8 database. That's when I wrote my <a href="http://cleandata.pl" target="_blank">cleandata.pl</a><br>
code. The code cleans 62750 records. I just don't understand why it<br>
doesn't fix this one.<br>
<div class="im"><br>
> Maybe someone in Tibet doesn't like your code :-) (It's a Tibetan<br>
> character code in utf8)<br>
<br>
</div>But what is it in SQL_ASCII? What happened is, some enterprising users<br>
figured out how to enter special characters from the keyboard, say, like<br>
a degrees symbol. That goes in to the SQL_ASCII database fine, but when<br>
you try to load that into the UTF8 encoded database, it pukes.<br>
<div class="im"><br>
><br>
> On Thu, Mar 31, 2011 at 2:59 PM, Geoffrey Myers<br>
</div><div class="im">> <<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>>> wrote:<br>
><br>
> Jim Kinney wrote:<br>
> > should it be replaced with nothing or with a NULL? Does the perl<br>
> pointer<br>
> > jump to the next character automatically when a match is made thus<br>
> > skipping a character in your replace? Does a double run fix it?<br>
><br>
> Missed the first question. It should be removed, not replaced with an<br>
> null. For example I have:<br>
><br>
> This is my <UGLY CHARACTER>data<br>
><br>
> Should then be:<br>
><br>
> This is my data<br>
><br>
> ><br>
> > On Thu, Mar 31, 2011 at 12:04 PM, Geoffrey Myers<br>
> > <<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>><br>
</div>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a><br>
<div><div></div><div class="h5">> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>>>> wrote:<br>
> ><br>
> > So, we are trying to convert a number of postgresql databases<br>
> that were<br>
> > created with SQL_ASCII encoding to UTF8 encoding. As such, I<br>
> need to<br>
> > strip certain characters out of the data before dumping and<br>
> reloading<br>
> > the new databases. I'm using the following:<br>
> ><br>
> > data =~ s/(.)/((ord($1) >= 0) && (ord($1) <= 8))<br>
> > || (ord($1) == 11)<br>
> > || ((ord($1) >= 13) && (ord($1) <= 31))<br>
> > || ((ord($1) >= 127)) ?"": $1/egs;<br>
> ><br>
> > Yet I'm getting the following error, which indicates a<br>
> character that<br>
> > should be handled by the above code is still in the data:<br>
> ><br>
> > pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte<br>
> sequence<br>
> > for encoding "UTF8": 0xbd<br>
> ><br>
> ><br>
> > Certainly, the above code should replace the 0xbd with nothing?<br>
> ><br>
> > Any perl/postgres gurus out there?<br>
> ><br>
> > --<br>
> > Until later, Geoffrey<br>
> ><br>
> > "I predict future happiness for America if they can prevent<br>
> > the government from wasting the labors of the people under<br>
> > the pretense of taking care of them."<br>
> > - Thomas Jefferson<br>
> > _______________________________________________<br>
> > Ale mailing list<br>
</div></div>> > <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
<div><div></div><div class="h5">> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>>><br>
> > <a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
> > See JOBS, ANNOUNCE and SCHOOLS lists at<br>
> > <a href="http://mail.ale.org/mailman/listinfo" target="_blank">http://mail.ale.org/mailman/listinfo</a><br>
> ><br>
> ><br>
> ><br>
> ><br>
> > --<br>
> > --<br>
> > James P. Kinney III<br>
> > I would rather stumble along in freedom than walk effortlessly in<br>
> chains.<br>
> ><br>
> ><br>
> ><br>
> ><br>
> ------------------------------------------------------------------------<br>
> ><br>
> > _______________________________________________<br>
> > Ale mailing list<br>
> > <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>><br>
> > <a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
> > See JOBS, ANNOUNCE and SCHOOLS lists at<br>
> > <a href="http://mail.ale.org/mailman/listinfo" target="_blank">http://mail.ale.org/mailman/listinfo</a><br>
><br>
><br>
> --<br>
> Until later, Geoffrey<br>
><br>
> "I predict future happiness for America if they can prevent<br>
> the government from wasting the labors of the people under<br>
> the pretense of taking care of them."<br>
> - Thomas Jefferson<br>
> _______________________________________________<br>
> Ale mailing list<br>
> <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>><br>
> <a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
> See JOBS, ANNOUNCE and SCHOOLS lists at<br>
> <a href="http://mail.ale.org/mailman/listinfo" target="_blank">http://mail.ale.org/mailman/listinfo</a><br>
><br>
><br>
><br>
><br>
> --<br>
> --<br>
> James P. Kinney III<br>
> I would rather stumble along in freedom than walk effortlessly in chains.<br>
><br>
><br>
><br>
> ------------------------------------------------------------------------<br>
><br>
> _______________________________________________<br>
> Ale mailing list<br>
> <a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
> <a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
> See JOBS, ANNOUNCE and SCHOOLS lists at<br>
> <a href="http://mail.ale.org/mailman/listinfo" target="_blank">http://mail.ale.org/mailman/listinfo</a><br>
<br>
<br>
--<br>
Until later, Geoffrey<br>
<br>
"I predict future happiness for America if they can prevent<br>
the government from wasting the labors of the people under<br>
the pretense of taking care of them."<br>
- Thomas Jefferson<br>
_______________________________________________<br>
Ale mailing list<br>
<a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
<a href="http://mail.ale.org/mailman/listinfo/ale" target="_blank">http://mail.ale.org/mailman/listinfo/ale</a><br>
See JOBS, ANNOUNCE and SCHOOLS lists at<br>
<a href="http://mail.ale.org/mailman/listinfo" target="_blank">http://mail.ale.org/mailman/listinfo</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>-- <br>James P. Kinney III<br>I would rather stumble along in freedom than walk effortlessly in chains.<br><br><br>