<p>Try a single table dump then feed through perl and output to a file. Now attempt the pgrestore to find which table is choking. I'm starting to think the problem is the restore not the perl. <br>
I've never had to do a character translation like this before on a database. My pg stuff always goes to files as an intermediate step unless it's an emergency replication.<br>
Need to do a search for the offending character in the pgdump.</p>
<div class="gmail_quote">On Apr 1, 2011 7:45 AM, "Geoffrey Myers" <<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>> wrote:<br type="attribution">> Jim Kinney wrote:<br>>> I was searching for a clue and saw the post on the pg list :-) Got all <br>
>> excited thinking I had found a solution to point to you to! Then I <br>>> realized, "Hey! This looks REALLY similar..."<br>>> <br>>> Maybe a single run then a grep for non-ascii characters?<br>
> <br>> Not sure what you're suggesting. The perl code is pulling the data out <br>> of the database, cleaning it, putting it back in. Are you suggesting <br>> pulling the data out and piping to grep? Are you thinking there's a bug <br>
> in the perl code?<br>> <br>>> <br>>> On Thu, Mar 31, 2011 at 4:11 PM, Geoffrey Myers <br>>> <<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>>> wrote:<br>
>> <br>>> Jim Kinney wrote:<br>>> > The "g" _should_ do it. Quite puzzling.<br>>> > Can you do a run and find an example of when it fails? Is it<br>>> always 0xbd?<br>
>> <br>>> The sad thing is, all it tells me is which table in the database has the<br>>> error. It does not identify the field. I've posted to the Postgres<br>>> list to see if there's any way to narrow down the culprit, there is not.<br>
>> <br>>> Once it fails, it stops loading that table, so I never get pass this<br>>> error.<br>>> <br>>> When I first caught this error, I was simply trying to load an SQL_ASCII<br>
>> database dump into a UTF8 database. That's when I wrote my<br>>> <a href="http://cleandata.pl">cleandata.pl</a> <<a href="http://cleandata.pl">http://cleandata.pl</a>><br>>> code. The code cleans 62750 records. I just don't understand why it<br>
>> doesn't fix this one.<br>>> <br>>> > Maybe someone in Tibet doesn't like your code :-) (It's a Tibetan<br>>> > character code in utf8)<br>>> <br>>> But what is it in SQL_ASCII? What happened is, some enterprising users<br>
>> figured out how to enter special characters from the keyboard, say, like<br>>> a degrees symbol. That goes in to the SQL_ASCII database fine, but when<br>>> you try to load that into the UTF8 encoded database, it pukes.<br>
>> <br>>> ><br>>> > On Thu, Mar 31, 2011 at 2:59 PM, Geoffrey Myers<br>>> > <<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>><br>
>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a><br>>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>>>> wrote:<br>
>> ><br>>> > Jim Kinney wrote:<br>>> > > should it be replaced with nothing or with a NULL? Does<br>>> the perl<br>>> > pointer<br>>> > > jump to the next character automatically when a match is<br>
>> made thus<br>>> > > skipping a character in your replace? Does a double run<br>>> fix it?<br>>> ><br>>> > Missed the first question. It should be removed, not<br>
>> replaced with an<br>>> > null. For example I have:<br>>> ><br>>> > This is my <UGLY CHARACTER>data<br>>> ><br>>> > Should then be:<br>
>> ><br>>> > This is my data<br>>> ><br>>> > ><br>>> > > On Thu, Mar 31, 2011 at 12:04 PM, Geoffrey Myers<br>>> > > <<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a><br>
>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>><br>>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a><br>>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>>><br>
>> > <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a><br>>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>><br>
>> > <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a><br>>> <mailto:<a href="mailto:lists@serioustechnology.com">lists@serioustechnology.com</a>>>>> wrote:<br>
>> > ><br>>> > > So, we are trying to convert a number of postgresql<br>>> databases<br>>> > that were<br>>> > > created with SQL_ASCII encoding to UTF8 encoding. As<br>
>> such, I<br>>> > need to<br>>> > > strip certain characters out of the data before<br>>> dumping and<br>>> > reloading<br>>> > > the new databases. I'm using the following:<br>
>> > ><br>>> > > data =~ s/(.)/((ord($1) >= 0) && (ord($1) <= 8))<br>>> > > || (ord($1) == 11)<br>>> > > || ((ord($1) >= 13) && (ord($1) <= 31))<br>
>> > > || ((ord($1) >= 127)) ?"": $1/egs;<br>>> > ><br>>> > > Yet I'm getting the following error, which indicates a<br>
>> > character that<br>>> > > should be handled by the above code is still in the data:<br>>> > ><br>>> > > pg_restore: [archiver (db)] COPY failed: ERROR:<br>
>> invalid byte<br>>> > sequence<br>>> > > for encoding "UTF8": 0xbd<br>>> > ><br>>> > ><br>>> > > Certainly, the above code should replace the 0xbd with<br>
>> nothing?<br>>> > ><br>>> > > Any perl/postgres gurus out there?<br>>> > ><br>>> > > --<br>>> > > Until later, Geoffrey<br>
>> > ><br>>> > > "I predict future happiness for America if they can<br>>> prevent<br>>> > > the government from wasting the labors of the people under<br>
>> > > the pretense of taking care of them."<br>>> > > - Thomas Jefferson<br>>> > > _______________________________________________<br>>> > > Ale mailing list<br>
>> > > <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a><br>>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>><br>
>> > <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>>>><br>>> > > <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>
>> > > See JOBS, ANNOUNCE and SCHOOLS lists at<br>>> > > <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br>>> > ><br>
>> > ><br>>> > ><br>>> > ><br>>> > > --<br>>> > > --<br>>> > > James P. Kinney III<br>>> > > I would rather stumble along in freedom than walk<br>
>> effortlessly in<br>>> > chains.<br>>> > ><br>>> > ><br>>> > ><br>>> > ><br>>> > <br>>> ------------------------------------------------------------------------<br>
>> > ><br>>> > > _______________________________________________<br>>> > > Ale mailing list<br>>> > > <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>>><br>>> > > <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>>> > > See JOBS, ANNOUNCE and SCHOOLS lists at<br>
>> > > <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br>>> ><br>>> ><br>>> > --<br>>> > Until later, Geoffrey<br>
>> ><br>>> > "I predict future happiness for America if they can prevent<br>>> > the government from wasting the labors of the people under<br>>> > the pretense of taking care of them."<br>
>> > - Thomas Jefferson<br>>> > _______________________________________________<br>>> > Ale mailing list<br>>> > <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
>> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>>><br>>> > <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>>> > See JOBS, ANNOUNCE and SCHOOLS lists at<br>
>> > <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br>>> ><br>>> ><br>>> ><br>>> ><br>>> > --<br>
>> > --<br>>> > James P. Kinney III<br>>> > I would rather stumble along in freedom than walk effortlessly in<br>>> chains.<br>>> ><br>>> ><br>
>> ><br>>> ><br>>> ------------------------------------------------------------------------<br>>> ><br>>> > _______________________________________________<br>
>> > Ale mailing list<br>>> > <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>><br>>> > <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>
>> > See JOBS, ANNOUNCE and SCHOOLS lists at<br>>> > <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br>>> <br>>> <br>>> --<br>>> Until later, Geoffrey<br>
>> <br>>> "I predict future happiness for America if they can prevent<br>>> the government from wasting the labors of the people under<br>>> the pretense of taking care of them."<br>
>> - Thomas Jefferson<br>>> _______________________________________________<br>>> Ale mailing list<br>>> <a href="mailto:Ale@ale.org">Ale@ale.org</a> <mailto:<a href="mailto:Ale@ale.org">Ale@ale.org</a>><br>
>> <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>>> See JOBS, ANNOUNCE and SCHOOLS lists at<br>>> <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br>
>> <br>>> <br>>> <br>>> <br>>> -- <br>>> -- <br>>> James P. Kinney III<br>>> I would rather stumble along in freedom than walk effortlessly in chains.<br>>> <br>>> <br>
>> <br>>> ------------------------------------------------------------------------<br>>> <br>>> _______________________________________________<br>>> Ale mailing list<br>>> <a href="mailto:Ale@ale.org">Ale@ale.org</a><br>
>> <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>>> See JOBS, ANNOUNCE and SCHOOLS lists at<br>>> <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br>
> <br>> <br>> -- <br>> Until later, Geoffrey<br>> <br>> "I predict future happiness for America if they can prevent<br>> the government from wasting the labors of the people under<br>> the pretense of taking care of them."<br>
> - Thomas Jefferson<br>> _______________________________________________<br>> Ale mailing list<br>> <a href="mailto:Ale@ale.org">Ale@ale.org</a><br>> <a href="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</a><br>
> See JOBS, ANNOUNCE and SCHOOLS lists at<br>> <a href="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</a><br></div>