On Tue, Feb 15, 2000 at 09:04:21AM -0500, Chris Champness wrote:
> Binary is slower because each byte is encoded as ascii, meaning that only a
> portion of the payload data is contained in each byte that goes over the
> wire; hence more bytes to transmit. This is to prevent modems from tripping
> up.
What! No way, no how...
I've written ftp applications. It just doesn't work that way.
Go read the sources. Show me in the wu-ftp sources where that
data gets encoded. You're thinking about E-Mail and Mime. FTP transfers
raw binary data over a binary data connection. It does NOT encode the data
into any form of ascii armoring.
If it were encoded, that encoding (be it uuencode, base64 radix,
mime, or what ever) would have to be in the standard so both ends would
understand how to encode and decode it. Show me the RFC that specifies
that encoding.
I saw ONE ftp application on an old Xenix system where the binary
mode ftp was slower than the ascii mode. It turned out it was because it
was opening the destination file with an unbuffered "open" instead of
"fopen" and then writing the data as received off the wire. Buffering
the file socket solved that problem.
Receiving data in binary mode is the simplest thing. You inhale
the data from the network data socket and you blow it into the target
file descriptor with no conversions, no changes, no alterations
what-so-ever. A real simple clean loop.
The ASCII mode is a royal pain in the fanny because you have to
transmit the data on the wire in "\r\n" convention. So UnixUnix
means that you insert a "\r" in the data stream every time you encounter
a "\n" in the data stream on the sending side while you strip a "\r" from
the data stream any time you encounter EITHER a "\r\n" or a "\n\r"
(some systems do it one way and others do it the other - you must be
conservative in what you sent and liberal in what you receive). So you
are constantly changing the size of the data your are transmitting and
receiving.
Now for those paying real close attention to those conversions in
the preceding paragraph, you'll notice that they are not symetrical.
What happens if there is already a "\r" or a "\r\n" or a "\n\r" in the
file data? Some servers will convert a "\r\n" to a "\r\r\n" while others
will only insert a "\r" before a "\n" if one didn't already exist (what
about "\n\r" then?). Then what does the receiving end do?
It should be pretty obvious that ASCII mode is non-deterministic
across arbitrary servers and clients. You may very easily get different
results from different clients and different servers with certain files.
As far as modems tripping up, I have a news flash, they do. It's
called a TIES bomb and it happens. If you have one of the cheap modems
that tried to get around the Haywood patent by dropping the guard time
around the +++, you will get screwed anytime that sequence shows up in data.
So what! It just showed up in this E-Mail and a few systems are going to
get burned. But this is ascii text. ASCII armoring does nothing to
protect your modem. If you have a TIES modem and leave S2 set to a '+'
then your modem is going to hang up the connection on ANYTHING
containing this:
+++ATH0
(Including this message - This is not a test - this will happen).
Some jokers are out there now ping sweeping entire ISPs with
ping packets containing TIES bombs in the payloads. They CAN do worse
than merely hanging up. They can do anything you could do from the command
line of that modem. Think about that for a bit.
> Chris
Mike
--
Michael H. Warfield | (770) 985-6132 | ">mhw@WittsEnd.com
(The Mad Wizard) | (770) 331-2437 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0xDF1DD471 | possible worlds. A pessimist is sure of it!
--
To unsubscribe: mail ">majordomo@ale.org with "unsubscribe ale" in message body.