[ale] Python regex (Now: readline())

Alex LeDonne aledonne.listmail at gmail.com
Mon Sep 11 12:20:23 EDT 2006


More thoughts/suggestions inline and below.

On 9/11/06, Christopher Fowler <cfowler at outpostsentinel.com> wrote:
> On Mon, 2006-09-11 at 11:21 -0400, Alex LeDonne wrote:
> > On 9/10/06, Christopher Fowler <cfowler at outpostsentinel.com> wrote:
> > > Is there a readline method in a class that can turn a socket into
> > > buffered I/O?  I'm writing a class to interface with a server on an
> > > embedded device and I need to interact by line.
> > >
> > > I simply added it to my class since my class is the client to the server
<snip original version>
> >
> > This seems... inefficient. I think you should safely be able to read a
> > much larger chunk at a time, say data = self.socket.recv(1024). If
> > there's a newline, then len(data) < 1024. The only thing you have to
> > check is if len(data) == 1024, whether data.endswith("\n").
> >
> > By the way, you'll want to test data against '' (empty string) -
> > that's what recv will return if the socket disconnects.
> >
>
> I did not know that was the way recv worked
>
> Here is the new method:
>   def readline(self, timeout=0):

>     buffer = None

If you initialize buffer = "" (empty string), then you don't have to
test for None later.


>     if self.__socket is None: return None
>
>     data = self.read(1024,timeout)
>
>     while data is not None:

If above you initialize buffer as "", then the conditional below goes
away and you can just use buffer = buffer + str(data). In fact, if the
guts of your read method uses socket.recv(), you can throw out the
str() call, too.

>       if buffer is None:
>         buffer = str(data)
>       else:
>         buffer = buffer + str(data)

Logic or cut-paste error below... your if and else both return.

>       if len(data) is 1024 and data.endswith("\n"):
>         return buffer
>       else:
>         return buffer
>
>       # No NL yet.  Read again
>       data = self.read(1024,timeout)
>
>     # If we get here then we've encountered
>     # an EOF condition.  Return the buffer. The
>     # next call to readline will cause None
>     # to be sent to caller
>     return buffer
>
>
> I thought recv returns None on socket disconnect?
>

My reference on recv returning "" on socket disconnect is Python in a
Nutshell, which I trust.

Philosophically, I've started to buy into the idea that a method
should have as few return points as possible. If your read() method
does like socket.recv() and returns an empty string on disconnect,
rather than None, you could combine the self.read calls at the top of
the loop. Also, all the returns in the loop can be replaced with
"break"s so that you are only returning at the end of the method.

Oh, and I had the a-ha that you don't need to worry about length -
just test for endswith("\n") regardless.

If I understand your logic correctly, consider this:
------------------

  def readline(self, timeout=0):
    buffer = ""
    if self.__socket is None: return None

    data = None

    while data != "":
      data = self.read(1024,timeout)
      buffer = buffer + str(data)

      if data.endswith("\n"):
        break

      # No NL yet and the socket is still
      # connected. Loop.

    # Either we got a newline, or the
    # socket disconnected.
    return buffer

----------

There's a possibility of a DOS here; you may want to add a parameter
for the max length of a "line" in this method.

-Alex



More information about the Ale mailing list