[ale] Text Processing Happiness - I'm lost
Bruce
callmebruce2002 at yahoo.com
Fri Aug 17 23:07:11 EDT 2007
Hey all, it's been a while since I was on the Ale list
- but I have a question, and figured this is the best
place to ask.
I am running a Netflow Collector (NFC5.0.2) and have a
config file in XML. The config file basically
associates applications with TCP and UDP ports. Since
the config file is pretty limited, most of my traffic
is not getting associated correctly.
I pulled down a listing of well-known and registered
ports from IANA, figuring on taking the scattershot
approach.
A short section is here:
"<case><value> 1 </value><label> TCP_ tcpmux - 1 -tcp
</label></case>"
"<case><value> 2 </value><label> TCP_ compressnet - 2
-tcp </label></case>"
"<case><value> 3 </value><label> TCP_ compressnet - 3
-tcp </label></case>"
"<case><value> 5 </value><label> TCP_ rje - 5 -tcp
</label></case>"
"<case><value> 7 </value><label> TCP_ echo - 7 -tcp
</label></case>"
"<case><value> 9 </value><label> TCP_ discard - 9 -tcp
</label></case>"
"<case><value> 11 </value><label> TCP_ systat - 11
-tcp </label></case>"
"<case><value> 13 </value><label> TCP_ daytime - 13
-tcp </label></case>"
"<case><value> 17 </value><label> TCP_ qotd - 17 -tcp
</label></case>"
"<case><value> 18 </value><label> TCP_ msp - 18 -tcp
</label></case>"
"<case><value> 19 </value><label> TCP_ chargen - 19
-tcp </label></case>"
"<case><value> 20 </value><label> TCP_ ftp-data - 20
-tcp </label></case>"
"<case><value> 21 </value><label> TCP_ ftp - 21 -tcp
</label></case>"
"<case><value> 22 </value><label> TCP_ ssh - 22 -tcp
</label></case>"
"<case><value> 23 </value><label> TCP_ telnet - 23
-tcp </label></case>"
"<case><value> 25 </value><label> TCP_ smtp - 25 -tcp
</label></case>"
"<case><value> 27 </value><label> TCP_ nsw-fe - 27
-tcp </label></case>"
"<case><value> 29 </value><label> TCP_ msg-icp - 29
-tcp </label></case>"
"<case><value> 31 </value><label> TCP_ msg-auth - 31
-tcp </label></case>"
"<case><value> 33 </value><label> TCP_ dsp - 33 -tcp
</label></case>"
"<case><value> 37 </value><label> TCP_ time - 37 -tcp
</label></case>"
"<case><value> 38 </value><label> TCP_ rap - 38 -tcp
</label></case>"
"<case><value> 39 </value><label> TCP_ rlp - 39 -tcp
</label></case>"
"<case><value> 41 </value><label> TCP_ graphics - 41
-tcp </label></case>"
"<case><value> 42 </value><label> TCP_ name - 42 -tcp
</label></case>"
"<case><value> 42 </value><label> TCP_ nameserver - 42
-tcp </label></case>"
"<case><value> 43 </value><label> TCP_ nicname - 43
-tcp </label></case>"
"<case><value> 44 </value><label> TCP_ mpm-flags - 44
-tcp </label></case>"
And what I want it to look like is here:
<case><value>1</value><label>TCP_tcpmux-1-tcp</label></case>
<case><value>2</value><label>TCP_compressnet-2-tcp</label></case>
<case><value>3</value><label>TCP_compressnet-3-tcp</label></case>
<case><value>5</value><label>TCP_rje-5-tcp</label></case>
<case><value>7</value><label>TCP_echo-7-tcp</label></case>
<case><value>9</value><label>TCP_discard-9-tcp</label></case>
<case><value>11</value><label>TCP_systat-11-tcp</label></case>
<case><value>13</value><label>TCP_daytime-13-tcp</label></case>
<case><value>17</value><label>TCP_qotd-17-tcp</label></case>
<case><value>18</value><label>TCP_msp-18-tcp</label></case>
<case><value>19</value><label>TCP_chargen-19-tcp</label></case>
<case><value>20</value><label>TCP_ftp-data-20-tcp</label></case>
<case><value>21</value><label>TCP_ftp-21-tcp</label></case>
<case><value>22</value><label>TCP_ssh-22-tcp</label></case>
<case><value>23</value><label>TCP_telnet-23-tcp</label></case>
<case><value>25</value><label>TCP_smtp-25-tcp</label></case>
<case><value>27</value><label>TCP_nsw-fe-27-tcp</label></case>
<case><value>29</value><label>TCP_msg-icp-29-tcp</label></case>
<case><value>31</value><label>TCP_msg-auth-31-tcp</label></case>
<case><value>33</value><label>TCP_dsp-33-tcp</label></case>
<case><value>37</value><label>TCP_time-37-tcp</label></case>
<case><value>38</value><label>TCP_rap-38-tcp</label></case>
<case><value>39</value><label>TCP_rlp-39-tcp</label></case>
<case><value>41</value><label>TCP_graphics-41-tcp</label></case>
<case><value>42</value><label>TCP_name-42-tcp</label></case>
<case><value>42</value><label>TCP_nameserver-42-tcp</label></case>
<case><value>43</value><label>TCP_nicname-43-tcp</label></case>
<case><value>44</value><label>TCP_mpm-flag-44-tcp</label></case>
The label is the name - I am keeping TCP_ (and UDP_)
at the start of the label, as the tool I use to
display stats looks for the TCP and UDP character. I
follow the IANA name with the port and protocol so I
won't get duplicate application names (a lot of the
apps. listen on both UDP and TCP).
Any pointers? How do I get rid of the " character? I'm
guessing there are tabs in the file, since I created
it using Excel(I know, I should have figured a way to
simply grab the IANA well-known ports page and process
it directly). How do I get rid of tabs?
____________________________________________________________________________________
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more.
http://mobile.yahoo.com/go?refer=1GNXIC
More information about the Ale
mailing list