Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/share/misc sed script to parse cctld-whois.htm and produce t...



details:   https://anonhg.NetBSD.org/src/rev/01190b599d4b
branches:  trunk
changeset: 543698:01190b599d4b
user:      jhawk <jhawk%NetBSD.org@localhost>
date:      Sun Mar 02 20:10:39 2003 +0000

description:
sed script to parse cctld-whois.htm and produce the "domains" file

diffstat:

 share/misc/domains.sed |  35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diffs (39 lines):

diff -r e84ea172257d -r 01190b599d4b share/misc/domains.sed
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/share/misc/domains.sed    Sun Mar 02 20:10:39 2003 +0000
@@ -0,0 +1,35 @@
+# $NetBSD: domains.sed,v 1.1 2003/03/02 20:10:39 jhawk Exp $
+:top
+#                              Join all lines with unterminated HTML tags
+/<[^>]*$/{
+       N
+       b top
+}
+#                              Replace all <BR> with EOL marker ($)
+s/<BR>/$/g                     
+#                              Join all data lines (containing ">.") not ending in $
+/>\..*[^$]$/{
+       N
+       s/\n//g
+       b top
+}
+s/<[^>]*>//g
+#                              Remove all HTML tags
+s/\$$//        
+#                              Remove EOL markers
+s/&nbsp;/ /g
+#                              Remove HTML character encodings
+s/&#150;//g
+s/[    ][      ]*/ /g
+#      n                       Compress spaces/tabs
+s/^ //
+#                              Output metadata to file "top"
+/updated/{
+  s/.*updated/# Latest change:/
+  s/ *$//
+  w top
+}
+#                              Delete all non-data lines
+/^\./!d
+#                              Remove leading '.'
+s/^\.//



Home | Main Index | Thread Index | Old Index