On Wed, 13 Jul 2016, Paul Goyette wrote:
On Wed, 13 Jul 2016, coypu%SDF.ORG@localhost wrote:Hi, you'll be able to continue getting data by using an alternate user agent. try e.g. `curl -A 'Lynx'`.Actually that doesn't seem to help here. I still get back all the javascript code betweem <script>...</script> tags.There's a small section following that, between <noscript>...</noscript> tags, but it doesn't contain any useful data.
Further experimentation shows that it does indeed have the raw data that I am looking for. I just need to run 'curl -A Lynx -s $URL' and parse through a 256kb line of text! Yes, not a typo, one line of text with 256k characters! Needless to say I decided to replace some shell symbol manipulation with some grep-foo and sed-goo. :)
I'm a happy camper again! +------------------+--------------------------+------------------------+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +------------------+--------------------------+------------------------+