Associated Press lapse into insane gibbering incoherence
Quick, get the nurse. There’s frothing at the mouth. It must be rabies!
What else would explain this?
The Associated Press have a new and evil plan. It’s to make it so that their articles are going to contain a tiny little piece of JavaScript inside them, so that they can be traced across the web. The JavaScript will inform the Associated Press where the article has been reposted. It’s a “beacon”. When, the rabies-assisted thinking goes, the article gets republished on the Web, the person who republished it will have also republished the JavaScript beacon. And then the game is up! It’ll then send a request across the network to their servers saying “Hey, someone has nicked this. Go get ‘em!”
Of course, if you are intelligent enough to be able to write the code to automatically repost other people’s RSS feeds onto your own site, you are intelligent enough to be able to use your XML/HTML parsing library to remove the JavaScript. In fact, you don’t even need to parse it into a DOM. Removing an HTML script tag from some HTML tag soup takes not much more than a fucking regular expression.
I mean, Christ, if you are trying to leave a tracking device, you have to make it tiny and microscopic and hide it somewhere nobody will find it. If even an idiot like me can programatically remove the tracking device with only a fraction more mental energy than one uses while struggling to remember how to take a piss in the morning, consider your design stupid.
They seem to think that the primary problem isn’t automated scraping software but idiots who View Source and then copy and paste. I’m guessing WordPress and Movable Type most likely have a filter for script tags and other potentially behaviour-fucking markup. Also, JavaScript in RSS? I’m guessing most of the RSS readers are going to block that too.
Even more amusing is that the tracking beacons are going to be a real good giggle for those of us who like to fuck with the legal departments of major reporting bureaus. I’m guessing that every news article produced by the AP has some kind of identifying string - perhaps a number or a predictable increment-by-one with some letters and stuff. Whatever. It’ll be possible for anyone of reasonable intelligence to figure out an article ID. Then they just insert the tracking code with a different random article ID on every page of their site. Only you don’t have any of their content there. Just wait for visits from ap.com and start inserting large animated GIFs of pulsating dongs. I’m guessing the tracking device is wired up to a bunch of crazy sue-everyone-to-hell lawyers. Getting them chasing up Encyclopedia Dramatica for blatant misuse of their articles or whatever is just going to be sweet to watch.
The news media has gone rabid. Unreasoned attack dogs of stupidity. Remember: NYTimes went online in 1995. The Telegraph here in Britain went online in November 1994. It’s not like they haven’t had any warning that the Internet may compete with their interests - they’ve had the best part of fifteen years to realise. So, what are they doing now? A lot of conferences and press releases blaming Google for, err, indexing their content and making it publicly available. Of course, if they want Google to stop indexing their content, it’d take all of five minutes to read the Robots Exclusion Standard and create a robots.txt file. If, say, the Associated Press or any particular news outlet wants Google to stop indexing, two or three lines of robots.txt will do that.
But they actually don’t want Google to stop indexing their content. They’d much rather complain and whine like an annoying child. Tracking beacons are just another idiotic idea in the whining arsenal.