Forbes accuses Perplexity AI of bypassing robots.txt web standard to scrape content, Tollbit startup gains publicity by baselessly accusing everyone of doing this too in open letter. Why do we listen to this shit?

[…]

A letter to publishers seen by Reuters on Friday, which does not name the AI companies or the publishers affected, comes amid a public dispute between AI search startup Perplexity and media outlet Forbes involving the same web standard and a broader debate between tech and media firms over the value of content in the age of generative AI.

The business media publisher publicly accused Perplexity of plagiarizing its investigative stories in AI-generated summaries without citing Forbes or asking for its permission.

A Wired investigation published this week found Perplexity likely bypassing efforts to block its web crawler via the Robots Exclusion Protocol, or “robots.txt,” a widely accepted standard meant to determine which parts of a site are allowed to be crawled.

Perplexity declined a Reuters request for comment on the dispute.

The News Media Alliance, a trade group representing more than 2,200 U.S.-based publishers, expressed concern about the impact that ignoring “do not crawl” signals could have on its members.

“Without the ability to opt out of massive scraping, we cannot monetize our valuable content and pay journalists. This could seriously harm our industry,” said Danielle Coffey, president of the group.

Source: Exclusive-Multiple AI companies bypassing web standard to scrape publisher sites, licensing firm says

So the original clickbait headline comes from a content licensing startup scaring content providers up but with no details whatsoever. Why is this even news?!

Robin Edgar

Organisational Structures | Technology and Science | Military, IT and Lifestyle consultancy | Social, Broadcast & Cross Media | Flying aircraft

robin@edgarbv.com https://www.edgarbv.com

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30