{"id":5452,"date":"2025-07-09T18:58:30","date_gmt":"2025-07-10T01:58:30","guid":{"rendered":"http:\/\/www.contrapositivediary.com\/?p=5452"},"modified":"2025-07-09T18:58:30","modified_gmt":"2025-07-10T01:58:30","slug":"the-nyt-vs-chatgpt","status":"publish","type":"post","link":"https:\/\/www.contrapositivediary.com\/?p=5452","title":{"rendered":"The NYT Vs. ChatGPT"},"content":{"rendered":"<p>You may have seen this story come up over the last year and change: <a href=\"https:\/\/www.npr.org\/2023\/12\/27\/1221821750\/new-york-times-sues-chatgpt-openai-microsoft-for-copyright-infringement\" target=\"_blank\">The <em>New York Times<\/em> is suing OpenAI, creator of ChatGPT, for copyright infringement.<\/a> Earlier this year, <a href=\"https:\/\/www.fastcompany.com\/91306958\/new-york-times-copyright-lawsuit-against-openai-proceed-says-federal-judge\" target=\"_blank\">a federal judge ruled that the lawsuit can move forward<\/a>. And now\u2014good grief!\u2014the <em>Times<\/em> is demanding that OpenAI save all discussions people have with ChatGPT. All of them. The whole wad&#8212;<a href=\"https:\/\/thehill.com\/opinion\/technology\/5383530-chatgpt-users-privacy-collateral-damage\/\" target=\"_blank\"><em>even conversations that people have deleted<\/em><\/a>.<\/p>\n<p>You want a privacy violation? They\u2019ll give you a privacy violation, of a sort and at a scale that I\u2019ve not seen before. The premise is ridiculous: The <em>Times<\/em> suspects that people who delete their conversations with ChatGPT have been stealing <em>New York Times <\/em>IP, and then covering it up to hide the fact that they were stealing IP. After all, if they weren\u2019t stealing IP, why did they delete their conversations?<\/p>\n<p>Privacy as the rest of us understand it doesn\u2019t enter into the <em>Times<\/em>\u2019 logic at all. The whole business smells of legal subterfuge; that is, to strengthen their copyright infringement case, they\u2019re blaming ChatGPT users. I\u2019ve never tried ChatGPT, and I\u2019m certainly not going anywhere near it now. But this question arises: If a user asks an AI for an article on topic X, does the AI bring back the literal article? Golly, Google does that right now, granting that Google respects&#160; paywalls. Can ChatGPT somehow get past a paywall? I rather doubt it. If the <em>Times<\/em> wants to go after something that <em>does<\/em> get past its paywall, it had better go after archive.is, over in Iceland. I won\u2019t say much more about that, as it does get past most paywalls and is almost certainly massive copyright infringement.<\/p>\n<p>And all this brings into the spotlight the central question about commercial AI these days: How do AIs use their training data? I confess I don\u2019t fully understand that. <a href=\"https:\/\/www.understandingai.org\/p\/metas-llama-31-can-recall-42-percent\" target=\"_blank\">This article is a good place to start<\/a>. Meta\u2019s Llama v3.1 70B was able to cough up 42% of <em>Harry Potter and the Sorcerer\u2019s Stone<\/em>, though not in one chunk. Meta\u2019s really big problem is that <a href=\"https:\/\/www.tomshardware.com\/tech-industry\/artificial-intelligence\/meta-staff-torrented-nearly-82tb-of-pirated-books-for-ai-training-court-records-reveal-copyright-violations\" target=\"_blank\">it trained Llama on <em>81.7 terabytes<\/em> of pirated material<\/a> torrented from \u201cshadow libraries\u201d like Anna\u2019s Archive, Z-Library, and LibGen, and probably other places. I consider these pirate sites, albeit not as blatant as the Pirate Bay, but pirate sites nonetheless.<\/p>\n<p>I\u2019m still looking for a fully digestible explanation of how training an AI actually works, but that\u2019ll come around eventually.<\/p>\n<p>So how might an AI be trained without using pirated material? My guess is that the big AI players will probably cut a deal with major publishers for training rights. A lot of free stuff will come from small Web operators, who don\u2019t have the resources to negotiate a deal with the AI guys. Most of then probably won\u2019t care. In truth, I\u2019d be delighted if AIs swallowed Contra\u2019s 3500+ entries in one gulp. Anything that has my name in it will make the AI more likely to cite me in answer to user questions, and that\u2019s all I\u2019ll ask for.<\/p>\n<p>Ultimately, I\u2019m pretty sure Zuck will cut a deal with NYT, WaPo, the Chicago Trib, and other big IP vendors. Big money will change hands. Meta will probably have to charge people to use Llama to pay off IP holders, and that\u2019s only right.<\/p>\n<p>But lordy, this is a supremely weird business, and I\u2019m pretty sure the bulk of the weirdness is somehow hidden from public scrutiny. Bit by bit it will come out, and I (along with a lot of you) will be watching for it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You may have seen this story come up over the last year and change: The New York Times is suing OpenAI, creator of ChatGPT, for copyright infringement. Earlier this year, a federal judge ruled that the lawsuit can move forward. And now\u2014good grief!\u2014the Times is demanding that OpenAI save all discussions people have with ChatGPT. [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23],"tags":[16,14,269],"class_list":["post-5452","post","type-post","status-publish","format-standard","hentry","category-ideasandanalysis","tag-publishing","tag-software","tag-web-weirdness"],"_links":{"self":[{"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=\/wp\/v2\/posts\/5452","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5452"}],"version-history":[{"count":1,"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=\/wp\/v2\/posts\/5452\/revisions"}],"predecessor-version":[{"id":5453,"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=\/wp\/v2\/posts\/5452\/revisions\/5453"}],"wp:attachment":[{"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5452"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5452"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.contrapositivediary.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5452"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}