Difference between revisions of "Neural Networks"

From jWiki
Jump to navigationJump to search
(→‎Lawsuits: Add Authors Guild v. OpenAI Inc.)
(→‎Non-academic works: Add reference to original SolidGoldMagikarp posting)
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
Herein lie some of my thoughts and resources about neural networks. Because I work for a company that builds models for computer vision, I have a bit of a professional bias towards [[#image models|image models]], but I have tried to represent my knowledge/opinions about a broader range of subjects here.
Herein lie some of my thoughts and resources about neural networks. Because I work for a company that builds models for computer vision, I have a bit of a professional bias towards [[#Image models|image models]], but I have tried to represent my knowledge/opinions about a broader range of subjects here.


= What do you think about generative "AI"? =
= What do you think about generative "AI"? =
Line 26: Line 26:
These references are totally unclassified
These references are totally unclassified


* [https://www.nature.com/articles/s41746-023-00939-z "Large language models propagate race-based medicine"]
* [https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e "Normcore LLM Reads"] - a reading list
* [https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e "Normcore LLM Reads"] - a reading list
* [https://arxiv.org/abs/2307.11760 Large Language Models Understand and Can be Enhanced by Emotional Stimuli] - (Note: I consider the use of "Understand" here to be unprofessional and irresponsible, but it's an interesting paper)
* [https://arxiv.org/abs/2307.11760 Large Language Models Understand and Can be Enhanced by Emotional Stimuli] - (Note: I consider the use of "Understand" here to be unprofessional and irresponsible, but it's an interesting paper)
Line 34: Line 35:
=== Writings by others ===
=== Writings by others ===
==== Academic works ====
==== Academic works ====
* [https://conf.researchr.org/details/ast-2024/ast-2024-papers/2/Using-GitHub-Copilot-for-Test-Generation-in-Python-An-Empirical-Study Using GitHub Copilot for Test Generation in Python: An Empirical Study] - "''we find that 45.28% of test generated...are passing tests, containing no syntax or runtime errors. The majority (54.72%) of generated tests...are failing, broken, or empty tests. We observe that tests generated within an existing test code context often mimic existing test methods''"
* [https://arxiv.org/abs/2311.17035 Scalable Extraction of Training Data from (Production) Language Models] - "''Using only $200 USD worth of queries to ChatGPT (gpt-3.5-turbo), we are able to extract over 10,000 unique verbatim-memorized training examples. Our extrapolation to larger budgets (see below) suggests that dedicated adversaries could extract far more data…we estimate the…memorization of ChatGPT…[at] a gigabyte of training data. In practice we expect it is likely even higher.''"
* [https://arxiv.org/abs/2311.17035 Scalable Extraction of Training Data from (Production) Language Models] - "''Using only $200 USD worth of queries to ChatGPT (gpt-3.5-turbo), we are able to extract over 10,000 unique verbatim-memorized training examples. Our extrapolation to larger budgets (see below) suggests that dedicated adversaries could extract far more data…we estimate the…memorization of ChatGPT…[at] a gigabyte of training data. In practice we expect it is likely even higher.''"
* [https://arxiv.org/abs/2310.20216 Does GPT-4 Pass the Turing Test?]
* [https://arxiv.org/abs/2310.20216 Does GPT-4 Pass the Turing Test?]
Line 52: Line 54:


==== Non-academic works ====
==== Non-academic works ====
* [https://www.technologyreview.com/2024/05/17/1092649/gpt-4o-chinese-token-polluted/ GPT-4o’s Chinese token-training data is polluted by spam and porn websites] - 'glitch' tokens continue to plague models long after [https://www.alignmentforum.org/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation the phenomenon] is well-known to practicioners
* [https://tante.cc/2023/11/10/thoughts-on-generative-ai-art/ tante's "Thoughts on “generative AI Art”"] - "''…people using these [generative] systems don’t care about the…process of creation or the thought that went into it, they care about the output and what they feel that that output gives them…It’s “idea guy” heaven.''"
* [http://decomposition.al/CSE232-2023-09/course-overview.html#policy-on-the-use-of-llm-based-tools-like-chatgpt Lindsey Kuper's CSE232 syllabus section on LLM usage] - "''Aside from the fact that the resounding hollowness of the ChatGPT-produced prose has sucked away all of my zest for life…please understand that while you are welcome to use LLM-based tools in this course, you should be aware of their limitations.''"
* [http://decomposition.al/CSE232-2023-09/course-overview.html#policy-on-the-use-of-llm-based-tools-like-chatgpt Lindsey Kuper's CSE232 syllabus section on LLM usage] - "''Aside from the fact that the resounding hollowness of the ChatGPT-produced prose has sucked away all of my zest for life…please understand that while you are welcome to use LLM-based tools in this course, you should be aware of their limitations.''"
* [https://time.com/6247678/openai-chatgpt-kenya-workers/ Time: "OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic"]
* [https://time.com/6247678/openai-chatgpt-kenya-workers/ Time: "OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic"]
Line 59: Line 63:
* [https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web Ted Chiang: "ChatGPT Is a Blurry JPEG of the Web"] - "''Large language models identify statistical regularities in text...When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.''"
* [https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web Ted Chiang: "ChatGPT Is a Blurry JPEG of the Web"] - "''Large language models identify statistical regularities in text...When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.''"
* [https://www.newyorker.com/science/annals-of-artificial-intelligence/will-ai-become-the-new-mckinsey Ted Chiang: "Will A.I. Become the New McKinsey?"] - "''I’m not very convinced by claims that A.I. poses a danger to humanity because it might develop goals of its own and prevent us from turning it off. However, I do think that A.I. is dangerous inasmuch as it increases the power of capitalism.''"
* [https://www.newyorker.com/science/annals-of-artificial-intelligence/will-ai-become-the-new-mckinsey Ted Chiang: "Will A.I. Become the New McKinsey?"] - "''I’m not very convinced by claims that A.I. poses a danger to humanity because it might develop goals of its own and prevent us from turning it off. However, I do think that A.I. is dangerous inasmuch as it increases the power of capitalism.''"
* [https://www.schneier.com/blog/archives/2023/12/ai-and-trust.html Bruce Schneier: "AI and Trust"] - ''"the corporations controlling AI systems will take advantage of our confusion to take advantage of us…our fears of AI are basically fears of capitalism"''


= Lawsuits =
= Lawsuits =
The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.
The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.
==== The New York Times Company v. MICROSOFT CORPORATION ====
* [https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html December 2023 coverage: initial complaint]
* Latest [https://www.courtlistener.com/docket/68117049/the-new-york-times-company-v-microsoft-corporation/ case proceedings]:
<rss max=3>https://www.courtlistener.com/docket/68117049/feed/</rss>


==== Andersen v. Stability AI Ltd. ====
==== Andersen v. Stability AI Ltd. ====
Line 83: Line 93:
* Latest [https://www.courtlistener.com/docket/67569254/silverman-v-openai-inc/ case proceedings]:
* Latest [https://www.courtlistener.com/docket/67569254/silverman-v-openai-inc/ case proceedings]:
<rss max=3>https://www.courtlistener.com/docket/67569254/feed/</rss>
<rss max=3>https://www.courtlistener.com/docket/67569254/feed/</rss>
==== Kadrey v. Meta Platforms, Inc. ====
* Similar suit to Silverman v. OpenAI, same parties etc.
* Notable for a [https://storage.courtlistener.com/recap/gov.uscourts.cand.415175/gov.uscourts.cand.415175.56.0_1.pdf prominent dismissal] of the class-action nature of the case, as the blatantly copied copyrighted works in the training data are not the works of the plaintiffs.
* Latest [https://www.courtlistener.com/docket/67569326/kadrey-v-meta-platforms-inc/]:
<rss max=3>https://www.courtlistener.com/docket/67569326/feed/</rss>


==== Authors Guild v. OpenAI Inc. ====
==== Authors Guild v. OpenAI Inc. ====

Latest revision as of 13:14, 21 May 2024

Herein lie some of my thoughts and resources about neural networks. Because I work for a company that builds models for computer vision, I have a bit of a professional bias towards image models, but I have tried to represent my knowledge/opinions about a broader range of subjects here.

What do you think about generative "AI"?

tl;dr - mostly dancing bearware, some novel uses in responsibility laundering

Resources

Image models

Text models

For code

For everything else

  • Washington Post coverage of the data contained in the 'C4' dataset and how it influences the training of popular large models. Also allows users to check if arbitrary URLs are part of the dataset. (NOTE: C4 is not the only source of training text for the models being discussed, and the authors aren't doing a great job highlighting that, but it should still be pretty representative)
  • How well does ChatGPT speak Japanese? - an April 2023 evaluation of GPT-3.5 and GPT-4 performance on Japanese language assessments. Also includes an interesting comparison of the number of tokens required to represent the "Lord's Prayer" in multiple languages. I found the results of the latter particularly surprising.

Misc.

  • I gave a talk on the fundamentals of neural networks to Boston Python in March 2023
  • 3blue1brown has an excellent series of lessons about the fundamentals of neural networks. Particularly interesting to me is the lesson on backpropagation for its excellent visualization of the process of adjusting neural network weights.

Dumping ground

These references are totally unclassified

Writings by others

Academic works

Non-academic works

Lawsuits

The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.

The New York Times Company v. MICROSOFT CORPORATION

Entry #522 in The New York Times Company v. Microsoft Corporation, 1:23-cv-11195
CERTIFIED TRUE COPY OF CONDITIONAL MDL TRANSFER IN ORDER FROM THE MDL PANEL... that pursuant to 28 U.S.C. 1407, the action(s) listed... and pending... be, and the same hereby are, transferred to th...
Entry #521 in The New York Times Company v. Microsoft Corporation, 1:23-cv-11195
LETTER addressed to Magistrate Judge Ona T. Wang from Michelle Ybarra, Elana Nightingale Dawson, Rose Lee, and Annette Hurst dated April 10, 2025 re: Response to Plaintiffs Letter Regarding Case Sc...
Minute entry from 2025-04-08 in The New York Times Company v. Microsoft Corporation, 1:23-cv-11195
Notice Regarding Pro Hac Vice Motion

Andersen v. Stability AI Ltd.

Minute entry from 2025-04-11 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Notice of Appearance/Substitution/Change/Withdrawal of Attorney
Entry #272 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Stipulation and Proposed Order
Entry #271 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Stipulation and Proposed Order


Getty Images (US), Inc. v. Stability AI, Inc.

Entry #67 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Laura Gilbert Remus as co-counsel. Reason for request: no longer with the firm. (Flynn, Michael) (Entered: 04/11/2025)
Entry #66 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
Letter to The Honorable Jennifer L. Hall from Robert M. Vrana regarding Rule 26(f) conference - re 52 Status Report. (Vrana, Robert) (Entered: 11/25/2024)
Entry #65 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Allyson R. Bennett as co-counsel. Reason for request: no longer with the firm. (Flynn, Michael) (Entered: 09/20/2024)

Doe 1 v. GitHub, Inc.

Silverman v. OpenAI, Inc.

Entry #71 in Silverman v. OpenAI, Inc., 3:23-cv-03416
NOTICE of Change In Counsel by Joseph R. Saveri Withdrawal of Counsel - Kathleen J. McMahon (Saveri, Joseph) (Filed on 8/12/2024) (Entered: 08/12/2024)
Entry #70 in Silverman v. OpenAI, Inc., 3:23-cv-03416
PRETRIAL ORDER as Modified. Signed by Judge Araceli Martinez-Olguin on 2/16/2024. (ads, COURT STAFF) (Filed on 2/16/2024) (Entered: 02/16/2024)
Entry #69 in Silverman v. OpenAI, Inc., 3:23-cv-03416
Order as Modified by Judge Araceli Martinez-Olguin granting (60) Stipulation Consolidating Cases in case 3:23-cv-03223-AMO. Associated Cases: 3:23-cv-03223-AMO, 3:23-cv-03416-AMO, 3:23-cv-04625-AMO...

Kadrey v. Meta Platforms, Inc.

  • Similar suit to Silverman v. OpenAI, same parties etc.
  • Notable for a prominent dismissal of the class-action nature of the case, as the blatantly copied copyrighted works in the training data are not the works of the plaintiffs.
  • Latest [1]:
Entry #536 in Kadrey v. Meta Platforms, Inc., 3:23-cv-03417
Certificate of Interested Entities by Association of American Publishers re 535 Brief of Amicus Curiae Association of American Publishers (Strassman, Ruby) (Filed on 4/11/2025) (Entered: 04/11/2025)
Entry #535 in Kadrey v. Meta Platforms, Inc., 3:23-cv-03417
Brief of Amicus Curiae Association of American Publishers filed byAssociation of American Publishers. (Strassman, Ruby) (Filed on 4/11/2025) (Entered: 04/11/2025)
Entry #534 in Kadrey v. Meta Platforms, Inc., 3:23-cv-03417
NOTICE of Appearance filed by Jacqueline Charlesworth on behalf of Association of American Publishers (Charlesworth, Jacqueline) (Filed on 4/11/2025) (Entered: 04/11/2025)

Authors Guild v. OpenAI Inc.

Entry #405 in Authors Guild v. OpenAI Inc., 1:23-cv-08292
MOTION for Leave to File Amended Complaint (Redacted). Document filed by Authors Guild, Jonathan Alter. (Attachments: # 1 Affidavit / Declaration of Rachel J. Geman, # 2 Exhibit 1 to the Geman Decl...
Entry #404 in Authors Guild v. OpenAI Inc., 1:23-cv-08292
***EX-PARTE*** MOTION for Leave to File Amended Complaint . Document filed by Authors Guild, Jonathan Alter. (Attachments: # 1 Affidavit / Declaration of Rachel J. Geman, # 2 Exhibit 1 to the Geman...
Entry #403 in Authors Guild v. OpenAI Inc., 1:23-cv-08292
MOTION to Seal Motion for Leave to Amend the Complaint. Document filed by Authors Guild, Jonathan Alter.Filed In Associated Cases: 1:23-cv-08292-SHS-OTW, 1:23-cv-10211-SHS-OTW.(Geman, Rachel) (Ente...

Sancton v. OpenAI Inc. et al

[ Document 21]
ORDER granting #17 Motion for Alejandra Christina Salinas to Appear Pro Hac Vice (HEREBY ORDERED by Judge Sidney H. Stein)(Text Only Order) (lab)
2023-11-30 08:00:00
[ Document 20]
ORDER granting #14 Motion for Rohit Dwarka Nath to Appear Pro Hac Vice (HEREBY ORDERED by Judge Sidney H. Stein)(Text Only Order) (lab)
2023-11-30 08:00:00
[ Document 19]
ORDER granting #16 Motion for Justin Adatto Nelson to Appear Pro Hac Vice (HEREBY ORDERED by Judge Sidney H. Stein)(Text Only Order) (lab)
2023-11-30 08:00:00

Mata v. Avianca, Inc. (closed)

Note: this case is not about machine learning textually, but is included in this list because it is a notable example of gross misuse of a language model by plaintiff's counsel to submit falsified documents to the court. This led to censure of plaintiff's counsel and dismissal of the case.