Neural Networks

From jWiki
Revision as of 13:14, 21 May 2024 by Snoopj (talk | contribs) (→‎Non-academic works: Add reference to original SolidGoldMagikarp posting)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Herein lie some of my thoughts and resources about neural networks. Because I work for a company that builds models for computer vision, I have a bit of a professional bias towards image models, but I have tried to represent my knowledge/opinions about a broader range of subjects here.

What do you think about generative "AI"?

tl;dr - mostly dancing bearware, some novel uses in responsibility laundering

Resources

Image models

Text models

For code

For everything else

  • Washington Post coverage of the data contained in the 'C4' dataset and how it influences the training of popular large models. Also allows users to check if arbitrary URLs are part of the dataset. (NOTE: C4 is not the only source of training text for the models being discussed, and the authors aren't doing a great job highlighting that, but it should still be pretty representative)
  • How well does ChatGPT speak Japanese? - an April 2023 evaluation of GPT-3.5 and GPT-4 performance on Japanese language assessments. Also includes an interesting comparison of the number of tokens required to represent the "Lord's Prayer" in multiple languages. I found the results of the latter particularly surprising.

Misc.

  • I gave a talk on the fundamentals of neural networks to Boston Python in March 2023
  • 3blue1brown has an excellent series of lessons about the fundamentals of neural networks. Particularly interesting to me is the lesson on backpropagation for its excellent visualization of the process of adjusting neural network weights.

Dumping ground

These references are totally unclassified

Writings by others

Academic works

Non-academic works

Lawsuits

The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.

The New York Times Company v. MICROSOFT CORPORATION

Entry #753 in The New York Times Company v. Microsoft Corporation, 1:23-cv-11195
LETTER MOTION for Discovery [Proposal} Regarding AEO Documents in Response to Order ECF No. 308 addressed to Magistrate Judge Ona T. Wang from Tiffany Cheung, Christopher S. Sun, Allison S. Bla...
Entry #752 in The New York Times Company v. Microsoft Corporation, 1:23-cv-11195
LETTER addressed to Magistrate Judge Ona T. Wang from Davida Brook dated July 11, 2025 re: The Court's July 9 Order. Document filed by The New York Times Company. (Attachments: # 1 Exhibit 1)F...
Entry #751 in The New York Times Company v. Microsoft Corporation, 1:23-cv-11195
DECLARATION of Michael Trinh in Support re: (318 in 1:25-md-03143-SHS-OTW, 749 in 1:23-cv-11195-SHS-OTW) LETTER MOTION to Seal re ECF No. 293 addressed to Magistrate Judge Ona T. Wang from Rose S....

Andersen v. Stability AI Ltd.

Entry #316 in Andersen v. Stability AI Ltd., 3:23-cv-00201
ORDER GRANTING 300 DEFENDANTS' REQUEST REGARDING DISCLOSURE TO DR. ZHAO by Judge Lisa J. Cisneros. (bns, COURT STAFF) (Filed on 7/14/2025)
Entry #315 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Stipulation and Proposed Order
Entry #314 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Order on Discovery Letter Brief


Getty Images (US), Inc. v. Stability AI, Inc.

Entry #68 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Melissa Rutman as co-counsel. Reason for request: no longer with Weil, Gotshal & Manges LLP. (Vrana, Robert) (Entered: 05/02/2025)
Entry #67 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Laura Gilbert Remus as co-counsel. Reason for request: no longer with the firm. (Flynn, Michael) (Entered: 04/11/2025)
Entry #66 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
Letter to The Honorable Jennifer L. Hall from Robert M. Vrana regarding Rule 26(f) conference - re 52 Status Report. (Vrana, Robert) (Entered: 11/25/2024)

Doe 1 v. GitHub, Inc.

Minute entry from 2025-02-11 in DOE 1 v. GitHub, Inc., 4:22-cv-06823
Notice of Appearance/Substitution/Change/Withdrawal of Attorney
Entry #289 in DOE 1 v. GitHub, Inc., 4:22-cv-06823
NOTICE of Withdrawal filed by Vera Ranieri, no longer appearing on behalf of OpenAI Startup Fund Management, LLC, OpenAI OpCo, L.L.C., OpenAI, Inc., OPENAI, L.L.C., OPENAI GLOBAL, LLC, OAI CORPORAT...
Entry #288 in DOE 1 v. GitHub, Inc., 4:22-cv-06823
Transcript Designation Form for proceedings held on 5/4/2023 and 11/9/2023 before Judge Jon S. Tigar, (Saveri, Joseph) (Filed on 1/10/2025) (Entered: 01/10/2025)

Silverman v. OpenAI, Inc.

Minute entry from 2025-04-28 in Silverman v. OpenAI, Inc., 3:23-cv-03416
Case opened in Southern District of New York as 1:25-cv-03483, filed 04/27/2025. (far, COURT STAFF) (Filed on 4/28/2025)
Minute entry from 2025-04-28 in Silverman v. OpenAI, Inc., 3:23-cv-03416
Remark
Entry #72 in Silverman v. OpenAI, Inc., 3:23-cv-03416
MDL TRANSFER ORDER transferring case to the Southern District of New York re MDL No. 3143. (far, COURT STAFF) (Filed on 4/21/2025) (Entered: 04/22/2025)

Kadrey v. Meta Platforms, Inc.

  • Similar suit to Silverman v. OpenAI, same parties etc.
  • Notable for a prominent dismissal of the class-action nature of the case, as the blatantly copied copyrighted works in the training data are not the works of the plaintiffs.
  • Latest [1]:
Minute entry from 2025-07-14 in Kadrey v. Meta Platforms, Inc., 3:23-cv-03417
Case Management Conference - Further
Entry #607 in Kadrey v. Meta Platforms, Inc., 3:23-cv-03417
Minute Entry for proceedings held before Judge Vince Chhabria: Further Case Management Conference held via Zoom on 7/11/2025. Stipulation or competing proposed schedules re motion for summary judgm...
Minute entry from 2025-07-10 in Kadrey v. Meta Platforms, Inc., 3:23-cv-03417
Order

Authors Guild v. OpenAI Inc.

Sancton v. OpenAI Inc. et al

Failed to load RSS feed from https://dockets.justia.com/docket/new-york/nysdce/1:2023cv10211/610699/feed: There was a problem during the HTTP request: 403 Forbidden

Mata v. Avianca, Inc. (closed)

Note: this case is not about machine learning textually, but is included in this list because it is a notable example of gross misuse of a language model by plaintiff's counsel to submit falsified documents to the court. This led to censure of plaintiff's counsel and dismissal of the case.