Difference between revisions of "Neural Networks"

From jWiki
Jump to navigationJump to search
(Add "Mathematical Capabilities of ChatGPT")
Line 24: Line 24:
=== Writings by others ===
=== Writings by others ===
==== Academic works ====
==== Academic works ====
* [https://arxiv.org/abs/2301.13867 "Mathematical Capabilities of ChatGPT"] - in which ChatGPT and GPT4 largely fail to muster passing performance on a mathematical problem set, compared to a domain-specific model that achieves nearly 100% performance.
* [https://doi.org/10.1038/s41467-019-08987-4 "Unmasking Clever Hans predictors and assessing what machines really learn"] - "''...it is important to comprehend the decision-making process itself...transparency of the what and why in a decision of a nonlinear machine becomes very effective for the essential task of judging whether the learned strategy is valid and generalizable or whether the model has based its decision on a spurious correlation in the training data''"
* [https://doi.org/10.1038/s41467-019-08987-4 "Unmasking Clever Hans predictors and assessing what machines really learn"] - "''...it is important to comprehend the decision-making process itself...transparency of the what and why in a decision of a nonlinear machine becomes very effective for the essential task of judging whether the learned strategy is valid and generalizable or whether the model has based its decision on a spurious correlation in the training data''"
* [https://doi.org/10.1145/3442188.3445922 "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜"] - "''LMs with extremely large numbers of parameters model their training data very closely and can be prompted to output specific information from that training data''"
* [https://doi.org/10.1145/3442188.3445922 "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜"] - "''LMs with extremely large numbers of parameters model their training data very closely and can be prompted to output specific information from that training data''"

Revision as of 11:55, 21 September 2023

Herein lie some of my thoughts and resources about neural networks. Because I work for a company that builds models for computer vision, I have a bit of a professional bias towards image models, but I have tried to represent my knowledge/opinions about a broader range of subjects here.

What do you think about generative "AI"?

tl;dr - mostly dancing bearware, some novel uses in responsibility laundering

Resources

Image models

Text models

For code

For everything else

  • Washington Post coverage of the data contained in the 'C4' dataset and how it influences the training of popular large models. Also allows users to check if arbitrary URLs are part of the dataset. (NOTE: C4 is not the only source of training text for the models being discussed, and the authors aren't doing a great job highlighting that, but it should still be pretty representative)
  • How well does ChatGPT speak Japanese? - an April 2023 evaluation of GPT-3.5 and GPT-4 performance on Japanese language assessments. Also includes an interesting comparison of the number of tokens required to represent the "Lord's Prayer" in multiple languages. I found the results of the latter particularly surprising.

Misc.

  • I gave a talk on the fundamentals of neural networks to Boston Python in March 2023
  • 3blue1brown has an excellent series of lessons about the fundamentals of neural networks. Particularly interesting to me is the lesson on backpropagation for its excellent visualization of the process of adjusting neural network weights.

Writings by others

Academic works

Non-academic works

Lawsuits

The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.

ANDERSEN v. STABILITY AI LTD.

Entry #277 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Transcript Order
Entry #276 in Andersen v. Stability AI Ltd., 3:23-cv-00201
STIPULATED PROTECTIVE ORDER FOR LITIGATION INVOLVING HIGHLY SENSITIVE CONFIDENTIAL INFORMATION AND/OR TRADE SECRETS re Docket No. 264 by Magistrate Judge Lisa J. Cisneros. (bns, COURT STAFF) (Fi...
Entry #275 in Andersen v. Stability AI Ltd., 3:23-cv-00201
Order by Magistrate Judge Lisa J. Cisneros granting 272 Stipulation Re: Discovery of Electronically Stored Information.(bns, COURT STAFF) (Filed on 4/15/2025)


GETTY IMAGES (US), INC. v. STABILITY AI, INC.

Entry #67 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Laura Gilbert Remus as co-counsel. Reason for request: no longer with the firm. (Flynn, Michael) (Entered: 04/11/2025)
Entry #66 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
Letter to The Honorable Jennifer L. Hall from Robert M. Vrana regarding Rule 26(f) conference - re 52 Status Report. (Vrana, Robert) (Entered: 11/25/2024)
Entry #65 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Allyson R. Bennett as co-counsel. Reason for request: no longer with the firm. (Flynn, Michael) (Entered: 09/20/2024)

DOE 1 v. GITHUB, INC.

SILVERMAN v. OPENAI, INC.

Entry #72 in Silverman v. OpenAI, Inc., 3:23-cv-03416
Order
Entry #71 in Silverman v. OpenAI, Inc., 3:23-cv-03416
NOTICE of Change In Counsel by Joseph R. Saveri Withdrawal of Counsel - Kathleen J. McMahon (Saveri, Joseph) (Filed on 8/12/2024) (Entered: 08/12/2024)
Entry #70 in Silverman v. OpenAI, Inc., 3:23-cv-03416
PRETRIAL ORDER as Modified. Signed by Judge Araceli Martinez-Olguin on 2/16/2024. (ads, COURT STAFF) (Filed on 2/16/2024) (Entered: 02/16/2024)

MATA v. AVIANCA, INC. (closed)

Note: this case is not about machine learning textually, but is included in this list because it is a notable example of gross misuse of a language model by plaintiff's counsel to submit falsified documents to the court. This led to censure of plaintiff's counsel and dismissal of the case.