Difference between revisions of "Neural Networks"

From jWiki
Jump to navigationJump to search
Line 22: Line 22:
* I gave [https://git.snoopj.dev/SnoopJ/talks/src/branch/master/2023/explaining_neural_networks a talk] on the fundamentals of neural networks to Boston Python in March 2023
* I gave [https://git.snoopj.dev/SnoopJ/talks/src/branch/master/2023/explaining_neural_networks a talk] on the fundamentals of neural networks to Boston Python in March 2023
* 3blue1brown has an excellent [https://www.3blue1brown.com/topics/neural-networks series of lessons] about the fundamentals of neural networks. Particularly interesting to me is the lesson on [https://www.3blue1brown.com/lessons/backpropagation backpropagation] for its excellent visualization of the process of adjusting neural network weights.
* 3blue1brown has an excellent [https://www.3blue1brown.com/topics/neural-networks series of lessons] about the fundamentals of neural networks. Particularly interesting to me is the lesson on [https://www.3blue1brown.com/lessons/backpropagation backpropagation] for its excellent visualization of the process of adjusting neural network weights.
=== Dumping ground ===
These references are totally unclassified
* [https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e "Normcore LLM Reads"] - a reading list


=== Writings by others ===
=== Writings by others ===
==== Academic works ====
==== Academic works ====
* [https://arxiv.org/abs/2311.17035 Scalable Extraction of Training Data from (Production) Language Models] - "''Using only $200 USD worth of queries to ChatGPT (gpt-3.5-turbo), we are able to extract over 10,000 unique verbatim-memorized training examples. Our extrapolation to larger budgets (see below) suggests that dedicated adversaries could extract far more data…we estimate the…memorization of ChatGPT…[at] a gigabyte of training data. In practice we expect it is likely even higher.''"
* [https://arxiv.org/abs/2310.20216 Does GPT-4 Pass the Turing Test?]
* [https://dl.acm.org/doi/10.1145/3531146.3533158 "The Fallacy of AI Functionality"] - "''...fear of misspecified objectives, runaway feedback loops, and AI alignment presumes the existence of an industry that can get AI systems to execute on any clearly declared objectives, and that the main challenge is to choose and design an appropriate goal. Needless to say, if one thinks the danger of AI is that it will work too well, it is a necessary precondition that it works at all.''"
* [https://dl.acm.org/doi/10.1145/3531146.3533158 "The Fallacy of AI Functionality"] - "''...fear of misspecified objectives, runaway feedback loops, and AI alignment presumes the existence of an industry that can get AI systems to execute on any clearly declared objectives, and that the main challenge is to choose and design an appropriate goal. Needless to say, if one thinks the danger of AI is that it will work too well, it is a necessary precondition that it works at all.''"
* [https://arxiv.org/pdf/1806.11146.pdf "Adversarial Reprogramming of Neural Networks"] - "''In each [of six cases], we reprogrammed the [classification] network [trained on ImageNet] to perform three different adversarial tasks: counting squares, MNIST classification, and CIFAR-10 classification… Our finding…[suggests] that the reprogramming across domains is likely [possible].''"
* [https://arxiv.org/pdf/1806.11146.pdf "Adversarial Reprogramming of Neural Networks"] - "''In each [of six cases], we reprogrammed the [classification] network [trained on ImageNet] to perform three different adversarial tasks: counting squares, MNIST classification, and CIFAR-10 classification… Our finding…[suggests] that the reprogramming across domains is likely [possible].''"

Revision as of 02:48, 5 December 2023

Herein lie some of my thoughts and resources about neural networks. Because I work for a company that builds models for computer vision, I have a bit of a professional bias towards image models, but I have tried to represent my knowledge/opinions about a broader range of subjects here.

What do you think about generative "AI"?

tl;dr - mostly dancing bearware, some novel uses in responsibility laundering

Resources

Image models

Text models

For code

For everything else

  • Washington Post coverage of the data contained in the 'C4' dataset and how it influences the training of popular large models. Also allows users to check if arbitrary URLs are part of the dataset. (NOTE: C4 is not the only source of training text for the models being discussed, and the authors aren't doing a great job highlighting that, but it should still be pretty representative)
  • How well does ChatGPT speak Japanese? - an April 2023 evaluation of GPT-3.5 and GPT-4 performance on Japanese language assessments. Also includes an interesting comparison of the number of tokens required to represent the "Lord's Prayer" in multiple languages. I found the results of the latter particularly surprising.

Misc.

  • I gave a talk on the fundamentals of neural networks to Boston Python in March 2023
  • 3blue1brown has an excellent series of lessons about the fundamentals of neural networks. Particularly interesting to me is the lesson on backpropagation for its excellent visualization of the process of adjusting neural network weights.

Dumping ground

These references are totally unclassified

Writings by others

Academic works

Non-academic works

Lawsuits

The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.

ANDERSEN v. STABILITY AI LTD.

Entry #190 in Andersen v. Stability AI Ltd., 3:23-cv-00201
NOTICE of Appearance by Luke P. Apfeld (Apfeld, Luke) (Filed on 4/19/2024) (Entered: 04/19/2024)
Entry #189 in Andersen v. Stability AI Ltd., 3:23-cv-00201
RESPONSE re 165 Request for Judicial Notice by Runway AI, Inc.. (Malhotra, Paven) (Filed on 4/18/2024) (Entered: 04/18/2024)
Entry #188 in Andersen v. Stability AI Ltd., 3:23-cv-00201
REPLY (re 164 MOTION to Dismiss Plaintiffs First Amended Complaint ) filed byRunway AI, Inc.. (Malhotra, Paven) (Filed on 4/18/2024) (Entered: 04/18/2024)


GETTY IMAGES (US), INC. v. STABILITY AI, INC.

Entry #39 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE requesting Clerk to remove Nicole M. Jantzi, Paul M. Schoenhard, Michael C. Keats and Amir R. Ghavi as co-counsel. Reason for request: no longer associated with the case. (Flynn, Michael) (E...
Entry #38 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE to Take Deposition of Peter O'Donoghue on February 22, 2024 filed by Getty Images (US), Inc..(Vrana, Robert) (Entered: 02/13/2024)
Entry #37 in Getty Images (US), Inc. v. Stability AI, Inc., 1:23-cv-00135
NOTICE OF SERVICE of (1) Stability AI Ltd.'s Second Supplemental Objections and Responses to Plaintiff's Jurisdictional Interrogatories Nos. 2 and 12; and (2) Stability AI, Inc.'s Secon...

DOE 1 v. GITHUB, INC.

Entry #248 in DOE 1 v. GitHub, Inc., 4:22-cv-06823
Discovery Letter Brief
Entry #247 in DOE 1 v. GitHub, Inc., 4:22-cv-06823
Joint Discovery Letter Brief filed by J. DOE 1, J. DOE 2, J. Doe 3, J. Doe 4, J. Doe 5. (Saveri, Joseph) (Filed on 4/17/2024) (Entered: 04/17/2024)
Entry #246 in DOE 1 v. GitHub, Inc., 4:22-cv-06823
Order by Judge Jon S. Tigar denying 218 Motion for Reconsideration re 218 MOTION for Reconsideration re 189 Order on Motion to Dismiss, filed by J. Doe 5, J. DOE 1, J. Doe 4,...

SILVERMAN v. OPENAI, INC.

Entry #70 in Silverman v. OpenAI, Inc., 3:23-cv-03416
PRETRIAL ORDER as Modified. Signed by Judge Araceli Martinez-Olguin on 2/16/2024. (ads, COURT STAFF) (Filed on 2/16/2024) (Entered: 02/16/2024)
Entry #69 in Silverman v. OpenAI, Inc., 3:23-cv-03416
Order as Modified by Judge Araceli Martinez-Olguin granting (60) Stipulation Consolidating Cases in case 3:23-cv-03223-AMO. Associated Cases: 3:23-cv-03223-AMO, 3:23-cv-03416-AMO, 3:23-cv-04625-AMO...
Entry #68 in Silverman v. OpenAI, Inc., 3:23-cv-03416
Order by Judge Araceli Martinez-Olguin granting in part and denying in part 32 Motion to Dismiss. Cross-posted in 23-cv-03223. (amolc2, COURTSTAFF) (Filed on 2/12/2024) (Entered: 02/12/2024)

MATA v. AVIANCA, INC. (closed)

Note: this case is not about machine learning textually, but is included in this list because it is a notable example of gross misuse of a language model by plaintiff's counsel to submit falsified documents to the court. This led to censure of plaintiff's counsel and dismissal of the case.