Revision as of 02:37, 9 August 2023

Herein lie some of my thoughts and resources about neural networks. Because I am work for a company that builds models for computer vision, I have a bit of a professional bias towards image models, but I have tried to represent my knowledge/opinions about a broader range of subjects here.

What do you think about generative "AI"?

tl;dr - mostly dancing bearware, some novel uses in responsibility laundering

Resources

Image models

Stanford CS231n: Deep Learning for Computer Vision - excellent introductory course in computer vision (from kNN to VGGNet) focused on neural networks, with exercises done in Python (with numpy)
How to trick a neural network into thinking a panda is a vulture - excellent exploration by Julia Evans (with Python source code) of an adversarial attack on an image classifier

Text models

For code

For everything else

Washington Post coverage of the data contained in the 'C4' dataset and how it influences the training of popular large models. Also allows users to check if arbitrary URLs are part of the dataset. (NOTE: C4 is not the only source of training text for the models being discussed, and the authors aren't doing a great job highlighting that, but it should still be pretty representative)
How well does ChatGPT speak Japanese? - an April 2023 evaluation of GPT-3.5 and GPT-4 performance on Japanese language assessments. Also includes an interesting comparison of the number of tokens required to represent the "Lord's Prayer" in multiple languages. I found the results of the latter particularly surprising.

Misc.

I gave a talk on the fundamentals of neural networks to Boston Python in March 2023
3blue1brown has an excellent series of lessons about the fundamentals of neural networks. Particularly interesting to me is the lesson on backpropagation for its excellent visualization of the process of adjusting neural network weights.

Writings by others

Academic works

"Unmasking Clever Hans predictors and assessing what machines really learn" - "...it is important to comprehend the decision-making process itself...transparency of the what and why in a decision of a nonlinear machine becomes very effective for the essential task of judging whether the learned strategy is valid and generalizable or whether the model has based its decision on a spurious correlation in the training data"
"On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" - "LMs with extremely large numbers of parameters model their training data very closely and can be prompted to output specific information from that training data"
"Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions" - "In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40% to be vulnerable."
"ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models" - "Over 90% of 1008 generated jokes were the same 25 Jokes."
"How is ChatGPT's behavior changing over time?" - "We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time."

Non-academic works

Donald Knuth: correspondence with Stephen Wolfram - "I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same."
Douglas Hofstadter: "Gödel, Escher, Bach, and AI" - "I frankly am baffled by the allure, for so many unquestionably insightful people...of letting opaque computational systems perform intellectual tasks for them."
Ted Chiang: "ChatGPT Is a Blurry JPEG of the Web" - "Large language models identify statistical regularities in text...When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression."
Ted Chiang: "Will A.I. Become the New McKinsey?" - "I’m not very convinced by claims that A.I. poses a danger to humanity because it might develop goals of its own and prevent us from turning it off. However, I do think that A.I. is dangerous inasmuch as it increases the power of capitalism."

Lawsuits

The legal status of generative models and their implications for intellectual property in the US is something I'm trying to keep an eye on. The cases given below are of particular interest to me.

ANDERSEN v. STABILITY AI LTD.

GETTY IMAGES (US), INC. v. STABILITY AI, INC.

DOE 1 v. GITHUB, INC.

SILVERMAN v. OPENAI, INC.

MATA v. AVIANCA, INC. (closed)

Note: this case is not about machine learning textually, but is included in this list because it is a notable example of gross misuse of a language model by plaintiff's counsel to submit falsified documents to the court. This led to censure of plaintiff's counsel and dismissal of the case.

@@ Line 21: / Line 21: @@
 * I gave [https://git.snoopj.dev/SnoopJ/talks/src/branch/master/2023/explaining_neural_networks a talk] on the fundamentals of neural networks to Boston Python in March 2023
 * 3blue1brown has an excellent [https://www.3blue1brown.com/topics/neural-networks series of lessons] about the fundamentals of neural networks. Particularly interesting to me is the lesson on [https://www.3blue1brown.com/lessons/backpropagation backpropagation] for its excellent visualization of the process of adjusting neural network weights.
+=== Writings by others ===
+==== Academic works ====
+* [https://doi.org/10.1038/s41467-019-08987-4 "Unmasking Clever Hans predictors and assessing what machines really learn"] - "''...it is important to comprehend the decision-making process itself...transparency of the what and why in a decision of a nonlinear machine becomes very effective for the essential task of judging whether the learned strategy is valid and generalizable or whether the model has based its decision on a spurious correlation in the training data''"
+* [https://doi.org/10.1145/3442188.3445922 "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜"] - "''LMs with extremely large numbers of parameters model their training data very closely and can be prompted to output specific information from that training data''"
+* [https://arxiv.org/abs/2108.09293 "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions"] - "''In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40% to be vulnerable.''"
+* [https://arxiv.org/abs/2306.04563 "ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models"] - "''Over 90% of 1008 generated jokes were the same 25 Jokes.''"
+* [https://arxiv.org/abs/2307.09009 "How is ChatGPT's behavior changing over time?"] - "''We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time.''"
+==== Non-academic works ====
+* [https://cs.stanford.edu/~knuth/chatGPT20.txt Donald Knuth: correspondence with Stephen Wolfram] - "''I myself shall certainly continue to leave such research to others, and to devote my time to developing concepts that are authentic and trustworthy. And I hope you do the same.''"
+* [https://www.theatlantic.com/ideas/archive/2023/07/godel-escher-bach-geb-ai/674589/ Douglas Hofstadter: "Gödel, Escher, Bach, and AI"] - "''I frankly am baffled by the allure, for so many unquestionably insightful people...of letting opaque computational systems perform intellectual tasks for them.''"
+* [https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web Ted Chiang: "ChatGPT Is a Blurry JPEG of the Web"] - "''Large language models identify statistical regularities in text...When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.''"
+* [https://www.newyorker.com/science/annals-of-artificial-intelligence/will-ai-become-the-new-mckinsey Ted Chiang: "Will A.I. Become the New McKinsey?"] - "''I’m not very convinced by claims that A.I. poses a danger to humanity because it might develop goals of its own and prevent us from turning it off. However, I do think that A.I. is dangerous inasmuch as it increases the power of capitalism.''"
 = Lawsuits =

Difference between revisions of "Neural Networks"