The specifics are a bit different, but the main ideas are much older than this, I’ll leave here the Wikipedia
"Frank Rosenblatt, who published the Perceptron in 1958,[10] also introduced an MLP with 3 layers: an input layer, a hidden layer with randomized weights that did not learn, and an output layer.[11][12] Since only the output layer had learning connections, this was not yet deep learning. It was what later was called an extreme learning machine.[13][12]
The first deep learning MLP was published by Alexey Grigorevich Ivakhnenko and Valentin Lapa in 1965, as the Group Method of Data Handling.[14][15][12]
The first deep learning MLP trained by stochastic gradient descent[16] was published in 1967 by Shun’ichi Amari.[17][12] In computer experiments conducted by Amari’s student Saito, a five layer MLP with two modifiable layers learned internal representations required to classify non-linearily separable pattern classes.[12]
In 1970, Seppo Linnainmaa published the general method for automatic differentiation of discrete connected networks of nested differentiable functions.[3][18] This became known as backpropagation or reverse mode of automatic differentiation. It is an efficient application of the chain rule derived by Gottfried Wilhelm Leibniz in 1673[2][19] to networks of differentiable nodes.[12] The terminology “back-propagating errors” was actually introduced in 1962 by Rosenblatt himself,[11] but he did not know how to implement this,[12] although Henry J. Kelley had a continuous precursor of backpropagation[4] already in 1960 in the context of control theory.[12] In 1982, Paul Werbos applied backpropagation to MLPs in the way that has become standard.[6][12] In 1985, David E. Rumelhart et al. published an experimental analysis of the technique.[7] Many improvements have been implemented in subsequent decades.[12]"
I thank you for your critic but I’m not writing a research paper here and therefore wikipedia is a good ressource for the uniniated public. This is also why I think it’s sufficient to know a) what an artificial neural network is by talking about the simplest examples b) this field of research didn’t initiate 10 years ago as often conceived by public, when first big headlines were made. These tradeoffs are always made: correctness vs simplification. I see your disagreeing with this PoV but that’s no reason to be condescending.