For my own observe purpose, I like to use matlab small community with matlab single hidden layer as in matlab programming diagram. In this layout, X represents input, subscripts i, j, k denote matlab programming variety of units in matlab programming input, hidden and output layers respectively; w ij represents matlab programming weights connecting input to hidden layer, and w jk is matlab programming weights connecting hidden to output layer. Often matlab programming selection of matlab programming loss characteristic is matlab programming sum of squared error. Here I use sigmoid activation function and assume bias b is 0 for simplicity, meaning weights are matlab programming only variables that affect model output. Lets derive matlab programming formulation for calculating gradients of hidden to output weights w jk. The complexity of making a choice on input to hidden weights is that it affects output error in a roundabout way.