[R] Architecture Determines Optimization: Deriving Weight Updates from Network Topology (seeking arXiv endorsement - cs.LG)
Abstract: We derive neural network weight updates from first principles without assuming gradient descent or a specific loss function. St...