Indexed by:
Abstract:
Whether sub-optimal local minima and saddle points exist in the highly non-convex loss landscape of deep neural networks has a great impact on the performance of optimization algorithms. Theoretically, we study in this paper the existence of non-differentiable sub-optimal local minima and saddle points for deep ReLU networks with arbitrary depth. We prove that there always exist non-differentiable saddle points in the loss surface of deep ReLU networks with squared loss or cross-entropy loss under reasonable assumptions. We also prove that deep ReLU networks with cross-entropy loss will have non-differentiable sub-optimal local minima if some outermost samples do not belong to a certain class. Experimental results on real and synthetic datasets verify our theoretical findings. (C) 2021 Elsevier Ltd. All rights reserved.
Keyword:
Reprint Author's Address:
Source :
NEURAL NETWORKS
ISSN: 0893-6080
Year: 2021
Volume: 144
Page: 75-89
7 . 8 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:87
JCR Journal Grade:1
Cited Count:
WoS CC Cited Count: 4
SCOPUS Cited Count: 6
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 9
Affiliated Colleges: