Quantized Eigenvector Matrices for 4-bit Second-Order Optimization of Deep Neural Networks

Quantized Eigenvector Matrices for 4-bit Second-Order Optimization of Deep Neural Networks

Deep neural networks (DNNs) have achieved exceptional success throughout varied fields, together with laptop imaginative and prescient, pure language processing, and speech recognition. This success is basically attributed to first-order optimizers like stochastic gradient descent with momentum (SGDM) and AdamW. Nonetheless, these strategies face challenges in effectively coaching large-scale fashions. Second-order optimizers, akin to Okay-FAC,…