From Variance-Reduced Initialization To Knowledge Distillation-Inspired Pruning At Initialization: Embedding Efficiency Right From The Onset Of Neural Network Training