We introduce a novel Phased Training approach called Branch-Train-Stack that is highly efficient in terms of compute requirements while offering a sim...