(yet) Pytorch Impl of Distributed Shampoo


< 목차 >


Optimizers for Single Processor

Vanilla AdamW


Shampoo


SOAP


Distributed Optimizer

scalable_2nd_order_paper_fig1 Fig.

scalable_2nd_order_paper_fig3 Fig.

distributed_shampoo_fig1 Fig.

distributed_shampoo_fig2 Fig.

distributed_shampoo_fig3 Fig.

distributed_shampoo_fig4 Fig.

distributed_shampoo_fig7 Fig.

Learning Rate Grafting: Transferability of Optimizer Tuning

adam_grafting Fig.

References