Model implementations with various configurations (native ViT, ResNet+ViT hybrid, different patch/heads/blocks setups, Stochastic Depth/DropPath, etc.) Training and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results