Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Reseacher from Meta have developed a new, simpler and faster hierarchical vision transformer called Hiera.
![Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles](/content/images/size/w1200/2023/06/Screenshot_20230606_132326.png)
![](https://ssv.ai/content/images/2023/06/Screenshot_20230606_132326-1.png)
Scientists have developed a new, simpler and faster hierarchical vision transformer called Hiera. By pretraining with a strong visual pretext task (MAE), the researchers were able to strip out unnecessary components from previous models, resulting in a more accurate and faster transformer. The researchers evaluated Hiera on a variety of image and video recognition tasks.
Paper
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
Modern hierarchical vision transformers have added several vision-specificcomponents in the pursuit of supervised classification performance. While thesecomponents lead to effective accuracies and attractive FLOP counts, the addedcomplexity actually makes these transformers slower than their vani…
![](https://static.arxiv.org/static/browse/0.3.4/images/arxiv-logo-fb.png)
Source Code
GitHub - facebookresearch/hiera: Hiera: A fast, powerful, and simple hierarchical vision transformer.
Hiera: A fast, powerful, and simple hierarchical vision transformer. - GitHub - facebookresearch/hiera: Hiera: A fast, powerful, and simple hierarchical vision transformer.