Nocaps Challenge Resources
Novel Object Captioning Resources
New Challenge
- Caption images with objects that are not appearing in the training phase: Nocaps Challenge
Before Nocaps Dataset
Reinforcement Learning
Pretrained Models & Current SOTA
- Microsoft Oscar
- Microsoft VIVO
- Microsoft VinVL: For Nocaps part, it is the same as VIVO except they trained a new object detector themselves and achieved even better performance.
Open Images Detectors
- Nocaps Default Detector
- 2020 Open Images Challenge #1 UT Austin UniDet
- 2019 Open Images Challenge #1 SenseTime TSD
Reading List(on Arxiv)
- Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
- Neural Baby Talk
- nocaps: novel object captioning at scale
- Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
- VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training
- VinVL: Making Visual Representations Matter in Vision-Language Models
- Self-critical Sequence Training for Image Captioning
- Self-critical Sequence Training for Image Captioning with SPICE
- ECOL-R: Encouraging Copying in Novel Object Captioning with Reinforcement Learning
- CIDEr: Consensus-based Image Description Evaluation
- SPICE: Semantic Propositional Image Caption Evaluation
All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.
Comment
TwikooLivere