Text-image retrieval (T2I) refers to the task of recovering all images relevant to a keyword query. Popular datasets for text-image retrieval, such as Flickr30k, VG, or MS-COCO, utilize annotated image captions, e.g., “a man playing with a kid”, as a surrogate for queries. With such surrogate queries, current multi-modal machine learning models, such as CLIP or BLIP, perform remarkably well.
Articles
Related Articles
April 13, 2023
accelerating wrf i/o performance with adios2 and network-based streaming
With the approach of Exascale computing power for large-scale High Performance Computing (HPC) clusters, the gap...
Read More >
1 MIN READING
June 6, 2019
Transceiver Architectures for Full Duplex Systems with Unmatched Receiver
In this paper, we present an overview of both the challenges and the state-of-the-art in full...
Read More >
1 MIN READING
October 26, 2017
Trapezoidal block split using orthogonal C2 transforms for HEVC video coding
We present an extension for HEVC intra-frame coding with trapezoidal splits and orthogonal transforms. A block...
Read More >
1 MIN READING