Text-image retrieval (T2I) refers to the task of recovering all images relevant to a keyword query. Popular datasets for text-image retrieval, such as Flickr30k, VG, or MS-COCO, utilize annotated image captions, e.g., “a man playing with a kid”, as a surrogate for queries. With such surrogate queries, current multi-modal machine learning models, such as CLIP or BLIP, perform remarkably well.
Articles
Related Articles
June 10, 2018
A Robust Reconfigurable Front-End for Non-contiguous Multi-Channel Carrier Aggregation Receivers
A new architecture for multi-channel carrier aggregation receivers is proposed for eliminating risks of VCO injection...
Read More >
1 MIN READING
July 1, 2024
AP cooperation in Wi-Fi: Joint transmission with a novel precoding scheme, resilient to phase offsets between transmitters
Multi Access-Point (M-AP) cooperation is expected to play a key role in the next-generation Wi-Fi standard...
Read More >
1 MIN READING
September 11, 2017
ScaleSimulator: A Fast and Cycle-Accurate Parallel Simulator for Architectural Exploration
Design of next generation computer systems should be supported by simulation infrastructure that must achieve a...
Read More >
1 MIN READING