Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
portfolio
publications
Retrieval Augmented Verification: Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts
Published in , 2024
Social Media posts, where real images are unscrupulously reused along with provocative text to promote a particular idea, have been one of the major sources of disinformation. By design, these claims are without editorial oversight and accessible to a vast population who otherwise may not have access to multiple information sources. This implies the need to fact-check these posts and clearly explain which parts of the posts are fake. In the supervised learning setup, this is often reduced to a binary classification problem, neglecting all intermediate stages. Further, these claims often involve recent events on which systems trained on historical data are prone to fail. In this work, we propose a zero-shot approach by retrieving real-time web-scraped evidence from multiple news websites and matching them with the claim text and image using pretrained language vision systems. We propose a graph structured representation, which a) allows us to gather evidence automatically and b) helps generate interpretable results by explicitly pointing out which parts of the claim can not be verified. Our zero-shot method, with improved interpretability, generates competitive results against the state-of-the-art methods.
Recommended citation: A. U. Dey, A. Llabrés, E. Valveny, and D. Karatzas, “Retrieval Augmented Verification: Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts,” Apr. 29, 2024, arXiv: arXiv:2404.10702. doi: 10.48550/arXiv.2404.10702.
Download Paper
Image-text matching for large-scale book collections
Published in 16th IAPR International Workshop on Document Analysis Systems, 2024
We address the problem of detecting and mapping all books in a collection of images to entries in a given book catalogue. Instead of performing independent retrieval for each book detected, we treat the image-text mapping problem as a many-to-many matching process, looking for the best overall match between the two sets. We combine a state-of-the-art segmentation method (SAM) to detect book spines and extract book information using a commercial OCR. We then propose a two-stage approach for text-image matching, where CLIP embeddings are used first for fast matching, followed by a second slower stage to refine the matching, employing either the Hungarian Algorithm or a BERT-based model trained to cope with noisy OCR input and partial text matches. To evaluate our approach, we publish a new dataset of annotated bookshelf images that covers the whole book collection of a public library in Spain. In addition, we provide two target lists of book metadata, a closed-set of 15k book titles that corresponds to the known library inventory, and an open-set of 2.3M book titles to simulate an open-world scenario. We report results on two settings, on one hand on a matching-only task, where the book segments and OCR is given and the objective is to perform many-to-many matching against the target lists, and a combined detection and matching task, where books must be first detected and recognised before they are matched to the target list entries. We show that both the Hungarian Matching and the proposed BERT-based model outperform a fuzzy string matching baseline, and we highlight inherent limitations of the matching algorithms as the target increases in size, and when either of the two sets (detected books or target book list) is incomplete. The dataset and code are available at https://github.com/llabres/library-dataset
Recommended citation: A. Llabrés, A. U. Dey, D. Karatzas, and E. Valveny, “Image-text matching for large-scale book collections,” Jul. 29, 2024, arXiv: arXiv:2407.19812. doi: 10.48550/arXiv.2407.19812.
Download Paper