Repository logo
 
Loading...
Project Logo
Research Project

Accelerat.AI

Funder

Authors

Publications

Combining Text and Visual Modalities for Enhanced Portuguese Image Retrieval
Publication . Duarte, Rodrigo Manuel Teixeira; Campos, Ricardo Nuno Taborda; Proença, Hugo Pedro Martins Carriço
The availability of digital images on the Internet has grown exponentially in recent years. This has made it challenging for users to find relevant images in the context of Information Retrieval IR tasks, as search engines are often unable to understand their content accurately. This challenge becomes even greater when searching for images in languages other than English - especially low-to-mid resource languages like Portuguese, which often lack the necessary linguistic resources. To address these issues, several approaches have been proposed, such as using multimodal language models that attempt to understand both image content and associated textual information. However, most of these models are fine-tuned primarily for the English language. Another common strategy involves language translation models, where queries in a target language are translated into English before being processed. However, such a solution is also not perfect as the meaning of the query can be lost in translation, leading to suboptimal results. This MSc thesis tackles this challenge by developing and evaluating multimodal approaches for Portuguese image retrieval, with a specific focus on understanding the limitations and opportunities of current vision-language models. Our hypothesis is that combining text-based and image-based retrieval modalities through innovative score adjustment mechanisms will lead to more effective results than individual approaches alone. The primary objective of this research is to develop an effective image IR system for Portuguese queries and establish performance baselines for this domain. To achieve this, we created a Portuguese image retrieval evaluation dataset comprising 80 queries and 5,201 annotated images from the Portuguese Presidency website. We developed a novel hybrid retrieval algorithm that combines text-based and image-based retrieval through mathematical score adjustment mechanisms, utilizing K-Nearest Neighbors (KNN) algorithms for similarity matching. Our comprehensive evaluation encompassed traditional text-based IR methods, commercial search engines, Portuguese-specific language models, and state-of-the-art vision-language models. The results revealed that multilingual visionlanguage models, particularly OpenCLIP xlm-roberta-base, substantially outperformed traditional text-based approaches by 62% in MRR scores, achieving 71% better performance with shorter queries compared to longer descriptive formulations. Surprisingly, fine-tuning experiments showed decreased performance across all metrics, with degradations ranging from 16% to 28%, suggesting that pre-trained multilingual representations are more valuable than domain-specific adaptations. The proposed hybrid algorithm achieved meaningful improvements, with a 1.8% enhancement in Mean Reciprocal Rank over the best baseline approach.

Organizational Units

Description

A Accelerat.ai ambiciona criar soluções diruptivas baseadas em Agentes de Inteligência Artificial (IA) Conversacional e CCaaS que permitirão acelerar a interacção mais eficiente entre entidades publicas/privadas e cliente/cidadãos, criando assim novos modelos de negócio digitais mais sustentáveis, para uma Economia de Futuro. Pretende-se criar, de forma pioneira, uma plataforma de serviços cognitivos em Português Europeu (nos diversos sotaques e faixas etárias) com efeito completamente transformador na experiência do utilizador na interação com produtos ou serviços. Esta plataforma inovadora será mais inclusiva, permanente (24/7) e customizada por indústria, prevendo a resolução de cerca de 80% de casos de suporte. O consórcio convidará todos os portugueses, das diversas regiões do país, a contribuir ? com a sua voz e com o seu conhecimento como falante nativo ? para o treino de modelos de IA, serviços disruptivos para esta nova Era Digital em Portugal e na Europa.

Keywords

Contributors

Funders

Funding agency

IAPMEI - AGÊNCIA PARA A COMPETITIVIDADE E INOVAÇÃO, I.P.

Funding programme

Reforço: Agendas/Alianças mobilizadoras para a Inovação Empresarial (Empréstimos)

Funding Award Number

01/C05-i11/2024.PC644865762-00000008

ID