ViTOR: Learning to Rank Webpages Based on Visual Features

Dataset

DOI PID

The visual appearance of a webpage carries valuable information about page’s quality and can be used to improve the performance of learning to rank (LTR). We introduce the Visual learning TO Rank (ViTOR) model that integrates state-of-the-art visual features extraction methods: (i) transfer learning from a pre-trained image classification model, and (ii) synthetic saliency heatmaps generated from webpage snapshots. Since there is currently no public dataset available for the task of LTR with visual features, we also introduce and release the ViTOR dataset, containing visually rich and diverse webpages. The ViTOR dataset consists of visual snapshots, non-visual features and relevance judgments for ClueWeb12 webpages and TREC Web Track queries. We experiment with the proposed ViTOR model on the newly introduced ViTOR dataset and show that our model significantly improves the performance of LTR with visual features.

Identifier
DOI	https://doi.org/10.17026/dans-xah-fkcq
PID	https://nbn-resolving.org/urn:nbn:nl:ui:13-ai-5j8y
Related Identifier	http://www2019.thewebconf.org
Related Identifier	https://lemurproject.org/clueweb12/
Related Identifier	https://trec.nist.gov/data/web2013.html
Related Identifier	https://trec.nist.gov/data/web2014.html
Metadata Access	https://easy.dans.knaw.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:easy.dans.knaw.nl:easy-dataset:115980

Provenance
Creator	Akker, B van den; Markov, I; Rijke, M de
Publisher	Data Archiving and Networked Services (DANS)
Contributor	University of Amsterdam
Publication Year	2023
Rights	info:eu-repo/semantics/openAccess; License: https://creativecommons.org/licenses/by-nd/4.0/; https://creativecommons.org/licenses/by-nd/4.0/
OpenAccess	true

Representation
Resource Type	Dataset
Discipline	Other