The CLEF-IP 2010 Test Collection

Dataset

DOI

CLEF-IP: Cross-Language Evaluation Forum - Intellectual Property The CLEF-IP track was launched in 2009 to investigate IR techniques for patent retrieval and it is part of the CLEF 2010 evaluation campaign.The track utilizes a collection of more than 1.3M patent documents (~2.6 million files) derived from EPO (European Patent Office) sources, and published before 2001. The collection contains documents in English, French and German with at least 150,000 documents in each language. The task is to find patent documents that constitute prior art. There are two tasks in the 2010's track. The first one is to find patent documents that are candidates to constitute prior art for a given document. The second task is to classify a given document according to the International Patent Classification system (IPC). Relevance judgements are produced using the patent citations and meta-data (bibliographic data). Files

Document CollectionThe collection contains over 2.6 million XML files. Topics and AnswersBoth the training and the test topic sets contain also the relevance assessments for the topics. GuidelinesDetailed explanation on how to work with the tasks from the corpus.

Identifier
DOI	https://doi.org/10.48436/jqrsc-jbq51
Related Identifier	IsDescribedBy http://ceur-ws.org/Vol-1176/CLEF2010wn-CLEF-IP-PiroiEt2010.pdf
Related Identifier	IsVersionOf https://doi.org/10.48436/dfds7-7az62
Metadata Access	https://researchdata.tuwien.ac.at/oai2d?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:researchdata.tuwien.ac.at:jqrsc-jbq51

Provenance
Creator	Piroi, Florina ; Tait, John
Publisher	TU Wien
Publication Year	2021
Rights	Creative Commons Attribution Non Commercial Share Alike 3.0 Unported; https://creativecommons.org/licenses/by-nc-sa/3.0/legalcode
OpenAccess	true
Contact	tudata(at)tuwien.ac.at

Representation
Language	English
Resource Type	Dataset
Version	1.0.0
Discipline	Other