Research in IR puts a strong focus on evaluation, with many past and ongoing evaluation campaigns. However, most evaluations utilize offline experiments with single queries only, while most IR applications are interactive, with multiple queries in a session. Moreover, context (e.g., time, location, access device, task) is rarely considered. Finally, the large variance of search topic difficulty make performance prediction especially hard.
Several types of prediction may be relevant in IR. One case is that we have a system and a collection and we would like to know what happens when we move to a new collection, keeping the same kind of task. In another case, we have a system, a collection, and a kind of task, and we move to a new kind of task. A further case is when collections are fluid, and the task must be supported over changing data.
Current approaches to evaluation mean that predictability can be poor, in particular:
Perhaps the most significant issue is the gap between offline and online evaluation. Correlations between system performance, user behavior, and user satisfaction are not well understood, and offline predictions of changes in user satisfaction continue to be poor because the mapping from metrics to user perceptions and experiences is not well understood.
July 9, 2018, extended to July 16, 2018
Notification of acceptance:
July 30, 2018, moved to August 10, 2018
Camera ready: August 27, 2018
Workshop day: October 22, 2018
Conference days: October 23-26, 2018
General areas of interests include, but are not limited to, the following topics:
Papers should be formatted according to the ACM SIG Proceedings Template.
Beyond research papers (4-6 pages), we will solicit short (1 page) position papers from interested participants.
Papers will be peer-reviewed by members of the program committee through double-blind peer review, i.e. authors must be anonymized. Selection will be based on originality, clarity, and technical quality. Papers should be submitted in PDF format to the following address:
Accepted papers will be published online as a volume of the CEUR-WS proceeding series.
Towards a Basic Principle for Ranking Effectiveness Prediction without Human Assessments: A preliminary Study
Novel Pre-Retrieval Query Performance Predictors and their Correlations for Heterogeneous Medical Applications
Data Analytics to study the transferability of parameter settings in IR
Causality, prediction and improvements that (don't) add up
The Challenges of Moving from Web to Voice in Product Search
Offline vs. Online Evaluation in Voice Product Search