A Central Repository Collecting Structured Data Via Collaborative Online Communities

1414 WordsAug 29, 20156 Pages
1. INTRODUCTION [[Wikidata, and the problem]] Wikidata is a central repository collecting structured data via collaborative online communities. It has provided sup- port for the content of Wikipedia and as well as other sites1. Like many other projects that rely on contributions from volunteer users, Wikidata content are largely edited by vol- unteers (more than fifteen thousand active users2) from di- verse locations, backgrounds and skill levels. It is inevitable there is demand in understanding the quality and trustwor- thiness of Wikidata. One thing sets Wikidata apart from other linked open data repository is its focus on curating sourced data which makes each statement verifiable. How- ever, the reality is not only there are a large number of statements without any references, but also there is no over- all view of how good the sources are for the statements with references. [[Our focus]] Our focus in this paper is to investigate the trustworthi- ness of the statements by looking at how good the exist- ing sources are and explore using crowdsourcing paid mi- cro tasks to curate source data for the statements whose 1 https://www.wikidata.org/wiki/Wikidata:Main_Page 2 http://stats.wikimedia.org/wikispecial/EN/ TablesWikipediaWIKIDATA.htm#editor_activity_levels Qiong Bu School of Electronics and Computer Science University of Southampton q.bu@soton.ac.uk reference are missing. The trustworthiness of statement is defined as the degree to which the data deserves of trust or

More about A Central Repository Collecting Structured Data Via Collaborative Online Communities

Open Document