ClassRank

Results of ClassRank

← Come back to ClassRank online demo

Wikidata

In this page we are publishing some results collected after applying ClassRank in a dump of Wikidata. The dump used (date: 2016/10/26) is no longer offered in Wikidata's download server, and it is a file too large to be placed in our current server. If you have interest in reproducing our computations, please contact with the people listed in the footer of this web to get the source dataset.

We are also publishing some other files linked to some experiments in order to compare the notion of ClassRank with some metrics of Wikidata.

We provide the following contents:

  • Wikidata class-pointers. CSV file with en evaluation of more than 400 properties of Wikidata evaluated as class-pointer or not class-pointer.
  • ClassRank results: The settings used are the following:
    • Threshold: 5
    • Class-pointers: the ones indicated in the previous file.
    The format of this file is JSON, using the base model specified in the home page of this prototype . The classes are sorted in light of its ClassRank score in descending order. In this case, we have made a couple of additions to the base JSON model within the dictionary of each class:
    • Field "thrs": Here we have included a summary of the results that each class would have obtained by setting a different threshold. We include ClassRank score and total number of instances for the next values of the threshold: [5, 10, 15, 25, 50, 100, 300, 500, 1000] .
    • Field "under_t_cps": It contains identifiers of instances that does not affect the ClassRank score of the class because the correspondent class-pointer is not enough used (with threshold = 5).
  • List of classes according to Wikidata rules. Wikidata applies some rules in order to detect classes among its items. A given item is labelled as a class if it makes true one of the following:
    • The item is 'o' at least in a triple (x, instance of, o)
    • The item is 's' or 'o' at least in a triple (s, subclass of, o).
    We have tracked all the items which fit in on of those conditions and we have sorted them in light of their total number of connections (incoming or outgoing links with other items in the graph).
  • Experiments of precision. We have made a random sampling of classes detected by ClassRank and classes according to Wikidata rules in order to compare the precision of both approaches.

We provide a link to the source code of the release of ClassRank used to obtain the files listed.