WRPA stands for “Relational Paraphrase Acquisition from Wikipedia” corpus. The WRPA corpus contains relational paraphrases extracted by the WRPA system from Wikipedia . WRPA contains several sub-corpora:
WRPA-person is composed of a group of 362 paraphrases expressing the person-date_of_birth relation, 449 paraphrases expressing the person-date of death relation and 965 paraphrases expressing the person-place_of_birth relation.
WRPA-person-2 is composed of a group of 55 paraphrases expressing the person-alternate_name relation, 40 paraphrases for person-charge, 54 for person-child, 238 for person-residence, 233 for person-employee_of, 375 for person-member_of, 555 for person-origin, 40 for person-parent, 62 for person-religion, 94 for person-school_attended, 413 for person-spouse and 532 for person-title.
WRPA-authorship is composed of 81,101 pairs of paraphrases expressing the authorship relation.
WRPA-authorship-A is composed of 1,000 paraphrase pairs from WRPA-authorship manually annotated with the paraphrase phenomena they contain.
For further reading on the corpus, refer to the README.txt file in the corresponding download package and .
This research work is carried out in the framework of the following projects and grants: