You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the data processing, I found that if two authors have the same name, for example, Yang, Xia (University A) and Yang, Xia (University B), they are grouped as the same author, even though they are from different universities. In this case, different authors with the same name are actually distinct individuals. Could this be considered a significant issue for the project?
Here is an example dataset (These authors are different individuals but are incorrectly grouped as the same author):
<style>
</style>
authorID
groupID
author_name
author_order
address
university
department
postal_code
city
state
country
RP_address
429
429
4595
Yang, Xia
6
Univ Malaya, Kuala Lumpur, Malaysia.
univ malaya
NA
NA
kuala lumpur
NA
malaysia
NA
1211
1211
4595
Yang, Xia
1
Cent South Univ, Xiangya Hosp 3, Dept Pediat, 138 Tongzipo Rd, Changsha 410013, Hunan, Peoples R China.
cent south univ
xiangya hosp 3
41001
changsha
hunan
peoples r china
NA
1294
1294
4595
Yang, Xia
5
Air Force Med Ctr, Dept Anesthesiol, Beijing, Peoples R China.
air force med ctr
NA
NA
dept anesthesiol
beijing
peoples r china
NA
1505
1505
4595
Yang, Xia
6
Shenzhen Univ, Shenzhen Peoples Hosp 2, Affiliated Hosp 1, Dept Traumat Orthoped,Shenzhen Translat Med Inst, Shenzhen 518028, Peoples R China.
Once we have our subset of possible similar entries, we match the existing info of row 1 against the subset. The entry only needs to match one extra piece of information - either address, email, or middle name. If it matches we assume it is the same person, and change the groupID numbers to reflect this.
The text was updated successfully, but these errors were encountered:
In the data processing, I found that if two authors have the same name, for example, Yang, Xia (University A) and Yang, Xia (University B), they are grouped as the same author, even though they are from different universities. In this case, different authors with the same name are actually distinct individuals. Could this be considered a significant issue for the project?
Here is an example dataset (These authors are different individuals but are incorrectly grouped as the same author):
<style> </style>https://docs.ropensci.org/refsplitr/articles/refsplitr.html#author-address-parsing-and-name-disambiguation
The text was updated successfully, but these errors were encountered: