You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the Arabic validation set arabic_gsr_validation_18-11-14.xml, the sentence 5b6757616203c433883a1f0b produces a target actor with the code USAMED, whereas the actual target is "American soldiers" جندي_أميركي_ which would code to USAMIL. The MED (media) agent comes out of the word موقع (site/location) being in the sentence and the agent dictionary, and in a chain of dependencies (possibly due to a parsing error) connecting this to أميركي (American) but the phrase جندي_أميركي_ is in the actor dictionary and should have taken precedence: in other words, having matched a country-agent combination, there is no need to look further for agents (at least this is how TABARI and PETR-1 worked, and thus is still implicit in the UDP dictionaries). Also if multiple agents are present, the more proximate would take priority -- جندي (soldiers) is in the agent dictionary -- or at the very least, if agents were being concatenated, you'd get USAMILMED or USAMEDMIL. This is, granted, a somewhat odd situation as موقع probably shouldn't be in the agent dictionary in the first place, as it is too general (it's there, presumably, as a synonym for موقع موقع_إلكتروني (website) and got there via automated translation) but those agent assignment precedence rules for dictionaries and proximity are more general.
The text was updated successfully, but these errors were encountered:
In the Arabic validation set
arabic_gsr_validation_18-11-14.xml
, the sentence5b6757616203c433883a1f0b
produces a target actor with the code USAMED, whereas the actual target is "American soldiers" جندي_أميركي_ which would code to USAMIL. The MED (media) agent comes out of the word موقع (site/location) being in the sentence and the agent dictionary, and in a chain of dependencies (possibly due to a parsing error) connecting this to أميركي (American) but the phrase جندي_أميركي_ is in the actor dictionary and should have taken precedence: in other words, having matched a country-agent combination, there is no need to look further for agents (at least this is how TABARI and PETR-1 worked, and thus is still implicit in the UDP dictionaries). Also if multiple agents are present, the more proximate would take priority -- جندي (soldiers) is in the agent dictionary -- or at the very least, if agents were being concatenated, you'd get USAMILMED or USAMEDMIL. This is, granted, a somewhat odd situation as موقع probably shouldn't be in the agent dictionary in the first place, as it is too general (it's there, presumably, as a synonym for موقع موقع_إلكتروني (website) and got there via automated translation) but those agent assignment precedence rules for dictionaries and proximity are more general.The text was updated successfully, but these errors were encountered: