Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pronunciation variants #751

Open
stannam opened this issue Feb 4, 2021 · 3 comments
Open

Pronunciation variants #751

stannam opened this issue Feb 4, 2021 · 3 comments
Assignees

Comments

@stannam
Copy link
Member

stannam commented Feb 4, 2021

An example csv file (csv_pron_var.txt) is in 'csv_sample'.

  • Unwanted transcription column (only 'Canonical' is expected but another column 'Transcription' gets added when creating a corpus) -- edit: solved
    image

  • PCT doesn't autogenerate the inventory chart (all segments go under 'uncategorized') -- edit: this issue doesn't arise in the most recent version of 'master'.

  • Question: Phonotactic probability doesn't allow '... as separate entry' as variant options? (cf. this)
    image

  • Error when calculating MI, FL, PrOD, etc by pronunciation variants. -- edit: solved?

Traceback (most recent call last):
File "D:\PycharmProjects\CorpusTools\corpustools\gui\migui.py", line 44, in run
call_back = kwargs['call_back'])
File "D:\PycharmProjects\CorpusTools\corpustools\mutualinfo\mutual_information.py", line 197, in pointwise_mi
probability=True, need_wb=need_wd)
File "D:\PycharmProjects\CorpusTools\corpustools\contextmanagers.py", line 98, in get_frequency_base
tier = getattr(word, self.sequence_type)
AttributeError: 'Word' object has no attribute 'Transcription'

@stannam stannam added the bug label Feb 4, 2021
@stannam stannam self-assigned this Feb 4, 2021
@stannam stannam changed the title Mutual Information on Buckeye Corpus Buckeye Corpus pronunciation variants Feb 4, 2021
@stannam stannam changed the title Buckeye Corpus pronunciation variants Pronunciation variants Feb 15, 2021
stannam added a commit that referenced this issue Feb 16, 2021
Need to confirm the change in the Word class (lexicon.py) does not raise any error elsewhere. ('create corpus', and 'add word' functions work properly)
@stannam stannam mentioned this issue Mar 8, 2021
25 tasks
@YuHsiangLo
Copy link
Contributor

Hmmm I think this is caused by the mixing of attribute _transcription, Transcription, _transcription_name, and the transcription getter and setter... This issue is related to #756, and we'll need to do some fundamental refactoring of the Word class to solve this type of problems once and for all.

@stannam
Copy link
Member Author

stannam commented Apr 19, 2021

cf. we have a separate branch for this: 'pronunciation_variants'

@stannam
Copy link
Member Author

stannam commented Apr 26, 2021

The recent commit, the one that forces the column name (92113c5) gets rid of the 'Unwanted transcription column' issue (i.e., no 'canonical' column to start with). However, for an independent reason, "List pronunciation variants" is acting out again 😳.
I have added an example file csv_pron_var.txt to the 'csv_sample' folder.

@YuHsiangLo YuHsiangLo added enhancement and removed bug labels May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants