You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get the following error when running NER: TypeError: 'NoneType' object is not subscriptable
After debugging the error, I found out that it is trying to access the document's text attribute, but it is empty (None). I'm loading the document from a CONLL-U file created using Stanza, with the function stanza.utils.conll.conll2doc. So it seems loaded documents don't get their text attribute set. Each sentence has their text, but not the main document, which Stanza is trying to access in order to create the entity spans.
Is it possible to build the document's text from the sentences? That would fix the problem, I guess.
This is the entire stack trace:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "stanza/prepare_eval_data.py", line 59, in
main(args)
File "stanza/prepare_eval_data.py", line 30, in main
doc = nlp(doc_tokenized)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/pipeline/core.py", line 480, in call
return self.process(doc, processors)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/pipeline/core.py", line 431, in process
doc = process(doc)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/pipeline/ner_processor.py", line 123, in process
total = len(batch.doc.build_ents())
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 433, in build_ents
s_ents = s.build_ents()
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 752, in build_ents
self.ents.append(Span(tokens=ent_tokens, type=e['type'], doc=self.doc, sent=self))
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 1601, in init
self.init_from_tokens(tokens, type)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 1618, in init_from_tokens
self.text = self.doc.text[self.start_char:self.end_char]
TypeError: 'NoneType' object is not subscriptable
The text was updated successfully, but these errors were encountered:
Thanks for your response. NER doesn't work with documents loaded from CONLLU files. This snippet replicates the error:
>>> import stanza
>>> pipe = stanza.Pipeline("en", processors="tokenize,ner")
>>> doc = pipe("Dr. Pritchett gave me a new hip")
# At this point NER is correct. But...
>>> from stanza.utils.conll import CoNLL
>>> CoNLL.write_doc2conll(doc, "output.conllu")
>>> loaded_doc = CoNLL.conll2doc("output.conllu")
>>> pipe = stanza.Pipeline("en", processors="tokenize,ner", tokenize_pretokenized=True)
>>> pipe(loaded_doc)
# TypeError: 'NoneType' object is not subscriptable
I get the following error when running NER:
TypeError: 'NoneType' object is not subscriptable
After debugging the error, I found out that it is trying to access the document's
text
attribute, but it is empty (None
). I'm loading the document from a CONLL-U file created using Stanza, with the functionstanza.utils.conll.conll2doc
. So it seems loaded documents don't get their text attribute set. Each sentence has their text, but not the main document, which Stanza is trying to access in order to create the entity spans.Is it possible to build the document's text from the sentences? That would fix the problem, I guess.
This is the entire stack trace:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/zbeloki/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "stanza/prepare_eval_data.py", line 59, in
main(args)
File "stanza/prepare_eval_data.py", line 30, in main
doc = nlp(doc_tokenized)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/pipeline/core.py", line 480, in call
return self.process(doc, processors)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/pipeline/core.py", line 431, in process
doc = process(doc)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/pipeline/ner_processor.py", line 123, in process
total = len(batch.doc.build_ents())
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 433, in build_ents
s_ents = s.build_ents()
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 752, in build_ents
self.ents.append(Span(tokens=ent_tokens, type=e['type'], doc=self.doc, sent=self))
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 1601, in init
self.init_from_tokens(tokens, type)
File "/home/zbeloki/workspace/nlp_processors_evaluation/venv/lib/python3.10/site-packages/stanza/models/common/doc.py", line 1618, in init_from_tokens
self.text = self.doc.text[self.start_char:self.end_char]
TypeError: 'NoneType' object is not subscriptable
The text was updated successfully, but these errors were encountered: