Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running example.py #450

Open
artptz opened this issue Jun 4, 2024 · 3 comments
Open

Error running example.py #450

artptz opened this issue Jun 4, 2024 · 3 comments

Comments

@artptz
Copy link

artptz commented Jun 4, 2024

On: 31/08/2017	 -2.04
CARD PAYMENT TO SHELL TOTHILL,2.04 GBP, RATE 1.00/GBP ON 29-08-2013
My guess is: 
> 6
Traceback (most recent call last):
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/decorators.py", line 35, in decorated
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/tokenizers.py", line 59, in tokenize
    return nltk.tokenize.sent_tokenize(text)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize
    tokenizer = load(f"tokenizers/punkt/{language}.pickle")
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/data.py", line 750, in load
    opened_resource = _open(resource_url)
                      ^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/data.py", line 876, in _open
    return find(path_, path + [""]).open()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/Users/arturo/nltk_data'
    - '/Users/arturo/Documents/GitHub/BankClassify/.venv/nltk_data'
    - '/Users/arturo/Documents/GitHub/BankClassify/.venv/share/nltk_data'
    - '/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''
**********************************************************************


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/arturo/Documents/GitHub/BankClassify/example.py", line 5, in <module>
    bc.add_data("Statement_Example.txt")
  File "/Users/arturo/Documents/GitHub/BankClassify/BankClassify.py", line 58, in add_data
    self._ask_with_guess(self.new_data)
  File "/Users/arturo/Documents/GitHub/BankClassify/BankClassify.py", line 154, in _ask_with_guess
    self.classifier.update([(stripped_text, category)   ])
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 292, in update
    self._word_set.update(_get_words_from_dataset(new_data))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 64, in _get_words_from_dataset
    return set(all_words)
           ^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 63, in <genexpr>
    all_words = chain.from_iterable(tokenize(words) for words, _ in dataset)
                                    ^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 59, in tokenize
    return word_tokenize(words, include_punc=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/tokenizers.py", line 76, in word_tokenize
    for sentence in sent_tokenize(text)
                    ^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/base.py", line 67, in itokenize
    return (t for t in self.tokenize(text, *args, **kwargs))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/decorators.py", line 37, in decorated
    raise MissingCorpusError() from error
textblob.exceptions.MissingCorpusError: 
Looks like you are missing some required data for this feature.

To download the necessary data, simply run

    python -m textblob.download_corpora

or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.


Process finished with exit code 1

I ran
python -m textblob.download_corpora
but still received the above error

@yunhuiy
Copy link

yunhuiy commented Aug 14, 2024

I have the same error

@alvindera97

This comment was marked as outdated.

@elifbeyzatok00
Copy link

elifbeyzatok00 commented Aug 27, 2024

I have the same error too:

sample_text = "I love data science and machine learning. I love coding. I love data science and coding."
TextBlob(sample_text).ngrams(3) # 3-gram

LookupError                               Traceback (most recent call last)
File c:\Users\tokel\anaconda3\Lib\site-packages\textblob\decorators.py:35, in requires_nltk_corpus.<locals>.decorated(*args, **kwargs)
     [34](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/decorators.py:34) try:
---> [35](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/decorators.py:35)     return func(*args, **kwargs)
     [36](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/decorators.py:36) except LookupError as error:

File c:\Users\tokel\anaconda3\Lib\site-packages\textblob\tokenizers.py:59, in SentenceTokenizer.tokenize(self, text)
     [58](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/tokenizers.py:58) """Return a list of sentences."""
---> [59](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/tokenizers.py:59) return nltk.tokenize.sent_tokenize(text)

File c:\Users\tokel\anaconda3\Lib\site-packages\nltk\tokenize\__init__.py:119, in sent_tokenize(text, language)
    [110](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:110) """
    [111](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:111) Return a sentence-tokenized copy of *text*,
    [112](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:112) using NLTK's recommended sentence tokenizer
   (...)
    [117](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:117) :param language: the model name in the Punkt corpus
    [118](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:118) """
--> [119](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:119) tokenizer = _get_punkt_tokenizer(language)
    [120](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:120) return tokenizer.tokenize(text)

File c:\Users\tokel\anaconda3\Lib\site-packages\nltk\tokenize\__init__.py:105, in _get_punkt_tokenizer(language)
     [98](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:98) """
     [99](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:99) A constructor for the PunktTokenizer that utilizes
    [100](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:100) a lru cache for performance.
...
    python -m textblob.download_corpora

or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants