-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4.0] Prevent finding no stemmer in finder when language is "*" #31795
Conversation
@Hackwar could you please check here and give your feedback? |
There are loads of languages which have no stemmer, thus we can't assume a stemmer is present. If the language is |
To extend a bit: stemming is a process that takes processing power and I can see several situations where you don't want to have a stemmer at all. |
It was not the intention to prevent the exception but to find a stemmer if possible. But I have no feelings if this improvement should be merge or not. |
@bembelimen Could you change this PR so it follows @Hackwar 's suggestion? To me it seems to be the better way, too. |
That would mean that a default install will never use the stemmer even though the stemmer is available for english, wouldnt it? |
Hmm, that's true ... not what's desired. |
That is intended behavior. The stemmer is resource intensive and I want people to make a conscious decision to enable this, also to select the right one/configure the search in its entirety for their system instead of simply enabling it and not looking into it further. |
And where do they do that?
|
The option is in the component configuration.
That is why that isn't visible to the user. I did not decide that |
I asked as I dont see anything regarding a stemmer in the component configuration for Joomla 4. I can see it in Joomla 3 but not 4 |
It is the "Default Language" option. |
Oh that is so obvious - not How on earth is anyone to know then that by selecting a default language there is (as you claim) a performance hit. That list is showing the languages on the site. Nothing to do with if those languages have a stemmer available |
And as the j3 field no longer exists you're breaking functionality on upgrade without any documentation |
Can we fix this with better text in the default language select? Maybe None -> None (Stemmer disabled) |
From what @Hackwar has said then no because the default site language might not have a stemmer available. |
Because it's not only the stemmer. This setting selects which language to use for indexing the content. That selection both influences how the text is tokenised as well as the stemmer to use. For example chinese texts don't have spaces between words, so we can't simply do a |
WOW - it's no wonder we get nowhere when you keep changing your mind on what information you will share about your code. This is the first time in this entire thread that you have said it was about something more than a stemmer. Every comment you have made here has only referred to this as being a stemmer. @bembelimen obviously thought it was just about a stemmer as he wrote in the PR. You didnt correct him then. I thought it was about a stemmer and you didnt correct me then. Only now three months later you wake up and start telling everyone its more than a stemmer. As I said before and now even more re-enforced by your latest revelation this is all an undocumented backwards compatibility break that is being hidden by you from everyone. |
Since this feature was broken in J3, there is no b/c to be broken. Besides the fact that our b/c claims don't actually extend to components and a major version actually does allow us to break b/c. I've described the functionality in the original PR here: #20391 I did not go into an elaborate rant about how the complete indexing system of Smart Search works, because I tend to not write 6-page-long essays to shift blame or try to make others feel stupid. I was asked to give feedback to this PR and concentrated on the context of this PR, which was the code to generate a stemmer object. All the relevant code in the |
Breaking b/c is fine it just has to be documented |
So the missing point is that we need to document this somewhere, can @Hackwar or @bembelimen do that please, Thanks. |
I dont understand why you have removed the release blocker. Its not resolved |
This pull request has automatically rebased to 4.2-dev. |
This pull requests has been automatically converted to the PSR-12 coding standard. |
Closing because this change is not a solution for the initial problem. Reopened issue #30372 |
Pull Request for Issue #30372 .
Summary of Changes
Try to find the correct language when "default star (*)" is given in finder stemmmer.
Testing Instructions
See: #30372
Actual result BEFORE applying this Pull Request
Current language not found when "*"
Expected result AFTER applying this Pull Request
Current language found.