Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dutch language #173

Open
wants to merge 18 commits into
base: 2.1.x
Choose a base branch
from
Open

dutch language #173

wants to merge 18 commits into from

Conversation

noud
Copy link

@noud noud commented Sep 6, 2020

hi,
can we have this?
it is the basic of Dutch language inflector.
thanks,
Noud

@noud
Copy link
Author

noud commented Sep 6, 2020

oyg, it has "Review required" that i am not used to ....

@noud
Copy link
Author

noud commented Sep 6, 2020

ow and Dutch is very irregular so more additions will follow.

@greg0ire
Copy link
Member

greg0ire commented Sep 6, 2020

oyg, it has "Review required" that i am not used to ....

It needs we in the team need to review the PR. I will do so when you manage to make the test suite pass. I just contributed #174 , that will help you figure out what is left to do on your next push or if you close and reopen this PR.

@greg0ire
Copy link
Member

greg0ire commented Sep 6, 2020

You can run vendor/bin/phpcbf, it will take care of most issues.

@noud
Copy link
Author

noud commented Sep 6, 2020

all checks passed and i have now that phpcbf running local as well.
thanks.

Copy link
Member

@greg0ire greg0ire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecov is right, please add some tests. There are examples in tests/Doctrine/Tests/Inflector/Rules/{English,French}

@noud
Copy link
Author

noud commented Sep 6, 2020

ok will do was already scared about tests ....

@noud
Copy link
Author

noud commented Sep 6, 2020

okay,

so i partly solved a problem
given i have this during actual use:

$ php artisan api-platform:generate
"Adres"
"Afbeelding"
"Link"
"Locatie"
"Politiebureau"
"PolitiebureausLocatie"
"Translation"
"Twitter"
"Wijkagent"
"WijkagentenLinks"
"WijkagentenLocatie"
"WijkagentenTranslations"

(this is from https://github.com/noud/politie-open-data-api)

see the correct "Adres" there
and the failing one in test.

Noud

*/
public static function getPlural() : iterable
{
// allready in plural
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's spelt "already", and I don't understand what this comment is about…

Copy link
Author

@noud noud Sep 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected.
and i also have a non-understanding w/ the comment.

originally it came from github.com/noud/cakephp-dutch/blob/master/Config/inflections.php
also used in github.com/enflow/component-inflector/blob/master/src/Language/Dutch.php

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove it then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may it be let in, as a remider?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a reminder of what? We have no idea what it means…

@greg0ire
Copy link
Member

greg0ire commented Sep 6, 2020

Please kindly squash your commits together. If you don't, we'll try to remember to do it for you but it's best if you save us this trouble.

How to do that?

  1. git rebase -i origin/master, assuming origin is a git remote that points to this repository, and not your fork. If you're not sure what your remotes are, run git remote -vvv, there should be your fork and the holy/reference/base/origin/whatever-you-call-it repository.
  2. A window will show up with many lines, replace pick with fixup on every line but the first one
  3. Close your editor, git should do its magic, and you should end up with one commit
  4. Use git push --force to overwrite what you already push. Don't forget the --force option otherwise git will try to merge both things together.

@noud
Copy link
Author

noud commented Sep 6, 2020

oyg, i will squash my commits but not immediate .... and if i manage to do that squashing ....

@noud
Copy link
Author

noud commented Sep 7, 2020

i managed to squash.

@noud noud changed the title dutch dutch language Sep 7, 2020
@greg0ire greg0ire closed this Oct 27, 2020
@greg0ire greg0ire reopened this Oct 27, 2020
@greg0ire greg0ire closed this Oct 28, 2020
@greg0ire greg0ire reopened this Oct 28, 2020
@greg0ire greg0ire changed the base branch from master to 2.1.x October 28, 2020 15:04
@greg0ire greg0ire force-pushed the master branch 2 times, most recently from a4cf50b to eea1cb9 Compare October 28, 2020 15:23
@noud
Copy link
Author

noud commented Oct 28, 2020

Users don't need to know for how long we supported a language in this lib, do they?

correct

@noud
Copy link
Author

noud commented Oct 28, 2020

no tweet

@noud
Copy link
Author

noud commented Oct 28, 2020

might w/ merge

public function dataSampleWords(): array
{
return [
['schip', 'schepen'],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some extra tests here.

Suggested change
['schip', 'schepen'],
['schip', 'schepen'],
['idee', 'ideeën'],

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi you 2,

did swap my incorrectness and did add note for possible rule.

tests running again now.

Copy link

@TimoBakx TimoBakx Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! 👋

Please also add:

  • ['meer', 'meren'],
  • ['baas', 'bazen'],
  • ['oog', 'ogen'],
  • ['as', 'assen'],

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added but outcommented for now.

@noud
Copy link
Author

noud commented Dec 1, 2020

@greg0ire question:

PHP CodeSniffer Config installed_paths set to ../../slevomat/coding-standard,../../doctrine/coding-standard/lib
Error: The operation was canceled.

and

Generating code coverage report in Clover XML format ... done [00:00.009]
Error: Process completed with exit code 1.

???

this previous worked say half an hour back.
is your doing?


yield new Substitution(new Word('idee'), new Word('ideeën'));

// @todo: above 3 examples maybe could be compacted into a rule
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed needs to be a rule instead of specific exceptions for these words, for example wee, zee, twee, drie, theorie etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now i see 2 rules:

  • wee, zee, twee
  • drie, theorie
    but did not make it into rules yet, given i do not know yet if that does conflict other words.

// http://nl.wikipedia.org/wiki/Meervoud_(Nederlands)#Klinkerverandering
yield new Substitution(new Word('lid'), new Word('leden'));

yield new Substitution(new Word('smid'), new Word('smeden'));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the word above should be a rule. Also includes words like ooglid (and other samples of ~lid)

@TimoBakx
Copy link

TimoBakx commented Dec 1, 2020

Hello.

One more set to test for: ['kies', 'kiezen'],.

I'll check if there are more things to add tomorrow, if that's okay.

Thank you for your work on this!

@noud
Copy link
Author

noud commented Dec 1, 2020

I'll check if there are more things to add tomorrow, if that's okay.

and i maybe have to make 1 or 3 rules extra now tomorrow.

now: tomorrow is today and skips to the day after today, pardon me.

@noud
Copy link
Author

noud commented Dec 1, 2020

@greg0ire question, remark:

now we also have this:
PHP Fatal error: Uncaught Error: Class 'Doctrine\Tests\Inflector\Rules\LanguageFunctionalTest' not found in /home/noud/workspaces/laravel-workspace/inflector/tests/Doctrine/Tests/Inflector/Rules/English/EnglishFunctionalTest.php:14

??

@greg0ire
Copy link
Member

greg0ire commented Dec 2, 2020

I don't have that error locally on 7f6a1b435cdcd14cf1e24dabfb2951c69e04be2c, nor does the CI…

@noud
Copy link
Author

noud commented Dec 2, 2020

I don't have that error locally on 7f6a1b435cdcd14cf1e24dabfb2951c69e04be2c, nor does the CI…

@greg0ire can this be related to:

PHP CodeSniffer Config installed_paths set to ../../slevomat/coding-standard,../../doctrine/coding-standard/lib
Error: The operation was canceled.

and

Generating code coverage report in Clover XML format ... done [00:00.009]
Error: Process completed with exit code 1.

???

this previous worked say half an hour back.
is your doing?

i'll look at 7f6a1b4 as well ....

@greg0ire
Copy link
Member

greg0ire commented Dec 2, 2020

Yes maybe, try composer install?

@noud
Copy link
Author

noud commented Dec 2, 2020

Yes maybe, try composer install?

the github web checks we are talking about.

i just did rm -r vendor && composer install
nothing changed so nothing to commit.

i will add one bogus char or line now and we see....

okay, runs as expected now failing over some later added words to test.

@noud
Copy link
Author

noud commented Dec 2, 2020

@SenseException

I can't tell anything about the language, but under this premise we should still proceed with this PR even if we can't find someone who can review the parts of the dutch language.

Please also check the failing coding style build.

maybe my previous reply got lost, error on me.
this reply: i'd like to keep the comments in as pointers for further work.
is okay?

@greg0ire
Copy link
Member

greg0ire commented Dec 4, 2020

i'd like to keep the comments in as pointers for further work. is okay?

Given the large number of comments, it seems that there will be a significant amount of work to finish this, right?

Maybe this sounds a bit lame and half work done?

Not going to lie, it does 😬 Why should people use this in production if it only has a 50/50 chance of doing the job properly? Won't this generate a lot of support?

i did read somewhere someone in the known to doctrine gave up on German his own language

Can you find this again? I think it wouldn't make sense to accept a PR on Dutch if German was rejected in a similar case, would it?

@TimoBakx
Copy link

TimoBakx commented Dec 4, 2020

There is a lot of work still ahead. I think it would be best to implement more rules before merging this. I would love to help out if you want. I have little knowledge of the syntax for the rules, perhaps we could work on this together and improve it for at least most of the rules and hopefully quite a few exceptions.

As it is right now, with some rules for quite common words missing, I don't think it's a good idea to accept/merge this.

@noud
Copy link
Author

noud commented Dec 4, 2020

Not going to lie, it does Why should people use this in production if it only has a 50/50 chance of doing the job properly? Won't this generate a lot of support?

support.
For one i think most dutch projects are done in English.
Second i welcome any issues to pick them up. (mind is as we all know free and spare time work)
And, as i experience myself, i have over 50% good results and for sure in dev environment are already a little happy w/ that.
(furthermore i think if anyone keeps the stanza it should be 90% i guess it will never be done we will near never see just one or two PRs on such somewhat complex languages)

I did the same inflections (a bit more) some years ago fot CakePHP and i did read remarks commercial company
we are inspired by that cake work (them copy pasted in full) so there the least is a commercial web build company that is (somewhat) happy w/ it.

Now if we have a uni level linguistic organization around that is well known to Dutch it would not be a problem.

i did read somewhere someone in the known to doctrine gave up on German his own language

Can you find this again? I think it wouldn't make sense to accept a PR on Dutch if German was rejected in a similar case, would it?

It was not rejected, Did not come so far the person coding gave up.
i think this is what i did read (mind German is even worse then Dutch)
#99
"dereuromark commented on Sep 28, 2018": "I once tried as well, and after 500 lines of exceptions and still no where close to having a reliable package " (my idea is Dutch is a bit simpler)

@noud
Copy link
Author

noud commented Dec 4, 2020

There is a lot of work still ahead. I think it would be best to implement more rules before merging this. I would love to help out if you want. I have little knowledge of the syntax for the rules, perhaps we could work on this together and improve it for at least most of the rules and hopefully quite a few exceptions.

As it is right now, with some rules for quite common words missing, I don't think it's a good idea to accept/merge this.

O i welcome this Timo, at first i did think yeah you give tests but do not think rules but you do well seem to have the interest, good and nice. So yes help and working together for sure appreciated and accepted. Furthermore i noticed your name might tell about you having a dutch connection like me as well? (mind it is spare and free time work for me)

We have to find and understand the linguistic rules. Maybe see what linguistic organizations are able to help us, think organizations that do dutch text to speech, it is a bit in there lane of work.Some years ago Symfony had a web inflector, see where that thing went, use it as low level step-in for non coding people, maybe also have some web interface to the rule sets and exceptions, so non coding people can have give there input as well. (just a few thoughts)

@noud
Copy link
Author

noud commented Dec 4, 2020 via email

@alcaeus
Copy link
Member

alcaeus commented Dec 9, 2020

Hello @noud,

apologies if me removing myself from the reviewer list caused confusion. I don't have anything to contribute to the review: the technical side is very well covered by @greg0ire and @SenseException, and without any basic knowledge in the Dutch language I am more than unqualified to review language changes.

That said, I welcome the effort you and @TimoBakx are making to add support for Dutch language inflection. I had previously looked at adding support for German, but quickly gave up as many inflections depend on context which this library does not provide. Your plan to take word lists, throw them at the inflector and work to raise the score seems like a sensible approach. When it comes to context sensitive inflections, it may be worth exploring other avenues of solving this.

To bring an example that previously caused breakage, take the word "weather": there are two pluralisations for this word, "weather" and "weathers", and the appropriate one depends on the context. For example, it's correct to say "we're going out in all kinds of weather" (with "weather" being plural here), but also "we've encountered many different types of weathers on our trip". It's impossible for this library to know the context, so instead of thinking how this library can guess this context we should rather allow the user to specify context and add custom inflection rules for their own contexts. So please, don't try to solve every possible use-case, especially the ones this library currently doesn't support. Instead, aim for providing a good baseline and rely on users to deal with strange edge cases themselves.

@noud
Copy link
Author

noud commented Mar 23, 2021

to have better insight the dutch nouns i started
https://github.com/noud/nouns-db
and
https://github.com/noud/nouns-laravel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants