Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature: Ignore lists #62

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

push0ret
Copy link

@push0ret push0ret commented Nov 30, 2024

Hi!

Thank you for all your work with mpd/mpdscribble. It's a seriously great project!

I'm hoping to contribute with this little feature I built for my own use.
It simply allows you to specify an ignore list per scrobbler in which you can specify tracks for the scrobbler to ignore.

I decided to make it per scrobbler, as for my personal use case, I have a file scrobbler that does not have excludes while last.fm has some excludes.

The PR includes a manpage update describing the file format in detail.
I'm adding the rendered excerpt here for your convenience.

IGNORE FILE FORMAT
       Tracks can be ignored by listing them in an ignore file.  Each line  in
       the  file  specifies a pattern to match tracks you wish to exclude from
       scrobbling.  The format is simple and flexible, allowing you  to  match
       by artist, album, title, track number or a combination of these fields.

   File Format
       A  tag match is specified as tagname="value", where tagname is replaced
       with one of the supported tags (artist, album, title,  track).   Values
       must be quoted and spaces are not allowed surrounding the equal sign.

       Each line consists of one or more tag matches separated by spaces:
           tag1="value1" tag2="value2" ...

       Tags can appear in any order and blank lines are ignored.

       A  backslash  (\) is interpreted as escape character and may be used to
       escape literal double quotes within a value.  To write a literal  back-
       slash, use a double backslash (\\).

       Each  line  is limited to 4096 characters including the newline charac-
       ter.  Superflous characters are silently ignored.


   Examples
       If a tag is omitted, any value for that field is matched:

       title="Bohemian Rhapsody"
              Matches any track titled Bohemian Rhapsody.

       artist="Queen"
              Matches any track by Queen, regardless of album or title.

       artist="Queen" album="A Night at the Opera"
              Matches any track by Queen from A Night at the Opera, regardless
              of title.

       artist="Queen" album="A Night at the Opera" title="Bohemian Rhapsody"
              Matches a specific track by Queen, from A Night  at  the  Opera,
              titled Bohemian Rhapsody.

       artist="Queen" album="A Night at the Opera" track="01"
              Matches  the  first  track  on the album A Night at the Opera by
              Queen.  Note that track tags are interpreted  as  text  and  not
              numbers, meaning "01" is not the same as "1".

       artist="Clark \"Plazmataz\" Powell"
              Matches tracks by Clark "Plazmataz" Powell.

Kind regards,
Fabian

EDIT: updated rendered man page excerpt to reflect new file format

src/util/Compiler.h Outdated Show resolved Hide resolved
@@ -126,6 +126,58 @@ Your Last.fm password, either cleartext or its MD5 sum.
The file where mpdscribble should store its journal in case you do not
have a connection to the scrobbler. This option used to be called
"cache". It is optional.
.TP
.B ignore = FILE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need an indirection with another file? This complicates the thing both for the code and for users.
Also this file format looks rather arbitrary and cannot be extended or changed later. One has to remember what those columns are.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a separate file for the following reasons:

  • Avoid complicating the config file parser itself.
  • Managing the lists is easier when they are separate files. Users may want to interact with these lists through other utilities, for example a shell script that gets the current song playing from mpd and writes the details into a list.
  • Makes sharing the lists across multiple scrobblers simply work.

I came up with the file format to make the parser implementation trivial / easily verifiable. I wanted to avoid having weird audio metadata end up breaking the parser. I considered two options: Very simple easy to parse format and write a small parser myself, or use a standardized machine readable format and implement it using a library. I'm happy to discuss alternatives.
The format also allows easy construction of list entries from audio metadata. In most cases, a simple mpc current -f "%artist%;%album%;%title%" >> ignorelist should just work. For the rare cases the metadata itself contains a semicolon, the escaping rules are simple to apply via regex substitution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Max

I understand your concern about the inflexibility of the file format and to address this I would propose a "ignore_file_tags" (naming is not final) option that is global for all scrobblers.
This option would be set to a space delimited string of tag names, which will then correspond to the columns of the ignore file. The current behavior would be represented by the option ignore_file_tags = artist album title.

At the moment, the Record structure only exposes artist, album, title and track tags, so the flexibility is quite limited. However, making the format configurable allows us to add more tags in the future without breaking users existing configurations while also allowing users to only match the tags they care about in their ignore file, avoiding useless extra columns.

Please let me know if you have concerns with this approach. I will implement this after your green light.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idea sounds like it will grow into a complexity monster. And you're going to split the specification between the configuration file and the ignore file - the ignore file will no longer be self-contained, it depends on external data (the tag list specified in the configuration file).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback.
I think based on this we could opt for one of the following options:

  • Make the first line of the ignore file a mandatory format specification. E.g. artist;album;title as first line would set the format to what is hard coded now. I would avoid an optional format specification line as this would then require some marker to distinguish the specification line from an ignore line. Such a marker in turn requires an escaping mechanism to avoid clashing with metadata.
  • Drop the configurable columns altogether. New tags can always be supported with additional columns at the end. This would not break existing configurations, as trailing semicolons are already not required. The current parser treats Queen, Queen; and Queen;; identically. Downside is potentially worse user experience if at some point many more tags are supported. Users may have to deal with an annoying number of empty columns in their ignore files.
  • Scratch the custom format and parser, move to something standardized. I do like the minimal format we have at the moment, I think it's easy to deal with and doesn't pull in any dependencies. But if you prefer to use a standardized format, I can implement that (but will require a new dependency).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do something like:

ignore artist="Queen" album="A Night at the Opera"
ignore artist="Queen" title="Bohemian Rhapsody"

... right in the main configuration file. Simpler and super flexible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the dedicated file is useful for:

  • Automated management by tools/scripts (see my first comment).
  • Making it easy to use an ignore list on a subset of scrobblers, without duplicating the lists themselves.

The reason I did not use the format you proposed in the ignore file is simply parser complexity, having to handle quoting properly. But that is obviously not a blocker.
Considering the aforementioned benefits of the dedicated file, would it be an acceptable compromise to keep the dedicated ignore file, but use a tag="value" format as you suggested?

With the cost of additional complexity, we could also support both methods (in-place in the main config and dedicated ignore files). I'm not positive that that's worth it, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting is difficult, of course - but your CSV-like syntax doesn't solve this - look at how you had to implement backslash-quoting. No difference here.

I think it's okay to have this external file. I don't quite like the idea of having two files, but it's a situation where all possible solutions are bad, and maybe you're right and it's the least-bad one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. I will implement the tag="value" format for the ignore files then.
Thanks for taking the time to discuss this, it is appreciated.

src/Config.hxx Outdated Show resolved Hide resolved
src/IgnoreList.cxx Outdated Show resolved Hide resolved
src/IgnoreList.cxx Outdated Show resolved Hide resolved
src/IgnoreList.hxx Outdated Show resolved Hide resolved
src/ReadConfig.cxx Outdated Show resolved Hide resolved
src/ReadConfig.cxx Outdated Show resolved Hide resolved
src/ReadConfig.cxx Outdated Show resolved Hide resolved
src/ReadConfig.cxx Outdated Show resolved Hide resolved
@MaxKellermann
Copy link
Member

Your way of splitting the PR into commits doesn't make sense. If you have a known-bad commit, don't add another commit to fix it; instead, amend the bad commit.

About UNREACHABLE: let's just not rename it. Yes, the "gcc_" prefix isn't exactly correct, but making all UPPERCASE is a different kind of ugliness, but doesn't improve anything. This whole macro is obsolete anyway - once we have C++23, we can simply do std::unreachable() and remove the macro completely.

@push0ret
Copy link
Author

push0ret commented Dec 5, 2024

Your way of splitting the PR into commits doesn't make sense. If you have a known-bad commit, don't add another commit to fix it; instead, amend the bad commit.

Okay, then I will for the most part squash this down with the next patch.

About UNREACHABLE: let's just not rename it.

Okay, I'll revert to the original name.

@push0ret push0ret force-pushed the feature_ignore_lists branch from 4469242 to 63b6172 Compare December 16, 2024 15:05
@push0ret push0ret force-pushed the feature_ignore_lists branch from 63b6172 to 0065020 Compare December 16, 2024 15:12
@push0ret
Copy link
Author

Hi Max

I updated the file format as discussed previously and also implemented the other requested changes.
The changes are squashed into the original two commits.

@MaxKellermann
Copy link
Member

../../src/ReadConfig.cxx: In function ‘IgnoreList* load_ignore_list(const std::string&, Config::IgnoreListMap&)’:
../../src/ReadConfig.cxx:322:14: error: ‘file_exists’ was not declared in this scope; did you mean ‘std::errc::file_exists’?
  322 |         if (!file_exists(path.c_str())) {
      |              ^~~~~~~~~~~
      |              std::errc::file_exists

Windows build fails. And I wonder why this check exists at all - what is the point?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants