-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite ETIP as a GitHub repository #133
Comments
What about the existing database of trackers in ETIP? |
Not to create a PR, but indeed we would need to create a script to migrate our existing trackers to the new format. That goes into this point:
|
I think its a great idea and can help write scripts and setup CI. I recommend
using YAML for this, it is basically JSON that is meant to be human edited
(indeed all valid JSON is valid YAML). YAML is also widely understood since it
is used GitHub Actions, FUNDING.yml, gitlab-ci.yml, travis-ci.yml, F-Droid's
metadata .yml format, and many more.
I think it should probably use [StrictYAML](https://hitchdev.com/strictyaml/) to
make human editing easier. For example, `version: 1.1` vs `version: 1.1.0`
would always be parsed as the strings `"1.1"` and `"1.1.0"` while in plain
YAML, it would be a float `1.1` and a string `"1.1.0"`.
|
This sounds like a very good idea! Without much knowledge about the existing setup, I'd guess this change would make that data a lot easier to read for machines and thus enables other projects (like F-Droid) to use it as well.
If you use YAML or JSON you can use a JSON schema for validation.
It'd be great to see you on federated Gitea when it's there. Speaking of submission: https://www.datenanfragen.de/ / https://www.datarequests.org does provide a web form [0] that will create a PR with a JSON file (e.g. [1]) at GitHub for example. [0] - German : https://www.datenanfragen.de/suggest/#!type=new&for=cdb |
I agree. A change at this point should serve multiple purposes. If it is easier to administer as well as easier to consume into Exodus. I don't know what the bottleneck is that caused the 200 tracker signatures to pile up but if it is a delay in moving it from ETIP into the machine-readable formats of Exodus itself, tracker signature finders can certainly step up and write things in a format that is more consumable and less hands off. Or if we need to set up test environments and see how test apk's handle the signatures- I'm fine with that. I just want to get the current backlog whittled down and let that process dictate how a new system could introduce improvement. My only concern with Github is then another bottleneck is introduced because pull requests get sat on and people get caught up in a back & forth discussion about a tracker signature instead of having someone with the interest to implement new tracker definitions. I wouldn't think the implementer should immediately add every tracker as they are submitted. Rather, wait for 20 or so to pile up and then do the same operation to implement several at a time. If the problem is submitters leave fields blank or the regex isn't correct, then we need required fields and a note that it needs to be in a particular format for the implementers to integrate it. Just a note by the field. It doesn't need to be some fancy syntax checker. |
Thanks for your inputs @jawz101 ! I share your concern to reduce the current backlog. I just added a couple of (very) minor changes to ETIP to ease the review and we had a meeting this week within the organization to try to put more (volunteer) people into this task. Actually, moving the trackers from ETIP to exodus is probably the only thing which works really well (and it's automated so requires very little human time). What I miss the most in the current version of ETIP is:
What happens for most trackers currently in the backlog:
I would say that the case 3 is the most common, then 2 then 1. I'm thinking that moving to a code repository would ease the discussion between submitters and reviewers, and allow us to not let a huge backlog like the current one happen. But I can be wrong, this won't solve every issue of ours. And yes, we probably need to tackle the backlog before moving to a new system. |
Perhaps something that indicates "needs more information" if there is a question about if it is indeed a tracker or not. I still like seeing that a signature is in there even if it does not fit the definition of a tracker because it would likely come up again. I mainly look for technical documentation if it is publicly accessible which tells me that it must be in some application somewhere at least at one point. Since there is not a convenient way to upload unknown apks directly from the phone, a cumbersome part of the submission process is having to go to the Exodus site with a package name in mind and upload it. And with the library only representing a 80,000 or so apps that leaves a large chunk unchecked. But yeah, the ETIP website did seem like a lot of effort to invest rather than using something like Github. Though if it functions on the backend with a database, that has its own conveniences. |
I have some time to work on this, so I started sketching it out. Here is the first stab at a YAML conversion, it definitely needs work, but it is a good place to continue to conversation: @pnu-s did you have time to work on this at all? If you have code for getting the data out of the database, I'm happy to work on getting it nicely outputted to YAML. I've been working from the JSON from https://reports.exodus-privacy.eu.org/api/trackers |
I was just working with @Miriam-cpu / mobilsicher.de and we thought that we could standardize on a data format here that would work for:
I think we can clearly use the same code data fields and structures, and additional project-specific fields can be added as needed without conflicting with these core fields. This works well when the base data structure is a dictionary. The only notable difference I can think of between these lists would be that Exodus and F-Droid's |
@eighthave Thanks for the work you've put into this! To be honest, we put our recent efforts about ETIP into adding new features to its current form, for instance to make it more explicit why some trackers are not accepted into εxodus yet (which is our main problematic at the moment). Rewriting ETIP would cost us, and I'm not entirely convinced that we would win more than lose in terms of ease of use and of features. That can obviously still be discussed and is not a final decision, but we decided to still invest into ETIP's current form. This being said, we are obviously open to discuss about the data format for trackers, and about changes in ETIP UI, JSON export format or εxodus JSON API response. |
Can you point me to the new ETIP work? I couldn't find anything. I'm still convinced that managing the ETIP/Exodus process via files and pull requests will make it easier to follow the work, and contribute to it. Millions of people are familiar with the git workflow at this point, so that alone means it is easier for people to follow. I have time to work on building this out, and we're going to do it anyway for the F-Droid.org proprietary libs list, and probably also the mobilsicher.de third-party list |
What I meant is that we added a couple of new features, such as the number of matches in exodus and the new badge for each tracker, which easily show why a tracker is not added to εxodus yet
I have mixed feelings about this, mostly because we would lose all the efforts we have made to the current form of ETIP (such as the automated integration of trackers from ETIP to εxodus, which would need to be rewritten). But I obviously see some benefits (otherwise I would not have create this issue in the first place 😄)
What do you imagine here? |
Where can I see that?
If you point me to the code that does that integration, I can look and see if I can handle the porting.
I think it is possible, as long as we can find agreement on how it should be maintained. I'm talking with mobilsicher.de and @IzzySoft about how to make this happen. mobilsicher.de currently maintains their own list, and @IzzySoft's library list is here in JSON Lines format: |
I just put together some examples to start thinking about this more: I don't yet see a clear logic to how the libraries are grouped. I think ETIP groups them more or less by "product" as defined by the companies that release it. Now that I've gone through this more, I think anti_features:
- NonFreeDep
- Tracking Then all of the |
After sleeping on this, I think we can actually leave the grouping pretty open because it should be fine if multiple profiles match a given library every now and then. These profiles are ultimately about showing info to a human, so multiple hits for a single library should be fine. |
You can see the first version of F-Droid rewriting its signature profiles as a git repo of YAML files now. We call is "suss" https://gitlab.com/fdroid/fdroid-suss |
Here's more on F-Droid's work on a YAML/git setup for signature profiles: |
After a long thinking and experiencing various pain with the current version of ETIP (as well as seeing the same pain from many contributors), I'm wondering whether we could/should revamp totally ETIP another way.
Hear me out: what about switching to a GitHub repository with a specific file for each tracker. I'd expect Markdown or JSON format (but preferably Markdown to ease the review).
Name of the repository could be: https://github.com/Exodus-Privacy/trackers
Advantages:
Drawbacks:
This is a major change so I would love to hear your opinions @U039b @jawz101 @eighthave @blaueente @IzzySoft (feel free to tag any potentially interested person)
The text was updated successfully, but these errors were encountered: