Virtuous Incentives / Compensation to join FLoC? #45

lbdvt · 2021-02-15T08:36:45Z

This issue stems from Issue #41, raised by @fischerbach, which I believe brings up a broader, key question: "Why would any site owner want to allow FLoC computation on its sites?"

In essence, one could compare FLoC to a data-cooperation program that everyone could benefit from, regardless of how much they contribute to it. Usually, data-coops include rules like:

To benefit from the coop, you must contribute to the pool.
The value you extract out of the coop must be on par with the value you bring to it.

But we don't see any of that in the current FLoC proposal.

Moreover, the usage of top-domain only for FLoC computation (Cf. issue #43) creates a contribution asymmetry depending on domain size and specificity.

Big platforms like 'video.example', 'social.example', 'shoppingmall.example', even if they allow to include the visits they get for FLoC computation, won't contribute to much value (as they are widely popular generic websites).

On the other hand, they will extract a lot of value and keep concentrating the demand, when small websites like 'gadgets.net', 'moviereviews.com' or 'cookingwares.com' will contribute a great deal with specific, qualified audiences from their website, making their contextual monetization less relevant as siphoned by FLoC which makes it available on bigger platforms.

We think that to make FLoC a useful, attractive and fair system, the solution should:

set as a rule that FLoC ids will be available for users on a site only if this site is allowing the visits it gets to be used in FLoC computation.
Make the contribution independent from website architecture, by getting signals based on the content of the page, and not only its top-level domain or URL address.
Define rules and compensation mechanisms to ensure that each participant gets a fair share of the value produced by FLoC

What are your thoughts on this?

dmarti · 2021-02-15T14:32:51Z

The other side of FLoC reciprocity is: because FLoC may reveal membership in a legally protected or otherwise sensitive group, sites will need to check their own audience data and content in order to decide whether or not to turn FLoC on.

#33

michaelkleber · 2021-02-15T16:01:06Z

Hello folks,

For the long-term behavior of FLoC, I think "FLoC ids will be available for users on a site only if this site is allowing the visits it gets to be used in FLoC computation " (Don's proposal in #33) is the right approach: sites access the user's flock, and that serves as a signal that the site wants to be included in subsequent flock calculation. There will also be an overriding opt-out mechanism, where a site can explicitly say that it does not want to be used for flock calculation, and if that's the case then subsequent requests for the user's flock will return an empty value.

That approach isn't feasible during the Origin Trial, though. The point of an OT is to offer an opportunity for a small number of early adopters to try out a new technology and give feedback on it, even before it is stable or has any chance for widespread adoption. If a first adopter gets zero benefit, then the OT is useless.

As for "getting signals based on the content of the page", I quite agree that using only the domain means flock leaves a lot of information out, and the signal would probably gain substantial value from being able to distinguish more finely the different pieces of a large site. It seems like there are two basic approaches to incorporating this information into clusters:

Pick some well-known ontology of topics, and give pages a way to self-declare what topics they are about. I like this in the abstract, but a few things worry me. For example, this only gets off the ground if a bunch of sites (or, probably, their ad networks) put effort into self-labeling, and there's the risk of the meaning of flock suddenly changing if one ad network changes its label assignment method.
Pick some well-known ontology of topics, and the browser runs some kind of ML classifier on the contents of each page. On-device content classification seems probably feasible, and Chrome is experimenting with it. But obviously this is a more researchy kind of approach, and I wouldn't say I'm confident it's going to work.
There's the possibility of using 1 and 2 together — pages declare their own topics, and the browser has a classifier that helps out for undeclared pages. If we got the benefits of both, that would be great! If we pay the costs of both, it would be depressing.

Are these the sorts of approaches you had in mind, or do you have some other ideas on using the content of the page?

dmarti · 2021-02-16T15:48:05Z

During the origin trial, will FLoC only be training on the sites that have opted in to the origin trial? (Seems like that will be necessary in order to prevent the origin trial from being a window of opportunity for sites participating in the trial to collect possible protected/sensitive group membership info.)

lbdvt · 2021-02-17T14:43:01Z

Hi,

Thanks for your answer!

That approach isn't feasible during the Origin Trial, though. The point of an OT is to offer an opportunity for a small number of early adopters to try out a new technology and give feedback on it, even before it is stable or has any chance for widespread adoption. If a first adopter gets zero benefit, then the OT is useless.

Sure, but we'll have to keep in mind when considering OT performance results. This will positively affect OT results, compared to general availability when adoption will be lower.

Are these the sorts of approaches you had in mind, or do you have some other ideas on using the content of the page?

These are definitely interesting ideas, that would be worth being tested.

These ideas, and the rule "to benefit from FLoC you have to allow your users visits to be taken account in FLoC computation", would probably give an incentive for most publishers (content producers that get remunerated through advertising) to join FLoC: "As a publisher, I share some knowledge on the habits of my users on my site (in a private manner), but in return I get some knowledge on the habits of my users outside my site, which enables me to get more value for my own ad inventory."

I think there's still a solution to look for to give advertisers (e.g.: retailers) an incentive to join FLoC. User buying intent on advertisers' sites is an important part of the value captured by third party cookies - capturing it with FLoC would be essential.

But advertisers do not care about getting their users FLoC ids, as they do not display ads. Moreover, allowing visits on an advertiser property to be taken into account in FLoC computation could trigger some nefarious use from the advertiser competitors, enabling these competitors to target / steal its users on the web.

I'd love to hear if anyone has any idea to address this concern.

dmarti · 2021-02-17T15:40:51Z

Some advertisers are likely to want to collect FLoC ids, for the common use case of "buy ads reaching people who are similar to those who already bought something from me".

Yes, this does involve a risk of leaking customer data to competitors, but there are already systems that enable data sharing across vendors, including competitors. An example is Adobe Experience Cloud Device Co-op

lbdvt · 2021-02-17T15:55:12Z

Some advertisers are likely to want to collect FLoC ids, for the common use case of "buy ads reaching people who are similar to those who already bought something from me".

This could be achieved by collecting FLoC ids only on a small fraction of the advertiser visitors - and not contributing to FLoC calculation for the vast majority of them.

An example is Adobe Experience Cloud Device Co-op

This is indeed interesting, but it features rules like:

Believes in data fairness: Equitable data sharing is an important concept in the Device Co-op. All Device Co-op members receive value relative to what they contribute. If you’ve never interacted with an anonymous person through a site visit or ad impression, you won’t get any information about their devices in the Device Graph. The Device Co-op helps brands recognize familiar consumers using unfamiliar devices.

Similarly, some advertiser data-coops include rules like "As an advertiser, I'm ok to share my visitors' data with other advertisers, provided they do not operate in the same sector as I do".

Without these kinds of rules, I fear that advertisers won't be willing to join FLoC.

michaelkleber · 2021-02-18T04:32:48Z

Hi folks,

During the Origin Trial, the default for whether a page will be used for FLoC computation will be based on Chrome's existing infrastructure which detects pages that load ads-related resources. Our thinking here is that pages detected as including ads-related resources probably fetched something with an ads-related 3p cookie attached, which means it's reasonable to guess that the page visit contributes to some ads profile today.

This certainly isn't perfect, and I'm sure it will have both false positives and false negatives. But it seems like the best way to have a FLoC Origin Trial that lets participants form a high-level opinion about how the signal might be useful, even if only a small number of ad tech companies choose to be involved at the OT stage.

Both the opt-out (via an interest-cohort permissions policy) and the opt-in (via calling the API on a page that did not opt out) will work; the ad resource detection will be the default for pages that don't do either one.

Our expectation for the full launch is to default to pages not contributing, with the opt-out and opt-in both available. But please remember that the point of an Origin Trial is to learn our lessons before the details are fixed — so the final design is still subject to change based on OT feedback.

dmarti · 2021-02-26T19:32:14Z

Thank you @michaelkleber. As I understand it...

If a Permissions Policy for "interest-cohort" applies to the page, then follow the permissions policy.
Otherwise, if any script on the page calls document.interestCohort, then use the page for FLoC computation.
Otherwise, check if the page has any resource on it that would be blocked by EasyList.
3a. If any resource on the page would be blocked by EasyList, use the page for training.
3b. If no resource on the page would be blocked by EasyList, then assume the page is opted out.

I have put in a pull request including a simple example of an opt out: #47

Is that right?

peligio · 2021-03-08T17:24:57Z

Given this method will be used to determine publisher opt in/out for origin trials, is it possible to confirm if FLoCs would be 'reset' once it's moved into full release?

Otherwise, it would be something for publishers to consider setting before origin trials launched if they wanted to opt out, using the initial opt out mechanism available for FLoC. Indeed, this is something that publishers should consider if they expect their default behaviour to be opt out, so that the trial data is more reflective of that scenario.

During the Origin Trial, the default for whether a page will be used for FLoC computation will be based on Chrome's existing infrastructure which detects pages that load ads-related resources. Our thinking here is that pages detected as including ads-related resources probably fetched something with an ads-related 3p cookie attached, which means it's reasonable to guess that the page visit contributes to some ads profile today.

michaelkleber · 2021-03-08T19:45:50Z

The cohort calculation is only based on the past one week of behavior. A publisher who chooses to opt out in the future can be sure that activity on their site from before they opted out will not have any long-term implications.

dmarti · 2021-03-22T17:11:24Z

If a Chrome extension injects an ad into a page, will the page be counted as a page that loads ads-related resources for purposes of FLoC training? (Related: #61)

michaelkleber mentioned this issue Mar 5, 2021

Make Computation opt-in for websites #51

Closed

michaelkleber mentioned this issue Mar 8, 2021

Availability for experimentation #25

Closed

lbdvt mentioned this issue Mar 15, 2021

Pages used for cohort calculation during the Origin Trial #62

Closed

michaelkleber mentioned this issue Mar 18, 2021

PING Privacy Review: Other browser implementations #70

Open

michaelkleber mentioned this issue Mar 25, 2021

Access to CohortID - api vs protocol's header #79

Open

bcyphers mentioned this issue Apr 2, 2021

SimHash may leak information about aggregate traffic of specific publishers #90

Open

This was referenced Apr 15, 2021

Provide rationale for FLoC being opt-out rather than opt-in #103

Closed

"Default allow" violates the core tenet of Internet communication protocols #106

Open

kyrofa mentioned this issue Apr 28, 2021

Add permission-policy header, opt out of FloC nextcloud-snap/nextcloud-snap#1715

Closed

TheMaskMaker mentioned this issue May 12, 2021

Integration with FedCM (formerly WebID) privacycg/is-logged-in#44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Virtuous Incentives / Compensation to join FLoC? #45

Virtuous Incentives / Compensation to join FLoC? #45

lbdvt commented Feb 15, 2021 •

edited

Loading

dmarti commented Feb 15, 2021

michaelkleber commented Feb 15, 2021

dmarti commented Feb 16, 2021

lbdvt commented Feb 17, 2021

dmarti commented Feb 17, 2021

lbdvt commented Feb 17, 2021

michaelkleber commented Feb 18, 2021

dmarti commented Feb 26, 2021

peligio commented Mar 8, 2021

michaelkleber commented Mar 8, 2021

dmarti commented Mar 22, 2021

Virtuous Incentives / Compensation to join FLoC? #45

Virtuous Incentives / Compensation to join FLoC? #45

Comments

lbdvt commented Feb 15, 2021 • edited Loading

dmarti commented Feb 15, 2021

michaelkleber commented Feb 15, 2021

dmarti commented Feb 16, 2021

lbdvt commented Feb 17, 2021

dmarti commented Feb 17, 2021

lbdvt commented Feb 17, 2021

michaelkleber commented Feb 18, 2021

dmarti commented Feb 26, 2021

peligio commented Mar 8, 2021

michaelkleber commented Mar 8, 2021

dmarti commented Mar 22, 2021

lbdvt commented Feb 15, 2021 •

edited

Loading