%%% title = "Test to Falsify Claims of End-to-End Security or End-to-End Encryption in Encrypted Messenger or Encrypted Messaging Apps" abbrev = "e2esm" category = "info" docName = "apparently this tool demands a doc name but does not use it" ipr ="trust200902" area = "Internet" workgroup = "individual submission" keyword = ["messaging", "end to end", "end to end encryption", "end to end secure", "end to end security", "encryption", "security"]
[seriesInfo] status = "informational" name = "Internet-Draft" value = "draft-muffett-end-to-end-secure-messaging-04" stream = "IETF"
[[author]] initials="A." surname="Muffett" fullname="Alec Muffett" organization = "Security Researcher" [author.address] email = "[email protected]" %%%
.# Abstract
This draft describes a test which MAY be used to falsify claims that a messaging or messenger application, platform, solution, or service ("messaging solution") provides either or both of "end-to-end security" or "end-to-end encryption". (either/both: "E2E")
Any messaging solution, or clearly defined subset thereof, which claims to provide E2E, MUST satisfy this test; however satisfaction of this test is not wholly sufficient to determine that the messaging solution actually provides E2E.
{mainmatter}
"End-to-end security" and "end-to-end encryption" offer digital analogues of "closed distribution lists" for sharing content amongst a set of intended recipients, where all others are fully excluded from access to content.
This draft assumes a specific application of "end-to-end security" or "end-to-end encryption" towards the specific use case of individual and group messaging solutions where entities who are later added to a messaging group MUST NOT be able to access previously-sent content.
In turn, use cases for such messaging solutions include the sending and receiving of any or all of:
- UNICODE or ASCII messages
- images, video files or audio files
- one-way streaming video or audio
- two-way streaming video or audio, as in live calls
The application of this test does not depend upon whether the messaging solution is built upon a centralized, distributed, hybrid, or any other network model.
Comments are solicited and should be addressed to the working group's mailing list and/or the author(s).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [@RFC2119] [@RFC8174] when, and only when, they appear in all capitals, as shown here.
The following terminology SHALL be used for this test.
An "entity" is that which is distingushed by possessing a distinct [@TrustedComputingBase]
Use cases of an entity MAY include being a human being, a software bot, a conversation archiver, or something other which sends and/or receives messages.
Plaintext content ("content") is information of 0 or more bits, to be communicated.
Plaintext content and sensitive metadata ("PCASM") is the union set of content and associated "metadata" that describes the content, comprising any or all of:
Content Metadata is any data that can offer better than 50% certainty regarding the value of any bit of the content. Self-evidently, this also includes the value of the content itself.
- For block encryption of content, "size metadata" is the unpadded size of the content.
- For stream encryption of content, "size metadata" is currently undefined. (TODO)
- For transport encryption of content, accurate "size metadata" SHOULD NOT be observable or inferable.
Analytic Metadata is data that analyses, describes, reduces, or summarises the content.
A "message" is zero-or-more bits of content which has been composed by a sender and which is bound to a fixed and immutable set of zero or more recipients ("intended recipients") for that message.
A "recipient" of a message is an entity which MAY derive any PCASM for that message. Recipients of a message MAY exist outside of the set of intended recipients for that message. Means of derivation MAY include analysis of a larger corpus of messages.
The "sender" of a message is an entity which composes that message to a set of intended recipients and sends that message into the messaging solution.
A "platform" is an entity which provides a messaging solution.
TODO
A "backdoor" is any intentional or unintentional feature of a messaging solution whereby, in respect of a given message some PCASM of that message MAY become available to an entity that is not an intended recipient of that message, other than by the intentional action of an intended recipient.
The following preconditions MUST be met for the test to be satisfied. Failure to satisfy these preconditions is a failure of the test.
For any message, there exists no method to access its PCASM where that method is not equally available to all recipients.
TODO, obvious
TODO, obvious
Consistent inheritance of group membership as intended recipients in centralized messaging solutions
TODO, no cheating or sneaky insert/elisions
The test fails if, for any message that is sent through the messaging solution, the set of recipients for that message exceeds the set of intended recipients for that message.
TODO, non-PCASM, stuff out of scope, Ricochet, etc.
A "conversation" is a sequence of one or more messages, and the responses or replies to them, over a period of time, amongst a constant or evolving set of participants.
A given platform MAY distinguish between and support more than one conversation at any given time.
In "centralised" E2ESM such as WhatsApp or Signal, the software MAY offer collective "group" conversation contexts that provide prefabricated sets of recipients for the client to utilise when a message is composed or sent.
In "decentralised" E2ESM such as PGP-Encrypted Email or Ricochet the recipients of each message are individually determined by each sender at the point of composition; however "group" metadata may also exist, in terms of (e.g.) email addressees or subject lines.
For a series of one or more "messages" each which are composed of "plaintext content and sensitive metadata" (PCASM) and which constitute a "conversation" amongst a set of "participants", to provide E2ESM will require:
In the nature of "closed distribution lists", the participants in a message MUST be frozen into an immutable set at the moment when the message is composed or sent.
The complete set of all recipients MUST be visible to the sender at the moment of message composition or sending.
The complete set of participants in a message MUST be visible to all other participants.
Excusing the "retransmission exception", PCASM of any given message MUST only be available to the fixed set of conversation participants from whom, to whom, and at the time when it was sent.
If a participant that can access an "original" message intentionally "retransmits" (e.g. quotes, forwards) that message to create a new message within the E2ESM software, then the original message's PCASM MAY become available to a new, additional, and possibly different set of conversation participants, via that new message.
All participants MUST be peers, i.e. they MUST have equal access to the PCASM of any message; see also "Integrity of Participation".
The set of participants in a conversation SHALL NOT be increased except by the intentional action of one or more existing participants.
Per "Transparency of Participation" that action (introducing a new participant) MUST be visible to all other participants
Existing participants MAY publicly share links to the conversation, identifying data to assist discovery of the conversation, or other mechanisms to enable non-participant entities to subscribe themselves as conversation participants. This MAY be considered legitimate "intentional action" to increase the set of participants in the group.
Where there exists centralised E2ESM software that hosts participants:
-
The E2ESM software MUST provide each participant entity with means to review or revoke access for that participant's clients or devices that can access future PCASM.
-
The E2ESM software MUST provide each participant entity with notifications and/or complete logs of changes to the set of clients or devices that can or could access message PCASM.
This explanatory section regarding the principles has been broken out for clarity and argumentation purposes.
Content PCASM MUST be protected as it comprises that which is "closed" from general distribution.
The test for measuring this is (intended to be) modeled upon ciphertext indistinguishability [@CipherInd]
Exact size PCASM MUST be protected as it MAY offer insight into Content PCASM.
The test for measuring this is (intended) to address risk of content becoming evident via plaintext length.
Analytic PCASM MUST be protected as it MAY offer insight into Content PCASM, for instance that the content shares features with other, specimen, or known plaintext content.
Conversational Metadata MAY offer insight into Content PCASM, however the abstractions of transport mechanism, group management, or platform choice, MAY render this moot.
For example an PGP-Encrypted email distribution list named "[email protected]" would leak its implicit topic and participant identities to capable observers.
The term "participant" in this document exists to supersede the more vague notion of "end" in the phrase "end-to-end".
Entities, and thus participants, are defined in terms of their [@TrustedComputingBase] to acknowledge that an entity MAY legitimately store, forward, or access messages by means that are outside of the E2ESM software.
It is important to note that the concept of "entity" as defined by their TCB, is the foundation for all other trust in E2ESM. This develops from the basic definitions of a [@TrustedComputingBase] and from the concepts of "trust-to-trust" discussed in [@RoleOfTrust]. Failure of a participant to maintain integrity or control over their TCB should not be considered a failure of an E2ESM that connects it to other participants.
For example: if a participant accesses their E2ESM software via remote desktop software, and their RDP session is hijacked by a third party; of if they back-up their messages in cleartext to cloud storage leading somehow to data exfiltration, neither of these would be a failure of E2ESM. This would instead be a failure of the participant's [@TrustedComputingBase].
Further: it would be obviously possible to burden an E2ESM with surfacing potential integrity issues of any given participant to other participants, e.g. "patch compliance". But to require such in this standard would risk harming the privacy of the participant entity. See also: "Mutual Identity Verification" in "OPTIONAL Features of E2ESM"
The "ends" of "end to end" are the participants; for a message to be composed to be exclusively accessible to that set of participants, all participants must be visible.
For decentralised "virtual point-to-point" E2ESM solutions such as PGP-Encrypted Email or Ricochet, the set of participants is fixed by the author at the time of individual message composition, and MUST be visible to all participants.
For "centralised" E2ESM solutions such as Signal or WhatsApp, the set of participants is a "group context" shared amongst all participants and at the time of individual message composition it MUST be inherited into a set of "fixed" per-participant access capabilities by the author.
Inherent in the term "end to end secure messenger" is the intention that PCASM will only be available to the participants ("ends") at the time the message was composed.
If this was not the intention we would deduce that an E2ESM would automatically make past content available to newly-added conversation participants, thereby breaking forward secrecy. This is not a characteristic of any E2ESM, but it is characteristic of several non-E2ESM. Therefore the converse is true.
As a concrete example this means that participants who are newly added to a "group" MUST NOT be able to read messages that were sent before they joined that group - unless (for instance) one pre-existing participant is explicitly intended to provide a "searchable archive" or similar function. The function of such a participant is considered to be out of scope for the messenger.
Without equality of participation it would be allowed for a person to deploy a standalone cleartext chat server, available solely over TLS-encrypted links, declare themselves to be "participants" in every conversation from its outset, access all message PCASM on that basis, and yet call themselves an E2ESM.
So this is an "anti-cheating" clause: all participant access to PCASM MUST be via the same mechanisms for all participants without favour or privilege, and in particular PCASM MUST NOT be available via other means, e.g. raw block-device access, raw filestore, raw database access, or network sniffing.
If a conversation is not "only extensible from within" then it would be possible for participants to be injected into the conversation thereby defeating the closure of message distribution.
A subtle centralised vs: decentralised edge-case is as follows: consider a PGP-encrypted email distribution list. Would it break "closure of conversation" for a non-participant email administrator to simply add new users to the maillist?
Answer: no, because in this case the maillist is functioning as a "platform" for multiple "conversation" threads, and mere addition of of a new "transport-level" maillist member would not include them as a participant in ongoing E2ESM conversations; such inclusion would be a future burden upon existing participants.
However: similar external injection of a new entity into a centralised WhatsApp or Signal "group" would be clearly a breach of "closure of conversation".
There is little benefit in requiring conversations to be closed against "participant injection" if a non-participant may obtain PCASM access by forcing a platform to silently add extra means of PCASM access to an existing participant on behalf of that non-participant.
Therefore to be an E2ESM the platform MUST provide the described management of participant clients and devices.
"Disappearing", "expiring", "exploding", "ephemeral" or other forms of time-limited access to PCASM are strongly RECOMMENDED but not obligatory mechanisms for E2ESM, not least because they are impossible to implement in a way that cannot be circumvented by e.g. screenshots.
Some manner of "shared key" which mutually assures participant identity and communications integrity are strongly RECOMMENDED but not obligatory mechanisms for E2ESM.
The benefits of such mechanisms are limited to certain perspectives of certain platforms.
For instance: in Ricochet the identity key of a user is the absolute source of truth for their identity, and excusing detection of typographic errors there is nothing which can be added to that in order to further assure their "identity".
Similarly WhatsApp provides each participant with a "verifiable security QR code" and "security code change notifications", but these codes do not "leak" the number of "WhatsApp For Web" connections, desktop WhatsApp applications, or other clients which are bound to the E2ESM software which executes on that phone.
Participant-client information of this kind MAY be a highly private aspect of that participant's TCB, and SHOULD be treated sensitively by platforms.
For an example message with content ("content") of "Hello, world.", for the purposes of this example encoded as an ASCII string of length 13 bytes without terminator character.
Examples of Content PCASM would include, non-exclusively:
- The content is "Hello, world."
- The content starts with the word "Hello"
- The top bit of the first byte of the content, is zero
- The MD5 hash of the content is 080aef839b95facf73ec599375e92d47
- The Salted-MD5 Hash of the content is : ...
Size PCASM is defined in the main text, as it relates to the transport and/or content encryption mechanisms.
Examples of Analytic PCASM would include, non-exclusively:
- The content contains the substring "ello"
- The content does not contain the word "Goodbye"
- The content contains a substring from amongst the following set: ...
- The content does not contain a substring from amongst the following set: ...
- The hash of the content exists amongst the following set of hashes: ...
- The hash of the content does not exist amongst the following set of hashes: ...
- The content was matched by a machine-learning classifier with the following training set: ...
Examples of Conversation Metadata would include, non-exclusively:
- maillist email addresses
- maillist server names
- group titles
- group topics
- group icons
- group participant lists
Information which would not be PCASM would include, non-exclusively:
- The content is sent from Alice
- The content is sent to Bob
- The content is between 1 and 16 bytes long
- The content was sent at the following date and time: ...
- The content was sent from the following IP address: ...
- The content was sent from the following geolocation: ...
- The content was composed using the following platform: ...
A different approach to defining (specifically) end-to-end encryption is discussed in [@I-D.knodel-e2ee-definition].
Live working drafts of this document are at: https://github.com/alecmuffett/draft-muffett-end-to-end-secure-messaging
This document has no IANA actions.
This document is entirely composed of security considerations.
{backmatter}
<title>The End-to-End Argument and Application Design: The Role of Trust</title> <title>Ricochet Refresh</title> <title>BREACH</title> <title>Ciphertext indistinguishability</title> <title>Clipper chip</title> <title>Crypto Wars</title> <title>Dual-use technology</title> <title>Export of cryptography from the United States</title> <title>Logjam</title> <title>Meltdown</title> <title>Spectre</title> <title>Trusted Computing Base</title>