Вы находитесь на странице: 1из 18

Chapter 43

Privacy-Enhancing Technologies1
Simone Fischer-Hbner
Professor, Karlstad University, Karlstad/Sweden

Stefan Berthold
Karlstad University

Privacy is considered a core value and is recognized legal systems, privacy is defined as a basic human right
either explicitly or implicitly as a fundamental human that only applies to natural persons.
right by most constitutions of democratic societies. In In general, the concept of personal privacy has several
Europe, the foundations for the right to privacy of indi- dimensions. This chapter mainly addresses the dimension
viduals were embedded in the European Convention on of informational privacy, which can be defined, similarly
Human Rights and Fundamental Freedoms of 1950 as by Westin and by the German Constitutional Court in
(Art. 8) and the Charter of Fundamental Rights of the its Census decision,2 as the right to informational self-
European Union in 2009 (Art. 7 & 8). In 1980, the determination (the right of individuals to determine for
OECD recognized the importance of privacy protection themselves when, how, and to what extent information
with the publication of the OECD Privacy Guidelines [1], about them is communicated to others). Furthermore,
which served as the foundation for many national so-called spatial privacy can be defined as another dimen-
privacy laws. sion of the concept of privacy, which also covers the
“right to be let alone,” where spatial privacy is defined as
the right of individuals to control what is presented to
1. THE CONCEPT OF PRIVACY their senses [4]. Further dimensions of privacy, which
Privacy as a social and legal issue has long been a con- will, however, not be the subject of this chapter, are terri-
cern of social scientists, philosophers, and lawyers. The torial privacy, which concerns the setting of limits on
first definition of privacy by legal researchers was given intrusion into the domestic, workplace, and other environ-
by the two American lawyers Samuel D. Warren and ments (public spaces), and bodily privacy, which con-
Louis D. Brandeis in their famous Harvard Law Review cerns protecting a person against undue interference, such
chapter “The Right to Privacy” [2], in which they defined as physical searches, drug testing, or information violat-
privacy as “the right to be let alone.” At that time, the ing his or her moral sense (see [5,6]).
risks of modern technology in the form of photography Data protection concerns protecting personal data in
used by the yellow press to infringe privacy rights was order to guarantee privacy and is only part of the concept
the motivation for Warren and Brandeis’s discussion of of privacy. Privacy, however, is not an unlimited or abso-
the individual’s right to privacy. lute right, because it can be in conflict with other rights
In the age of modern computing, an early and often or legal values and because individuals cannot participate
referred to definition of privacy was given by Alan fully in society without revealing personal data.
Westin: “Privacy is the claim of individuals, groups and Nevertheless, even in cases where privacy needs to be
institutions to determine for themselves, when, how and restricted, the very core of privacy still needs to be pro-
to what extent information about them is communicated tected. For this reason, the objective of privacy and data
to others” [3]. Even though, according to Westin’s defini- protection laws is to define fundamental privacy princi-
tion, natural persons (humans) as well as legal persons ples that need to be enforced if personal data is collected,
(groups and institutions) have a right to privacy, in most stored, or processed.

1. Parts of this work were conducted within the scope of the PetWeb II
project funded by the Norwegian Research Council (NFR) and the
U-PrIM project funded by the Swedish Knowledge (KK) Foundation. 2. German Constitutional Court, Census decision, 1983 (BVerfGE 65,1).

Computer and Information Security Handbook. DOI: http://dx.doi.org/10.1016/B978-0-12-394397-2.00043-X


© 2013 Elsevier Inc. All rights reserved. 755
756 PART | V Privacy and Access Management

An Agenda for Action for Privacy-Enhancing Technologies


The Privacy-Enhancing Technology (PET) Concept: (Check _____11. Does not require individuals to take active protec-
All Tasks Completed) tion measures.
_____1. Enforces making sparing use of data. _____12. Does not interfere with active protection
_____2. Makes privacy the default. measures.
_____3. Transfers control to individuals. _____13. Avoids creation and use of central database(s).
_____4. Sends tags to a secure mode automatically. _____14. Avoids creation and use of databases at all.
_____5. Can prove that automatic activation of the secure _____15. Enables functionality after point-of-sale in a secure
mode always works. way.
_____6. Prevents eavesdropping of tag-reader _____16. Can be achieved without changing Radio
communication. Frequency Identification (RFID) physical
_____7. Protects individuals from producer. technology.
_____8. Protects individuals from retailer. _____17. Does not make tags much more expensive.
_____9. Protection includes in-store problem. _____18. Does not introduce additional threats to
_____10. Protects tag in secure mode against presence- privacy.
spotting. _____19. Introduces additional benefits for privacy.
_____20. Provides benefits for the retailer.

2. LEGAL PRIVACY PRINCIPLES is of key importance for privacy protection, as the sensi-
tivity of personal data does not only depend on how
In this section, we present an overview to internationally “intimate” the details are, which the personal data are
well-accepted, basic legal privacy principles, for which describing, but is also mainly influenced by the purposes
PETs implementing these principles have been developed. of data processing and context of use. For this reason, the
These principles are also part of the general EU Data data processing purposes need to be specified in advance
Protection Directive 95/46/EC [7], which is an important by the lawmaker or by the data processor before obtaining
legal instrument for protection of privacy in Europe, as it the individual’s consent, and personal data may later not
codifies general privacy principles that have been imple- be (mis)used for any other purposes (cf. Purpose
mented in the national privacy laws of all EU member Specification and Use Limitation Principles of the OECD
states and of many other states. The principles also corre- Guidelines). The objective of privacy policy languages
spond to principles of the OECD Privacy Guidelines to and tools is to enforce this principle.
which we will also refer.

Legitimacy Data Minimization


Personal data processing has to be legitimate, which The processing of personal data must be limited to data
according to Art. 7 EU Directive 95/46/EC is usually the that are adequate, relevant, and not excessive (Art. 6 I (c)
case if the data subject3 has given his unambiguous (and EU Directive 95/46/EC). Besides, data should not be kept
informed) consent, if there is a legal obligation, or if there in a personally identifiable form any longer than neces-
is contractual agreement (cf. the Collection Limitation sary (Art. 6 I (e) EU Directive 95/46/EC—cf. Data
Principle of the OECD Guidelines). The requirement of Quality Principle of the OECD Guidelines, which requires
informed consent poses special challenges for the design that data should be relevant to the purposes for which
of user interfaces (UIs) for PETs. they are to be used). In other words, the collection of per-
sonal data and the extent to which personal data are used
Purpose Specification and Purpose Binding should be minimized because obviously privacy is best
protected if no personal data at all (or at least as little
(Also Called Purpose Limitation) data as possible) are collected or processed. The data
Personal data must be collected for specified, explicit, and minimization principle derived from the Directive also
legitimate purposes and may not be further processed in a serves as a legal foundation for PETs (see checklist: An
way that is incompatible with these purposes (Art. 6 I (b) Agenda for Action for Privacy-Enhancing Technologies).
EU Directive 95/46/EC). The purpose limitation principle that aim at protecting “traditional” privacy goals, such as
anonymity, pseudonymity, or unlinkability for users and/
3. A data subject is a person about whom personal data is processed. or other data subjects.
Chapter | 43 Privacy-Enhancing Technologies 757

Transparency and Rights of the Data 3. CLASSIFICATION OF PETS


Subjects Privacy-enhancing technologies (PETs) can be defined as
Transparency of data processing means informing a data technologies that enforce legal privacy principles in order
subject about the purposes and circumstances of data to protect and enhance the privacy of users of information
processing, identifying who is requesting personal technology (IT) and/or data subjects (see [9]). While
data, how the personal data flow, where and how long many fundamental PET concepts for achieving data mini-
the data are stored, and what type of rights and controls mization were already introduced, mostly by David
the data subject has in regard to his personal data. Chaum, in the 1980s (see [1012]), the term privacy-
The Directive 95/46/EC provides data subjects with enhancing technologies was first introduced in 1995 in a
respective information rights according to its Art. 10. report on “privacy-enhancing technologies, which was
Transparency is a prerequisite for informational jointly published by the Dutch Registratiekamer and the
self-determination, as “a society, in which citizens can Information and Privacy Commissioner in Ontario/
no longer know who does, when, and in which situations Canada [13]. PETs can basically be divided into three dif-
know what about them, would be contradictory to the ferent classes.
right of informational self-determination.”4 Further The first class comprises PETs for enforcing the legal
rights of the data subjects include the right of access to privacy principle of data minimization by minimizing or
data (Art. 12 (a) EU Directive 95/46/EC), the right to avoiding the collection and use of personal data of users
object to the processing of personal data (Art. 14 or data subjects. These types of PETs provide the
EU Directive 95/46/EC), and the right to correction, era- “traditional” privacy goals of anonymity, unlinkability,
sure, or blocking of incorrect or illegally stored data unobservability, and pseudonymity, which will be elabo-
(Art. 12 (b) EU Directive 95/46/EC, cf. Openness and rated on in the next section. This class of PETs can be
Individual Participation Principle of the OECD further divided, dependent on whether data minimization
Guidelines). is achieved on the communication or application level.
Examples for some of the most prominent data minimiza-
tion technologies will be presented in Section 6.
While data minimization is the best strategy for pro-
Security tecting privacy, there are many occasions in daily life
The data controller needs to install appropriate technical when individuals simply have to reveal personal data, or
and organizational security mechanisms to guarantee the when users want to present themselves by disseminating
confidentiality, integrity, and availability of personal data personal information (especially on social network sites).
(Art. 17 EU Directive 95/46/EC) (cf. Security Safeguards In these cases, the privacy of the individuals concerned
Principle of the OECD Guidelines). Classical security still needs to be protected by adhering to other relevant
mechanisms, such as authentication, cryptography, access legal privacy requirements). The second class of PETs
control, or security logging, which need to be implemen- therefore comprises technologies that enforce legal pri-
ted for technical data protection, will, however, not be vacy requirements, such as informed consent, transpar-
discussed in this chapter. ency, right to data subject access, purpose specification,
In January 2012, the EU Commission published a and purpose binding and security, in order to safeguard
proposal for a new EU Regulation [8], which defines a the lawful processing of personal data. In this chapter, so-
single set of modernized privacy rules and which (once called transparency-enhancing technologies will be dis-
the regulation is in force) will be directly valid across cussed, which are PETs that enforce or promote informed
the EU. In particular, it includes the principle of data consent and transparency. Besides, we will refer to pri-
protection/privacy by design and by default (Art. 23), vacy models and privacy authorization languages for
requiring building PETs already into the initial system enforcing the principle of purpose binding.
design. Besides, the requirements of transparency for The third class of PETs comprises technologies that
data handling and easily accessible policies (Art. 11) are combine PETs of the first and second class. An example
explicitly included. Furthermore, the right to be forgot- is provided by privacy-enhancing identity management
ten (Art. 17) and a right to data portability (Art. 18) are technologies such as the ones that have been developed
newly included, which require technical support by within the EU FP6 project PRIME (Privacy and Identity
appropriate PETs. Management for Europe) [14]. The PRIME architecture
supports strong privacy by default by anonymizing the
underlying communication and achieving data minimiza-
tion on the application level by the use of anonymous cre-
4. German Constitutional Court, Census decision, 1983 (BVerfGE 65,1).
dential protocols. Besides privacy presentation and
758 PART | V Privacy and Access Management

negotiation tools, privacy authorization language for relationship anonymity), which means that the relation of
enforcing negotiated policies and tools allowing users to who is communicating with whom is kept secret. Data
“track” and access their data that they released to remote minimization can also be implemented through obfuscat-
services sides ensure the technical enforcement of all pri- ing the presence of facts and events. The idea is that
vacy requirements mentioned in the section above. adversaries who are unable to detect the presence of facts
or events cannot link them to subjects. Pfitzmann and
Hansen define this privacy goal as undetectability.
4. TRADITIONAL PRIVACY GOALS OF PETS
Undetectability of an item of interest (IOI) from an attacker’s
In this section, privacy goals for achieving data minimiza-
perspective means that the attacker cannot sufficiently distin-
tion are defined, which we call traditional privacy goals
guish whether it exists or not [14].
because early PETs that were developed already in the
1980s followed these goals. Data minimization as an The strongest privacy goal in data minimization is
abstract strategy describes the avoidance of unnecessary unobservability, which combines undetectability and ano-
or unwanted data disclosures. The most fundamental nymity: Unobservability of an item of interest (IOI) means:
information that can be disclosed about an individual is G undetectability of the IOI against all subjects unin-
who he is, that is, an identifier, or which observable
volved in it and
events he is related to. If this information can be kept G anonymity of the subject(s) involved in the IOI even
secret, the individual remains anonymous. Pfitzmann and
against the other subject(s) involved in that IOI [14].
Hansen, who pioneered the technical privacy research ter-
minology, define anonymity as follows: Anonymity of a The third and last way to implement data minimization,
subject means that the subject is not identifiable within a apart from obfuscating the facts, the events (undetectabil-
set of subjects, the anonymity set [14]. ity, unobservability), or the relation between them and the
By choosing the term subject, Pfitzmann and Hansen subjects (unlinkability), is the use of pseudonyms in
aim to define the term anonymity as generally as possible. the place of subjects. Pseudonyms may be random num-
The subject can be any entity defined by facts (names or bers, email addresses, or (cryptographic) certificates. In
identifiers, or causing observable events, such as by send- order to minimize the disclosed information, pseudonyms
ing messages). If an adversary cannot narrow down the must not be linkable to the subject. The corresponding pri-
sender of a specific message to less than two possible vacy goal is pseudonymity: Pseudonymity is the use of
senders, the actual sender of the message remains anony- pseudonyms as identifiers [14].
mous. The two or more possible senders in question form Pseudonymity is related to anonymity, as both con-
the anonymity set. The anonymity set, and particularly its cepts aim at protecting the real identity of a subject. The
size, will be the first privacy metrics discussed in use of pseudonyms, however, allows maintaining a refer-
Section 5, Privacy Metrics. ence to the subject’s real identity (for accountability
An adversary that discovers the relation between a purposes [16]). A trusted third party could, for instance,
fact or an event and a subject identifies the subject. reveal the real identities of misbehaving pseudonymous
Relations cannot only exist between facts or events and users. Pseudonymity also enables a user to link certain
subjects, but may exist between facts, actions, and sub- actions under one pseudonym. For example, a user could
jects. An adversary may, for instance, discover that two reuse the same pseudonym in an online auction system
messages have been sent by the same subject, without (such as eBay) for building up a reputation.
knowing this subject. The two messages would be part of The degree of anonymity protection provided by pseu-
the equivalence relation [15] that is formed by all mes- donyms depends on the amount of personal data of the
sages that have been sent by the same subject. Knowing pseudonym holder that can be linked to the pseudonym,
this equivalence relation (and maybe even others) helps and on how often the pseudonym is used in various con-
the adversary to identify the subject. Pfitzmann and texts/for various transactions. The best privacy protection
Hansen define the inability of the adversary to discover can be achieved if for each transaction a new so-called
these equivalence relations as unlinkability. transaction pseudonym is used that is unlinkable to any
other transaction pseudonyms and at least initially unlink-
Unlinkability of two or more items of interest (IOIs (subjects,
able to any other personal data items of its holder (see
messages, actions, . . . )) from an attacker’s perspective means
also [14]).
that within the system (comprising these and possibly other
items), the attacker cannot sufficiently distinguish whether these
IOIs are related or not [14]. 5. PRIVACY METRICS
A special type of unlinkability is the unlinkability of a Privacy metrics aim to quantify the effectiveness of
sender and recipient of a message (or so-called schemes or technologies with regard to the privacy goals
Chapter | 43 Privacy-Enhancing Technologies 759

defined in the previous section. A simple metrics for mea- properties have to be reestablished afterward. T-closeness
suring anonymity is the anonymity set [14]. The anonym- as one of the latest developed properties in this category
ity set comprises all subjects that may have caused an is a close relative to information-theoretic privacy
event that is observed by the adversary. The subjects in metrics. Information-theoretic metrics measure the infor-
the set cover up for each other against the adversary. The mation that an adversary learns by observing an event or
adversary can thus not hold a single subject responsible a system, for example, all observable events in a commu-
for the observed event as long as the set size of the ano- nication system. The information always depends on the
nymity set is greater than one. Greater set sizes are useful knowledge the adversary had before his observation, the
for protecting against (stronger) adversaries that would a-priori knowledge. Knowledge is expressed as a proba-
accept false positives up to a certain threshold among the bility distribution over the events. For a discrete set of
subjects that are held responsible. In this case, the set size events X 5 fx1 ; x2 ; . . .; xn g and the probability mass func-
has to exceed such a threshold. tion Pr: X-½0; 1, Shannon [24] defines the self-
The anonymity set size is also related to the metrics in information of xi , 1 # i # n, as
k-anonymity [17]. K-anonymity is defined as a property
Iðxi Þ 5 2 log2 Prðxi Þ
or a requirement for databases that must not leak sensitive
private information. This concept was also applied as an The self-information Iðxi Þ is what an adversary learns
anonymity metrics in location-based services [18] and in when he observes the event xi with his a-priori knowledge
VoIP [19,20]. The underlying assumption is that database about all possible events encoded in the probability mass
tables store two kinds of attributes. The first kind is iden- function Pr. The self-information takes the minimal value
tifying information, and the second is identifying sensi- zero for Prðxi Þ 5 1; that is, the adversary will learn mini-
tive information. The claim is that the database table is mal information from observing xi , if he is a-priori certain
anonymous if every search for identifying information that xi will be observed. The self-information approaches
results in a group of at least k candidate records. The k in infinity for Prðxi Þ-0; that is, the more information
k-anonymity is thus the privacy parameter that determines the adversary learns, the less likely the observed events
the minimum group size. Groups of candidate records are. The expected self-information of a system with the
form anonymity sets of identifiers in the database table. events X is the entropy of the system. Shannon defines
The k-anonymity as a privacy property and k as a the entropy as
metrics are not undisputed. In particular, the fact that X
k-anonymity only depends on the identifying informa- HðXÞ 5 Prðxi ÞUIðxi Þ
xi AX
tion in the database table—that is, it is independent of
the sensitive information—leads to remaining privacy The entropy is maximal when the distribution of all
risks [21]. A simple attack building on the k-anonym- events is uniform; that is, Prðxi Þ 5 1n for all 1 # i # n. The
ity’s blindness for sensitive information is described in maximal entropy is thus
[22]. The trick is to search for candidate groups where
1
the sensitive attribute has a constant value for all candi- Hmax 5 2 log2 5 log2 n
dates. The sensitive attribute value is immediately dis- n
closed for all records in the candidate group; thus, privacy The entropy is minimal when one event xi is perfectly
is breached. A solution to these risks is a new privacy certain; that is, Prðxi Þ 5 1 and Prðxj Þ 5 0 for all xj AX and
property, l-diversity, with the privacy parameter l. xj 6¼ xi .
The claim of l-diversity is that a database table is anon- The “degree of anonymity” is the entropy of the com-
ymous if the diversity of sensitive attribute values is at munication system in question [25,26]. This degree of
least l ( . 1) in every candidate group. In some cases, anonymity measures the anonymity of the message’s
l-diversification would not sufficiently protect from sender within the communication system. When the
attribute disclosure or would be too difficult to estab- adversary learns the actual sender of the message, he will
lish. A third property, t-closeness [23], is solving this learn the self-information Iðxi Þ where each xi AX encodes
problem. T-closeness restricts the distribution of sensi- the fact that one specific subject is the sender of the mes-
tive information within a candidate group. The claim sage. The probability distribution Pr encodes the a-priori
is that a database table is anonymous if the distribution knowledge of the adversary about the sender. A uniform
of sensitive information within each candidate group distribution (maximal entropy) indicates minimal a-priori
differs from the table’s distribution at most up to a knowledge (maximal sender anonymity). Absolute cer-
threshold t. tainty, on the other hand (minimal entropy), indicates per-
All these privacy metrics with the exception of the fect a-priori knowledge (minimal sender anonymity). The
anonymity set are tailored to static database tables. degree of anonymity can be used to compare communica-
Changes in the data set are possible, but the privacy tion systems with different features (numbers of subjects)
760 PART | V Privacy and Access Management

when normalized [25] with max entropy Hmax and other- technologies that we think are most relevant from a prac-
wise without normalization [26]. tical and scientific point of view.
The same entropy-based metrics can be applied to
measure the anonymity provided by facts about a subject
[27]. The adversary’s a-priori knowledge, encoded in the DC Network
probability mass function Pr, comprises how well a fea-
ture vector with facts about a subject fits to each subject David Chaum’s DC (Dining Cryptographer) network pro-
xi AX. The adversary learns the self-information Iðxi Þ tocol [28] is an anonymous communication protocol,
when learning that the feature vector applies to the sub- which, even though it cannot be easily used in practice, is
ject xi . The entropy HðXÞ can be seen as the confusion of still very interesting from a scientific perspective. It pro-
the adversary before he learns the subject that the feature vides unconditional sender anonymity, recipient anonym-
vector is applying to. ity, and unobservability, even if we assume a global
The entropy can be used to calculate the expected adversary who can observe all communication in the net-
anonymity set size, which is 2HðXÞ . The expected anonym- work. Hence, it can guarantee the strongest anonymity
ity set size is equal to the anonymity set size, if Pr corre- properties of all known anonymous communication proto-
sponds to the uniform distribution (if the message [25,26] cols. DC nets are based on binary superposed sending.
or feature vector [27] is not more or less linkable to one Before any message can be sent, each participant in the
subject than to any other subject in X). The anonymity set network (user station) has to exchange via a secure chan-
size can be seen as an overestimation of the expected ano- nel a random bit stream with at least one other user sta-
nymity set size, if Pr does not correspond to the uniform tion. These random bit streams serve as secret keys and
distribution; that is, some subjects are more linkable than are at least as long as the messages to be sent. For each
others in the anonymity set. In this case, the expected single sending step (round), every user station adds mod-
anonymity set size or the degree of anonymity is the ulo 2 (superposes), all the key bits it shares and its mes-
more accurate metrics. sage bit, if there is one. Stations that do not wish to
Metrics that are based on entropy have the disadvan- transmit messages send zeros by outputting the sums of
tage that the a-priori knowledge Pr of the adversary has their key bits (without any inversions). The sums are sent
to be known when evaluating the metrics. When this a- over the net and added up modulo 2. The result, which is
priori knowledge can be derived from publicly available broadcast to all user stations, is the sum of all sent mes-
observations (message routing data in a network [25,26]), sage bits because every key bit was added twice (see
the metrics are easy to apply. If the Pr depends on per- Figure 43.1). If exactly one participant transmits a
sonal information which is not available to nonadversaries
[27], the metrics are hard to evaluate without additional sender a
tools being effective (legal transparency tools). message 01001000
keywith b + 10101010
keywith c + 11100011
6. DATA MINIMIZATION TECHNOLOGIES transmission 00000001

broadcast
In this section, we will present the most relevant PETs for
minimizing data on both communication and application
sender b
levels. It is important to note that applications (such as
message 00000000
eCommerce applications) can only be designed and used
keywith a + 10101010
anonymously if their users cannot be identified on a com-
keywith c + 00111010
munication level (via their IP addresses). Hence anony-
transmission 10010000 + 01001000
mous communication is a prerequisite for achieving
anonymity, more generally or data minimization, on the
application level. sender c
message 00000000
keywith a + 11100011
Anonymous Communication
keywith b + 00111010
Already in 1981, David Chaum presented the Mix net transmission 11011001
protocol for anonymous communication, which has
become the fundamental concept for many practical anon-
FIGURE 43.1 A DC network with three users. Sender a sends a mes-
ymous communication technologies that have been sage, but from observing the communication in the network alone, the
broadly used for many years. In this section, we will pro- adversary cannot tell whether it was sent by a, b, or c (and not even if a
vide an overview to anonymous communication meaningful message was sent at all).
Chapter | 43 Privacy-Enhancing Technologies 761

message, the message is successfully broadcast as the 3. All messages are recoded. This is usually done by
result of the global sum to each participant. Collisions cryptography. Without this functionality, the adversary
are easily detected (as the message sent by a user and the could link the input with the output messages by com-
one that is broadcast back to him will be different) and paring the contents of the messages.
have to be resolved, for example, by retransmitting the 4. The sending sequence of delayed messages is deter-
message after a random number of rounds. mined independently of the receiving sequence.
In theory, superposed sending provides, in the Without the delay and reordering of messages, an
information-theoretic sense, perfect (unconditional) adversary could link the input to the output messages
sender anonymity and unobservability, as the fact that by a time correlation attack (he could be sure that the
someone is sending a meaningful message (and not only first message is the first message out).
zeros) is hidden by a one-time pad encryption. From a 5. Mixes can be used to achieve unlinkability of sender
metrics perspective, all senders in the DC network form and recipient, sender anonymity as well as recipient
the anonymity set. No user is more likely to be the anonymity. For achieving the latter two properties, dif-
sender of the message than any other user for an outside ferent recoding functions are used. For providing
adversary. Perfect recipient anonymity can be achieved sender anonymity, asymmetric cryptography is used.
by reliable broadcasting. However, the DC network and The mix user (or more precisely his machine) encrypts
the one-time pad share the same practical shortcomings the message m with the public key eR of the recipient
which have prevented them from being broadly used: and achieves enceR ðmÞ. He then encrypts enceR ðmÞ
the security of DC networks depends on the perfect ran- together with the address of the recipient and a nonce
domness of keys and the secure distribution of keys. with public key e1 of the mix and sends the resulting
Moreover, each key is to be used only once, and the message ence1 ðr1 ; AR ; enceR ðmÞÞ to the mix. Adding the
keys need to be perfectly unavailable to the adversary nonce is necessary to achieve nondeterministic encryp-
(the adversary may not get hold of the keys before or tion, which prevents an adversary from monitoring the
after the message has been sent). output message enceR ðmÞ and address AR of the recipi-
ent, and then simply encrypt both values with the
public key of the mix and compare it with the mes-
Mix Nets sages sent to the mix. Moreover, it prevents the mix
from discarding one of two messages when identical
Mix nets [12] are more practical than DC networks, but contents are intended to be sent. The mix decrypts the
do not provide security against adversaries with unlimited
message with its private key, discards the nonce, and
resources. Nevertheless, most anonymity networks
sends enceR ðmÞ to the address AR of the recipient.
(Mixmaster, Tor, Onion Routing, and AN.ON) build on
6. Using a single mix can only provide anonymity if it is
the mix net concept. A mix is a relay or proxy server that
fully trustworthy and cannot be compromised. For
performs four steps to hide the relation between incoming
improving security, several mixes can be used in a
and outgoing messages (see Figure 43.2):
chain or a “cascade.” Let us assume that the sender
1. Duplicates (replayed messages) are discarded. Without (or more precisely his machine) chooses a chain of n
this functionality, an adversary could launch a replay mixes with addresses Ai and public keys ei , i 5 1. . .n.
attack by sending two identical messages to be for- The sender will first add layers of encryptions using
warded as two identical output messages by the mix. the public keys of the mixes in the path in reverse
The adversary could thus link these messages and order. Each layer includes the message to be forwarded
therefore “bridge over” the mix. by the mix, the address to which the message should be
2. All messages are (randomly) delayed (by temporarily sent (next mix in the chain or the final recipient), plus a
storing them in a buffer). Without this functionality (if nonce to be discarded. The resulting message
messages were immediately forwarded), the anonym- ence1 ðr1 ; A2 ; ence2 ð. . .encen ðrn ; AR ; enceR ðmÞÞ. . .ÞÞ is
ity set for one message would be reduced to one sent to the first mix (with the address A1 ). Each mix
sender (no anonymity at all). on the path decrypts the message with its privacy key
and thereby gets a nonce that is discarded as well as
an encrypted message and address to which it sends
mix this message. The last mix in the path finally sends
1 filterduplicates
1 2 3 4
2 delaymessages
enceR ðmÞ to the recipient. Unlinkability of sender and
3 recodemessages
recipient can in principle also be provided in the pres-
f (x) x<y
4 reordermessages ence of an adversary who monitors all communication
lines, as long as the crypto operations cannot be bro-
FIGURE 43.2 Processing steps within a mix. ken and one mix in the path is trustworthy, that is, one
762 PART | V Privacy and Access Management

Alice can reply as well in the same manner. Thus, Alice


ence1 (r1,A2,ence2 (r2,R,enceR (m))) and Bob can communicate without knowing each other’s
m S Mix A1
identities.
ence2 (r2,R,enceR (m)) The mix net protocol was invented by Chaum in 1981
for high-latency communication, such as email communi-
cation. In the mid-1990s, when interactive Internet ser-
m R Mix A2 vices became broadly used, low-latency anonymous
enceR (m)
communication protocols were developed. The most
broadly used low-latency protocols AN.ON and Onion
FIGURE 43.3 Sender anonymity with two mixes. Routing/Tor, which are based on the mix net concept,
will be briefly presented in the next sections.

mix that is not controlled by the adversary.


Figure 43.3 illustrates how sender anonymity can be AN.ON
achieved with a path consisting of two mixes.
AN.ON [29] is an anonymity service that was developed
For achieving recipient anonymity, symmetric cryp- and operated since the late 1990s at the Technical
tography is used as arecoding function.5 The recipient University of Dresden. Because it aims at providing a net-
first chooses a sequence of n mixes with addresses Ai work of mixes for low-latency traffic routing, symmetric
and public keys ei , i 5 1. . .n, a sequence of m symmet- cryptography is replacing asymmetric cryptography where
ric encryption keys j 5 0. . .n, and a label LR and creates possible (asymmetric cryptography is only used to
an anonymous return address RS 5 ðk0 ; A1 ; R1 Þ, which exchange symmetric session keys between mixes and
contains a symmetric key k0 , the address of the first users). Moreover, low latency requires that message
mix A1 in the chain, and another anonymous return delays are reduced, and in fact, AN.ON mixes implement
address R1 , which is calculated according to the follow- practically no message delay. The downside of reducing
ing scheme: delays is that the size of the message buffer in the mixes
Rj 5 encei ðkj ; Aj11 ; Rj11 Þ forj 5 1. . .n, where An11 is and thus the anonymity set decrease. In order to increase
the address of the recipient and Rn11 5 LR . the size of anonymity sets, AN.ON provides standard
The recipient makes the anonymous return address routes through the mix network, the so-called mix cas-
available to the sender, who uses key k0 to encrypt his cades. A mix cascade typically contains a sequence of
message m. The encrypted message ENCk0 fmg is then two or three mixes, and every message sent to the cascade
sent along with the return address R1 to the first mix A1 . runs through the mixes in the same order as any other
The first mix decrypts R1 5 ence1 ðk1 ; A2 ; R2 Þ, encrypts message sent to the same cascade. Predefined and
ENCk0 fmg with the symmetric key k1 , and sends the stable mix cascades have a number of advantages over
encrypted message ENCk1 fENCk0 fmgg along with R2 to dynamic routing:
the next mix with the address A2 . This is repeated for all
mixes along the path. The last mix in the path finally for- 1. Mixes can be audited and certified with regard to their
wards ENCkn f. . .ENCk1 fENCk0 fmgg. . .g and LR to the performance, their geographical position, the legisla-
recipient. The label LR of the return address indicates to tion in which they operate, and the operator (the com-
the recipient which sequence of symmetric keys he has to pany or the governmental institution operating the mix
use to decrypt the message. infrastructure).
The two schemes above provide either sender or recip- 2. Cascades can be designed to cross different nations,
ient anonymity. By combining both schemes, it is possi- different legislations, and different operators in order
ble for two communication partners to communicate to enjoy the protection of the most liberal regulation;
anonymously in both directions. Suppose that Alice anon- they can also be designed to provide a certain
ymously sends a message including an anonymous return performance.
address to a discussion forum. Bob can then anonymously 3. Security measures focus on a small number of mixes
send his reply via a self-chosen sequence of mixes to the while the costs can be distributed to a large number of
first mix used in the anonymous return address. If Bob’s users.
message also contains an anonymous return address, Disadvantages of implementing mix cascades include:
1. Each mix is a possible bottleneck and thus needs to
provide a stable and high-bandwidth installation, as
5. In the following, we will use capital letters and curly brackets for the one mix going offline stops all cascades in which it
symmetric encryption function ENC. was involved.
Chapter | 43 Privacy-Enhancing Technologies 763

sender (linkisTLS-encrypted) OR (linkisTLS-encrypted) OR2 (unencrypted) website

Create c , enc x
1 OR (g 1)
1

y1 )
Created c1, g , H (k1 Legend
enc(·) – public-keyencryption
Relay c1, ENC {Ex ENC{·} – symmetric encryption
k1 tend, enc (gx2)} H(·) – one-way hash
OR2
Create c , enc x
c – channel (“circuit”) id
2 OR (g 2)
2 g,x,y – Diffie-Hellman: generator,
y2 ) exponents
Created c2, g , H (k2
y
tend, g 2, H (k2)}
Relay c1, ENCk1 {Ex

Relay c1, ENC {EN


k1 Ck {Begin 〈website〉
2 :80}}
Relay c2, ENC {Be
k 2
gin 〈website〉:80 }
(TCPhandshake)
Relay c2, ENCk2 {Connected}
nected}}
Relay c1, ENCk1{ENCk2{Con

Relay c1, ENC {ENC {Dat


k1 k 2
a, ”HTTP GET . . . ”}}
Relay c2, ENC {Data, ”HTT
k 2
P GET . . . ”}
”HTTPGET...”
(response)
onse)}
Relay c2, ENCk2 {Data, (resp
a, (response)}}
Relay c1, ENCk1 {ENCk2{Dat

FIGURE 43.4 The Tor key negotiation and a simple Web site request [31].

2. Setting up and operating mixes is expensive due to the cells. All cells arriving at an onion router within a fixed
considerable organizational overhead for establishing time interval are mixed together to reduce correlation by
mix cascades and due to the high performance network insiders. Reply onions, which correspond to
requirements. untraceable return addresses, allow for a responder to
send back anonymously a reply after its original circuit is
broken.
Onion Routing/Tor Since individual routing nodes in each circuit only
Onion Routing [30] is a low-latency, mix-based routing know the identities of adjacent nodes, and since the nodes
protocol developed in the 1990s at the Naval Research further encrypt multiplexed virtual circuits, traffic analy-
Laboratory. It provides anonymous socket connections by sis is made difficult. However, if the first node behind the
means of proxy servers. Onion Routing uses the mix net initiator’s proxy and the last node of the circuit cooperate,
concept of layers of public key encryption (the so-called they will be able to determine the source and recipient of
onion) to build up an anonymous bidirectional virtual cir- communication through the number of cells sent over this
cuit between communication partners. The initiator’s circuit or through the duration for which the virtual circuit
proxy (for the service being requested) constructs a “for- was used.
ward onion,” which encapsulates a series of routing nodes Tor [31], the second generation of Onion Routing, has
(“mixes”) forming a path to the responder, and sends it added several improvements. In particular, it provides for-
with a create command to the first node. Each layer of ward secrecy. Once the session keys are deleted they can-
the onion is encrypted with the public key of each node not be obtained any longer. Even if all communication
on the path and contains symmetric crypto function/key has been wiretapped, the long-term secret keys of the
pairs as a payload. After sending the onion, the anony- onion routers ("mixes") become compromised. Therefore,
mous path is established and the initiator’s proxy sends instead of using hybrid encryption for distributing sym-
data through this anonymous connection. The symmetric metric session keys, the Diffie-Hellman key negotiation
function/key pairs are applied by each node on the path to protocol is used, which provides forward secrecy. The
crypt data that will be sent along the virtual circuit. All key negotiation and a simple communication via two
information (onions, data, and network control) are sent onion routers are outlined in Figure 43.4. The symmetric
through the Onion Routing network in uniform-sized session key shared by the user with the first onion router
764 PART | V Privacy and Access Management

OR1 is negotiated by means of a Diffie-Hellman hand- are also needed on the application level. Many such data
shake. The first half of the handshake gx1 is encrypted minimization techniques are based on cryptographic pro-
with the public key of OR1 (to prevent man-in-the-middle tocols (see also [33] for an overview). The classical types
attacks), and encOR1 ðgx1 Þ is then sent to OR1 . The onion of privacy-protecting cryptography that have already been
router replies with the second half of the handshake; that applied for decades are of course encryption schemes
is, gy1 and a hash over the negotiated session key themselves. However, there are a number of more recent
k1 5 gx1 y1 . Any further communication between the sender crypto schemes for protecting data and authenticating
and OR1 will be crypted with this negotiated symmetric information, which are variations or extensions of basic
session key. In order to extend the path from the sender crypto schemes they have “surprising properties” that in
over OR1 to a second onion router OR2 , the sender trans- many cases can offer better data minimization properties
mits ENCk1 fOR2 ; encOR2 ðgx2 Þg to OR1 . The first onion [33]. In this chapter, some of the most relevant examples
router decrypts the symmetric encryption and forwards of mainly cryptographic mechanisms for protecting pri-
the encrypted first half of a new Diffie-Hellman hand- vacy at application level will be given.
shake encOR2 ðgx2 Þ to OR2 . The second onion router replies
with the second half of the handshake gy2 and the hash
over the session key k2 5 gx2 y2 . The reply Blind Signatures and Anonymous eCash
ENCk1 fgy2 ; hashðk2 Þg is forwarded by OR1 back to the
Blind signatures are an extension of digital signatures and
sender. Only the sender and OR1 are now in possession of
provide privacy by allowing someone to obtain a signa-
k1 and only the sender and OR2 are now in possession of
ture from a signer on a document without the signer see-
k2 . The communication between sender and OR2 can now
ing the actual content of the “blinded” document he is
be crypted with k2 . Once a circuit has been established,
signing. Hence, if the signer is later presented with the
the symmetric encryption with the negotiated session
signed “unblinded” document, he cannot relate it with the
keys is applied by each node on the path to crypt data
signing session and with the person on behalf of whom he
that will be sent along the circuit. Advantages of Tor over
has signed the document. Blind signatures were invented
AN.ON are as follows:
by David Chaum as a basic building block for anonymous
1. Tor provides forward secrecy. eCash (see below). They can also be used to achieve ano-
2. It is easy to set up new onion routers (“mixes”), which nymity of other applications, such as eVoting, and also
are run by many volunteers all over the world. serve as a basic building block for other privacy crypto
3. There are lower performance requirements for each protocols, such as anonymous credentials (see below).
“mix.” David Chaum et al. have invented protocols based on
4. Each mix is a possible bottleneck, however, in Tor, blind signatures [10,34,35], which allow electronic money
“mixes” that do not perform can be excluded from the to flow perfectly tracelessly from the bank through con-
dynamic routing. sumer and merchant before returning to the bank.
Chaum’s cryptographic “online” payment protocol based
Disadvantages include:
on blind signatures can be summarized as follows (see
1. Anyone can set up “mixes” independent of their per- also Figure 43.5).
formance, that is, bandwidth, latency, security. Let ðe; nÞ be the bank’s private key indicating a certain
2. There is no audit or certification, thus a lack of reli- value of a signature under this key (in this example: one
able data about legislation and operator. dollar) and ðd; nÞ the bank’s public key.6 f is a
3. Bridging the “mixes” and thus breaching anonymity suitable one-way function. Electronic money has the form
by controlling the entry node and the exit node is eas- ðx; f ðxÞd ðmod nÞÞ, where the one-way function is needed
ier for adversaries, since they can easily set up their to prevent forgery of electronic money (see also [36] for
own new nodes. In particular, an adversary can easily more explanations).
try to attract user traffic by establishing a few well-
1. The customer Alice (his computer) first generates a bank
performing exit nodes and a lot of stable intermediate
note number x (of at least 100 digits) at random and (in
mix nodes that eventually become entry nodes [32].
essence) multiplies it with a blinding factor r, which he
has also chosen at random: B 5 r e Uf ðxÞ ðmod nÞ. He
then signs the blinded bank note number with his private
Data Minimization at Application Level
key and sends it to the bank.
Even if the communication channel is anonymized, users 2. The bank verifies and removes Alice’s signature.
can still reveal personal and identifying data on the appli- Then, it signs the blinded note (and thereby creates
cation level; often users have to reveal more personal
data than needed. Hence, data minimization techniques 6. Using the RSA encryption scheme.
Chapter | 43 Privacy-Enhancing Technologies 765

FIGURE 43.5 The flow of eCash’s untraceable


electronic money [35].

the blinded signature) with its “worth one dollar” sig- once. If, however, a note is spent twice, the bank will get
nature: enough information to identify the spender’s account.
Bd ðmod nÞ 5 ðr e Uf ðxÞÞa ðmod nÞ 5 rUf ðxÞd ðmod nÞ. Disappointingly, there has been a lack of adoption of
The bank then withdraws one dollar from his account anonymous eCash (attempts of commercial deployment
and returns the note with the blind signature. of Chaum’s schemes failed in the late 1990s), and today,
3. Alice divides out the blinding factor and thereby there are still no widely deployed anonymous electronic
d
extracts: C 5 Br ðmod nÞ 5 f ðxÞd ðmod nÞ from B. For payment services.
paying the online merchant Bob one dollar, Alice
sends him the pair ðx; f ðxÞd ðmodnÞÞ.
4. Bob verifies the bank’s signature and immediately Zero-Knowledge Proofs
contacts the bank to verify that the note has not
A zero-knowledge proof is defined as an interactive proof
already been spent.
in which a prover can prove to a verifier that a statement
5. The bank verifies its signature, checks the note against
is true without revealing anything else than the veracity
a list of those notes already spent, and credits Bob’s
of the statement. Zero-knowledge proofs were first pre-
account by one dollar.
sented in 1985 by Goldwasser et al. [37]. A zero-
The blind signature scheme provides (unconditional) knowledge proof must fulfill the following three
anonymity of the electronic money: Even if the bank and properties:
the merchant cooperate, they cannot determine who spent G Completeness: If the statement is true, the honest veri-
the notes. Since the bank does not know the blinding factors,
fier will be convinced of this fact by an honest prover.
it cannot correlate the note it was signing blindly for Alice G Soundness: If the statement is false, no cheating
with the note that was spent. (However, Alice’s identity is
prover can convince the honest verifier that it is true,
only protected if he also uses an anonymized communica-
except with some very small probability.
tion channel and if he does not personally reveal identifying G Zero-knowledge: If the statement is true, no cheating
information, such as a personal delivery address).
verifier learns anything other than this fact.
In addition to the online eCash protocol version,
where the bank needs to be constantly online for checking Zero-knowledge proofs are building blocks for data-
whether notes have already been spent, in [28] a protocol minimizing technologies, such as anonymous credential
for offline electronic money is presented by Chaum et al. systems. The anonymous credential protocol IdeMix is,
With the offline protocol, a user remains unconditionally for example, based on proofs of knowledge, in which a
anonymous as long as he spends each bank note only prover proves that he knows a secret value or that he is
766 PART | V Privacy and Access Management

able to solve some number theoretic problem, which also enable the user in the transformation to apply any
would contradict the assumption that the problem cannot mathematical function to the (original) attribute value,
be solved by a polynomially bounded Turing machine. allowing him to prove only attribute properties without
revealing the attribute itself. Besides, with the IdeMix
protocol by Camenisch et al., the issuer’s signature is also
Anonymous Credentials
transformed in such a way that the signature in the new
A traditional credential (often also called certificate or attri- certificate cannot be linked to the original signature of the
bute certificate) is a set of personal attributes, such as birth issuer [33]. Hence, different credential uses cannot be
date, name or personal number, signed (and thereby certi- linked by the verifier and/or issuer (unlinkability prop-
fied) by the certifying party (the so-called issuer), and bound erty). Cryptographically speaking, with IdeMix, the user
to its owner by cryptographic means (by requiring the user’s is basically using a zero-knowledge proof to convince the
secret key to use the credential). In terms of privacy, the use verifier of possessing a signature generated by the issuer
of (traditional or anonymous) credentials is better than the on a statement containing the subset of attributes.
direct request to the certifying party, as this prevents the cer- Figure 43.6 provides a scenario illustrating how data
tifying party from profiling the user. Traditional credentials minimization can be achieved in an identity management
require, however, that all attributes are disclosed together if online transaction: First, user Alice obtains an anonymous
the user wants to prove certain properties, so that the verifier driving license credential issued by the Swedish Road
can check the issuer’s signature. This makes different uses authority (the so-called identity provider), with personal
of the same credential linkable to each other. Moreover, the attributes typically stored in the license, including her
verifier and issuer can link the different uses of the user’s birth date. Later, she would like to purchase a video from
credential to the issuing of the credential. an online shop (the so-called relying party, which is also
Anonymous credentials (also called private certifi- the verifier in this scenario), which is only permitted for
cates) were first introduced by Chaum [10] and later adults. After sending a service request, the online shop
enhanced by Brands [38] and by Camenisch and will answer her with a data request for a proof that she is
Lysyanskaya [39] and have stronger privacy properties older than 18. Alice can now take advantage of the selec-
than traditional credentials. Microsoft’s U-Prove technol- tive disclosure feature of the anonymous credential proto-
ogy based on Brands’s protocols and IBM’s IdeMix tech- col to prove with her credential just the fact that she is
nology based on the credential protocols by Camenisch older than 18 without revealing her birth date or any other
et al. are currently the practically most relevant anony- attributes of her credential. If Alice later wants to pur-
mous credential technologies. chase another video that is only permitted for adults at
Anonymous credentials allow the user to essentially the same video online shop, she can use the same anony-
“transform” the certificate into a new one that contains mous credential for a proof that she is over 18. If the
only a subset of attributes of the original certificate (it IdeMix protocol is used, the video shop is unable to rec-
allows proving only a subset of its attributes to a verifier ognize that the two proofs are based on the same creden-
(selective disclosure property)). Instead of revealing the tial. Hence, the two rental transactions cannot be linked
exact value of an attribute, anonymous credential systems to the same person.

User Alice Verifier (Relying Party) FIGURE 43.6 Example for


achieving data minimization for
2. Service request an online transaction by the use of
anonymous credentials.
3. Data request
4. (unlinkable) selective
disclosure

Age > 18
1. Issues
credentials
Advantages:
• Selective Disclosure
• Unlinkability of Transactions (Idemix)
• No Profiling by IdPs or Relying Parties

Issuer (Identity Provider - IdP)


Chapter | 43 Privacy-Enhancing Technologies 767

Private Information Retrieval Classification


Private Information Retrieval (PIR) allows a user to Transparency-enhancing tools (TETs) for privacy pur-
retrieve an item (record) from a database server without poses can be classified into the following categories (see
revealing which item he is interested in; that is, the pri- also [44]):
vacy of the item of interest is provided. A typical appli-
1. TETs that provide information about the intended data
cation example is a patent database from which
collection and processing to the data subject, in order
inventors would like to retrieve information without
to enhance the data subject’s privacy;
revealing their interests. A trivial approach would be to
2. TETs that provide the data subject with an overview
simply download the entire database and to make a local
of what personal data have been disclosed to which
selection, which would, however, be too costly
data controller under which policies;
bandwidth-wise. In [40], one of the first PIR solutions
3. TETs that provide the data subject, or his proxy,
was introduced, which requires the existence of t 1 1
online access to his personal data, to information on
noncooperating identical database servers (with n
how his data have been processed and whether this
records each). To each server, one n-bit query vector
was in line with privacy laws and/or negotiated poli-
(out of a set of t 1 1 query vectors) is sent via an
cies, and/or to the logic of data processing in order to
encrypted channel, where each bit represents one record
enhance the data subject’s privacy;
of the queried database: If the bit is one, then the record
4. TETs that provide “counter-profiling” capabilities to
is selected; otherwise not. The user creates t of the
the data subject, helping him to “guess” how his data
query vectors randomly. The remaining t 1 1st vector is
match relevant group profiles, which may affect his
calculated by XOR-ing (superposing) all random vectors
future opportunities or risks;
and flipping the bit representing the record of interest.
One of the query vectors is sent to each database. All TETs of the first and last categories are also called
selected records are superposed (XOR-ed). The result is ex-ante TETs, as they provide transparency of any
exactly the requested database item. If each of the bits intended data processing to the data subjects before they
in the t random query vectors is set to 1 with a probabil- are releasing any personal data. TETs of the second and
ity of 0.5, then an adversary who has access to at most t third categories provide transparency in regard to the
of the requests or responses associated with the query processing of personal data, which the user has already
vectors will gain no information about the database item disclosed, and are therefore called ex-post TETs.
that the user is retrieving.
A disadvantage of a multi-database solution is, how-
ever, that several identical databases must be obtained. Ex-ante Transparency-Enhancing Tools
Especially updates are complex, as changes should
Examples of ex-ante TETs are privacy policy languages
take place simultaneously. Besides information-theoretic
tools and Human Computer Interaction (HCI) components
PIR schemes with database replication, several single-
that make privacy policies of services sides more trans-
database (computational) PIR schemes have been
parent, such as the Platform for Privacy Preferences
subsequently developed and advanced (see, for instance,
(P3P) language [45] and P3P user agents, such as the pri-
[41,42] for surveys and references). Oblivious transfer
vacy bird.7 P3P enables Web sites to express their privacy
[42] is private information retrieval, where additionally
policy (basically stating what data are requested by
the user may not learn any item other than the one that he
whom, for what purposes, and how long the data will be
requested.
retained) in a standard (machine-readable XML) format
that can be retrieved automatically and interpreted easily
by user agents and matched with the user’s privacy pre-
ferences. Thus, P3P user agents enable users to be better
7. TRANSPARENCY-ENHANCING TOOLS
informed of a Web site’s data-handling practices (in both
As we elaborated above, transparency is a basic legal machine- and human-readable formats).
principle. It is also an important social trust factor, as Recently developed visualization techniques for dis-
trust in an application can be enhanced if procedures are playing P3P-based privacy policies are based on the meta-
clear, transparen,t and reversible, so that users feel they phor of a “Nutrition Label,” assuming that people already
are in control [43]. Transparency-enhancing technologies understand other nutrition, warning, and energy labeling,
provide tools to the end users or their proxies acting on which consequently can also allow users to find and
behalf of the user’s interests (such as data protection com- digest policy-related information with a proposed privacy
missioners), for making personal data processing more
transparent to them. 7. http://www.privacybird.org.
768 PART | V Privacy and Access Management

FIGURE 43.7 “Send Data?” PPL user interface [48].

label design more accurately and quickly (see [46]). A sides and which will “travel” with the data transferred to
more advanced policy language, the PrimeLife Policy downstream controllers.
Language (PPL), was developed in the EU FP7 project As PPL has many features that P3P does not provide
PrimeLife [47]. PPL is a language used to specify not only (downstream data controllers, credential selection for cer-
the privacy policies of data controllers but also those of tified data, and obligations), the design of usable PPL
third parties (so-called downstream controllers) to whom user interfaces provides many challenges. The “Send
data are further forwarded as well as privacy preferences Data?” user interfaces for letting the PPL engine interact
of users. It is based on two widespread industry standards, with the user for displaying the result of policy matches,
XACML (eXtensible Access Control Markup Language) identity/credential selection and for obtaining informed
and SAML (Security Assertion Markup Language). The consent to disclose selected certified and uncertified data
data controller and downstream data controller have poli- items were developed and presented [48]. Figure 43.7
cies basically specifying which data are requested from the depicts an example PPL “Send Data?” dialogue.
user and for which purposes and obligations (e.g., under The PPL user interfaces follow the Art. 29 Data
the obligation that the data will be deleted after a certain Protection Working Party recommendation of providing
time period). PPL allows specifying both uncertified data policy information in a multilayered format [49] for mak-
requests and certified data requests based on proofs of the ing policies more understandable and usable. According
possession of (anonymous IdeMix or traditional X.509) to this recommendation, a short privacy notice on the top
credentials that fulfill certain properties. The user’s prefer- layer offers individuals the core information required
ences allow expressing for each data item to which data under Art. 10 EU Directive 95/46/EC, including at least
controllers and downstream data controllers the data can the identity of the service provider and the purpose of
be released and how the user expects his data to be treated. data processing. In addition, a clear indication (in the
The PPL engine conducts an automated matching of the form of URLs—in our example the “privacy policy”
data controller’s policy and the user’s preferences. The URLs of the two data controllers) must be given as to
result can be a mutual agreement concerning the usage of how the individuals can access the other layers presenting
data in the form of a so-called sticky policy, which should the additional information required by Art. 10 and
be enforced by the access control systems at the backend national privacy laws.
Chapter | 43 Privacy-Enhancing Technologies 769

Ex-Post Transparency-Enhancing Tools8 on secure logging systems that usually extend the Kelsey-
Schneier log [53] and protect the integrity and the confi-
An important transparency-enhancing tool that falls into dentiality of the log data. Such a secure logging system
categories 2 and 3 of our classification is the data track and an automated privacy audit facility are key compo-
that has been developed in the PRIME9 and PrimeLife10 nents of a privacy evidence approach proposed by [54].
projects [51]. The data track is a user side transparency This privacy evidence system allows a user to inspect all
tool, which includes both a history function and online log entries that are recording actions of that user with a
access functions. The history function keeps for each special view tool and permits sending the log view cre-
transaction, in which a user discloses personal data to a ated by that tool to the automated privacy audit compo-
communication partner, a record for the user on which nent, which compares the log view with the privacy
personal data are disclosed to whom (the identity of the policy and to construct privacy evidence. This privacy
controller), for which purposes, which credentials and/or evidence indicates to the user whether the privacy policy
pseudonyms have been used in this context, as well as the has been violated.
details of the negotiated or given privacy policy. These The unlinkability of log entries, which means that
transaction records are either stored at the user side or they should not be stored in the sequence of their crea-
centrally (in the Cloud—see [52]) in a secure and tion, is needed to prevent an adversary from correlating
privacy-friendly manner. User-friendly search functionali- the log with other information sources, such as other
ties, which allow the user to easily get an overview about external logs, which could allow him to identify data sub-
who has received what data about him or her, are also jects to whom the entries refer (cf. [55]). Also, anony-
included. The Online access functions allow end users to mous access is needed to prevent an adversary from
exercise their rights to access their data at the remote ser- observing who views which log entries and in this way
vices sides online. In this way, they can compare what conclude to whom the log entries refer. Wouters et al.
data they have disclosed to a services side with what data [56] have presented such a secure and privacy-friendly
are still stored by the services side. This allows them to Logging for eGovernment Services. However, it
check whether data have been changed, processed, or addresses the unlinkability of logs between logging sys-
deleted (in accordance with data retention periods of the tems in eGovernment rather than the unlinkability of log
negotiated or given privacy policy). Online access is entries within a log. Moreover, it does not address insider
granted to a user if he can provide a unique transaction attacks, nor does it allow anonymous access to log
ID (currently implemented as a 16-byte random number), entries.
which is shared between the user (stored in his data track) Within the PrimeLife project, a secure logging system
and the services side for each transaction of personal data has been developed, which addresses these aspects of the
disclosure. This in principle also allows anonymous or unlinkability of log entries and anonymous user access. In
pseudonymous users to access their data. particular, it fulfills the following requirements (see [55]):
While the data track is still in the research prototype
stage, the Google Dashboard is already available in prac- G Only the data subject can decrypt log entries after
tice and grants its users access to a summary of the data they have been committed to the log.
stored with a Google account, including account data and G A user can check the integrity of his log entries. A ser-
the users’ search query history, which are, however, only vice provider can check the integrity of the whole log
a part of the personal data that Google processes. It does file.
not provide any insight into how these data provided by G It is not possible for an attacker to secretly modify log
the users (the users’ search queries) have subsequently entries, which have been committed to the log before
been processed by Google. Besides, access is provided the attacker took over the system (forward integrity).
only to authenticated Google users. G It is practically impossible to link log entries, which
Further examples of transparency tools that allow refer to the same user.
users to view and control how their personal data have G For efficiency reasons, it should be possible for a data
been processed and to check whether this is in compli- subject to read his log entries without the need to
ance with a negotiated or given privacy policy are based download and/or fully traverse the whole log database.
The need and requirements of tools that can anticipate
8. This section corresponds to most parts of Section 2.4.2 that the lead- profiles (category 4 of our definition above) have been
ing author has contributed to ENISA’s study on “Privacy, Accountability analyzed within studies of the FIDIS project (see, for
and Trust—Challenges and Opportunities,” published in 2011 [50]. instance, [57]). To the best of our knowledge, no practical
9. EU FP6 project PRIME (Privacy and Identity Management for
Europe), www.prime-project.eu.
transparency-enhancing tools fulfill the requirements. In
10. EU FP7 project PrimeLife (Privacy and Identity Management for academia, promising approaches [58,59] have been for-
Life), www.primelife.eu. mulated and are the subject of research.
770 PART | V Privacy and Access Management

8. SUMMARY and not excessive (Art. 6 I (c) EU Directive 95/


46/EC).
This chapter has provided an introduction to the area of 5. True or False? Transparency of data processing means
privacy-enhancing technologies. We presented the legal informing a data subject about the purposes and cir-
foundation of PETs and provided a classification of cumstances of data processing, identifying who is
PETs, as well as a selection of some of the most rele- requesting personal data, how the personal data flow,
vant PETs. Following the Privacy by Design paradigm where and how long the data are stored, and what type
for effectively protecting privacy, privacy protection of rights and controls the data subject has in regard to
should be incorporated into the overall system design (it his personal data.
should be embedded throughout the entire system life
cycle).
While technical PET solutions to many practical pri-
vacy issues have existed for many years, these solutions Multiple Choice
have not been widely adopted by either industry or users. 1. What needs to install appropriate technical and organi-
The main factors contributing to this problem are low zational security mechanisms to guarantee the confi-
demand by industry, which, rather, is under competitive dentiality, integrity, and availability of personal data
business pressure to exploit and monetize user data, and (Art. 17 EU Directive 95/46/EC)?
low user awareness in combination with a lack of easily A. Privacy-enhancing technology
usable, efficient PET implementations (see also [60]). B. Location technology
While this chapter is part of a technical computer security C. Web-based
handbook and has therefore focused on the technical (and D. Technical improvement
partly legal) aspects of PETs, clearly the economic, E. Data controller
social, and usability aspects of PETs also need further 2. What can be defined as technologies that are enforcing
attention in the future. legal privacy principles in order to protect and
Finally, let’s move on to the real interactive part of enhance the privacy of users of information technol-
this chapter: review questions/exercises, hands-on pro- ogy (IT) and/or data subjects?
jects, case projects, and optional team case project. The A. Information technology
answers and/or solutions by chapter can be found in the B. Location technology
Online Instructor’s Solutions Manual. C. Web-based
D. Privacy-enhancing technologies
E. Web technology
CHAPTER REVIEW QUESTIONS/EXERCISES 3. What as an abstract strategy describes the avoidance
of unnecessary or unwanted data disclosures?
True/False A. Data minimization
1. True or False? Data protection concerns the protection B. XACML
of personal data in order to guarantee privacy and is C. XML-based language
only a part of the concept of privacy. D. Certification authority
2. True or False? Personal data processing has to be E. Security
legitimate, which according to Art. 7 EU Directive 95/ 4. What aim quantifies the effectiveness of schemes or
46/EC is usually the case if the data subject11 has technologies with regard to the privacy goals defined
given his ambiguous (and informed) consent, if there in the previous section?
is a legal obligation, or if there is contractual agree- A. Privacy metrics
ment (cf. the Collection Limitation Principle of the B. Languages
OECD Guidelines). C. Privacy-Aware Access Control
3. True or False? Personal data must be collected for D. Privacy preferences
specified, explicit, and legitimate purposes and E. Taps
may be further processed in a way incompatible 5. What is a prerequisite for achieving anonymity, more
with these purposes (Art. 6 I b EU Directive 95/ generally, or data minimization, on an application
46/EC). level?
4. True or False? The processing of personal data must A. Release policies
not be limited to data that are adequate, relevant B. Anonymous communication
C. Data-handling policies
D. Intellectual property
11. A data subject is a person about whom personal data is processed. E. Social engineering
Chapter | 43 Privacy-Enhancing Technologies 771

EXERCISE [11] D. Chaum, The dining cryptographers problem: unconditional


sender and recipient untraceability, J. Cryptol. 1 (1) (January
Problem 1988) 6575.
[12] D. Chaum, Untraceable electronic mail, return addresses, and dig-
Why now? Why is the National Strategy for Trusted IDs ital pseudonyms, Commun. ACM 24 (2) (February 1981) 8488.
in Cyberspace needed? [13] Registratiekamer & Information and Privacy Commissioner of
Ontairo, Privacy-Enhancing Technologies: The Path to
Anonymity, Achtergrondstudies en Verkenningen 5B, vol. I & II,
Hands-On Projects Rijswijk, August 1995.
Project [14] A. Pfitzmann och M. Hansen, Anonymity, Unlinkability,
Undetectability, Unobservability, Pseudonymity, and Identity
Won’t having a single password and credential be less Management—A Consolidated Proposal for Terminology
secure and private than having many usernames and (2010). [Online]. Available: ,http://dud.inf.tu-dresden.de/Anon_
passwords? Terminology.shtml..
[15] S. Steinbrecher, S. Köpsell, Modelling unlinkability, in:
Workshop on Privacy Enhancing Technologies, 2003.
Case Projects [16] Common Criteria for Information Technology Security Evaluation,
Version 3.1, Part 2: Security Functional Requirements, Common
Problem Criteria Project (September 2006). [Online]. Available: ,www.
Who will make sure that companies follow the rules? commoncriteriaportal.org..
[17] L. Sweeney, k-Anonymity: a model for protecting privacy, Int. J.
Uncertain. Fuzz. Knowl. Based Syst. 10 (5) (2002) 571588.
Optional Team Case Project [18] M. Gruteser, D. Grunwald, Anonymous usage of location-based
services through spatial and temporal cloaking, Proceedings of the
Problem 1st International Conference on Mobile Systems, Applications and
Will new laws be needed to create the Identity Services (MobiSys), 2003.
[19] W. Wang, M. Motani, V. Srinivasan, Dependent link padding
Ecosystem?
algorithms for low latency anonymity systems, Proceedings of the
15th ACM Conference on Computer and Communications
Security (CCS), 2008.
REFERENCES [20] M. Srivatsa, A. Iyengar, L. Liu, Privacy in VoIP networks: a
[1] OECD, Guidelines on the Protection of Privacy and Transborder k-anonymity approach, Proceedings of the IEEE INFOCOM
Flows of Personal Data, September 1980. 2009, 2009.
[2] S.D. Brandeis, L.D. Warren, The right to privacy, Harv. Law Rev. [21] A. Narayanan, V. Shmatikov, Robust de-anonymization of large
5 (1890) 193220. sparse datasets, Proceedings of the IEEE 29th Symposium on
[3] A. Westin, Privacy and Freedom, Atheneum, New York, 1967. Security and Privacy, Oakland, CA, USA, 2008.
[4] G. Hogben, Annex A, i PRIME project deliverable D14.0a [22] A. Machanavajjhala, D. Kifer, J. Gehrke, M. Ventikasubramaniam,
PRIME Framework V0, June 2004. l-Diversity: privacy beyond k-anonymity, ACM Trans. Knowl.
[5] Global Internet Liberty Campaign, PRIVACY AND HUMAN Discov. Data 1 (2007).
RIGHTS—An International Survey of Privacy Laws and Practice, [23] N. Li, T. Li, S. Venkatasubramanian, t-Closeness: privacy beyond
[Online]. Available: ,http://gilc.org/privacy/survey.. k-Anonymity and l-Diversity, Proceedings of the IEEE 23rd
[6] R. Rosenberg, The Social Impact of Computers, Academic Press, International Conference on Data Engineering, 2007.
1992. [24] C.E. Shannon, A mathematical theory of communications, Bell
[7] European union, directive 95/46/EC of the European Parliament Syst. Tech. J. 27 (1948) 379423623656
and of the Council of 24 October 1995 on the protection of indivi- [25] C. Diaz, S. Seys, J. Claessens, B. Preneel, Towards measuring
duals with regard to the processing of personal data and on the anonymity, Workshop on Privacy Enhancing Technologies, 2002.
free movement of such data, Off. J. L 281 (1995). [26] A. Serjantov, G. Danezis, Towards an information theoretic metric
[8] Proposal for a REGULATION OF THE EUROPEAN for anonymity, Workshop on Privacy Enhancing Technologies,
PARLIAMENT AND OF THE COUNCIL on the protection of 2002.
individuals with regard to the processing of personal data and on [27] S. Clauss, A framework for quantification of linkability within a
the free movement of such data (General Data Protection privacy-enhancing identity management system, Emerg. Trends
Regulation), COM(2012) 11 final, 2012/00, January 25, 2012. Inf. Commun. Secur. (ETRICS) (2006).
[Online]. Available: ,http://ec.europa.eu/justice/data-protection/ [28] D. Chaum, The dining cryptographers problem: unconditional
document/review2012/com_2012_11_en.pdf.. sender and recipient untraceabilit, J. Cryptol. 1 (1) (January 1988)
[9] S. Fischer-Hübner, Anonymity, Encyclopedia of Database 6575.
Systems, Springer, Heidelberg, 2009, pp. 9091. [29] [Online]. ,http://anon.inf.tu-dresden.de., [Använd August 29,
[10] D. Chaum, Security without identification: card computers to 2012].
make big brother obsolete, Informatik Spektrum 10 (1987) [30] D.M. Goldschlag, M.G. Reed, P.F. Syverson, Hiding routing
262277. information, Inf. Hiding (1996).
772 PART | V Privacy and Access Management

[31] R. Dingledine, N. Mathewson, P. Syverson, Tor: The Second- [47] PrimeLife, Privacy and Identity Management in Europe for Life-
Generation Onion Router, Naval Research Lab, Washington, DC, Policy Languages, [Online]. Available: ,http://primelife.ercim.
2004. eu/results/primer/133-policy-languages..
[32] R. Böhme, G. Danezis, C. Dı́az, S. Köpsell, A. Pfitzmann, On the [48] J. Angulo, S. Fischer-Hübner, E. Wästlund, T. Pulls, Towards
PET Workshop Panel “Mix Cascades Versus Peer-to-Peer: Is One usable privacy policy display & management for primelife, Inf.
Concept Superior?”, Workshop on Privacy-Enhancing Manag. Comput. Secur. (Emerald) 20 (1) (2012) 417.
Technologies (PET) 2004, 2005. [49] Opinion on More Harmonised Information provisions. 11987/04/
[33] Jan Camenish, Maria Dubovitskaya, Markulf Kohlweiss, Jorn EN WP 100, Chapter 29 Data Protection Working Party,
Lapon, Gregory Neven, Cryptographic mechanisms for privacy, November 25, 2004. [Online]. Available: ,http://ec.europa.eu/
Privacy and Identity Management for Life, Springer, justice_home/fsj/privacy/docs/wpdocs/2004/wp100_en.pdf..
Heidelberg, 2011, pp. 117134. [50] ENISA, Privacy, Accountability and Trust—Challenges and
[34] D. Chaum, A. Fiat, M. Naor, Untraceable Electronic Cash, Adv. Opportunities, 2011. [Online]. Available: ,http://www.enisa.
Cryptol. Crypto’88 (1988). europa.eu/activities/identity-and-trust/privacy-and-trust/library/
[35] D. Chaum, Achieving electronic privacy, Sci. Am. (1992) deliverables/pat-study..
7681. [51] E. Wästlund, S. Fischer-Hübner, End User Transparency Tools:
[36] S. Fischer-Hübner, LNCS IT-Security and Privacy—Design and UI Prototypes, PrimeLife Deliverable D4.2.2. ,www.primelife.
Use of Privacy-Enhancing Security Mechanisms, Springer, eu., June 2010.
Heidelberg, 2001. [52] T. Pulls, Privacy-Friendly Cloud Storage for the Data Track: An
[37] S. Goldwasser, S. Micali, C. Rackoff, The knowledge complexity Educational Transparency Tool, NordSec—17th Nordic
of interactive proof systems, Proceedings of the 17th ACM Conference on Secure IT Systems will be held at Blekinge
Symposium on Theory of Computing, 1985, pp. 291304. Institute of Technology, Karlskrona, October 2012.
[38] S. Brands, Rethinking Public Key Infrastructure and Digital [53] B. Schneier, J. Kelsey, Cryptographic Support for Secure Logs on
Certificates—Building in Privacy, PhD thesis. Eindhoven. Untrusted Machines, The Seventh USENIX Security Symposium
Institute of Technology, 1999. Proceedings, USENIX Press, 1998, pp. 5362.
[39] J. Camenisch, A. Lysyanskaya, Efficient non-transferable anony- [54] S. Sackmann, J.A.R. Strüker, Personalization in privacy-aware
mous multi-show credential system with optional anonymity revo- highly dynamic systems, Commun. ACM 49 (9) (September 2006).
cation, Adv. Cryptol. Eurocrypt 2045 (2001) 93118. [55] H. Hedbom, T. Pulls, P. Hjärtquist, A. Lavén, Adding Secure
[40] D. Cooper, K. Birman, Preserving privacy in a network of mobile Transparency Logging to the PRIME Core, i 5th IFIP WG 9.2,9.6/
computers, Proceedings of the 1995 IEEE Symposium on Security 11.7,11.4,11.6 / PrimeLife International Summer School, revised
and Privacy, Oakland, May 1995. selected papers, published by Springer in 2010, Nice, France, 2009.
[41] R. Ostrovsky, W. Skeith, A Survey of Single-Database Private [56] K. Wouters, K. Simoens, D. Lathouwers, B. Preneel, Secure and
Information Retrieval: Techniques and Applications, Public Key Privacy-Friendly Logging for eGovernment Services, 3rd
CryptographyPKC 2007, Springer, 2007. International Conference on Availability, Reliability and Security
[42] H. Lipmaa, Oblivious Transfer or Private Information Retrieval, (ARES 2008), IEEE, 2008, pp. 10911096.
[Online]. Available: ,http://www.cs.ut.ee/Blipmaa/crypto/link/ [57] M. Hildebrandt, Biometric Behavioral Profiling and Transparency
protocols/oblivious.php.. Enhancing Tools, FIDIS Deliverable D 7.12, ,www.fidis.net.,
[43] R. Leenes, M. Lips, R. Poels, M. Hoogwout, User aspects of 2009.
Privacy and Identity Management in Online Environments: [58] S. Berthold, R. Böhme, Valuating privacy with option pricing the-
Towards a theoretical model of social factors, PRIME Framework ory, in: T. Moore, D.J. Pym, C. Ioannidis (Eds.), Economics of
V1 (Chapter 9) project Deliverable, 2005. Information Security and Privacy, Red. Springer, 2010,
[44] H. Hedbom, A survey on transparency tools for privacy purposes, pp. 187209.
Proceedings of the 4th FIDIS/IFIP Summer School, published by [59] S. Berthold, Towards a Formal Language for Privacy Options,
Springer, 2009, Brno, September 2008. Privacy and Identity Management for Life, 6th IFIP WG 9.2,9.6/
[45] W3C, P3P—The Platform for Privacy Preferences 1.1 (P3P1.1) 11.7, 11.4, 11.6/PrimeLife International Summer School 2010,
Specification, 2006. [Online]. Available: ,http://www.w3.org/ Revised Selected Papers, 2011.
P3P/.. [60] S. Fischer-Hübner, C.J. Hoofnagle, I. Krontiris, K. Rannenberg,
[46] P. Kelley, L. Cesca, J. Bresee, L. Cranor, Standardizing privacy M. Waidner, Online privacy: towards informational self-
notices: an online study of the nutrition label approach, determination on the internet (Dagstuhl Perspectives Workshop
Proceedings of the 28th International Conference on Human 11061), Dagstuhl Manifestos 1 (1) (2011) 120.
Factors in Computing Systems ACM, 2010, p. 1573.

Вам также может понравиться