You are on page 1of 6

ACEEE Int. J. on Information Technology, Vol. 02, No.

01, March 2012

Introduction of a New Non-Repudiation Service to Protect Sensitive Private Data


Rainer Schick1, Christoph Ruland1
1

Chair for Data Communications Systems / University of Siegen, Siegen, Germany Email: {rainer.schick, christoph.ruland}@uni-siegen.de Fig. 1 illustrates the problem for a data owner in a decentralized information distribution system. The approaches described in this paper solve two different goals: First, a reliable data tracking system is provided. Each receiver is able to track the way protected data have taken so far. Additionally, only authorized receivers are able to decrypt the received data. The goal is a new security service which provides nonrepudiation of forwarding for recipients. A receiver of such protected data cannot repudiate that he had access to it. Second, a mechanism to prove forwarding of such protected data is provided for the owner. If he or she finds a copy of these data, he is able to track the last authorized receiver of it. Then the owner can check if it is an authorized or unauthorized copy. Finally, the approaches shown in this paper should lead to a new security service. This service extends recent nonrepudiation services described in ISO/IEC 13888 [1], [2], [3]. The fields of application of the data tracking service are those depending on provable authentic information. Examples are the protection of company secrets and warrants or the realization of notary authorities. Authorized receivers of protected data should be able to collect evidence to prove the receipt and the forwarding by previous receivers of these data. Taking Fig. 1 as an example, Carol should be able to prove the forwarding by the Owner and FLR3. Additionally, the owner (the first sender) of this sensitive information can prove the unauthorized forwarding by the last authorized receiver of data protected using the data tracking service. If Carol forwards the data to an unauthorized receiver, the owner should be able to prove this misbehavior. The data tracking service does not prevent unauthorized forwarding of the plaintext information. Instead, it provides mechanisms to track data leaks. It is not meant to replace copyright protections in the sense of preventing or detecting illegal file sharing applications. II. RELATED WORK As already stated in the introduction, the new data tracking service is proposed as a non-repudiation service. The goal of a non-repudiation service is to generate and collect evidence concerning a claimed action or event [4]. None of the services described in ISO/IEC 13888 provides options to find the last authorized receiver of a suspicious copy of sensitive data. If a misbehaving authorized receiver claims the forwarding, all of the next receivers can use the evidence to prove the forwarding [5]. There are several approaches to providing control over digital data. For example, companies often install so-called 1

AbstractCurrent security systems dealing with sensitive private data do not provide sufficient options to find data leaks. An approach to find the last authorized receiver of a protected copy is proposed in this paper. Existing security concepts are extended by a new security service based on reliable tracking data embedding. Additionally, a new mechanism to protect the new tracking data is shown. Digital watermarking techniques are used to provide tracking abilities for forwarded copies of the protected data. This paper briefly describes approaches to improve security for both the owner of protected data and its recipients. Index Termsinformation security, security services, digital forensic, data hiding, digital watermark, fingerprinting

I. INTRODUCTION Nowadays most sensitive and private data are generated, processed and stored digitally. This circumstance causes many efforts to protect these data from access by unauthorized attackers. Fortunately, modern security services provide confidentiality, authenticity and integrity for data worth protecting. But these services only provide security against attacks by unauthorized external attackers. The even worse attacks conducted by employees and authorized receivers of such data are often neglected. The main problem is that control over confidentiality ends with decryption. The sender does not know what the receiver does with the data. This is even worse for a data owner. If he shares information with trusted users and one of them misbehaves, the owner cannot prove who has been the mole.

Figure 1.

Unauthorized data forwarding in a decentralized system

2012 ACEEE DOI: 01.IJIT.02.01. 30

ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012 Data Leakage Prevention on their systems [6]. This is also called Endpoint Security, because it is often a software extension installed on client computers. These software solutions limit the user rights on these systems, such that an employee cannot copy protected data to a private USB device or use a private e-mail account for conversations. Another approach is Digital Rights Management. DRM provides access control for copyright holders to limit the use of digital content. It is often stated as a copy protection mechanism, but it is not. Copying cannot be prevented by this technique. Instead, the copies contain digital watermarks. These watermarks support the copyright owners to track the leak if an unauthorized copy is found [7], [8], [9], [10], [11]. Digital watermarking has to cope with different problems. One is the limited embedding capacity in relation to the size of its carrier [12]. Another is the collusion attack, where attackers combine their copies to withdraw their fingerprints. This attack is tried to be solved by frameproof codes [13], [14]. Unfortunately, there is no feasible possibility to prevent unauthorized information forwarding under all circumstances. If a user is able to view the plaintext information (and authorized receivers should be able to do so), he or she has several possibilities to create copies. The recipient may print it, photograph the screen or at least rewrite it (if the confidential information is a text). As it is easy to make copies or manipulate digital data, a rethink in using and believing its content is inevitable. Receivers of digitalized information should not trust its content unless its authenticity and the authenticity of its source can be proven. At least if sensitive and confidential information are shared. Nevertheless, usually attackers do not care for the existence of a valid digital signature. They forward data they have stolen or received by misbehaving authorized personnel. It is not part of this work to find the one who published the data. Instead, the last authorized receiver should be traceable, such that the source of the data leak can be found. In contrast to the approaches mentioned above, each authorized receiver of the data is trusted. If he or she is not in possession of a needed security module, the receiver is not able to obtain a plaintext copy of the protected information. III. NOTATION The following terms and notations apply for this paper: m: Data/Information that must be protected by the document tracking service. CD: Specific configuration data added by the data owner. This may contain an expiry date, specific receiver identifier or group policies. x||y: The result of the concatenation of x and y in that order. An appropriate encoding must be used so that the data items can be recovered from the concatenated string. O(m): The source signature calculated by the data owner. As long as this signature accompanies m and can be verified successfully, the data is valid. The signature is calculated over the concatenation of data 2012 ACEEE DOI: 01.IJIT.02.01. 30 2 m = m||CD. mO: The concatenated data m||O(m). PIDn: The unique personal identifier of user n. FID: The unique file identifier of data m. TIDn: The unique transaction identifier for the transmission of data m from user n to user n+1. TSn: Timestamp of TIDn. TDn: The tracking data of user n. These data are defined as the concatenation of PID0PIDn+1||FID||TID0TIDn||TS0TSn n(TDn): The signature calculated by user n signing the current tracking data TDn. SSTKn: Secret Storage Key. DEKn: Secret Data Encryption Key. SWKn: Secret Watermarking Key. SCK: Secret Confusion Key. TDEK: Secret Tracking Data Encryption Key. IV. SYSTEM DESIGN The system aims at two different goals, so that the approach consists of two main parts: one is the data tracking part and the other is the displaying or watermarking part. Before these parts are described in detail, the basic idea of the data flow is explained. The data tracking part secures the sensitive information during storage, transmission and processing. The embedded tracking data provide nonrepudiation of forwarding for an authorized receiver n. With the use of the embedded tracking data a receiver n can prove the chain of receivers for all users 0n-1. The figuring part embeds a digital watermark into the visible content on the receivers side. If such a watermarked copy is found and not manipulated, the data owner can prove the forwarding of the last authorized receiver. Fig. 2 sketches the flow of such protected data. It shows the functionality for sending and receiving data using the data tracking service. It also shows the branch of the visible data. These data are watermarked using a watermarking key SWKn. The watermarked copy is for viewing only and should not be forwarded to anybody by the authorized receiver. The encrypted data contains the tracking data of all previous recipients of the confidential information.

Figure 2. Data flow of the data tracking scheme

Figure 3. Structure of the data

ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012 V. DATA TRACKING The approach of data tracking can basically be conceived as a smart letter: The receiver of the data gets a letter in an envelope. The envelope contains a field for the signature consisting of carbon paper. If the user has signed the receipt on the closed envelope, the letter internally checks the signature and reveals the secret content only if that signature is valid. That is, the signature and the confirmation of the recipient of the receiver are already added to the letter once he can view it and cannot be removed anymore. When this letter is published, he cannot repudiate that he was the last authorized recipient of it. For each receiver, the personal signature and confirmation of recipient is added to the tracking information of the letter, such that it contains all information of the previous receivers. When the letter is sent to the next authorized receiver, it is put into a smart envelope again and the letter is now accompanied by the tracking data. The data tracking part is figured as the path of the encrypted data shown in Fig. 2. Summarized, each receiver of the protected data signs the receipt before he or she is able to process the data. This measurement improves security for all authorized users of the security system. A receiver of such protected information can verify the way the data have taken up to him. The owner of the data protects them by access control: only users with an appropriate security module can decrypt the data. Additionally, the owner can proof if a suspicious plaintext copy of his document is authorized or not. This idea of non-repudiation of forwarding is explained in the following. VI. TRACKING DATA PROTECTION In order to protect the tracking data from targeted manipulations, two solutions are proposed in this paper. One is based on a known block cipher mode of operation with infinite error propagation. The other is a new mechanism called data confusion. This mechanism confuses data of arbitrary size, such that data in one block is not only permuted within that block. The following requirements must be fulfilled by the approaches: If an attacker manipulates any of the protected data, the source signature must be destroyed with very high probability. Thus, the confidential information is not authentic anymore. The tracking data of a receiver must be added before he or she is able to access the plaintext. A. Security Module The previously mentioned requirements make the use of a security module inevitable. This module must provide different functionalities: A secure storage for different secret and private keys. The owner of the security module must not be able to read them out. Generation and validation of digital signatures. Support SSL/TLS. The key agreement is done using a public key that corresponds to a securely stored 2012 ACEEE DOI: 01.IJIT.02.01. 30 3 private key within the module. The negotiated secret data encryption key is DEKn. Three different functions must be provided for data processing. These functions are described in the following. The prepare-function works as follows: 1. The sensitive data m which must be protected by the data tracking service are input into the security module. 2. The user adds configuration data CD, such as expiry dates for the protected data, valid receiver PIDs or a maximum number of allowed forwarding. 3. The source signature O(m) is calculated. Thus, mO is generated. 4. Finally, mO is encrypted using a secret storage key SSTK0. The encrypted copy is stored locally stored until it is processed again. The receive-function works as follows: 1. Data encrypted using DEK n are received and decrypted. The structure of such data is shown in Fig. 3. The tracking data within this output are protected using either the PCBC data encryption or the data confusion mechanism as described later in this chapter. 2. The tracking data signature (TDn-1) is verified. If the validation fails, the module stops processing. 3. The previous tracking data are displayed. The receiver can check the chain of receivers of the protected information. 4. If the receiver applies the receipt, the source signature O(m) is verified to check integrity and authenticity of m. 5. A digital watermark is added to m using SWKn. The watermarked copy m is output to the receiver. 6. For local storage, the tracking data are protected again and the data are encrypted using a secret storage key SSTKn.

Figure 4. The three functions of the security module

The send-function works as follows: 1. The locally stored data encrypted by SSTKn are decrypted again. If the sender is not the data owner (e.g. no tracking data are available yet), continue with step 3. 2. The PID0 of the data owner and the unique FID are added. The owner proceeds with step 4. 3. Both signatures O(m) and (TDn-1) are verified again in order to detect manipulations during storage. If an

ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012 error occurs, the module stops processing. The security module adds the PIDn+1 of the next receiver, a unique TIDn for the transmission and the current timestamp TS n . 5. The signature (TDn-1) is discarded and replaced by the new tracking data signature (TDn). The new signature authenticates all tracking data including those of previous receivers. 6. The resulting data are encrypted using DEKn+1 and transmitted to the next receiver. The three security module functions work as a black box. The data are input into the module, and certain data are output (if no error occurs). Fig. 4 illustrates the functions from the users view. The receive-function additionally outputs a watermarked plaintext copy of the protected data. This output will be explained in the watermarking chapter and is not shown in Fig. 4. 4. B. PCBC Data Encryption The propagating cipher-block chaining (PCBC) mode is used if small changes in a ciphertext should cause infinite error propagation when the data are decrypted. This mode of operation is chosen such that every data following the manipulated is also manipulated. For logical reasons, an attacker will try to manipulate or remove his personal tracking data. It is one requirement to make sure that such an attack is not successful. Therefore the tracking data have to be added ahead the existing data as shown in Fig. 3. If these data are encrypted using the PCBC mode, the manipulation of any of the tracking data leads to a useless plaintext. Neither the source signature O(m) nor the original message m can be recovered. The tracking data signature (TDn) is also destroyed if any preceding data is manipulated. For that reason, a receiver of such manipulated data recognizes the attack before he gets access to the message and before his tracks are added. Unfortunately, the PCBC mode deals with different problems and it is claimed to be insecure. If two adjacent ciphertext blocks are exchanged, it does not affect the decryption of subsequent blocks [15]. For this reason, an alternative mechanism is presented: the data confusion. C. Tracking Data Confusion The data confusion mechanism confuses the structure of certain data. It is an approach to protect tracking data embedded by the security module from manipulations by authorized receivers. Nobody can remove or change certain information in the data unless he or she is in possession of the required private key. Unlike other mixing schemes or encryption functions, the permutations in this approach do not shuffle data block by block [16], [17]. Instead, it considers the protected data as a single block of arbitrary size. This is for a good reason: If an attacker knows that each new tracking data are appended to the end of the data, he or she also knows which block must be manipulated. Targeted obliteration of traces must be prevented by the data tracking scheme. Fig. 5 sketches the flow of the protected data from the data owner to the n-th receiver. The tracking data including configuration data CD and the source signature O(m) are 4 2012 ACEEE DOI: 01.IJIT.02.01. 30 first mixed using the data confusion mechanism. This mechanism uses a pseudorandom generator with a minimum period of (with as the length of the data in bytes). This function must be initialized using a private key SCK. The PRNG calculates the new positions of the confused data block. Under these circumstances it can be guaranteed that each byte might have changed its positions to any new position. Only users who know the start value SCK are able to reverse the confusion process. The confused data are additionally encrypted using a common symmetric cipher like AES. As a side effect, the cipher behaves as a block by block mixing function. It follows that each bit of the confused data are additionally permuted within the block. Symmetric key algorithms are keycontrolled and therefore another private key TDEK is needed. Again, this key must only be known to the security module.

Figure 5. Data flow using the data confusion mechanism

The proposal of the data confusion mechanism is an approach to randomize data of arbitrary size. If such data are encrypted only, the manipulation of certain tracking data might lead to an invalid tracking data block while the source signature still remains valid. The new data confusion mechanism ensures that the source signature O(m) and the tracking data signature (TDn) become invalid if an attacker manipulates any of the confused data. A precise description of the data confusion mechanism is part of future work.

ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012 VII. DIGITAL WATERMARKING As already stated in the introduction, the unauthorized forwarding of confidential data cannot be prevented under all circumstances. The services mentioned before focus on protecting data from attacks by authorized users during storage and transmission. The system proposed in this paper should also provide traceability for plaintext copies of the protected data. Digital watermarks are chosen to meet this requirement. The plaintext document output by the receive function is declared as m. The watermark contains the unique identifier PIDn of the last authorized receiver. Additionally, the timestamp of watermark generation TS is n added. Finally, these data are signed, thus n(PIDn||TS ) is n generated and added to . If a copy of m is found by the data owner, the signature n(PIDn||TS ) is used to prove the n authenticity of the watermark. If was extracted and the signature was verified successfully, the user with identifier PIDn as extracted from cannot repudiate that he or she was the last authorized recipient of it. This feature of the proposed watermarking process is called the non-repudiation of forwarding. The data owner (or another administrative instance) decides if this data forwarding was authorized or not. Unauthorized information distribution must not be intentional. It is also possible that the security module of the user was stolen or broken. Or the watermarked plaintext data as output by the security module might have been stolen. These scenarios lead to digital forensic aspects. This research field will be considered more detailed in future work. The investigation or punishment of a proven forwarding is not part of this paper. The watermark embedding process is key-controlled using securely stored watermarking key SWKn. This key initializes a pseudorandom generator to choose the embedding positions within the carrier in the frequency domain. Due to the nature of digital watermarks, the carrier must provide enough embedding capacity for invisible and robust data embedding. It is desirable that protected data are destroyed if someone tries to manipulate the protected information or the embedded tracking data. A trade-off between robustness, imperceptibility and embedding capacity must be found. More precise descriptions about embedding capacities and sizes of the embedded data are part of future work. CONCLUSIONS This paper proposes a new security service which provides data tracking abilities. A suitable security module is needed to decrypt the protected data. A receiver is able to track the way the data have taken and prove the forwarding of the sensitive information by previous receivers. If a suspicious watermarked copy appears and has not been manipulated, the data owner can associate the copy to the last authorized receiver. Thus, a new non-repudiation service is introduced in this paper: the non-repudiation of forwarding. This service can be used both by the data owner and by all receivers of protected data. 2012 ACEEE DOI: 01.IJIT.02.01. 30 5 Two suggestions to protect the new tracking data are made: The PCBC encryption protects the data with infinite error propagation. If an attacker manipulates his personal tracking data, the sensitive information is also manipulated and useless. The new data confusion mechanism shuffles certain data and encrypts them. If an attacker manipulates any of these confused data, every confused data are also manipulated with very high probability. The proposed system is split into two different parts. Currently, the data tracking part can handle any kind of data. The watermarking part is ubject to the restrictions of recent digital watermarking algorithms. It is therefore currently limited to data where such algorithms can be applied. This paper describes the sapproaches in a superficial way and shows the big picture of the ideas. More detailed descriptions will be published in future work, including precise descriptions of the used mechanisms and the key management. It is also planned to describe another way to protect the tracking data using authenticated encryption. ACKNOWLEDGMENT This work is funded by the German Research Foundation (DFG) as part of the research training group GRK 1564 Imaging New Modalities . REFERENCES
[1] ISO/IEC FDIS 13888-1:2009, Information technology Security Techniques Non-repudiation Part 1: General, 2009 [2] ISO/IEC FDIS 13888-2:2010, Information technology Security Techniques Non-repudiation Part 2: Mechanisms using symmetric techniques, 2010 [3] ISO/IEC FDIS 13888-3:2009, Information technology Security Techniques Non-repudiation Part 3: Mechanisms using asymmetric techniques, 2009 [4] S. Kremer, O. Markowitch and J. Zhou, An Intensive Survey of Fair Non-Repudiation Protocols, Computer Communications, 2002, pp. 1606 1621 [5] R. Schick and C. Ruland, Document Tracking On the Way to a New Security Service, Proc. of Conference on Network Architectures and Technologies (SAR-SSI), 2011 [6] V. Scheidemann, Endpoint Security: Data Loss Prevention, Security Advisor ePublication, 2008 [7] K. J. Liu, W. Trappe, Z.J. Wang, M. Wu and H. Zhao, Multimedia fingerprinting forensics for traitor tracing, EURASIP Book Series on Signal Processing and Communications, Hindawi Publishing Corporation, 2005, ISBN 977-5945-18-6 [8] J.J. Chae and B.S. Manjunath, A robust embedded data from Wavelet coefficients, Proceedings of Storage and Retrieval for Image and Video Databases (SPIE), 1998, pp. 308 319 [9] Y. Wang, J.F. Doherty and R.E. van Dyck, A watermarking algorithm for fingerprinting intelligence images, Proceedings of Conference on Information Sciences and Systems, 2001, pp. 21 24 [10] M.U. Celik, G. Sharma, A.M. Tekalp and E. Saber, Lossless generalized-LSB data embedding, IEEE Transactions on Image Processing Vol. 14 No. 2, 2005, pp. 253 266 [11] J. Dittmann, A. Behr, M. Stabenau, P. Schmitt, J. Schwenk and J. Ueberberg, Combining digital watermarks and collusion secure fingerprints for digital images, JEI, 2000, pp. 456 467

ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012
[12] A. Barg, G.R. Blakley and G.A. Kabatiansky, Digital fingerprinting codes: problem statements, constructions, identification of traitors, IEEE Transactions on Information Theory, 2003, pp. 852 865 [13] Y. T. Lin and J. L. Wu, Traceable multimedia fingerprinting based on the multilevel user grouping, Proceedings of Multimedia and Expo, 2008, doi: 10.1109/ICME.2008.4607442, pp. 345 348 [14] D. Boneh and J. Shaw, Collusion-secure fingerprinting for digital data, Proceedings of CRYPTO 95, 1995, pp. 452 465 [15] J. Kohl, The Use of Encryption in Kerberos for Network Authentication, Proc. of Crypto 89, 1989 [16] M. Matyas, M. Peyravian, A. Roginsky and N. Zunic, Reversible data mixing procedure for efficient public-key encryption, Computers & Security Vol. 17, No. 3, 1998, pp. 265 272 [17] M. Jacobsson, J.P. Stern and M. Yung, Scramble all, encrypt small, FSE 99, LNCS 1636, Springer-Verlag, 1999, pp. 95 111

2012 ACEEE DOI: 01.IJIT.02.01. 30