Privacy Law and Policy Reporter
Dr John Borking
This article is based upon a paper delivered to the ICSI conference on anonymity and unobservability at the University of Berkeley, California in July 2000. The project described here (the PISA project) has now received substantial funding from the European Commission.
In the coming years, electronic commerce (e-commerce) and electronic government (e-government) will become more and more part of citizens’ everyday life. Many transactions will be performed by means of computers and computer networks. However, several hurdles have to be taken before e-commerce and e-government can develop their full potential.
An essential element of e-commerce and e-government is the collection of information on large computer networks, where either the information itself is the product sought or information about products and services is being retrieved. Currently the retrieval of information on large computer networks, in particular the internet, is getting more and more complicated. The volume of the stored information is overwhelming, and the time and capacity needed for its retrieval is growing. Congestion in computer networks is a serious problem.
Numerous services are currently available to ease these problems, ranging from simple push technologies such as ‘PointCast’, which brings information to your doorstep by ‘narrow-casting’ or filtering information based on an individual’s specified interests, to sophisticated systems that allow for the ‘personalisation’ of network user sessions and the tracking of user activities. Collaborative filtering of a user’s ‘clickstream’ or history of web based activity, combined with neural networks which look for detailed patterns in a user’s behaviour, are just beginning to emerge as powerful tools used by organisations of all kinds.
While the majority of these technologies are at the moment still in the design phase or of limited utility, they are indicative of the types of products that are being developed. The end result will be the creation and development of intelligent software agent technologies (ISATs). Intelligent software agents are software programs, at times coupled with dedicated hardware, which are designed to complete tasks on behalf of their user without any direct input or supervision from the user. Agents for that purpose contain a profile of their users. The data in this profile are the basis for the actions an agent performs: searching for information, matching this information with the profile, and performing transactions on behalf of its user.
At first glance, ISATs appear to hold out great promise for automating routine duties and even conducting high level transactions. However, upon greater reflection, it becomes clear that ISATs could present a significant threat to privacy, given the wealth of personal information in their possession and under their control. Accordingly, it is highly desirable that their development and use reflect European privacy standards (such as European Union Directives 95/46/EC and 97/66/EC) in order to safeguard the personal information of their users.
The functionality and utility of user agents lies in what they can do for the user. Their whole raison d’être is to act on one’s behalf and function as a trusted personal servant, serving its users’ needs and managing their day to day activities.
Their powers are constrained by a number of factors: the degree of software sophistication, the number of services with which they can interact and, most importantly, the amount of personal information that they possess about the user.
It is this issue of ‘user profiling’ that is at the core of the privacy risk associated with the use of ISATs. Typically, an ISAT user profile would contain a user’s name, contact numbers and email addresses.
Beyond this very basic information, the profile could contain a great deal of additional information about a user’s likes and dislikes, habits and personal preferences, frequently called telephone numbers, contact information about friends and colleagues, and even a history of websites visited and a list of electronic transactions performed.
Because agents could be requested to perform any number of tasks, ranging from downloading the daily newspaper to purchasing concert tickets for a favourite singer, the agent is required to know a great deal of information about the user.
In order to function properly, ISATs must also have the following characteristics:
Depending upon the level of security associated with the user profile, this information may be saved in a plain text file or encrypted. However, the security of the data residing within the agent is only one part of the concerns regarding privacy. The arguably more significant concern is the dissemination of information during transactions, and the general conduct of the agent’s activities on behalf of the user.
As an agent collects, processes, learns, stores and distributes data about its user and the user’s activities, the agent will possess a wide variety of information which should not be divulged unless specifically required for a transaction. In the course of its activities, an agent could be required, or be forced, to divulge information about the user that he or she may not wish to be shared.
The most important issue here is one of openness and transparency. As long as it is clear to the user exactly what information is being requested, the purpose for which it is needed, and how it will be used (and stored), the user will be in a position to freely make decisions based on informed consent.
Of even greater concern is the situation where the ISAT may not be owned directly by the user but is made available (rented or leased) to the user by a third party in order to assist in accessing services.
Summarising, there are two main types of privacy threats that are posed by the use of ISATs:
The user is required to place a certain degree of trust in the agent to perform its functions correctly as requested. However, this trust could well come with a very high price tag, one of which the user may have no knowledge or awareness — the price of his or her privacy. Failing to ensure such trust may prove to be a major hindrance for the development of electronic transactions and commerce.
Conventional information systems generally record a large amount of information. This information is often easily linked to an individual. Sometimes these information systems contain information that is privacy-sensitive to some individuals. To prevent information systems from recording too much information, the information systems need to be adjusted.
There are a number of options to prevent the recording of data that can be easily linked to individuals. The first is not to generate or record data at all. The second is not to record data that is unique to an individual (identifying data) — the absence of such data makes it almost impossible to link existing data to a private individual. These two options can be combined into a third one. With this third option, only strictly necessary identifying data will be recorded, together with the non-identifying data.
In the privacy incorporated software agents (PISA) project, we will study how this privacy enhancing technology (PET) can be implemented in software agents.
The conventional information system contains the following processes: authorisation, identification and authentication access control, auditing and accounting. In the conventional information system, knowledge of the user’s identity is often needed to perform these processes. The identity is used within the authorisation process, for instance, to identify and record the user’s privileges and duties. The user’s identity is thus introduced into the information system. Because in a conventional information system all processes are related, the identity travels through the information system.
The main question is: is identity necessary for each of the processes of the conventional information system? For authentication, in most cases, it is not necessary to know the user’s identity in order to grant privileges. However, there are some situations in which the user must reveal his or her identity to allow verification of certain required characteristics.
For identification and authentication access control and auditing, the identity is not necessary. For accounting, the identity could be needed in some cases. It is possible that a user needs to be called to account for the use of certain services, for example when the user misuses or improperly uses the information system.
The introduction of an identity protector (IP) as a part of the conventional information system will structure the information system in order to protect the privacy of the user. The IP can be seen as a part of the system that controls the exchange of the user’s identity within the information system.
An important functionality of the IP is conversion of a user’s identity into a pseudo-identity. The pseudo-identity is an alternate (digital) identity that the user may adopt when consulting an information system.
The user must be able to trust the way his personal data is handled in the domain where his identity is known. The IP (as a system element) can be placed anywhere in the system where personal data is exchanged. This offers some solutions for privacy compliant information systems.
Techniques that can be used to implement an IP are: digital signatures, blind digital signatures, digital pseudonyms and trusted third parties.
In PISA, the technological objective is to solve the problems of design and implementation of an IP in software agents to be used on the electronic highway for privacy issues.
The challenge is to design a PET agent that independently performs these tasks while fully preserving the privacy of the persons involved, or at least up to the level specified by the persons themselves. For that purpose, the agent should be able to distinguish what information should be exchanged in what circumstances to which party. The challenge here is to implement privacy laws, specifically the European Directive 95/46/EC (being the most rigorous privacy standard at this moment in the world), and other rules into specifications for a product. Next the specifications have to be implemented by software programming. Also there should be appropriate (cryptographic) protection mechanisms to ensure the security of the data and prevent ‘leakage’ to third parties.
PET agents will enable the user to protect him or herself against loss of informational privacy as a consumer or citizen in e-commerce and e-government transactions and communications — unlike systems like P3P, where an asymmetric situation exists to the benefit of the website owner. PISA empowers the consumer and citizen to decide at any time and in any circumstance when to reveal his or her identity.
If the use of agents could lead to so many potential privacy risks, one wonders if it could be possible for anyone to use ISATs safely. I believe this still remains within the realm of possibility, and that the answer lies in the use of PET agents.
The tracking and logging of a person’s use of computer networks is a major source of potential privacy violation. Conventional information systems perform the following transactions: authorisation, identification and authentication access control, auditing and accounting. At each phase, a user’s identification is connected with the transaction. However, use of a system element (filter) called the IP in the design of a system will go a long way to protecting privacy. The introduction of an IP into an information system can improve the protection of the user’s information by structuring the system in such a way as to remove all unnecessary linkages to the user’s personally identifying information.
The IP system element can be placed between the user and the agent; this prevents the ISAT from collecting any personal data about the user without the knowledge and prior consent of the user. Alternatively, the IP can be located between the agent and the external environment, preventing the ISAT from divulging any personal information unless specifically required to do so in order to perform a particular task or conduct a specific transaction.
Additional technical means may also be integrated into the ISAT in order to bring even more transparency to the user in the operation of the agent, thus ensuring the user’s knowledge and, if necessary, informed consent.
An international team consisting of the Dutch Physics and Electronics Laboratories (TNO), the Dutch Technical University of Delft, Sientient, a Dutch company designing data mining systems, Italsoft from Roma (Italy), CIME-Labs and Toutela from Paris, the Canadian National Research Council (with Nortel and three other Canadian companies) and the Registratiekamer (Dutch Data Protection Authority) will build a demonstrable PISA not later than 2003. The architecture of PISA will be published in the beginning of 2002. A special PISA website will soon be opened for publication of the progress of the project and discussion.
The main objectives of the demonstration are to show the security of the privacy of the user in the types of processes that could be employed:
The integration of these technologies into the core of the ISAT, combined with a process that places similar technology between the agent and the external environment, would result in a demonstration system PISA that ensures maximum protection against threats to the user’s privacy, enabling users to protect themselves without being dependent on systems like P3P or other privacy seals. The international team hopes to get a subsidy of the European Commission under the EU Technology Program.
The PISA project contributes to EU-IST-program and key action line II4.1 and II4.2 objectives:
II4.1 states as an objective:
[t]o develop and validate novel, scalable and interoperable technologies, mechanisms and architectures for trust and security in distributed organisations, services and underlying infrastructures [with a focus on] building technologies to empower users to consciously and effectively manage and negotiate their ‘personal rights’ (ie privacy, confidentiality, etc). This includes technologies that enable anonymous or pseudonymous access to information society applications and services, for example by minimising the generation of personal data.
The II4.2 objective is:
[t]o scale-up, integrate, validate and demonstrate trust and confidence technologies and architectures in the context of advanced large-scale scenarios for business and everyday life. This work will largely be carried out through trials, integrated test-beds and combined RTD and demonstrations [with a focus on] generic solutions that emphasise large-scale interoperability and are capable of supporting broad array of transactions (eg e-purses and e-money), applications and processes. Development of solutions that reconcile new e-commerce models and processes with security requirements, paying particular attention to the needs of SMEs and consumers. Validation should generally include assessing legal implications of proposed solutions, especially in the context of solutions aimed at empowering users to consciously and effectively manage their personal ‘rights and assets’.
The PISA project aims to build a model of a software agent within a network environment to demonstrate that it is possible to perform complicated actions on behalf of a person without the personal data of that person being compromised. In the design of the agent an effective selection of the presented PETs will be implemented.
The PISA demonstration model is planned to be a novel piece of software that incorporates several advanced technologies in one product:
Additionally, the project involves:
In order to prove the capability of the PISA model, we propose to test it in a model environment, in two cases of e-commerce that closely resemble real life situations.
The demonstration model will be applied to a practical case which is suitable for testing several aspects of privacy protection. Testing of the demonstration model will be done in a local network environment. The proposed test object is the matching of supply and demand on the labour market. In the coming years it is expected that the matching on the labour market will increasingly be performed through such (internet based) intermediaries. In this process the agent carries the profile of a person, including sensitive information about his or her history, (dis)abilities, skills, labour history and so on. When matching this information with the available functions, a match has to be made between the user profile and the demands of the party requiring personnel. In this process personal data will be exchanged in an anonymous way. After the match has been made successfully and has been reported by the agent to its user, he or she may decide to reveal his or her identity.
The second demonstration will take place in an extension of an existing web portal providing a range of intermediation services oriented towards both individual consumers and business users, addressing areas like the buying and selling of vehicles and renting of real estate. Respecting and protecting the privacy of users posting requests on this site is a key aspect of user acceptance of the intermediation services proposed. Testing of the demonstration model will be done in a local network environment. The proposed test object is the matching of supply and demand on the vehicle and real estate markets. In the intermediation process, the agent matches requests and offerings posted by individuals, which may encompass confidential information not to be disclosed to the outside world.
E-commerce, e-government and other internet developments introduce fundam-entally new modes of consumer transactions, with new challenges to data protection. A growing awareness exists that, in addition to a legal framework, effective privacy protection should be supported by technical and organisational measures. This is particularly needed in view of the difficulty in ascribing global electronic transactions to a particular jurisdiction.
In order to increase the visibility of privacy compliance, websites involved in e-commerce activities apply so-called ‘privacy seals’ like P3P. These are mere visual statements of compliance, the credibility of which is difficult for consumers to judge.
Rather than relying on legal protection and self-regulation, the protection of consumers’ privacy is probably more effective if transactions are performed by means of technologies that are privacy enhancing. Without PET privacy cannot be protected adequately.
In the context of e-commerce, a major current initiative to increase privacy compliance is P3P, under development by the W3C consortium. Major players in the software industry back the W3C consortium. P3P provides for a standardised language and interface for both consumer and website to provide their personal data and privacy preferences. Transactions can be accomplished according to the specified level of openness given by consumers on their personal data.
The P3P, as it is being developed now, does not foresee any ‘negotiation’ between consumer and market party. It is basically a compliance check between consumer preferences and website policy; therefore it is relatively static and needs active user input for each transaction. Given the limited capability of P3P, it is difficult to implement the privacy principles underlying the European Directive into this technology. Although an essential first step, P3P is not sufficient for effective and efficient privacy protection of consumers.
E-commerce and e-government will only take off if the search for matches of supply and demand and the accompanying negotiation on privacy preferences can be performed in a distributed way. This would decrease both the load on network capacity and the time spent by a consumer. Intelligent agents are the right types of software for this task. In contrast to current systems like P3P, agents should be able to negotiate intelligently with the consumer, a website or with each other. The skills to negotiate privacy and a common platform for communication will have to develop. This level of sophistication will enable developers to implement more deeply the principles of privacy into the technology itself. Once equipped with these skills, a PET agent will become an enabler of secure e-commerce and e-government. It will provide structural feedback to the user as well as allowing them to control their identifiable data in cyberspace, preventing harmful privacy intrusions.
The project results will be of importance to the Dutch Data Protection Authority (DDPA) and other privacy commissioners in several respects.
Firstly, the resulting model software will be an important step towards an effective and user-friendly implementation of the European Directive 95/46/EC. The emergence of new forms of electronic business and interactions requires a data protection approach that ties in more closely with the dynamic technological environment. Over the next few years, software agents will become an important instrument for data protection, in addition to the legal instruments used now. The results to be achieved will have a flow on effect on the legislative process as well.
For the DDPA, participating in the project helps the office to develop practical ways to make its task easier as well as being a way to keep up its renowned high standard of expertise on the connection between information technology and privacy.
It is necessary to be conversant with technology developments and to translate the Directives 95/46/EC and 97/66/EC into design recommendations if the organisation is to retain its viability as an authority, and to serve the consumers within the EU, in matters of data protection.
A fruitful collaboration between the DDPA and the industrial partners, finally, will help to keep up the competitive advantage of the European software industry in designing tools that provide trust and security as added values.
Dr John J Borking is vice-president of the Dutch Data Protection Authority.