The Open Data Commons: A New Vision For The Future of Open Data

The Citadel
Open Data Commons
A collection of excerpts from reports and deliverables of

the CIP (ICT-PSP) Project Citadel On the Move
Jesse Marsh, Francesco Molinari, and Ricardo Stocco

Alfamicro Lda, Cascais, PT
with contributions from the partners and participants in the Citadel project consortium
www.citadelonthemove.eu
LICENSE
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0

Unported License. https://creativecommons.org/licenses/by-nc-sa/3.0/
ACKNOWLEDGEMENT
The material in this book was produced as a result of work carried out in the context of the
Citadel on the move project, which was partially funded through the EUs CIP ICT PSP
programme under Grant Agreement n. 297188.
Cover image: http://commons.wikimedia.org/wiki/File:Leipzig_1632_Theatrum_Europaeum.jpg

FURTHER INFORMATION
For the most recent updates on the project, its objectives and results, visit the Citadel website
at www.citadelonthemove.eu.
DISCLAIMER
This project has been made possible by the EU cofinancing under the CIP (ICT-PSP) programme
2012-2014. However, the opinions expressed here are solely of the author(s) and do not
represent the official standpoint of any EU institution.
FOREWORD
Citadel on the Move was an EC funded project under the CIP (ICT-PSP) programme 2012-2014,
with a simple objective: to make Open Data an achievable reality for every city in Europe. By
working with the four pilot cities of Ghent (BE), Issy-les-Moulineaux (FR), Athens (EL), and
Manchester (UK), Citadel developed an easy to use platform that makes it possible for all
governments, especially the small ones that often get left behind, to Open Data and unlock
smart city innovation. The tools and the datasets and apps generated (over 500 apps in the
projects lifetime) are all part of a central concept the Open Data Commons that extends the
scope of Open Data from specific city portals to all actors in a Smart City, promoting the active
engagement of citizens and local businesses as well as city departments and agencies to
contribute their own data and build a common data resource for the whole city.
The Open Data Commons is a concept that was developed by Alfamicro, one of the key partners
of the Citadel consortium, and reported in a fragmented way across a range of project
documents. To make the Open Data Commons concept less obscure and popularize it to a
broader international audience, we have assembled in this book some of the key contents
developed all along the project. This book therefore briefly presents the Citadel project and the
stakeholder based approach applied to the development and governance of the Open Data
Commons in the four pilot cities. It then describes the Open Data Commons and how it was
implemented in the course of the Citadel project, with a special focus on the semantic
dimension and a specific chapter on the issue of privacy. Finally, the policy implications of the
Open Data Commons and the future of Open Data are briefly explored.
This work has of course been made possible by our interaction with all of the partners of the
Citadel consortium, under the leadership of Geert Mareels of CORVE (BE) supported by Julia
Glidden and the team at the 21c Consultancy of London. The pilot driven approach was
successfully carried out by the City of Ghent, Issy Media of Issy-les-Moulineaux, MDDC of
Manchester, and DAEM of Athens, and supported by the evaluation work of iMinds (BE). The
technical team included Intrasoft and ATC of Greece as well as Derby University (UK), ITEMS
(FR), and V-ICT-OR (BE). Project dissemination was entrusted to the Euractiv Foundation and
coordination to IS-practice, both in Brussels. Part of the work described here was also made
possible by additional funding from the FI-WARE projects Lisbon Pilot, whose technical team
included FullIT, IPN, and DRI, all from Portugal.
Our thanks go out to all of the dedicated people with these organisations whom we have had
the pleasure to work with over the last years, as well as the citizens and developers who have all
engaged with us in the design and development of the Citadel Platform. Finally, my thanks go to
the Alfamicro team who have written and compiled this book, as well as to Leonardo Alberto dal
Zovo, who played the essential role of developing the main tools of the Open Data Commons.
lvaro Duarte de Oliveira, President
Alfamicro Lda
CONTENTS
Foreword .......................................................................................................................... 3
CONTENTS ............................................................................................................................ 5
Index of Figures ................................................................................................................. 8
Index of Tables .................................................................................................................. 9
Definitions used in this Book............................................................................................ 11
The Citadel On the Move project ...................................................................................... 15
The Citadel approach to Open Data.................................................................................. 15
Defining the Open Data Commons ...................................................................................... 21
Stakeholder dynamics for Open Data ............................................................................... 21
Experimentation in Pilot Cities ......................................................................................... 28
Roles in the ODGG ................................................................................................................. 29
Reaching ODGG Objectives .................................................................................................... 30
The Open Data Commons at Work ...................................................................................... 33
Operationalisation through Experimentation ................................................................... 33
First issue (2012) .................................................................................................................... 34
Second issue (2014) ............................................................................................................... 36
The Semantic Dimension of the ODC ................................................................................ 38
Standards issues in the first ODC concept ............................................................................. 38
The Emergence of the Converter-AGT model........................................................................ 39
The central issue of semantics ............................................................................................... 41
Semantic convergence in the pilot cities ............................................................................... 42
The ODC as a Semantic Framework ....................................................................................... 43
Privacy and the Open Data Commons.................................................................................. 47
General Framework ......................................................................................................... 47
The Privacy Impact Assessment Framework ......................................................................... 49
Privacy types .......................................................................................................................... 50
Privacy at the Community level........................................................................................ 53
Periodic surveys of the Pilot Cities......................................................................................... 53
Analysis of Outcomes............................................................................................................. 56
Towards a Community PIA ..................................................................................................... 57
Privacy at the App Level................................................................................................... 57
Mapping of Citadel Apps........................................................................................................ 57
Analysis of Implications ......................................................................................................... 59
Proposal for an App PIA Framework ...................................................................................... 60
Privacy at the Data level .................................................................................................. 63
Privacy in the Open Data Commons ...................................................................................... 63
5
Analysis of implications .......................................................................................................... 64

Proposal for a licensing mechanism....................................................................................... 65
Maturity of Open Data Governance ..................................................................................... 71
Capability of Open Data ecosystems ................................................................................. 71
Governance roles ............................................................................................................. 76
Process ................................................................................................................................... 79
Guidelines for completion...................................................................................................... 81
The Citadel Governance Toolkit ........................................................................................ 82
The ODC and the Future of Open Data ................................................................................. 85
Towards the Semantic Web .............................................................................................. 85
The Citadel Vision: Territories of Data ................................................................................ 85
Flattening datasets ................................................................................................................. 86
Semantic relationships in Citadel apps .................................................................................. 89
Back to LOD ............................................................................................................................ 91
The ODC as a Policy Concept ............................................................................................ 93
The ODC in practice................................................................................................................ 93
Open Data as a public good ................................................................................................... 96
Policy implications.................................................................................................................. 98
Towards a Regional Cloud for a Territory of Data.................................................................. 99
References ........................................................................................................................ 103
Annex I: The ODC Toolkit ................................................................................................... 105
The Citadel Converter .................................................................................................... 105
The Library ........................................................................................................................... 105
The GUI Standalone ............................................................................................................. 106
Portlet for Liferay Portal ...................................................................................................... 107
The PHP Converter Library ............................................................................................. 108
The CitySDK-Citadel conversion script** ......................................................................... 108
Annex II: Standards adopted in the Citadel Platform .......................................................... 111
File Formats Mapping .......................................................................................................... 111
File Formats Choices ............................................................................................................ 112
Data Model Choice ............................................................................................................... 112
Points of Interest (POI) Standards........................................................................................ 113
Geospatial Standards ........................................................................................................... 114
Date and Time Data ............................................................................................................. 114
Sensor and IoT...................................................................................................................... 114
Metadata .............................................................................................................................. 115
POI Dataset Categories ........................................................................................................ 115
Mobile Application Templates ............................................................................................. 116
Gaps in Existing Standards .............................................................................................. 116
Character Encoding Issues ................................................................................................... 116
6
POI Issues ............................................................................................................................. 116

Events Issues ........................................................................................................................ 117
Geospatial Issues ................................................................................................................. 117
Annex III: The Citadel Charter ............................................................................................. 119
Preamble ....................................................................................................................... 119
The Malmoe Declaration ..................................................................................................... 119
The Citadel Statement ......................................................................................................... 119
Learning from the Citadel Project ........................................................................................ 120
Towards the Malmoe Objectives ......................................................................................... 121
The Citadel Manifesto .................................................................................................... 122
Vision.................................................................................................................................... 122
Commitments and challenges ............................................................................................. 123
About the Authors ............................................................................................................. 125
INDEX OF FIGURES
Figure 1. Open Data Value Chain .................................................................................................. 21
Figure 2. The Citadel Vision ........................................................................................................... 22
Figure 3. Typologies of innovation ................................................................................................ 23
Figure 4. Mapping of stakeholder domains .................................................................................. 23
Figure 5. Mapping of stakeholder transactions ............................................................................ 24
Figure 6. Citadel additional stakeholder transactions................................................................... 24
Figure 7. Citadel integrated ecosystem ......................................................................................... 25
Figure 8. Key areas of stakeholder interaction ............................................................................. 26
Figure 9. Outcome of stakeholder interaction .............................................................................. 26
Figure 10. Open Data activity contribution to Citadel objectives ................................................. 27
Figure 11. Pilots contributions to Citadel objectives.................................................................... 28
Figure 12. Issues for specification of the Open Data Commons ................................................... 34
Figure 13. The range of functions in the ODC ............................................................................... 35
Figure 14. The vision for the Citadel ODC ..................................................................................... 36
Figure 15. The ODC as a semantic model ...................................................................................... 41
Figure 16. The Semantic Core of the ODC ..................................................................................... 44
Figure 17. The Open Semantic Ecosystem .................................................................................... 45
Figure 18. The ODC in the 'Real World' ......................................................................................... 46
Figure 19: Stepwise PIA process (source: [20]) ............................................................................. 50
Figure 20: Typologies of privacy in Citadel .................................................................................... 51
Figure 21: Stakeholders role in the definition of privacy policies (Citadel members) .................. 54
Figure 22: Stakeholders role in the definition of privacy policies (Citadel non-members)........... 55
Figure 23: Conditions and purposes of MoU definition ................................................................ 79
Figure 24. A typical relational database structure ........................................................................ 87
Figure 25. A typical GTFS folder unzipped..................................................................................... 88
Figure 26. A typical CKAN Data Store ............................................................................................ 89
Figure 27. AGT Apps by Month ..................................................................................................... 90
Figure 28. The basic RDF syntax .................................................................................................... 92
Figure 29. The LOD schema for the statue of Einstein .................................................................. 92
Figure 30. Citadel JSON Data Model ........................................................................................... 113
INDEX OF TABLES
Table 1. Comparison of Pilot ODGGs ....................................................................................... 29
Table 2. Governance Roles in Pilot ODGGs .............................................................................. 29
Table 3. ODGG Objectives in Pilot Cities .................................................................................. 31
Table 4. Summary of the three levels of the Citadel PIA framework....................................... 52
Table 5. Mapping of Citadel Application Templates ................................................................ 58
Table 6. Taxonomy of Data/Application Pairings in Citadel ..................................................... 61
Table 7. Proposed data privacy classification scheme ............................................................. 65
Table 8. Proposal for a Data Licensing Mechanism (based on the CC scheme)....................... 67
Table 9. Proposal for a Data PIA Framework ........................................................................... 68
Table 10. Open Data Ecosystem capability matrix ................................................................... 73
Table 11. A CMM for LOD (example) ....................................................................................... 76
Table 12. Ecosystem role definition and potential MoU contribution .................................... 77
Table 13. Citadel Charter Approaches ..................................................................................... 83
Table 14. Citadel Common File Formats Mapping Grid ......................................................... 111
10
DEFINITIONS USED IN THIS BOOK

OPEN DATA
A statement of principles regarding the right of people to access, use and republish certain
datasets as they wish, without any restrictions from copyrights, patents or other IPR regimes. In
Citadel, we mostly consider Open Government Data, which may refer to citizens or enterprises,
based on the common assumption that its content is owned by the general public, therefore it
should be made freely accessible by everyone for the promotion of transparency and/or the
provision of incentives to economic and social initiatives. In this sense, one of the project goals
has been to make life easier for those public sector organisations holding raw datasets and
willing but not knowing exactly how to publish them electronically.
APPLICATION
Any piece of software purposefully developed to perform a certain task, particularly, but not
limited to, the benefit of mobile users. In Citadel, we focused on applications using Open
Government Data resources and one of the main project goals has been to provide an easy-touse platform that could dramatically simplify the creation of applications even by non-expert
users (the so-called Citizen Developers). A notable component of the Citadel vision has been
the requirement that application templates be unbundled from the datasets, with the app
accessing the information only when needed, directly through a remote server located
somewhere on the Internet.
API
API stands for Application Programming Interface, and is a software module that allows a
programmer designing an application direct access to another application. In the area of Open
Data, APIs are generally used for an app to fetch data from (or, if the programmer is allowed
access, to write information to) an on-line service such as transport or weather info or a citys
information services.
COMMUNITY
In Citadel, a stylized representation of the Open Data Community has been provided, composed
of the following actors: Policy Makers, being in charge of the high-level direction and regulation
of the process of opening up and cleansing government owned datasets; Data Providers,
responsible for the creation and management (setup, organization, structuration, cleansing) of
those datasets; Application Developers, usually ICT-savvy companies/individuals but also
Citizen Developers, with the mission of transforming available datasets into human readable
forms either products, or services, or both; and Business/Citizen Groups, including not-forprofit entities and NGOs as well as City travellers and visitors, the ultimate beneficiaries of the
generation, transformation and utilization of datasets according to their respective purposes.
11
OPEN DATA COMMONS

A repository concept specifically developed in Citadel, where it was designed as a mixed sociotechnical platform showing eco-systemic features (promoting the cooperation between Data
Providers and Application Developers) and acting as an open and public collection of tools and
services that users can navigate, acquire/adopt, and populate as they wish. Its particular name
is due to the simple idea that Open Data should be considered as a common good, in a public
sphere whose stewardship is to the benefit of both public and private stakeholders as well as
individual citizens.
CITIZEN DEVELOPER
In Citadel, the Citizen Developer metaphor has been used to identify a particular category of
users of the Open Data Commons. As an implication of it, we make the further distinction
between Citizen as Application Co-Developer, who improves an existing app provided to
her/him with an Open Source Code, and Citizen as Garage Developer, who uploads to the
project platform a new app realised from scratch by him/her possibly without immediate
business purposes (e.g. a free app, etc.).
PRIVACY
A human right enforced by almost all constitutions worldwide, which pertains to the protection
of the personal sphere against the abuses of the State or Law. According to the EU Directive
95/46/EC, privacy concerns may emerge wherever personally identifiable information is
collected and stored in digital form or with partly automated means and only partial,
improper or non-existent disclosure risk prevention is guaranteed to the data subject by the
so-called data controller (or processor, if acting on data controllers behalf). In Citadel, we
have made the distinction between three types of privacy: Community, Application and Data
level.
PRIVACY AS A SERVICE
The Privacy as a Service concept [10, 12] refers to a group of security protocols through which
the privacy and legal compliance of user data can be exercised in cloud service architectures. It
can be associated to the provision of feedback to the user on the current risk of personal
information exposure in dependence of his/her privacy settings. Together with the Privacy By
Design operational principles, it has guided the first instantiation of Citadels Privacy Impact
Assessment Framework (PIAF).
PRIVACY BY DESIGN
Privacy by Design is a concept developed by the former Information and Privacy Commissioner
of Ontario, Dr. Ann Cavoukian, to mitigate the impact on personal data of ICT and largescale
networked data systems. The objectives of Privacy by Design are, for people, to gain personal
12
control over own information and, for organizations, to acquire a sustainable competitive
advantage by taking a positive sum, not a zero-sum, approach to privacy protection1.
SEMANTICS
Semantics is technically the study of meaning, but in the area of data management it takes on a
more specific definition as types of data structures specifically designed to represent
information content. Semantics can thus refer, for instance to the meaning of column headings
in an Excel table and whether to expect the same information under the headings name and
title.
CITADEL HUB
Available online at http://www.citadelonthemove.eu/en-us/Thehub.aspx - it is a collection of
Open Data, mobile application templates, user extensions and discussions about these. It was
setup in the early stages of the Citadel project, before migrating its contents to Github
(https://github.com/citadel-eu).
OPEN DATA GOVERNANCE GROUPS
Established since the early project stages, in compliance with the Living Lab approach, they
were informal groups consisting of the key stakeholders from the Open Data Community in each
of the pilot settings.
CITADEL CHARTER
Better referred to as Citadel Open Data Charter, it was originally conceived of as a formal
protocol (MoU) to be signed in preparation and accompaniment of the governance of opening
up processes. What has in fact emerged as a common need for the cities involved in Citadel is to
share a common vision and principles, so that the final version of the Charter has taken the
form of a manifesto, in continuation of the Citadel Statement of 2010 on which the project is
originally based.
The 7 foundational principles of Privacy by Design are described at:

http://www.privacybydesign.ca/index.php/about-pbd/7-foundational-principles/
13
14
THE CITADEL ON THE MOVE PROJECT

THE CITADEL APPROACH TO OPEN DATA
The Citadel... on the Move project builds on two key political statements of principle that
underpin much of the Smart City movement:
The Malm Declaration, agreed on 18 November 2009 at the 5th Ministerial

eGovernment Conference in Malm, Sweden, calling for a new generation of open,
flexible and personalised eGovernment services of administrations at local, regional,
national, and European level
The Citadel Statement, signed in Ghent on 14 December 2010, aiming to operationalize
the Malm Declaration with a local government Action Plan based on five key
principles: a common architecture, Open Data, citizen participation, privacy, and rural
inclusion.
The Citadel on the Move project is for many aspects carrying that process one step further by
implementing these principles in practice, building Open Data-based mobile application
templates for pilot experimentation in four pilot cities across Europe Ghent, Issy-lesMoulineaux, Manchester, and Athens and since extending engagement to over 120 cities in 5
continents. The results of the project allow all cities, even the smaller villages, to offer public
services on the mobile phones of their citizens and visitors at a low cost. All they need to do is
to publish their data using the Citadel tools and formats and the mobile apps built according to
the same standards will be usable in their municipality.
Citadel thus makes it possible for all municipalities to offer
Data to be used by Mobile Apps developed by citizens or companies (local or from other
cities)
Mobile Apps to their citizens and visitors.
The Mobile Apps are the most visible and concrete services based on open data to make life
easier for people. But of course the cities will have to do part of the work themselves. They have
to publish their data in a way they can be picked up by the Applications. Sounds easy - but they
will have to overcome the political, administrative and legal constraints, which still slow down
the Open Data movement.
In contrast, policy makers can expect that there will be a growing demand from their citizens to
be able to use the same app in their city as they have experienced in a neighbouring city. Like
the use of Internet, which slowly but surely found its way in the Public sector twenty years ago,
Citadel helps the same modernization of the Public services in the use of Open data on Mobile
Applications.
In the three years of development and experimentation of the Citadel concept, three key
principles have emerged, which we can fix as strategic guidelines for Smart Cities:
15
The Citadel Open Data Commons
The new roles of citizen developer and traveller (visitor)

Standardization processes rather than standards
Open Data as a Commons.
CITIZEN DEVELOPERS AND TRAVELLERS

The emphasis in the Malm Declaration and Citadel Statement on citizen empowerment and
participation are finding application in Citadel through concrete approaches that really place
citizens at the centre of the process.
The concept of Citizen Developer means that technology development can no longer be entirely
in the hands of technicians, but must always imagine the products of the technology industry as
incomplete artefacts whose construction as technological tools is only finalized by citizens.
This approach builds on a trend that has been in vogue for some time; it was as long ago as
2006 that Time magazine deemed the computer user to be Person of the Year, recognition of
the increasing role of User Generated Content. Nonetheless, the divide between producer and
consumer has remained intact to date. Citadels Citizen Developer instead gains full status as
producer of services and applications.
The way Citadel implements this concept is through the idea of Application Templates. In
common office software, templates are ready-to-go models for standard uses such as a business
letter or a monthly budget, allowing non-experts to add the content and finishing touches.
Citadel provides Templates for Open Data, as modules that carry out the technical work of, say,
accessing a database of events or air quality data and visualizing the information contained
therein. Citizen Developers, or rather those with some familiarity of HTML5, can then piece
these modules together to build mobile applications for their own city, such as an app to advise
those with allergies as to whether or not to attend an open-air concert.
The Citizen Developer is thus not just a technical concept, but a new form of empowerment and
democratization of Internet technologies. Mobile applications can now be designed by the same
people that will use them, rather than devised in far-away research laboratories, and they thus
belong to a city and its citizens in a new way. In this sense, Citadel is part of the growing
number of initiatives in the Human Smart City movement2, which envisages future urban
services driven by people more than by the underlying infrastructures, through processes able
to seamlessly blend technical with social innovation.
The second role concept to emerge from Citadel is that of the Traveller. When dealing hands-on
with a mobile application such as a city tour or a guide to local restaurants, the question
emerges as to whom this application is really for. From there, the idea of Traveller takes shape
as a cosmopolitan figure who is neither a total stranger nor a member of a generations-old local
family. In this perspective, we are perhaps all Travellers both in our own cities whose various
quarters we may know in greater or lesser detail and across Europe, as the Erasmus
http://humansmartcities.eu/
16
The Citadel on the Move Project
generation builds trans-cultural fluency into an emerging notion of nomadic European

citizenship.
If the author-reader dialogue in Citadel is between citizen developers and travellers, then the
idea of Smart City services can be framed in the concept of hospitality. Open Data becomes a
welcoming gesture on the part of a City Administration, whose role is no longer one of cold
efficiency but as the host who opens the house to familiar and unfamiliar faces alike. Public
sector information becomes a shared asset, the basis on which the dialogue within and between
cities can unfold, with travellers as messengers of other places and experiences offering new
understandings of local urban life that only an external eye can bring.
STANDARDS AND STANDARDISATION PROCESSES
Both the Malmoe Declaration and the Citadel Statement call for the adoption of standard
technical formats to facilitate the uptake of Open Data based applications and their
interoperability across different cities and even Member States. Indeed, over the last decades
we have seen how agreement within an industry on a technical standard such as VHS or MP3
can be a determining factor in speeding up innovation and opening up markets. In a sort of
policy paradox however, most attempts to impose such standards by decree such as the
attempts to introduce Digital Rights Management to protect copyrighted material have failed,
overtaken by rapid technological advances that either bypass top-down constraints or shift the
foundations on which the proposed standards are intended to work, thus making the proposed
standards irrelevant. Nonetheless standards do emerge and they are important; it is just very
risky to try to pick a winner.
Citadel on the Move addresses this conundrum by shifting the emphasis from standards to
standardization processes. If we look at the history of standards, especially the kind of data and
semantic standards most relevant to Citadel, we see that they tend to first appear in the midst
of a flurry of innovation focused in a specific area, with one or more of the key innovators at
that moment proposing to clean things up. The key innovators can be end users or technology
providers, each with a different self-interest in proposing a standard, but each with their own
concrete needs together with an eye to the benefits of scaling up in mind. It is this combination
of healthy self-interest together with an awareness of positive network effects that often leads
to the emergence of a good standard.
As innovation literature from Thomas Kuhn onwards shows, the key then lies in part in the
intrinsic quality even elegance of the proposed standard but also in the credibility of the
body proposing it and the degree to which it is judged to be acting in the general interest and
willing to defend and maintain that standard over time. An important sign is thus the degree of
acceptance of a standards proposal, evident not only through adoption by lead users but
above all by the emergence of tools, translators etc. that can adapt non-conformant datasets to
that standard or, conversely, allow access to that standard by a tool originally designed for
another one. This ecosystem of tools often means that two or three standards can be adopted
concurrently with a high degree of interoperability.
17
Rather than proposing single standards for a given area, Citadel thus prefers to focus on building
literacy in standardization processes, both for both todays and tomorrows emergent needs.
Indeed, this means learning to clearly identify the area of standardization, search for on-going
activities and standards proposals, search for the richness of the toolkit ecosystem built around
the available options, and evaluate the best strategy in relation to the current landscape.
In short:
Standards are relevant but cannot be defined top-down

Standards result from social adoption and technology convergence processes
Standards tend to define a path towards a coherent vision (ie Web of Data)
In each area such as file formats, data models, etc., usually a limited set of standards
emerges and prevails
Tools exist to translate between standards and allow for on-the-fly interoperability
The key strategy is thus understand these processes and align standards to practice
Citadel offers a solution to integrate at each step the up to date standards.
OPEN DATA AS A COMMONS

Open Data is generally promoted, beyond its technical modularity within the Web of Data
vision, as a matter of principle: Open Data is a good thing in terms of a) transparent government
and b) collaboration between the public and private sectors for the creation of services in the
public interest. Open Datasets themselves are in the public domain, as Public Sector Information
(PSI), while it is private businesses who build the applications using PSI. As we will see below,
however, the situation is not as simple as it appears, as a mature Open Data ecosystem includes
a wealth of tools, interfaces and toolkits between the data and the applications, making it
increasingly difficult to draw the line between where the obligations of the public authority end
and the opportunities for individual enterprises begin.
Citadel on the Move proposes an innovative approach in which the set of all available tools
and services that can be considered as non-specific to either a given dataset or a given
application, is considered as a Commons: a collection of re-useable items that belong to the
community. This Open Data Commons (ODC) approach aims to provide the greatest benefits to
data providers and application developers alike, lowering the risk of specific standards decisions
by the former and the investment required for a new application by the latter. In fact, the
required tools that allow the two to interact are likely to be already in the Commons, and if they
are not the required re-useable part of the new application is developed and donated for use
by others in a win-win situation for the developers themselves.
Governance of the ODC is a collaboration between the city administration, citizens and
businesses, and application developers, but it is the city government that oversees that the
process is open and fair. All parties discuss Open Data strategies for the city, i.e. which datasets
to open, possible applications, standards, privacy and security, and so forth, since the ODC
highlights the public relevance of these issues. Indeed, with the Open Data Commons, the role
18
The Citadel on the Move Project
of the public sector is elevated from mere data provider to the stewardship of the collective
interest.
In short:
Citadel has defined a common space in the public domain as key to uptake of Open Data
The Open Data Commons as the on-going collection of shared tools and resources
allows to publish and access datasets transparently
Promoting the emergence of standards and sharing standards of practice
Based on a partnership of the data and development communities
Governance principles are required to define ODC structure and nature
Role of the City Government in guaranteeing openness and transparency of governance
19
20
DEFINING THE OPEN DATA COMMONS

STAKEHOLDER DYNAMICS FOR OPEN DATA
Over the last decade, several authoritative studies [6, 13, 14] have dealt with the definition of a
value chain for the commercial and non-commercial re-use of Public Sector Information,
including Open Data as a specification thereof. These attempts were based on a number of
assumptions [17], namely that:
Enabling technologies such as the Internet and open source software applications are
supporting and enhancing the main value-creating functions;
Much of the currently expanding re-use activity only started once low-cost ICT applications
and networks became available;
A positive economic value is actually created out of Open Data / Public Sector Information
reuse, according to a number of relevant business models [8];
Recent trends on collaborative data and service production between governments and
citizens [15] do not add significant feedback loops to the workflow schematized in the
following Figure:
Figure 1. Open Data Value Chain
In the above representation, four main actors, or stakeholder categories, can be identified, in
close association with well specific tasks:
Policy Makers, being in charge of the high-level direction and regulation of the whole
process, and with specific respect to Data Providers;
Data Providers, usually, though not always, public bodies or agencies (such as public utility
companies, statistical offices, chambers of commerce etc.), being responsible for the
creation (setup, organization, structuration) of the open datasets, and sometimes also of
their adaptation and specialisation to the needs of the Application Developers;
21
Application Developers, usually ICT companies, sometimes under the control of public
bodies, otherwise acting on the free market, with the mission of transforming the datasets
available into human readable forms either products, or services, or both;
Business/Citizen Communities, including not-for-profit entities and NGOs, who are
ultimately beneficiaries of the transformation, generation and utilization of public datasets
according to their respective (business / non business) purposes.
Activities beyond raw data creation, collection and aggregation, which can be relevant to value
creation include, for instance: data processing, editing and packaging, marketing and delivery.
More recently, they also comprised the development of APIs, mash-ups and other forms of
user friendly if not user generated content. However, as the following picture shows, the
essence of Citadel vision is to complicate the previous representation of the value chain by
adding three forms of interaction between the four stakeholder categories introduced before:
a) Data co-production, deriving from the Business/Citizen Communities themselves, as
parallel and additional sources with respect to Data Providers;
b) Application co-design, again reflecting the spirit of freedom and initiative that
characterizes most end user communities;
c) And policy co-creation, as joint result of the feedback searched for by the smarter
Policy Makers and received back from all of the remaining stakeholder categories, after
a complex process of Living Lab interaction that is the goal of Citadel development
activities to achieve.
Figure 2. The Citadel Vision
As final outcome of this set of feedback loops and interrelations, two main goals are to (should)
be achieved: intelligent policy learning, from the perspective of workflow directors and
regulators; and the creation of (additional) value from the disclosure of Open Data and the reuse of Public Sector Information, that what could be reasonably guaranteed using the
conventional, one-way logic depicted in Figure 1 above.
22
Defining the Open Data Commons
The way this outcome becomes feasible can be described as follows. In Figure 3, we add
another relevant analytical dimension to our vision, namely the distinction between
technological and social (including also institutional) innovation. Among the many definitions of
the latter, we would like to adopt the following: innovative solutions and new forms of
organisation and interactions to tackle social issues.
Figure 3. Typologies of innovation
By the combination of the value chain tasks depicted in Figure 1 with the typologies of
innovation introduced above, we can easily locate the four stakeholder groups as per the
following diagram:
Figure 4. Mapping of stakeholder domains
Here, the corresponding value transactions - using the jargon popularized by the Value Network
Analysis paradigm [16] can be depicted as in the Figure on the next page:
23
Figure 5. Mapping of stakeholder transactions
One can notice the addition of the Impact and Requirements function from the
Business/Citizen Communities to the Policy Makers, in such a way that the linear workflow
outlined in Figure 1 may hold an iterative feature permanently added to it.
However, the contribution of the Citadel project to refining the above vision is more extended
than what has been discussed by now. In particular, the operational objectives set out in our
work on Open Data are introducing de facto a symmetrical iteration, going counterclockwise as
described by the following scheme:
Figure 6. Citadel additional stakeholder transactions
In this scenario, Policy Makers act as prime movers with respect to the Business/Citizen
Communities, in launching and promoting the constitution of the ODGGs in the respective Cities
(by now, those that are formal partners of the Citadel consortium; in the future, those that will
adhere to the proposed scheme and play the role of supporting or affiliated partners). This
ensures the definition of the scope, limitations and conditions under which the whole
experiment takes place including, but no less important, the privacy, confidentiality and
security aspects related to the procedures of Open Data disclosure and Public Sector
Information dissemination.
Within this overall framework, it is desired, and somehow expected, that the local
Business/Citizen Communities, adequately stimulated and supported, may start defining their
range of expectations, desires, and purposes, with respect to the specific utilization examples of
the various applications developed, or to be designed and worked out with the integration of
the public datasets available or to be made available. This backward process, which also
24
includes the generation of own datasets, whereby citizens and/or businesses themselves act as
complementary Data Sources with respect to the Public Sector, should positively influence the
strategic behaviour of the Application Developers, who could stay more focused on the
developments that hold the maximum level of utility, usability and social acceptance, instead of
wasting precious resources in a tedious and never ending process of ex post validation for the
APIs or other ICT applications established meanwhile.
As a by-product of this virtuous interaction between prospective end users and solution
providers, a new range of access and acquisition protocols should also be foreseen, between
the Application Developers and the Public Sector Data Providers. The latter should make
reference to the Policy Makers again, for revised and revamped guidelines concerning pricing
and availability of datasets, in relation to the priorities expressed or signalled by the ultimate
beneficiaries.
Although the proposed representation may look oversimplified (as it does not include, for
instance, the cases of user generated or private sector owned datasets, nor it considers
application developers as capable of achieving social innovation), most of its heuristic value is
given by the juxtaposition of Figure 5 to Figure 6 into a single, integrated ecosystem, as shown
in the picture below:
Figure 7. Citadel integrated ecosystem
This exercise is helpful, in that it identifies four main areas of interaction, with the
corresponding feedback loops:
25
Figure 8. Key areas of stakeholder interaction
As a result of those interactions, the goals of policy learning and value creation (as per Figure 2
above) should ultimately be achieved.
Figure 9. Outcome of stakeholder interaction
The overarching objective of Citadel is to grow and nurture such an ecosystem providing tools,
methodologies, cases and exploitation opportunities.
In the next two (and final) pictures, we identify the contribution of Citadel Open Data and
development activities, respectively, to the achievement of such objective, through a number of
instrumental and operational reports.
26
Figure 10. Open Data activity contribution to Citadel objectives
Beside the definition of management rules for the ODGGs, Citadel also dealt with the creation
of Open Data Charters concerning the use of public datasets. Later in the project, Privacy Impact
Assessments were defined to identify the risks for personal/sensitive data related to the
introduction of a culture of openness and transparency in Public Sector Information and Data
handling. In parallel to this effort, the semantic dimension of dataset production and usage was
explored, in order to define a common Semantic Framework. Finally, as a collective space on the
Citadel project website, an Open Data Commons Repository was conceived, in its first instance
as a collection of links to available datasets together with a variety of open source tools
providing for adaptation, refinement, and access to public datasets and application resources.
Most of the above achievements, including the technical developments related to each Citadel
pilot, were ensured by the joint contribution of technical partners and pilot cities. As far as the
latter are concerned, the following diagram summarizes their contribution to the Citadel
objectives, namely:
Scenario Development, to create a shared understanding of what makes a City smart;

User Requirements Gathering and Technical Development;
Future Proofing with Geo-Based Technologies;
Template Mobile Applications Creation, Testing and Review.
The intention is not to describe each of these steps in depth, but rather to highlight the close
interconnections between the Open Data Commons Repository and the Template Applications,
both lying at the crucial point of convergence between Business/Citizen Communities (as the
prime receptors of the commercial/non commercial value created) and Application Developers
(including the projects technical partners, as well as third party organizations, including citizens
and NGOs acting under the Web 2.0 / FLOSS logic on the ICT market).
27
Figure 11. Pilots contributions to Citadel objectives
EXPERIMENTATION IN PILOT CITIES

At the heart of the Citadel approach was the actual experimentation of its Open Data tools and
concepts in real settings, engaging city administrations with local citizens and developers in four
pilot cities: Ghent (BE), Issy-les-Moulineaux (FR), Manchester (UK), and Athens (EL). In order to
frame the stakeholder-based approach across the four cities, the common idea of an Open Data
Governance Group (ODGG) was put forth as a means of collectively managing Open Data as a
common good.
Pilot experimentation of the ODGG concept in the four cities took on a different path as a
function of the degree of development of Open Data strategies and engagement with local
developers, as follows:
For Ghent, Citadel coincided with the launch of a double strategy for Open Data and
constitution of the local Living Lab. Citadel thus helped frame and guide this process as
it rapidly grew; the ODGG constitutes somewhat of a lead-user forum. This explains the
significant number of meetings held throughout the project.
In Issy, Citadel was aligned with a new strategy definition process, so the ODGG
included the different stakeholders including both different city government
responsibles and application developers to define together a common strategy. This
explains the larger composition of the group and the slower process of data publishing.
For Manchester, Citadel reinforced long-standing Smart City and Living Lab strategies.
There already existed a strong Open Data community in Manchester, so the ODGG
mainly aligns those activities with the work in Citadel.
In Athens, Citadel helped to define a new Open Data strategy, which, due to the mere
size and complexity of the city, needed to raise awareness among many institutional
departments as well as citizen stakeholder groups. The ODGG thus worked as the
strategic core group guiding this process, and tended to focus on the required actions
for pilot start-up to deliver concrete results.
The final reports from the pilot cities, following two years of experimentation of the Citadel
tools and platform, confirm the different approaches taken in terms of governance strategies,
while at the same time delivering successful outcomes across the board, as shown in the
following table:
28

Table 1. Comparison of Pilot ODGGs
City
Ghent
ODGG
members
11
Events
24
Total
participants
>575
Issy
34
195
Manchester
10
>250
Athens
32
Governance style
Tightly technical ODGG with active
engagement of developer and enduser communities in numerous events.
Broad representation in the ODGG,
with more selective and structured
events.
Continuity with on-going Open Data
strategy, increasing links with
community.
ODGG composed of key political
actors, coupled with direct
engagement of active citizen groups.
ROLES IN THE ODGG

Mid-way through the pilot testing, a survey carried out mostly within the project and pilot
communities aimed to obtain a first mapping of what the roles for different actors in the
community should be. The findings for each role (i.e. the functions for which a leading role was
attributed) were as follows:
Mayor, City Government: defining strategies, promotion, privacy, and evaluation

City ICT Department: leading role for all activities, namely ODGG coordination
Public/private data providers, leading role for all except app development and
promotion
Software companies: leading role for refinement, app development, promotion, and
R&D
Citizen developers: as above but with a leading role also for evaluation
User communities: as above but not developing apps
Citizens and visitors: evaluation
Against this background, we can note the roles actually played by the different actors in the
ODGGs of the pilot cities as the project evolved, as shown in the following table:
29
Table 2. Governance Roles in Pilot ODGGs
Role
Mayor, City
Government
City ICT
Department
Ghent
A clear Open Data
strategy was
already in place
with political
support.
The ICT
department
coordinated the
pilot throughout.
Public and
Private Data
providers
Key roles in the

ODGGs, data
provision became
demand-driven
over the course of
the project.
Software
companies
Ghent tended to
work with SMEs
and citizen
developers.
Citizen
developers
Played an active
role in
Hackathons and
app development.
Strong role for
Open Knowledge
Foundation, also
for the cultural
sector.
Engaged through
Living Lab
activities,
especially in the
open co-design
events.
User
communities
Citizens and
visitors
Issy
The Open Data
strategy was
launched and
defined in the
course of Citadel.
Issy Media already
had a strong
mandate to
promote
innovation.
Issy engaged many
city offices (e.g.
tourism) but also
neighbouring
municipalities and
multi-level
stakeholders
(agglomerate and
regional levels)
Manchester
A clear Open Data
strategy was
already in place
with political
support.
Manchester MDDA
with a strong
mandate, acting as
pilot leader.
Athens
Attaining strong
political support was a
key objective for
Citadel in Athens.
Dealing
predominantly
with municipal
data holders.
Issy has strong

connections with
the software
industry (e.g.
Microsoft).
Engaged through
workshops and Issy
Media events.
Good
representation of
development
community.
DAEM carried out a

broad policy of oneon-one meetings and
discussions with data
holders, based on a
demand sparked by
citizen engagement.
This also included
other government
portals.
Industrial
participation will be
defined with the
Athens Living Lab
follow-up.
Athens in particular
encouraged the role
of citizen developers.
Role for schools,

urban communities,
etc.
Engaged through
testing, but also
workshops and
conferences.
Multiple roles
identified and
engaged for citizen
developers.
User communities
mainly engaged
through pilot
activities.
User communities
mainly engaged
through pilot
activities.
DAEM has already a

strong mandate to
manage innovation.
Citizen community
NGOs played a driving
role in the demanddriven strategy.
The tourism industry
is a key concern for
Athens.
REACHING ODGG OBJECTIVES

From the reports, it is also possible to identify the different ways in which the objectives for the
ODGG have been met. These can be synthesized as follows:
30
Open up data: the first and foremost objective of the ODGGs was to spark off processes
for opening up datasets.
Engage with the community: the second objective was to actively involve data owners,
the development community, and local citizens and businesses in Open Data
Define a strategy: the final objective for the ODGGs was to enable the community to
identify the best way forward to maximize the value of Open Data for their city.
Table 3. ODGG Objectives in Pilot Cities
City
Ghent
Issy
Manchester
Athens
Open up
In the context of an Open
Data policy already in place,
the Citadel ODGG helped
reinforce the link between
the city and the open
development community.
Issy used the Citadel ODGG
to launch its Open Data
policy, going from nothing
to a significant number of
opened datasets.
Citadel helped reinforce an
already existing Open Data
strategy and extend the
user base.
In a situation of political
hurdles and severe
austerity, the Athens ODGG
approach has been
successful in gaining strong
political support for Open
Data.
Engage
Ghents engagement with
the community was
reinforced through the
Citadel co-design events:
Ghent pioneered the
Apps4Dummies format.
Issy Medias existing
structures and activity
frameworks provided the
setting through which to
engage citizens and local
businesses.
The Citadel tools helped
bring new actors into the
picture with less technical
skills.
The Athens ODGG engaged
directly with key
government data holders,
while at the same time codesigning application
scenarios with citizens and
community groups to gain
bottom-up consensus.
Define a strategy
Through the work of the
ODGG, Ghent shifted from
a data-push to a demandpull strategy, particularly as
regards the cultural sector.
The ODGG is helping Issy
carry out an original multilevel strategy involving
nearby municipalities,
coordinating with national
and regional portals.
The Manchester Open Data
strategy is extended and
reinforced by the
availability of the tools.
Athens now has a clear,
Citadel-driven Open Data
strategy that will be
sustained by the Athens
Living Lab currently being
established.
31
32
THE OPEN DATA COMMONS AT WORK

OPERATIONALISATION THROUGH EXPERIMENTATION
The Citadel ODC was originally formulated in general terms, as a shared collection of items in
the public space APIs, transformers, converters, etc. situated between datasets on the one
hand and applications on the other. Any further definition of the ODC concept required a
definition of the two main interfaces between the ODC and applications on the one hand and
the ODC and datasets on the other. The higher the common ODC-App interface (closer to the
single applications), the more work the ODC would have to do to comply with the interface
standards but the greater the number of applications that could access the ODC. On the
contrary, the lower the interface, less work for the ODC but more for the individual Apps; the
same logic goes for the interface between the ODC and the datasets. Once the level of these
two interfaces were defined, it would then become relatively easy to develop the tools to
connect them.
Such a concept could only be formulated in the context of a Living Lab methodology, since only
through the engagement of real users and real cities in real settings could the most appropriate
level for these interfaces be established. In this sense the user groups in the pilot cities
effectively co-designed the ODC. The upper interface corresponded to the JSON data model
required by the Citadel templates in the experimentation with local developers. The data model
proved sufficiently flexible, especially since more than one template was available with slightly
different data models, but there were hardly any datasets were available to be used.
In fact, the large majority of datasets was only available as an Excel or CSV file, at best properly
structured and refined. A particular case in point here was at Issy-les-Moulineaux, who followed
the data models of the templates but built the datasets by manually entering information into
Excel spreadsheets for feasibility. Inspired by a first prototype Converter built by the Ghent pilot
that transforms parking data into the JSON format compatible with the Parking template.
Building this input from the pilots into the ODC concept basically required two steps: first, the
apps using Citadel templates would need to unbundle their data by configuring the template
to read the JSON file from a separate external source. This approach works only if the JSON files
that are remotely accessed are exactly in the form that the Template expects to see. As a
consequence, the second step involved defining a standard tool to convert files from Excel or
CSV into this standard format: the Citadel Converter.
We thus have the two interfaces defined to fit the needs that emerged during the pilot
experiments: at the lower level (dataset-ODC), any well-structured Excel or CSV file. At the
upper level (ODC-application) a standard JSON schema. The ODC thus consists in the conversion
tool or tools (variations on the converter have already been carried out, ie to read from
geoJSON or CitySDK databases on the input side and write to alternative data models on the
output side), together with the listing of JSON files that are compatible with one or more
Templates.
33
FIRST ISSUE (2012)

The preliminary specification of the application templates and the datasets for the pilot cities
also involved some preliminary investigation of the ODC concept, since this constitutes the
interface between the templates and the datasets. Even in the case where a template can
directly access a dataset scenario that may work for a given pilot instance but with a very low
level of flexibility and transferability for other uses the ODC needs to play the role of
interpreting the URI (Unique Resource Identifier) for the datasets. In the vast majority of cases
however, the ODC will be called upon to carry out more sophisticated functions, such as
filtering, translation, and/or access interfacing.
During the requirements capture phase, the main issue for Citadel was to find the right balance
between dataset formats, the ODC, and the templates. If the ODC is required to do too much,
e.g., in order to have a broad variety of datasets and light templates, then the entire ecosystem
becomes too dependent on the ODC, betraying the principle of Open Data. If the ODC aims to
be lighter and more efficient, then the dilemma is between templates with a greater
complexity (in order to access datasets with different formats) and placing greater constraints
on datasets, thus increasing the cost of opening up data.
Figure 12. Issues for specification of the Open Data Commons
At this point it was useful to look more closely at the functions that could be included in the
ODC, mainly by looking at the different tools and approaches that currently exist for supplying
open data application with an external dataset. The main approaches can involve:
34
Direct access, as mentioned above: this can occur in the case that an external dataset
offers data exactly as the application expects it to be formatted and structured;
Plugs or translators, that have the function of translating file formats or re-mapping
data structures to make data fit applications (we can consider some XML translators in
this category).
The Open Data Commons at Work
Data dumps, mirrored databases constructed for a variety of purposes such as: a)
storing translated datasets as per the above; b) where a dynamically updated
database (eg. Meteo) is copied at regular intervals so that external applications avoid
overloading the primary system with queries or c) for other reasons such as security.
One or more APIs can be developed either from the data side (as in the CitySDK project)
or from the application side (as with Pachube or Foursquare) providing interface
functionalities.
Various combinations of the above.
We considered all of these approaches as appropriate to the functional requirements of the

ODC as illustrated below:
Figure 13. The range of functions in the ODC
The interesting fact for Citadel is that the above tools and devices are part of an ever-evolving
ecosystem in which application developers, data owners, and third parties interact. Indeed, the
indirectly are the key drivers of standardisation processes in Open Data, since the emergence of
a standard such as GTFS (General Transit Feed Specification) is generally accompanied by the
production of APIs, translators etc. to help fit that standard, even in the presence of emergent
enhancements to that standard such as real-time-GTFS.
The second noteworthy fact is that most of these tools and devices are freely available in the
public domain if not Open Source. The only exception to this are APIs that are paid for as part of
the business model for applications such as Foursquare, but this does not mean that the ODC
cannot list the API and facilitate its adoption, leaving it up to the user to decide whether or not
to adopt a commercial (generally proprietary) format.
This opens an interesting scenario for the ODC within an Open Data Smart City strategy. The
ODC can become the space which manages an ever-evolving ecosystem that as a collection of
tools can be said to define the public space of a citys information capital. On the one hand, it
makes it possible for the citys data to be seen by a broad range of applications; on the other,
35
it allows applications easy access to the citys information capital. Use of the ODC would occur
not by any constriction but by the convenience it offers to data holders and application
developers alike.
In order to maintain this role, ODC cities (through the Open Data Governance Group) would
negotiate with application developers to ensure that the components they develop that can be
said to be of public utility data access tools that could be re-used by other applications be
donated to the ODC and remain in the public domain, in exchange for access to the citys data
through the ODC. The community of developers that participate in such an endeavour would
thus have an interest in collaborating to ensure that the different components developed work
together smoothly where appropriate as well as promoting adherence to emergent standards.
This vision for the ODC can be illustrated as follows:
Figure 14. The vision for the Citadel ODC
The above figure also illustrates an additional added value that can be provided by the ODC in
line with this scenario. Since the ODC will be managing the interfacing between application
queries, it can maintain a record of queries coming from the application side as well as records
of which applications access a given dataset and for what purposes. This allows us to imagine
the introduction of the concept of bi-directional traceability of Open Data, which opens up
interesting possibilities for the management of privacy and security issues. In addition, analysis
of the queries and transactions over a given period of time could allow for the identification of
semantic patterns, thus allowing for a bottom-up definition of emergent semantics that could
then be fed back into the definition of appropriate data structures.
SECOND ISSUE (2014)

The concept for the Converter tool was born during the first phase of Closed User Group testing,
as on the one hand the pilot cities were opening their data mostly as Excel files, while on the
other the Citadel Templates required JSON files based on specific data models linked to each
36
template. Indeed, the City of Ghent first built a simple tool for this purpose, translating parking
information to the required JSON format. The request therefore arose for a general Converter
that could transform any CSV or Excel file into the required JSON format for use in Citadel.
This idea was taken up as the first step in actual realisation of the Open Data Commons concept
that had been defined in the first year of the project. Indeed, the Converter as described would
be the first of a series of tools bridging the gap between datasets (as they currently stand in 95%
of public administrations, ie. Excel files) and applications built using the Citadel Templates.
By the time the request was formalized and the first UML specifications of possible Converter
workflows had been defined, the pilot cities were waiting anxiously for the Converter in order to
finally begin building apps. An unusual development plan was thus defined for the Converter,
following three main stages in an open sequence that allowed to test the concept as early as
possible and then proceed on the basis of user feedback:
A first prototype was realised in less than a month, using php for a server-based
converter. This version only worked with CSV and mapped columns in the original
spreadsheet directly onto the data schema of the JSON format. This version was
released in December 2013 and was an immediate success with the pilot cities.
A second prototype was built using Java, as an off-line tool. This was meant to provide
more stable features and possibly be used for batch processing for files with the same
data structure and/or with constant updating. This version also separated a first phase
of semantic mapping (pairing source column headings with standard field names) with
that of the export schema (matching with the actual fields of the template data model,
and in addition adding necessary metadata such as language, licensing, etc.). This
version was less successful due in part to the large size of the file to download but
mainly because pilot cities were preferring the simpler though less sophisticated
conversion of the php version. This version was released mid January 2014 but received
little feedback.
Shortly thereafter the final version was released as an on-line Java tool encapsulated in
Liferay. The basic functionalities were essentially the same, and it was this version that
has been gradually improved through interaction with pilot users in the Living Lab
settings. The first step was to add help texts along the way, as well as feedback on
possible errors in the mapping to the export schema. This version was released in time
for demonstration and testing at the Data Days conference in Ghent in February 2014.
Since then, further refinements of the Java code, together with significant upgrades of the
server features, have been carried out with the objective of improving performance, and in fact
the response times have been notably reduced (another of the reasons why the pilot cities
initially preferred the php version). Following initial user testing, the Converter tool was then
integrated into the Citadel platform. This involved stripping away the user registration of the
Liferay environment in order to allow a smooth passage from the Citadel Hub, within which the
Converter is inserted as a simple i-frame. Other enhancements to the Converter have been
carried out in a dialogue with end-users, and are reported in the following section.
As a final note, it is interesting to see that one of the hypotheses of the development plan ie.
that outside developers would prefer to work with the php version has been validated by the
37
recent adaptation to geoJSON of the Converter and other activity on Github. Thus both the php
and the Java versions continue to co-exist, the first as a more open, technical, and experimental
version and the second as a more stable, user-friendly version useful for the front end of the
Citadel platform.
The Citadel Converters operational maintenance has mostly been a question of adapting to
continuous user requests for enhancement. The main problem is that the Converter significantly
raises expectations promising to convert just about anything into an app while in fact there
are many problems to address, mostly related to problems or inconsistencies with the original
dataset. Addressing these issues has involved a combination of: technical improvements,
accompanying information, and human support.
The technical improvements have mainly been carried out on two fronts: geographical
coordinates and dataset publishing. As for geographical coordinates, the Citadel JSON format
foresees latitude and longitude in one field separated by a comma and a space. Other common
formats (ie. with a comma only as in Google) returned an error message. Work was therefore
carried out to automatically recognize different formats and adjust them where necessary. The
second question is related to the desire to directly save the converted file and publish the
metadata on the Citadel Hub. This issue introduced the option of saving converted files to any
CKAN server, though also required an API through which to write to the Citadel Index.
Another aspect is related to providing user information. This has occurred through: presentation
of the Converter so as to lower expectations (raising awareness of the difficulties involved),
explanatory trouble-shooting pages on the Citadel website, improvements in the help texts
accompanying the different phases of conversion, and improvements in the error messages to
make them more understandable. A final aspect, human interaction, has led to a series of
actions that are not within the scope of this book, except perhaps for the preparation of a series
of help sheets and template Excel files used as support tools for the Apps4Dummies
workshops3.
THE SEMANTIC DIMENSION OF THE ODC

STANDARDS ISSUES IN THE FIRST ODC CONCEPT
It is useful to remember how the actual implementation of the ODC developed as compared to
the original vision. The first idea for the ODC was in fact as an open collection of APIs,
converters, transformers and similar tools in the public domain, with the idea that they might be
somehow configured or chained so as to connect a given dataset with a given application
template.
While this description provided little guidance from an operational point of view, its innovation
consisted in the clear vision of some sort of autonomous space between datasets and
applications, considered as a public good. By suggesting that transformation resources should be
pooled rather than selected, the ODC concept implied that it would not be necessary for the
Citadel project to choose a semantic standard. More precisely, it implied that if Citadel were to
3
These were awareness raising events organised by the Citadel consortium in several locations across Europe.
38
define a data model for practical purposes, then it didnt necessarily have to become a
standards proposal since other data models could perfectly well co-exist with it in the ODC
space, together with the transformation tools needed to convert towards them.
The ODC model was thus presented as a vision, without the specific intention of implementing it
in practice. The goal was rather to use the ODC as a framework capable of guiding the thinking
and actions of the pilot cities. The main objective at this early stage was to see whether in
practice such an autonomous space between data and applications did exist and, if so, to
identify where the borders or interfaces above and below this space were and what defined
them.
What did emerge in the first cycle of pilot testing was the emptiness of this common space.
Feedback from the pilots noted the significant gap between city datasets on the one hand and
the application templates on the other, which require data to be in a specific JSON format.
Normally, this gap would be bridged by a specific tool such as an API, but the whole idea of the
ODC is to introduce a different concept that, rather than bridge this gap, fills it with elements
that are open and re-useable4.
In addition, APIs generally work only with on-line relational database services, while the great
majority of the datasets of the pilot cities consist of Excel files. The gap between data and
applications was thus a significant one, and cities found themselves with few applications with
which to use their data (giving them little motivation to open more data), while developers had
little data ready to use with the application templates (giving them little motivation to go
through the complicated process of installing the templates client-server configurations). To
begin to overcome this problem, the Ghent pilot devised a simple tool to convert data from one
of the citys services for parking data to the JSON format required for the parking template. This
was heralded as the first instance of the ODC concept at work, with the spontaneous emergence
of a conversion tool to begin the process, but it actually laid the ground for the further
development of the ODC itself as a more complex system. In the process, what initially appeared
as a question of technical formats (the use of JSON), ultimately emerged as a question of
semantics.
THE EMERGENCE OF THE CONVERTER-AGT MODEL

Evaluation and reflection on this experience and that of the other pilot cities led to the proposal
for the introduction of the Converter-AGT toolkit to bridge the gap between datasets and
templates. The idea was to build on the example of the Ghent converter but in a more
structured way that would work for most of the datasets in all of the pilot cities. Indeed:
the Application Generator Tool (AGT) adopts a generic version of the Citadel data format
(mostly based on the POI templates data model) to generate apps that read one or
APIs in fact are in general not conceived as belonging in the common space of the ODC. They are either designed as
an accessory to an application, so that a data service needs to write specific code to feed data to it (example Google
Maps or Xively) or they are written for a specific data service, so the application has to have special code to be able to
use each data services API. The idea of the CitySDK project (http://www.citysdk.eu/) is to standardize the APIs
associated with common types of data services, but only works with data services and not, for example, with the
spreadsheet type static files that make up most of available public sector information.
39
more appropriately structured JSON files and then visualize the POIs in a list or map
visualization.
The Converter, which starts from a row-based Excel or CSV dataset as input (like the ones
most cities had produced, notably Issy-les-Moulineaux), then carries out a mapping of
the column headings to the generic Citadel data model of the AGT (i.e. mapping Name
to Title) and finally saves the output as a Citadel compliant JSON file.
This required an important step to be taken as regards the architecture of the Citadel templates.
In the first round of pilot testing, apps were created by selecting the appropriate template,
encapsulating the necessary datasets (after downloading the files), and installing the template
software (with the data inside it) on the server side with the visualization part on the client side.
This procedure, which is normal practice for mobile applications, essentially creates a closed
system, even though the data was originally downloaded from an open portal and structured
according to a common data model.
In order for the AGT to be able to generate an app quickly, it was necessary to separate the data
from the application or application template that reads it, thus externalizing the dataset. In this
way, the app resides entirely on the client side (i.e. in a smartphone), the data resides as an
autonomous file on the Internet (i.e. on the Citadel Hub), and the app reads the data in real time
when it needs it. This is more similar to the way an API reads data from an on-line web service,
essentially by connecting to the web service, asking the right questions, and knowing how to
expect the data to be returned and subsequently adapted to the needs of the application. The
difference, however, is that the Citadel JSON file is live5 but already in exactly the format the
application expects to see it: all the application needs is the URL to read the information directly
from the external server, with no further need for an API.
From a functional standpoint, this means that for every original dataset published by a given
city, there is the need to also store a JSON version of the same file so that the AGT can read the
information from the Internet. This may appear to be an unnecessary proliferation of files, but
there are three important benefits to this new approach, all driven by different aspects of the
Citadel scenario:
A Citadel JSON file can be updated at any time even live feeds can work as long as
the URL returns the expected JSON schema following the expected semantic structure.
Any city or user can add a new file respecting the same semantic schema and the
application will be able to read it and use the data, as long as it knows the URL: the
same application can be reused with no changes.
Any application developer can access the available JSON files for any purpose, simply by
knowing in advance what semantic structure to expect: the same data can be reused
with no changes.
By live we mean that, rather than having to download a file and then read the information, the application can
directly access the information from the server hosting the JSON file.
40
THE CENTRAL ISSUE OF SEMANTICS

This new scheme met with an enthusiastic reception from the pilot cities, but from the
standpoint of the ODC a doubt remained: Was this only a way of defining the AGT schema as a
standard data model for Citadel, and thus defeating the principle of the ODC as an open space
without standards? Is it possible to extend the Converter-AGT model to allow for other data
models and other types of applications? The answer to this question lies is in the original
template concept.
Each of the Citadel templates in fact is designed to visualize a specific kind of information
(events, parking, etc.) and thus each works with its own data model. We can thus imagine that
different templates or applications can be paired with their own set of JSON files with the data
structured the way they expect to see it, just as the AGT currently works with its own JSON files.
The standards negotiation processes can still take place, based on the co-existence of different
data models in different JSON files feeding information to different templates and applications.
The ODC concept is therefore still alive, if we consider the first Converter-AGT toolkit to be just a
first instantiation of it. The interesting thing to note is that in this process the semantic
dimension has become the driving force of the ODC model.
Figure 15. The ODC as a semantic model
The above diagram is in fact based on the above discussion, showing the ODC no longer as a set
of tools but rather as a set of JSON files (the green boxes) linked to different templates and
applications, all stored somewhere on the web and accessible through a URL. The tools are still
there (the dotted line below) but there is an important shift of focus: the common space is now
characterized by the way it accommodates datasets with different semantic models. Indeed, the
figure shows the different types of applications on the top the generic template of the AGT,
the specific application templates developed in the first months of the project, third-party
As of October 2013. It should be remembered that this is a conceptual schema that is broader than what the pilot
cities actually tested, which instead had the Converter using only the generic data format required by the AGT.
41
templates, and even third-party applications and programs each of which gets its data from
the datasets in the ODC that have been formatted in the way they expect to see it.
From an operational standpoint, this approach requires an index function to keep track of which
datasets can be read by which applications, but once the pairing between a template or
application and one or more JSON files has been configured, the job is done7. The Citadel Index
in its current configuration only foresees pairings between apps generated by the AGT and
Citadel JSON datasets, so it does not yet reflect the open nature of the ODC model. How this
index function might evolve and scale up is one of the issues for future development of the ODC.
SEMANTIC CONVERGENCE IN THE PILOT CITIES

One of the key hypotheses of the ODC concept is that it provides an open framework within
which user-driven semantic convergence processes occur or, in the words of the project
workplan, convergence of the use of terms. Although the pilot cities only tested the direct
Converter-AGT data model in successive versions of the Converter (and not the multiple data
models of the broader schema in the preceding section), the usage of the operational version of
the Citadel Converter in the pilot cities already demonstrated some interesting dynamics.
The first php prototype of the Converter converts a source spreadsheet file directly to the export
schema as required by the AGT (the generic template data model). This achieved its main
objective of bridging the gap between Excel spreadsheets and the applications requiring the
JSON, thus allowing pilot cities to get on with their activities of opening datasets and testing
applications. As stated elsewhere, however, the ease with which end users could convert
datasets and make an app with them also allowed them to experience the Open Data paradigm
end-to-end8.
One of the most evident effects of this was that users could easily see that a poorly structured
spreadsheet with, say, different ways of abbreviating street or a missing address, led to either a
failure of the conversion process or visible problems in the final result9. People in the pilot cities
once it was clear to them that the cause of the problems they were experiencing was in the
datasets they themselves produced and not a bug in the Converter software actually went
back and corrected or revised their datasets until the desired result was achieved.
This educational aspect of the Citadel converter was further highlighted in the introduction of
the second version, which divided the conversion process into two steps: semantic mapping
and export schema. This new structure mainly was intended to support further developments,
but it also had the effect of raising awareness of the importance of semantics to end users. At
first some complained that it appeared to be a duplication of effort. Then the pilot participants
7
The exception to this is Discovery, a Citadel function which allows the user to move from city to city and the
application to automatically detect the presence of a new dataset with information relevant to that city, since
Discovery is essentially a dynamic configuration of the App-URL link that takes place through the Index. For the
purposes of this discussion however, the semantics of the underlying data model remain the same.
8
This may seem obvious but very few civil servants have had the gratifying experience of seeing a dataset they have
published actually used in a mobile application.
9
Cleaning up source data is one of the main costs of traditional Open Data initiatives. The original workplan for
Open Data activities envisaged the engagement of citizen groups in this lengthy process, but the dynamics described
here have proven far more effective.
42
gradually saw the usefulness of an intermediate step in which the terms they used (name or
even titre for title) are mapped onto a standardized vocabulary, and that the output format
required for the AGT was just one of many possible ways in which their data could be used, once
the semantic mapping to standardized terms had been carried out.
It is hard to overestimate the impact of this engagement of civil servants, and how building a
sense of ownership of a dataset in the person who generates it generally seen as a source of
trouble and not a resource can be the best way to ensure quality. The uniqueness of the
Citadel approach is to actively empower the people who create datasets in the first place,
influencing their behaviour by showing the consequences of sloppy data, directly rewarding
good data with a working app, and thus promoting the convergence of behaviour patterns
towards common standards of practice.
Once the Converter had reached a point of stability and was in active use by the pilot cities, it
was useful to explore how it can be adapted to different data models and different applications,
in order to steer the process from the pragmatic Converter-AGT toolkit prepared for the pilots
towards the multiple-standard approach of the ODC model as originally conceived.
These enhancements effectively open up the conversion process to other options such as Open
Street Map, geoJSON, the CitySDK APIs, etc. Since they were developed in the final stages of the
project, they were not fully tested in the pilot cities, though they nonetheless demonstrate the
flexibility of the Citadel approach and the possibility of migrating from the original ConverterAGT toolkit towards the multi-standard ODC concept that has inspired the developments of the
Open Data Commons throughout the project.
THE ODC AS A SEMANTIC FRAMEWORK

On the basis of these experiences, we can say that through the Citadel piloting and development
processes, the ODC has evolved according to a path that alternated conceptual modelling with
concrete solutions:
First, the ODC was a conceptual model of an open, multi-standard ecosystem.
Next, the ODC became the Converter-AGT toolkit, designed to overcome a specific
problem identified by the pilots.
Finally, a series of alternative data models and conversion scenarios were explored,
framed by the original ODC framework, extending the Converter-AGT toolkit to re-gain
the goal of an open system.
In this process, ODC development was shaped by the semantic issues that eventually defined a
core, a-standard (in the sense of not requiring standards) semantic framework that is at the
heart of the model.
43
Figure 16. The Semantic Core of the ODC
This schema considers that the common space is driven by the presence of three elements,
JSON files with pre-defined semantic structures (provided by one or more applications
and registered in the Index),
Converter tools that carry out the necessary semantic mapping from the unstructured
CSV files towards the structured JSON files,
CSV files with any semantic structure (namely with whatever choice and sequence of
column headings the original user defines),
plus the index which keeps track of a) where the CSV files are and where they come from; b)
data models used by applications and the JSON files that conform to them; and c) which
converters can be used to produce which data models.
The various tools developed above and below this core interact through it, allowing for different
standards to co-exist by providing the semantic framework that matches data models to the
applications that can use them, forming an open semantic ecosystem.
This open ecosystem, upon closer inspection, consists of the very tools that were originally
conceived of as populating the Open Data Commons, considered as the shared space in the
public domain. Indeed, the services and tools shown are all reusable, generic components,
whose semantic interoperability is guaranteed by the fact that they can speak to and get
information from the semantic core.
In addition, this schema fulfils the original idea of the ODC as a negotiation space for userdriven convergence towards standard semantic structures. As stated above, any data model can
be registered in the index together with the conversion tools to create the necessary files. At
the same time, however, users are likely to converge on the data models with a greater number
of JSON files available, so long as they meet their needs. This encourages both the development
user-driven standards both for specific data models for precise requirements (i.e. restaurant
menus) together with data models of general relevance (i.e. POIs), with the balance being
gradually defined within this operational semantic framework.
44
Figure 17. The Open Semantic Ecosystem
The model of the open data ecosystem above contains nearly everything developed to date in
Citadel: what, then, is outside of the Open Data Commons? The fact is, Open Data systems in
the real world are ultimately fed by real (in the sense of not necessarily Open Data) office
systems, files, and services on the one hand, and are used to contribute to the development of
real (in the sense of normal use, not only finding a parking place) applications on the other. This
is easy to forget, since most Open Data discourse to date seems to take place in a separate
world from our daily life of interacting with ICT systems. The end objective of Open Data, at
least in the Citadel perspective, is to become part of the real world, simply as an efficient way
of addressing interoperability issues when linking different data sources to applications, as
illustrated in the figure on the following page.
This scenario is conceivable only with a massive uptake of the Open Data paradigm, which
Citadel considers to be a possibility enabled by the ODC with its semantic core. At least in the
context of this book, we can say that the definition of the semantic core has been the key
enabling mechanism for unlocking the Open Data Commons, and remains the driving force of
the concept.
45
Figure 18. The ODC in the 'Real World'
46
PRIVACY AND THE OPEN DATA COMMONS

GENERAL FRAMEWORK
Citadel reflections on privacy issues commenced in January 2013. At that time, the EC proposal
for a general reform of data protection legislation in Europe was already available10. This
included a Communication, stating the EC policy objectives, and two legislative drafts: a
Regulation setting out a general EU framework for data protection, and a Directive on
protecting the personal data processed for the purposes of prevention, detection, investigation
or prosecution of criminal offences and related judicial activities. Therefore, our analysis started
by outlining the current legislative setup in the EU as far as privacy protection was concerned,
and followed on by analysing the impact of the upcoming legal provisions on the framework
that informed both the Open Data and the AGT development activities in Citadel. Incidentally,
that analysis is still up to date, as neither the Regulation nor the Directive has entered into force
as yet. The former in particular, in order to become law, has to be adopted by the EU Council of
Ministers using the ordinary legislative procedure (i.e. co-decision mechanism). On this we note
that the Conclusions of the 24-25 October 2013 summit of EU27 heads of State and
Government committed to a "timely" adoption of the new Regulation, which was supported by
the European Parliament in its plenary assembly of 12 March 2014, with 621 votes in favour, 10
against and 22 abstentions. On the same day, the EP also expressed its consensus to the
Directive with 371 votes in favour, 276 against and 30 abstentions.
Among the key findings of Citadels early analysis, two were particularly notable, as seemingly
going in the same direction of the new proposed legislation:
The existence of compelling socio-economic reasons to speed up the process of opening up

and cleansing government owned datasets, particularly at the local level. Such reasons refer
to the definition of a sustainable value chain for the (commercial and non-commercial)
reuse of Public Sector Information, including Open Data as a specification thereof. In fact,
one of the key aims of the upcoming reform is (among other goals) the provision of a single
legislative framework for businesses and citizens, in order to promote the digital market and
ultimately economic growth in Europe;
The need to tackle with privacy protection issues upfront, not as an afterthought: in
particular, the Privacy by Design concept means that data protection safeguards should be
built into new open data based applications and services since the earliest stage of
development. This concept is also invoked by the new legislation principles. In Citadel, it has
led to the definition of a number of Privacy as a Service scenarios for the Open Data
Commons, including the possibility of logging Open Data related transactions as well as the
appropriate management of access rights as a function of specifically designed metadata
structures.
However, the analysis carried out at the time also showed the existence of a latent tension
between the promotion of the commercial (and/or non commercial) use of Public Sector
Information and an excessive protection against the risk of personal data disclosure that didnt
10
http://ec.europa.eu/justice/newsroom/data-protection/news/120125_en.htm
47
take into account the blurring of traditional distinctions between data holders and collectors,
application providers and final users. Such tension has been epitomized by the so-called Citizen
Developer profile, a natural person engaged in the processing of open data with the support of
Citadel application templates. This person is empowered to create, own and control his or her
own datasets and applications, and share them with other participants in the Open Data
Commons on terms that are set and negotiated, as need be. For instance, a local bridge club
member can be incentivised to publish online the list of fellow members, together with their
home addresses, to make it easier to find a location for the next game, through a newly
developed app that mashes up the bridge club list with the official dataset of city parking
facilities. Unfortunately, the same list once made public could be of interest for some IT-savvy
burglar, noting which homes are left unattended on the occasion of the next club meeting.
While possibly trivial as an example, it shows how significant and irreversible can be the
unwanted consequences of privacy carelessness, even in the case of prior approval by personal
data owners, and despite the fact that nobody acted with profit making purposes in this
scenario (except perhaps the burglar). This is however, all but an occasional risk in the Citadel
world: in fact, it was a precise mission of our project to facilitate the realisation of open data
driven applications by non-expert users through the Open Data Commons resources. And the
objection that publishing someones home address goes beyond open data is weak, for at least
two reasons: a) that many other public sources may already have disclosed the association
between a certain name and a specific address, and b) that there could be a shared interest in
running the privacy risks of this disclosure, for instance to make known to other potential
members the existence of an active bridge club in the city with practical venues for playing in
the same neighbourhood or to invite other IT-savvy people to co-create a mash up service
showing all the right PoIs (Points of Interest) and useful connections at hand.
As a matter of fact, due to its seeming irrelevance for digital business, this case is not
considered by the upcoming legislative reform. According to Art. 3 of Directive 95/46/EC, its
provisions do not apply to the processing of personal data by a natural person in the course of
a purely personal or household activity. The new Regulations Art. 2 repeats the same concept.
For sure, the Citizen Developer figure complies with the first part of the definition being a
natural person but not necessarily with the second qualification, given the possibility offered
to him/her by the Open Data Commons, either of adding data to an initial City PoI collection, or
of making improvements to an existing application provided by someone else. Paradoxically, if
someone started to claim (probably little) money for the mash up service from some time on, all
the legal consequences of past privacy carelessness would be charged to the last edge of the
value chain, although this could also be taken as a (clumsy, yet innovative) example of open
data exploitation for business purposes.
With this case description, we do not intend to imply that Community Law should consider
adding to its scope the activities of Citizen Developers, but only reinforce the importance of
preventive privacy assessments, rather than corrective or punitive actions. A relatively shortterm scenario, also driven by the likely popularisation of the Open Data Commons, will see a
growing number of applications heavily relying on citizens own datasets if not also on user
generated improvements, according to the Open Source or Living Lab logic. There, the risk of
48
Privacy and the Open Data Commons
involuntarily merging relevant (to the new service targeted) with irrelevant personal data is high
and must be considered upfront.
THE PRIVACY IMPACT ASSESSMENT FRAMEWORK

As a first level of response towards fulfilling the need for preventive actions, a PIAF Privacy
Impact Assessment Framework has been developed in Citadel. Generally speaking, a PIA is a
managerial process that helps an organisation identify and remove or reduce the privacy risks of
a certain project, initiative or IT system. To this end, the PIA analyzes the way(s) personal data
and information is collected, stored, protected, shared and managed by and within the
organisation. For instance, Art. 33 of the proposed EU Regulation obliges organisations to
conduct a data protection impact assessment where processing operations present specific
risks to the rights and freedoms of data subjects. Despite the change in terminology, the
requirement is unequivocal, also in light of the previous Recommendation of May 2009 on
privacy in RFID applications [7] where the EC asked Member States to make sure that industry,
in collaboration with relevant civil society stakeholders, develops a framework for privacy and
data protection impact assessments. This framework should be submitted for endorsement to
the Art. 29 Data Protection Working Party. As far as RFID based innovations are concerned, the
Working Party formulated an Opinion on the framework in February 2011, welcoming the
explicit inclusion of a stakeholder consultation process as part of the internal procedures
needed to support the execution of a PIA. It also observed that a PIAF should be aimed to
promote Privacy by Design, better information provision to individuals as well as transparency
and dialogue with competent authorities [3].
In order to be effective in fulfilling those requirements, a PIAF should allow systematic detection
and monitoring of how the privacy of involved people is affected by the proposed project,
initiative or IT system. Historically, several distinct approaches have been developed and carried
forward in a number of countries, including Australia, Canada, Ireland, New Zealand, UK and US.
In 2002, the Canadian government became the first jurisdiction to make PIA mandatory for
government bodies [19]. The EC-funded project PIAF (A Privacy Impact Assessment Framework
for data protection and privacy rights) reviewed existing PIA methodologies in the above
mentioned countries with the most experience in PIA, identifying the principal similarities and
differences between the different PIA guidance documents and the best practice elements that
a successful PIAF should include [20]. Most of these elements are mentioned in Art. 33 of the EC
Regulation, including the requirement for data controllers to seek the views of data subjects or
their representatives on the intended processing, without prejudice to the protection of
commercial or public interests or the security of the processing operations. Such consultation
allows gathering inputs on stakeholder perceptions of the severity of each privacy risk and the
possible measures to mitigate it. This inclusive approach implies that despite the process level
commonalities, with most of PIA handbooks complying to the workflow described in the Figure
below the outputs and outcomes of individual instantiations of the process may considerably
differ to one another.
49
Figure 19: Stepwise PIA process (source: [20])
Currently, a PIA standardisation effort is ongoing at the ISO/IEC Joint Technical Committee No.
1/SC 27 IT Security techniques [9], but its results will only be made known in 2016 or later. The
ISO/IEC 27002 standard for IT systems security already includes privacy protection. Yet, despite
this claim, it leaves privacy policies and measures unspecified. Therefore, no single agreed PIA
procedure or guideline exists at the moment and the Citadel PIAF only adds to a number of
concurrent methodologies and approaches. In particular, current PIA schemes follow a risk
assessment approach, aiming to minimize the risk of privacy breaches and the consequences of
that on a particular organisation. In Citadel, this is complemented by an empowerment
approach, whereby communities and citizens can have greater control over their own
information, thus also contributing to lower the risks for whoever manages it.
PRIVACY TYPES
The original contribution of Citadel to the theoretical and practical debate on PIAs comes from
the distinction between three types of privacy: Community, Application and Data level. These
have gradually emerged from the iterative activities done across the project tasks during the
past couple of years. This distinction becomes relevant in three respects: first, because it is
commonly agreed that a fully functioning PIA should deal with all types of privacy within their
respective scope; second, because with the introduction of an Open Data Commons it is no
longer clear who the data controllers are, whom the liability of privacy protection should be
attributed by law; third, in relation to the fact that the practical measures to embed privacy
concerns into the design change quite a lot in relation to the specific nature of the privacy issues
tackled.
These three typologies of privacy are only partly overlapping, as the following picture exhibits:
50
Figure 20: Typologies of privacy in Citadel
Community level privacy can be defined as the way this concern is perceived and assessed by
the stakeholders potentially affected by it. As the PIA process outlined in Figure 19 documents,
it is essential for data controllers to make sure they understand the distinct interests and
arguments of the people and organisations involved in their community of reference, as far as
the management and the potential risk of disclosure of personal data and information is
concerned. As a matter of fact, the Citadel project since its early stages has promoted the
constitution, in each of the pilot Cities, of the so-called Open Data Governance Groups (ODGGs),
consisting of the key stakeholders interested in the opening up and cleansing of public (normally
local government owned) datasets. One of the key topics of discussion internally to the ODGGs
has inevitably been how to deal with the privacy implications of the utilization of open datasets,
particularly in association with no profit activities. The resulting template of Open Data Charter
is meant to include a specific section on the terms and conditions of privacy protection, with its
contents reflecting the specific outcomes of the thematic discussion in the Cities.
In turn, Application level privacy can be defined as the extent to which applications such as
those experimented in Citadel deal with user information by either disclosing or protecting it.
Apps can collect significant information about users and their devices, often without their
knowledge or permission. It is quite rare that comprehensive information in clear and plain
language is provided to new users about the features of a given app, what information will be
accessed by whom and how it will be used or to whom it will be disclosed. Merely offering a
single 'Accept' or 'Install' button is unlikely to support valid user consent. In February 2013, the
Art. 29 Data Protection Working Party formulated an opinion on the security and privacy risks
associated with the use of applications and proposed a set of recommendations to each of the
different players in the marketplace [2]. During the Citadel project, a number of application
templates as well as a generic resource called Application Generation Tool (AGT) were
developed and positioned on Github and the Citadel Hub. Ideally, each of these tools can help
generate innumerable apps (as they already have) with only few differences across the various
51
possible datasets, locations and utilisations. Therefore, it makes sense to define the privacy
policy of each application template as well as the AGT.
Finally, Data level privacy is a concept that has been developed during the project as a result of
the reflections and experimentations done as mentioned above. It can be defined as the
qualification of a single data item (or row in a dataset) in terms of its possibility to be safely
disclosed, without generating any harm for the original data owner. Of course, being an
attribute of the single data entry, it cannot be assigned by any other subject than the data
owner, nor can it be changed from public back to private any more11. Obviously, the specific
attribute of a data item affects the quality of the dataset it belongs to and each transformation
thereof. For example, a JSON file created out of existing CSV or similar with the Citadel
Converter (which is another free tool of the project) should in theory preserve the same data
level privacy attribution as the source of this transformation.
The following three sections separately delve with these privacy typologies also in relation to
the prospective impact of the forthcoming legislative reform described earlier in this book.
Taken together, they form the three distinct conceptual elements of the Citadel PIAF, as shown
by the table below. Each of these aspects implies specific governance issues that will be
discussed in the perspective of their being instantiated in the agreements supporting Citadels
Open Data Governance Groups. In this way, the PIAF contributes to ongoing PIA standardization
efforts, by further highlighting the communitarian not only organizational dimension of
privacy management even in a context like the one of Open Data, that at first sight poses little
(if any) challenges to the protection of personal information.
Table 4. Summary of the three levels of the Citadel PIA framework
Level
Community
Application
Data
11
Focus
City leading
Open Data
Governance
Group(s)
Professional
Developer (or
Citizen
Developer)
Goal
City
level
PIA
Individuals
generating data
items
Data
level
PIA
App
level
PIA
Issues
Beyond risk assessment in
organisations to community
stewardship of open data
policies
Citadel scenarios leading to
multiple authorship and
personal data mash-ups
with potentially unforeseen
outcomes.
Adequate information on
who is using personal data
and guarantees that
individual privacy
requirements will be
respected
Proposal
PIA embedded in Open Data
Governance Charter as a multifaceted framework to highlight
emergent privacy issues
Open Data Commons, AGT and
templates with privacy policy
embedded by design (scope for
more privacy as a service
features)
ODC Index based licensing
system (explicitly dealing with
converted datasets) based on
the Creative Commons analogy
for supporting Privacy
lifestyle decisions
This statement could soon be reversed, according to the results of the debate on the right to be forgotten.
52
PRIVACY AT THE COMMUNITY LEVEL

PERIODIC SURVEYS OF THE PILOT CITIES
During the projects lifetime, the key stakeholders of the four pilot cities of Athens, Ghent, Issyles-Moulineaux and Manchester have periodically been surveyed by the Citadel consortium, to
gather - in a cost efficient and effective manner - inputs and experiences on open data
publication and exploitation as well as related issues and concerns.
The first survey was launched in April-May 2012 on a platform managed by CORVE and LOLA. It
was based on the principles of the Citadel Statement (http://bit.ly/citadel-statement), which
was presented at the 'Flemish Conference on local eGovernment' on December 14th, 2010 in
Ghent. The Citadel Statement was a call for a European policy supporting cities and
municipalities in their implementation of the eGovernment action plan, which received the
support of numerous organisations across Europe.
Following the analysis of survey results, a number of key performance indicators were
proposed, in order to provide a more precise definition of a viable benchmark for use in the
remainder of the Citadel project. The concerns about Privacy have been addressed in KPI-1
Legal and Policy Frameworks, as the latter can be a catalyst for open data publication and
exploitation against the constraints of privacy and security, and in KPI-4 Scope of Guidance for
local policies, which was requested to include: principles, standards, privacy, security, filtering,
data formats, data definitions, repositories, and channels of open data publication.
In January 2014, an online survey was administered to member and non-member organizations
(as well as individual persons) to assess the relevance and progress of governance related issues
in the emergent Open Data scenario. The survey was anonymous. The questionnaire started by
introducing a list of possible actors of the city open data governance system, namely the
following:
Mayor, City government

City ICT Department
Public Data providers
Private Data providers
Software companies
Citizen developers
User communities
Citizens and visitors
and the interviewed panel was asked to select which roles were/should be more appropriate for
each category, picking up from the following options:
a) Being informed and consulted - the actors are kept up to date on developments and
consulted when general strategies or plans need feedback;
b) Participating in decisions - actors contribute to specific decisions on how to implement
an Open Data strategy;
53
c) Active, leading role - they are directly involved in the concrete implementation of the
Open Data strategy;
d) Not relevant none of the above options is applicable to the actor group at hand.
As far as the definition and enforcement of privacy related policies are concerned, the
respondents assigned a crucial role of leadership to the City government and ICT department,
and to a lesser extent to the public and private data providers. As far as the remaining
stakeholders are concerned, while information and consultation is always welcome during the
process, a slightly more participative role was invoked for the Citizen Developers only, as the
following diagrams exhibit:
Figure 21: Stakeholders role in the definition of privacy policies (Citadel members)
54
Figure 22: Stakeholders role in the definition of privacy policies (Citadel non-members)
A third survey was run in conjunction with the Evaluation activity and it was specifically
designed to capture detailed feedback about the Citadel tools and their usage. At that time, the
Citadel tools (Converter plus AGT) were already available and thus the prospects for a much
more open approach to Open Data were also evident. Over 100 people participated in the
experiment, thus providing an adequate statistical base for extrapolating results from the
answers received to the online questionnaire. It is relevant to note here that the opinions
expressed on privacy related issues seemed to differ very much across the interviewed panel.
On the one side, some respondents rightly affirmed that privacy regulations in force have little
or no impact on the process of opening up the public datasets belonging to the pilot cities.
While this can be undoubtedly true within the Citadel partnership, in other cases, however,
there is evidence of privacy issues and concerns building a sort of psychological barrier against a
faster and widespread implementation of open government and data principles. The Open Data
Charter has been named as a viable solution to make people aware of possible breaches or
consequences of opening up data.
On the other side, it should be noted that the upcoming reform of EU data protection legislation
reinforces the responsibilities of data and service providers, placing a heavy burden of overhead
for compliance. As was argued in the previous section, this reform seems to endanger the status
of the Citizen developer lying at the heart of the Citadel vision, even though natural persons as
such do remain out of the scope of the privacy legislation. Again, the Open Data Charter is seen
as the right localisation for some ad hoc provisions in the direction of privacy management.
More generally, the judgement concerning the current and prospective framework and
guidelines on data protection was mixed. While most respondents were aligned with the
principles and directions of the EU initiative, others thought that the real issues of concern were
not properly addressed, as e.g. the social norms about privacy are shifting over time, and the
55
exploitation context proposed by the Citadel project, in which published datasets are used by
citizens for non profit purposes, is worthier of trust than mere commercial exploitation.
A closer look at the findings from the questions in this survey specifically related to privacy, in
the light of the three-level Citadel PIAF, instead provides an explanation of these apparent
contradictions.
The highest risk people see of not opening data is missed opportunities for useful
applications and services. Therefore, people see the value of open data and are expecting
someone to solve privacy issues upfront, as we suggest in the Citadel PIAF.
People want to be able to manage their own data in more detailed ways than foreseen by
EU legislation, for instance allowing access to specific groups (44%), prohibiting commercial
use (42%) and so forth.
People are very concerned about what happens to their private information; fears of
inappropriate disclosure, unacceptable use, and insecure storage are the first three
concerns, all above 60%.
Surprisingly, over 70% of respondents (many of whom within city administrations) dont
know whether or not their City has even published a PIA. Since people are indeed
concerned about privacy, the PIA appears to be considered more as a question of
compliance rather than a process that guarantees what theyre looking for.
A specific question confirms the result of the previous survey, namely that people trust
municipal authorities and in particular municipal IT departments more than anyone else and
dont trust suppliers to the government nor Communication and PR office.
While City governments may be trusted to manage privacy issues, should not act on their
own but rather follow guidelines and policies decided upon through public consultation.
ANALYSIS OF OUTCOMES
Alongside the periodic surveys, some discussions on privacy issues can be reported within the
Open Data Governance Groups in the four pilot Cities. Overall, the results in the pilot cities echo
the last external survey: people are concerned about privacy but the PIA looks insufficient to
meet these concerns. However, during the project the contribution of the ODGGs has stayed
well below initial expectations. While everyone agrees that the ODGGs can be defined as (sort
of) knowledge broker on the challenges and barriers of opening and using government data,
received input on privacy and data protection issues have been limited. In particular:
1) An Early Stage Approach has been invoked, to integrate best practice around privacy and
data protection right at the start of any City release of Open Data;
2) Trust and Confidence building are deemed essential, although the question remains of how
to create and maintain them in local citizens and businesses;
3) External (Third Party) Control was also required, for instance through appointing an official
Ethical Advisor to help the local authority oversee privacy and data protection matters.
Additionally to the above, as noted above, the initial concept of Open Data Charter proposed in
the project gradually evolved from a standard protocol to be adopted and formally signed by
the interested City stakeholders, to an alternative, more flexible structure akin to a set of
56
guidelines that focuses more on the respective roles and contributions of governance group
actors together with specific indications on how privacy is to be managed.
TOWARDS A COMMUNITY PIA

This evidence together with the recommendations for an inclusive and consultative
preparation process leads us towards integrating an on-going Privacy Impact Assessment
exercise into the mandate of Open Data Governance Groups, accompanied by a broader scope
and tools to manage privacy in a more objective and transparent manner (as we shall see in the
following sections). Advantages of so doing include:
The opportunity to ignite a specific discussion on privacy among the ODGG actors to explore
the consequences of data opening and utilisation, and especially to design scenarios where
the Open Data Commons acquires value-adding features such as Privacy by Design or
Privacy as a Service at the community level;
The possibility to establish a more favourable regime in the City, as far as data protection is
concerned, for those datasets and applications which do not fall or not easily so within
the provision of extant and forthcoming legislation;
The chance of bringing these aspects to the attention of public decision makers with even
greater relevance and urgency with the prospective diffusion of Open Data Commons
facilities, such as the Converter facilitating the publication of own data by individual
citizens and the App Generator facilitating the own app creation scenario for non IT savvy
or expert users.
PRIVACY AT THE APP LEVEL

MAPPING OF CITADEL APPS
As reported above, the project across four consecutive iterations has developed five application
templates, which are available at the URL: http://demos.citadelonthemove.eu/ and summarized
in the following table. It should be added that all the templates are open source and come
under the new BSD 3 license12.
12
http://opensource.org/licenses/BSD-3-Clause
57
Table 5. Mapping of Citadel Application Templates
Find a
Parking
Lot
Events in
the City
Points of
Interest in
the City
User
Generated
Points of
Interest
Environmental
Data
Athens
http://demos.citadel
onthemove.eu/parki
ng-athens/
onthemove.eu/even
ts-athens/
onthemove.eu/poisathens/
onthemove.eu/crow
d-sourcing-athens/
onthemove.eu/envir
onment-athens/
Ghent
onthemove.eu/parki
ng-gent/
onthemove.eu/even
ts-gent/
onthemove.eu/poisgent/
onthemove.eu/crow
d-sourcing-gent/
Issy-Les-Moulineaux
Manchester
onthemove.eu/parki
ng-manchester/
onthemove.eu/even
ts-issy/
onthemove.eu/poisissy/
onthemove.eu/crow
d-sourcing-issy/
onthemove.eu/envir
onmentmanchester/
In short, Find a Parking Lot provides information about the parking lots of a given city. The city
is configured in the back end of the application. The first page of the application presents all the
available parking lots on a map that is centred to the centre of the selected city. Events in the
City displays the events of a city on a map or list view and helps the user get information on the
types of events (s)he is interested in. The first page of the application presents a map of the city
centre just like in the Parking facilities template. The events are always geo-localized, so the
ones that take place near the city centre are those that are displayed in the first place. Points of
Interest in the City is a general application to display any kind of PoIs on the map of a chosen
city. Every PoI is categorized under one or more categories, e.g. Museums, Transportation, etc.
The template offers a filtering functionality that uses all the categories of PoIs found in the
given dataset and provides users with a list of checkboxes corresponding to those categories.
User Generated Points of Interest is a template that provides user-generated PoIs of a given
city and other crowd-sourced information. The city is configured in the back end of the
application. The first page of the application presents a map that is centred to the centre of the
selected city. Users can select different categories of user-generated PoIs to be shown.
Different colours of pins represent different categories of PoIs. Finally, Environmental Data
provides information about the environmental, and in the future, traffic and transportation data
of a given location in the city.
With these templates at hand, also made available on Github (https://github.com/citadel-eu),
every Citizen Developer can potentially create their own mobile applications, linking to the
open public datasets made available by the respective local governments (and communities, as
it is the case of user generated points of interest). As Table 5 shows, no fewer than two pilot
cities, if not more, have adhered to the task of feeding the templates with (at least simulated)
data.
58
Additionally and with the purpose of making life easier to the less IT savvy Citizen Developers,
the project has delivered an App Generation Tool (AGT), available online at the following URL:
http://www.citadelonthemove.eu/en-us/createanapp/applicationgenerationtool.aspx
At present, over 100 European Cities are subscribed as data publishers in the AGT and over 500
mobile apps have been created in about one year. About one tenth of all apps created are
multi-city, namely they concretely demonstrate potential for reuse, and about one in four are
multi-data, namely they provide valuable information to their users through data mash-ups.
ANALYSIS OF IMPLICATIONS
The argument, according to which the EU legislation on personal data protection is not
applicable to Citadel application developments, is based on the fact that with only one partial
exception (i.e. User Generated PoIs), all the other templates in Table 5 are totally dependent on
government data from different sources. Therefore, the project has been successful in
promoting new and original, if not innovative, ways to exploit the publication of open public
datasets for basically non-commercial purposes by the private sector and particularly the
Citizen Developers.
However, the Citadel vision itself, configuring an active role for people in the development of
applications, as well as in the generation of data (like PoIs and other crowdsourced
information), introduces scenarios for privacy management that are only partially foreseen by
current and pending legislation.
Based on the periodic reports from the ODGGs in the Citadel pilot cities, the main privacy risks
to be considered at application level can be listed as follows:
a) Geolocalisation. The User Generated Points of Interest are inevitably georeferenced on
the City map which can create a feedback loop with personal information.
Additionally, each Citadel app template to ensure better performance may require the
communication of the exact localization of the user, which can considerably facilitate
his or her identification by third parties13.
b) Shared Access. It is unclear whether the apps developed with the support of the Citadel
AGT should be considered for private use only. The notion of private use might include
the creation of small closed groups (including e.g. friends or relatives), though it is
unlikely that a group registration system would be added in that case. As a matter of
fact, when shared with other users or linked on Web 2.0 communities, the app can
disclose and publicize personal information (on user localization, behaviour and related
aspects) to a broader audience than originally planned or desired14.
c) Extension in Scope. In principle, the app templates provided for a baseline 100% open
and public datasets scenario could be profitably reused for other purposes, where
underlying datasets, still owned by third parties, may be private or confidential or just
not authorised for such particular use. A variant of this extension in scope can occur if
13
Actually, one uses the localisation feature of HTML5, which therefore has to do with the browser more than the
app. But the app has to be authorized to work with it anyways.
14
Not surprisingly, some Associate cities (eg. Amsterdam) have asked for "private" ODC spaces.
59
and when the user decides to mash-up the open and public with his/her own generated
datasets (like in the example of the bridge club above).
d) Crowdsourced Data Inputs. Depending on the way an (extended) app is configured,
users can be called (or tempted) to add their own datasets or complete/integrate the
data items in the existing ones. This may facilitate personal identification or change in
the level of privacy protection of used datasets.
e) Open Source Software Improvements. The Citadel app templates are already available
on Github. According to the OSS logic, developers can bring improvements to the
existing release(s) under the condition of free redistribution. However, depending on
the nature of the app or template, this can reinforce or reduce the level of privacy
guarantee.
f) Recurrent need for user consent. For the reasons expressed above, it might be
necessary to repeat the request for user consent even after the specific application has
been installed, for instance whenever a new user enters the community or does specific
actions with and on the app.
PROPOSAL FOR AN APP PIA FRAMEWORK

Most of the previous problems emerge because of an improper management of the
implications of data and application ownership. The conventional approach of data protection
regulators rightly focuses on the clarification of privacy policy from the perspective of mobile
app developers in order to strengthen the information basis in support of user consent.
However this approach has as starting point the following principles:
That app developers are juridical persons, usually profit making organisations
That the personal data under scrutiny only belong to the individual user
That the app features are consolidated and do not change much over time.
As the examples provided above demonstrate, these principles are no longer valid in the Citadel
world, where
The app developers can be natural persons (citizens) or not for profit entities like e.g. social
networks or associations
That the datasets used may belong to several owners and subject to different policies (of
openness and permission to reuse) including no policy at all
That the app features are subject to continuous revision by multiple parties at the same
time, according to the OSS logic.
In the early stages of the Citadel project, a taxonomy of data/application pairings was proposed
in order to make room for such variations of the theme, in the perspective of existing and
upcoming EU legislation on data protection. Then, we identified nine possible instantiations (use
cases) worth consideration in terms of liability for data providers and app developers. That
taxonomy is proposed again here, with modifications, thanks to the following table, where the
use cases of interest have now become eight.
60
Table 6. Taxonomy of Data/Application Pairings in Citadel
Person
owns data
Person
produces
data
Person
rd
uses 3
party data
Data is stored
locally on the
app (be it ones
rd
own or a 3
partys)
1. Out of the
scope of current
and future data
protection
legislation, as no
disclosure
occurs
5. Out of scope,
but common
sense should be
used (eg
diligence in
custody)
Data is shared with peers

in a closed group through
rd
3 partys app
Data is peer shared

or made public
through ones own
app
Data is made
public through
rd
3 party app
2. In the scope for the app

provider, provided it acts
professionally, grey zone
otherwise
3. Out of scope, but

awareness of citizen
developer should be
promoted somehow
4. In the scope, if
app is
professionally
run, grey zone
otherwise
6. Out of scope for the

individual, provided data is
for personal use or he/she
acts within the group
7. Out of scope for

the individual, within
the limitations posed
by data owner
8. Out of scope
for the individual,
as data captured
was already
public
1. Person owns/produces data, which is stored locally on the application. Example: annotating
events in the agenda of a cellular phone or tablet PC. This case (highlighted in green colour)
should remain out of the scope of the current and future legislation on data protection, being
referred to personal use only. No difference should be made by the circumstance that the app
used was bought from a third party or directly produced by a citizen developer for his/her own
purposes.
2. Person owns/produces data and shares it with peers in a closed group using a 3rd party app.
Example: the person sends an email to a friend or chats on Facebook or other social network
about a certain topic. The contents of this communication are private, but what if the receiver
discloses them without a prior consent from the sender? The infringement of privacy would be
certain, not yet its legal relevance (unless a crime is committed due to this disclosure). An early
warning (or a periodic reminder) from the system might be desirable, however, in order to
minimize this risk. From the perspective of the developer of that app (normally a 3rd party, but
there could be some exceptions), being the group closed, liability for any privacy infringement is
limited if a registration procedure existed that foresaw the collection of user consent before
entering the system. This is the prior consent required by EU Directive 95/46/EC to be formally
and explicitly given by the owner (or subject) before any data treatment.
3. Person owns/produces data and peer shares it or makes it public through his/her own app.
Example: a self-developed application that spreads information about energy consumption of
home appliances for monitoring and benchmarking purposes. Given the full identity between
data owner and application developer, pre-emptive consent to data sharing should be
considered as embedded in the system. However, taking into account the risk that data
publication may lead to third party appropriation for other uses (e.g. commercial) not expressly
authorised by the owner, if not for the commission of crimes, it would be advisable to promote
better awareness of these risks by the user in some way. Here the provisions of current and
61
future data protection legislation do not apply, but a role could be identified for a specific
service in the Open Data Commons, for instance.
4. Person owns/produces data and makes it public using a 3rd party app. This may be the case
of a service residing in the cloud for instance, to the benefit of car drivers in a City centre
that shows the respective locations in order to promote e.g. the exchange of parking lots by
those going in and out of the town. Disclosing a trivial piece of information like ones own car
plate number can have unwanted consequences, which should be the object of prior caveats
and informed consent. However, an alternative action would be to embed Privacy as a Service
within the application, so that each user can select the acceptable level of personal data
protection in relation to the scope and purposes of the service required.
5. Person uses 3rd party data, which is stored locally on the application. If private information
belonging to any third party is stored on local devices only, then we might presume it is only for
the individual use of that person. Example: a standard phone directory or address book in
someones cellular phone or tablet PC. While the case is not relevant for the current and future
legislation on data protection, it may become of interest for criminal law if a fraudulent use
materializes. Without going that far, common sense recommendations like adding a password
and regularly changing it can be appropriate, to minimize the risk of involuntary disclosure.
6. Person uses 3rd party data, which is peer shared in a closed group. Example: information
that is acquired on Facebook. This can be reused without limits in the same context (eg.
Facebook) but some behavioural rules (Netiquette?) should possibly be adopted. Use of this
information only for personal purposes (provided they are legal) is also allowed. Another
question might be whether the disclosure of such information outside the borders of the group
would be allowed. Here the answer would certainly be negative.
7. Person uses 3rd party data, which is peer shared or made public by the data owner through
an application developed by him/her. Example: a searchable repository of knowledge provided
to registered users (peer shared) or to the general public by the owner of that knowledge.
Depending on the limitations posed by the data owner, reuse may be free or subject to
conditions. However, the provisions of data protection legislation may not apply.
8. Person uses 3rd party data, which has been made public on 3rd party applications. Example:
downloading information from a repository of open datasets. Here the public nature of
information used is already clear from the start. Therefore, no infringement of privacy law could
be foreseen.
The two grey zones identified in points 2 and 4 above refer to the case of citizen developer,
who is most probably lacking a juridical persons nature and thus difficult to attribute with
certainty to the scope of application of privacy legislation. Here also two contrasting interests
are visibly in action, one to promote the transformation of this spontaneous initiative into a real
business with potentially huge returns, another to avoid that when this happens, it will also be
too late to protect individual user against a voluntary or involuntary disclosure of the embedded
personal information.
62
By identifying several cases as out of scope of the proposed EU legislation, we do not intend to
imply any shortcoming of the proposed reforms nor propose any extension of the normative
framework. On the contrary, Citadel holds that the above dilemma requires a soft regulation to
be designed in the context of the principles of the Open Data Commons and the multi-level
PIAF, and namely by linking the out of scope issues at the application level to the other two
levels identified:
At the Community level, by collectively reviewing app developers privacy policies and how
they are implemented on a regular basis, with more frequent and clear reminders of privacy
risks to app users, in the context of local ODGGs PIA exercises.
At the Data level, by allowing application developers to have greater certainly as regards the
privacy implications of datasets they are accessing. This is the subject of the following
section.
PRIVACY AT THE DATA LEVEL

PRIVACY IN THE OPEN DATA COMMONS
As argued in the previous section, it seems quite unlikely to be able to fully solve the data
protection vs. valorisation dilemma through a more accurate handling of application level
privacy within a Community PIA exercise. While valuable, this approach needs to be
complemented by a mechanism through which to embed privacy concerns in the Open Data
Commons by design.
The way it has been conceived and offered to European stakeholders attention, the Open Data
Commons is not meant to be (or become) a single, giant information silo where someone,
typically a Leviathan acting as cloud service provider, collects and stores resources in the
perspective of future utilization by registered or even unregistered customers. It is rather a
new kind of network or a data and application ecosystem, in which all actors are peers,
interacting on a level playing field and where every connection be it person-to-person, personto-business, or business-to-business is peer-to-peer, i.e. with no middleman in between.
In this ecosystem, no single entity has full access to or stores all the data, which is only linked to
and from the location of its original source(s). Additionally, individuals are empowered to
create, own and control their own datasets and applications, and to share them with other
participants in the Open Data Commons on terms that they set and negotiate, as need be.
Embedding privacy by design in the Open Data Commons (ODC) implies making sure that a list
of Privacy as a Service features are baked in from conception, possibly including the following:
1. Data Identification: Every single data item (and as a consequence, any dataset containing it)
should always be associated to its source and ownership15. Every dataset transformation (eg.
converting, merging, purging, cleansing, etc.) should be able to preserve this attribution and
communicate it transparently to ODC actors.
15
Two fields of metadata that are already mandatory in the Citadel Index.
63
2. Real time Updating: Every time the source system adds, deletes, or changes record in a
dataset, this should in theory be immediately reflected in the availability of the new,
transformed dataset on the ODC. How this translates into practice will need to be seen in future
developments, but the issue assumes a specific urgency in the light of privacy considerations.
3. Data Classification: The owner of each dataset (data item) should always be able to classify it
as open and public or private and restricted in use according to a certain licensing
mechanism. A facility on the ODC should help users associate and visualise the last updated
license to each dataset (data item) before usage16.
4. Data Anonymization: Another facility on the ODC should enable data owners to anonymize a
dataset, cleansing it of any reference to specific individuals or organisations during a conversion
process. This can be implemented in future versions of the Citadel Converter.
5. Transparency and Control: Every ODC actor should be entitled to make real time searches on
the log files, discovering who has inspected, appropriated or transformed any datasets
available. In perspective, this should lead to the possibility of assessing whether any ownership
rights have been broken illegally or without justification.
Given the extremely varying nature of the linked datasets and the early maturity stage of the
ODC implementation, it can be more productive to stay focused on process level innovations
such as those above17 as a contribution to the roadmap for future development.
ANALYSIS OF IMPLICATIONS
At present, the ODC prototype environment as developed within the Citadel project is made
up of two main components:
A collection of online datasets i.e. accessible via their URL that contain data in a format
i.e. platform and data structure that can be accessed remotely by at least one template or
third party application without any further conversion;
A unique Index that includes a listing of a) online datasets converted as described, including
where and how they can be accessed, and b) the relevant template and application data
format that are compatible with the files.
In the future roadmap outlined in Citadel, the ODC (together with additional enhancements to
the Index) offers a distinctive opportunity to fulfil the requirements of embedding Privacy as a
Service into an Open Data ecosystem.
First and foremost, basic logging has been implemented for the Converter and the Index, so
that key events are tracked such as the registration of new datasets (by whom and how),
accesses by templates, configurations used, etc. Further enhancements of these features
can considerably promote the transparency and control requirement, provided that all
actors can effectively access the log information contained therein.
Secondly, recent work with the Citadel Converter has explored features that contribute to
the requirements listed above: besides allowing the ad hoc transformation of multiple file
formats into JSON files, compatible with one or more data models, on-the-fly conversion
16
17
A CC license field is also mandatory in the Citadel Index, at the dataset level.
As compared to exploring privacy implications of different kinds of information, i.e. financial, health, etc.
64
scripts have also been tested. This can lead to enhancements that will allow the Converter
to directly access an original data source (or a regularly copied data dump) and only save
the configuration info to the ODC, thus better preserving the integrity of attribution.
Thirdly, the initial project policy based on developing application templates has been
considerably altered by the extremely good performance of the Application Generation
Tool, which has allowed the generation of dozens of original applications but also has the
limit (in its current version) of using only one data model that includes all fields from the
PoI, parking, and event templates. A different roadmap can be that of developing and
storing new application templates, able to visualize types of datasets that go beyond the
Citadel application scenarios. This can significantly contribute to realizing the open and
interoperable vision of the ODC in an incremental fashion.
Finally, the Citadel platform the first common project space where both datasets and
application templates have been made publicly available will continue being open and
accessible after the end of funded activities. This allows to anyone the publication of
georeferenced datasets and their immediate visualisation on a map on their mobile phone.
As already demonstrated by the Associate city outreach activity, this has great potential for
motivating the publication of datasets beyond those of the project partners. However, the
functionalities of data classification and (if required) anonymization need to be enhanced.
PROPOSAL FOR A LICENSING MECHANISM

We conclude this section by formulating a technical proposal for a data classification scheme
and a licensing mechanism. Its purpose is to provide a framework for qualifying data sensitivity
upfront, i.e. right at the moment a data item is created by the respective owner. This proposal
provides the foundation for establishing protection profile requirements and use allowances for
each class of data, irrespective of the application that handles or transforms it.
Table 7. Proposed data privacy classification scheme
Protection
level
0
(Limited or
none)
1
(Moderate)
2
(Contractual)
3
(Sensitive)
Consequences of disclosure
Information intended for public access
Association with personal identity of

data owner(s)
Non compliance with terms and

agreements set out by data owner(s)
Disclosure of sensitive personal
information about data owner(s)
Privacy breach, likely criminal offense
4
(Confidential)
Sample data (non

exhaustive list)
City PoIs
Official statistics
Environmental data
Geolocalisation
Personal contacts
Unique device
identifiers
License keys
Electronic subscriptions
Access logs to services
Health records
Political opinions
Sexual orientations
Legal files
Credit card data
Commercial records
Required
action(s)
Acknowledge
source
Ask permission
Pay against
usage
Anonymize
Destroy
The way this scheme should work is the following:
65
18
At data item level, the system (most likely a Privacy as a Service resource in the ODC, like
the aforementioned Index) should enable a user to add a metadata, akin to a Creative
Commons license18, that clarifies the extent to which it will be possible to copy, distribute,
and make some use of that data item, either non commercially or for business related
purposes;
At dataset level, the data item with the highest protection level should determine the whole
datasets qualification and classification. However, it should become possible to delete
some data items in order to create a new dataset with a lower protection level. In other
words, the classification is confined to the data, it does not extend its scope to any dataset
created with this data, provided the data itself is not manipulated;
At application level, the risk of privacy breach is zeroed in case a particular intelligence is
added to the system, which attributes to e.g. data mash-ups the higher protection level of
all used datasets to that specific purpose, asking eg. the permission of the user only if and
when necessary (of course, all the caveats and provisions of the data protection legislation
should remain valid);
At community level, five events would be particularly relevant and interesting to explore:
1. What happens after the data owner has eg. given the permission to use that data in a
certain context. For sure, the licensing metadata should trace this circumstance. But is
this enough to authorize future reuse, in other contexts than the former? Here the
answer should be probably not. However, it would be hard to prevent unauthorised
reuse once e.g. someones phone number has been published on the web for the first
time. Therefore we might think of a lighter approach, which allows free reuse of a data
item (for instance, by lowering protection level from 1 to 0 permanently) under the
only condition that the owner should be informed whenever that data is used again.
2. What happens if a third party (like another user) manipulates a dataset by adding in
coherence new records to it? In this case, each additional data item should bear its own
metadata and again the protection level of the dataset would match the highest
protection level of any single data item contained therein. Same outcome in case the
addition was made by an application, rather than a human being. At least in principle,
an algebra of data transformations could be devised in that case eg. if a new data item
is the sum of two, the result should automatically bear the protection level of the
highest addend.
3. Of course, any conversion of a dataset into a different format should not determine any
change in the predefined privacy attribution. For example, a JSON file created out of
existing CSV or similar with the Citadel Converter should preserve the same data
protection level than the original source of this transformation.
4. What if there is a mistake in a dataset proposed by someone, which is corrected by
someone else? The case is not that relevant per se, because it should be treated as the
previous in this list (with the new data item bearing the licensing metadata attributed
by its owner), but more as an example of conflict between false positives and false
negatives, which severely affect the world of data19. Probably the best way to solve a
For more information see https://creativecommons.org/licenses/?lang=en

By false negative we intend a data item discarded because it is seemingly wrong, while it is true, and by false
positive a data item which is considered true, while it is wrong. For a discussion of related issues in the world of big
19
66
potential impasse is to keep track of all versions of a certain dataset, in order to allow
the recovery from mistakes in correcting mistakes.
5. Could a user change his/her mind and modify the attribution of a certain data item from
a previous, lower level to a higher level of protection? While the answer is certainly
yes in principle, this would prove impossible to do in practice after the protection has
been set at 0 for the first time. Following this train of logic, we might infer that once
the user has received sufficient advice for informed consent to data publication, this
decision should be presented as irreversible before collecting his/her approval on it. The
lesson learnt from this case is that EU privacy regulators should reinforce the procedure
of information delivery prior to user consent collection.
The proposed framework however poses important challenges to the Open Data Commons and
all similar ecosystems where important collections of data and applications would be made
available. In particular, the ODC should provide both the framework for the collective definition
of privacy guidelines related to both applications and datasets, and be an integral part of the
governance of open data as a public good.
The above cases, which derive from the way applications use data, all have to do with the
dynamics of applications during their use. And here we should remember that high data
protection levels (if unjustified, of course, in relation to envisaged uses) prevent the
development of the digital market and ultimately economic growth in Europe two major goals
of the incumbent and upcoming EU legislation. Therefore, an adequate privacy scheme should
be closer to a licensing scheme, in the sense that it is not just a question of see vs. dont see
but a more articulated issue of what happens to my data. In other words, like for the
classification scheme above, it is at the level of the individual data item that a license
mechanism should be applied, like the following table shows:
Table 8. Proposal for a Data Licensing Mechanism (based on the CC scheme)
Abbr.
PR
Meaning
Privacy Restriction
PR-BY
PR-SA
Re-use with
attribution (by)
Share Alike
PR-ND
Non-derivative
PR-NC
Non-commercial
PR-NI
Non-identifiable
Status
Proposed
as new
umbrella
Existing in
CC
Existing in
CC
Existing in
CC
Existing in
CC
Proposed
PR-NP
Non-position
Proposed
Use for Open Data

Framework for derogation to data protection level 1 (see
Table 7)
Data item can be re-used but link to source (of data item,
either directly or indirectly) must be maintained
Data item can be re-used by same license must be applied
to derivatives (eg. when aggregating data or data mining)
Data item can be re-used by cannot be modified (ie. no
aggregation or data mining)
Data item can be re-used but not for any commercial
purposes
Data item can be used but without identification of data
generator/owner
Data item can be used but not in association with the
location.
A given individual, either at the moment of providing data or enabling a device to generate
private data, could thus assign a PR license to each data item (ie. a positioning reading), for
data see: http://jeffjonas.typepad.com/jeff_jonas/2011/02/sensemaking-on-streams-my-g2-skunk-works-projectprivacy-by-design-pbd.html
67
example PR SA ND NC NI. The way such a license should be applied varies according to the way
data is captured20:
Volunteered data by people who explicitly share information about themselves through
electronic media - for example, when someone creates a social network profile or enters
credit card information for online purchases;
Observed data captured by third parties while recording activities of users (in contrast to
data they volunteer) - examples include Internet browsing preferences, location data when
using cell phones or telephone usage behaviour;
Inferred data from the analysis of personal data belonging to the previous categories. For
instance, credit scores are calculated based on a number of factors relevant to an
individuals financial history. (This is a derivative form of data capture only allowed if the ND
is not present, but in the case of SA the aggregated data item should maintain the same
license).
The following table summarizes the required characteristics of ODC or similar environments to
align with the logic and implications of the proposed data licensing scheme.
Table 9. Proposal for a Data PIA Framework
Data level
License handling
Compliant
system
Only accepts data
items with license
Data item
Defines level of
data protection
thru PR license
Original
Dataset
(CSV)
Contains mix of
data protection
levels and a PR
license as common
denominator
Contains all data

with all licenses,
held by trusted
administrator
Dataset (CJSON)
Is compliant with
specific level of
data protection by
the corresponding
PR license
Guarantees usage
coherent with data
protection level as
stated by PR license
Guarantees usage
coherent with data
protection level and
the PR license
Contains only
data items
coherent with
dataset license
Application
Third parties
(agency /
company)
Only
reads/accepts
compliant
datasets
Reports usage of
datasets
Agreements
Validation
Users can configure

devices to generate
data with selected
licenses
Community (ODGG)
can agree to assign
basic licenses by
default to nonlicensed data items
User can always

inspect data item in
CSV (through the ODC
Index)
Community (ODGG)
defines terms for
control of
administrator activity
while Index can
handle the license
aggregation
Users can trace
datasets using data
items through the
Index logs
Administrator
guarantees
coherence of
Dataset license with
data items
Application
developers agree to
abide by licenses
Community (ODGG)
can make specific
agreements with
third parties
Community (ODGG)
can trace use made by
applications via the
ODC Index
Community (ODGG)
defines terms for
control of third party
activity & configures
Index accordingly
We could consider this as an innovative Data PIA Framework, to be embedded by design in

the future structure and functioning of the ODC itself.
20
We owe this distinction to [18].
68
Overall, the diffusion of this licensing mechanism may have important consequences, both in
terms of personal rights and commercial exploitation of open data. On the one hand, the license
helps data owners to keep track of who uses their data and when retaining copyright and
credit if that is the case while not impeding to others the appropriation and manipulation of it.
Differently from the Creative Commons license, ShareAlike would be possibly allowed, under
the condition of giving feedback to owners about the uses of their data at least for commercial
purposes. On the other hand, particularly after a dataset has reached the protection level of 0
or has been licensed as PR-BY, app developers and other digital businesses would be facilitated
in their activities, being able to demonstrate that data legitimately belongs to the public
domain. In this way, the unwanted legal consequences of past privacy carelessness would no
longer be charged to the last edge of the value chain.
69
70
MATURITY OF OPEN DATA GOVERNANCE

In the Citadel project Common Data Charters (the plural referring to the intention to have one
per pilot city, then a single and final edition for the whole consortium) were defined as
operational set of principles that will specify the common goals and principles for common
Data, as well as the rights and responsibilities for both data providers and data users. On the
side of data providers, this includes the principles of common data formats in the public
interest, as well as duties and obligations regarding privacy and identity management. For the
data users, this includes conditions for a) contributing to the collective assets of common SDK
and API components and b) rights to exploitation of data by compliant applications. Therefore,
the contents and prescriptions of Common Data Charters are inherently procedural.
This does not imply, however, that they also have an intrinsic contractual nature, namely to
engage the key stakeholders from each pilot city into reciprocal commitments related to
provision and usage of data and information. At least, this has been the way during the
project, the four Citadel pilot partners have been first trying to gain the commitment of the
most relevant socio-economic actors: propose drafts of Open Data Agreements (as Memoranda
of Understanding) and then collect as many signatures on them as possible.
Unfortunately, an early feedback received from the respective local communities was that the
Citadel project was at too an early stage to propose a written agreement that would be able to
attract a sufficient and qualified number of stakeholders. This is also why we soon decided to
abandon the idea of a City-specific Memorandum of Understanding, requiring formal signatories
by the participants in open data groups, opting for a more loose, non-binding registration form,
provided as a Google Document, which is still available at
https://docs.google.com/spreadsheet/viewform?formkey=dDVtekh5YzBGVkxtcXpoYnd4eWNNT
2c6MQ#gid=0
In so doing, additional degrees of freedom were allowed to prospective participants in the
Citadel project pilots, without losing track of the respective requirements in terms of data,
information and technological development.
CAPABILITY OF OPEN DATA ECOSYSTEMS

Now that the final objective is to issue a common Open Data Charter for the entire Citadel
community, the following considerations can be made:
First and foremost, flexibility is a value. While there have been talks within the Citadel
consortium, leading to rethink the priority of signing a local Data Charter agreement, this has
not reduced the intensity of efforts from the pilot partners, as documented in the previous
section, to deploy open datasets on their respective infrastructure. In the meantime, very few
Cities worldwide have demonstrated appreciation for the value of such an approach to kickstart open data generation processes in the respective areas of competence or interest. A
relevant exception is the City of Edmonton, Canada - see their MoU (Memorandum of
71
Understanding) at
https://docs.google.com/a/edmonton.ca/viewer?a=v&pid=sites&srcid=ZWRtb250b24uY2F8b3B
lbi1kYXRhLWNhdGFsb2d1ZS0yLTB8Z3g6MmUwMTMyNjhiNTBkZDNiOQ). Other Cities or public
institutions, particularly in Europe, have either used the MoU instrument to establish strategic
relationships with different levels of government (this can be the case of the Vienna Smart City
agreement between the City Mayor and the Austrian Ministry of Infrastructure, see the news
about it at https://smartcity.wien.at/site/die-initiative/strategie/smart-city-wien-neueinitiative-bundelt-krafte/), or with same level organizations and institutions (compare the MidAmerica Regional Council, which gathers 9 member counties from Kansas City, MO see
http://cfakc.tumblr.com/post/60775247513/digital-innovation-in-government-resources), or
with leading think tankers and experts in the domain (for example, the four MoUs signed by the
BBC in 2013 see the news at http://www.techweekeurope.co.uk/news/bbc-agrees-open-data132653). Other Cities have preferred a more formal establishment of rules concerning Open
Data governance, such as through ad hoc legislation (example: New York City see
http://www.nyc.gov/html/doitt/html/open/local_law_11_2012.shtml). Still others have issued
ad hoc licensing agreements (such as: Goteborg see
http://gbgdata.files.wordpress.com/2012/02/avtal-goopen-1-3-0-copy-eng.pdf, or Nantes - see
http://data.nantes.fr/licence/).
Second, the Open Data governance system outlined by the Citadel vision has the merit of
integrating all the local stakeholders belonging to the public sectors data and information reuse
value chain we first outlined in the beginning of this book and reproduced in Figure 7 above.
While the eight stakeholder typologies presented to the Citadel survey respondents fully map
(with more internal specifications) the four communities originally displayed in that picture,
there is an obvious need for clarifying the respective roles and contributions to the
achievement of a common vision and the reciprocal gains and benefits that can derive from it.
This special need will be partly fulfilled in the remainder of this section.
The vision behind this ecosystems representation is that of a socio-technical environment,
made up of people, networks, institutions and technology artefacts, which co-determine the
direction and progress of open data publication and use policies (in this case). In addition to
communication and collaboration activities among the four groups of stakeholders that make
up this ecosystem, a set of behavioural rules, resources and practices contribute to shaping the
main function that this environment has to deliver in order to survive: innovation. According to
[11], after an extensive review of literature from various fields (economics of innovation,
entrepreneurship, sociology of technology and political science), seven are the key capabilities
to be enhanced for such systems to evolve and perform well in terms of innovation:
1.
2.
3.
4.
5.
6.
7.
72
Knowledge base creation, development and diffusion

Influence on the direction of search and investment processes
Entrepreneurial discovery and experimentation
Formation and support of new markets for innovation
Visioning and legitimization of a common future
Mobilization of resources (human, financial, etc.)
Development of positive externalities
Maturity of Open Data Governance
In the following table, we provide a few examples of how different governance activities
contribute to improving the above capabilities. Some of these can well be enhanced by the
application of a MoU-style agreement, some others can not, depending on historical and
cultural circumstances.
Table 10. Open Data Ecosystem capability matrix
Ecosystem
capability
Example from
Citadel pilots
Governance system contributions to

capability enhancement
1. Knowledge
base creation,
development
and diffusion
http://data.gent.be
/datasets
2. Influence on
the direction of
search and
investment
processes
Local debates on
published and tobe-published
datasets to figure
out new
applications
3.
Entrepreneurial
discovery and
experimentatio
n
4. Formation
and support of
new markets
for innovation
http://data.gent.be
/apps
Empowerment of citizens and mobile

users to create own datasets.
Publication of private, non-government
data according to an open access
approach that will enable its subsequent
re-use by the public.
Formulation of shared requirements for
providing open data, including
standards, formats, licensing approach
etc. Empowerment of interested
stakeholders from the bottom of the
value chain to propose changes in open
data policies and plans if reasonable.
Empowerment of citizens and mobile
users to develop own applications using
open data. Fast prototyping, innovation
and uptake of new templates for smart
city services.
Creation of new mobile apps relying on
current and to be published open
datasets. Promotion of application
interoperability, rather than mere data
convergence and/or integration. Privacy
as an embedded service.
5. Visioning and
legitimization
of a common
future
http://fr.amiando.c
om/Citadel_EN.htm
l
6. Mobilization
of resources
(human,
financial, etc.)
http://opendatama
nchester.org.uk/
Hackatons, Open
Data Days, etc.
Generation of political backing to the

provision of open access to data.
Understanding citizen needs in terms of
innovation in public services. Creation of
new partnership models of working and
co-creating services between
government and citizens. Monitoring of
the impact and outcomes of making
data and applications open and
available.
Ability to bring together all relevant city
stakeholders. Inclusion of domain
experts and technical advisors to
support specific parts of the process.
Incubation and financial support of most
relevant entrepreneurial initiatives.
Supported in
Citadel vision? If
so, how
Citadel data
converter
Establishment of
open data
governance
groups in the
four pilot cities
Citadel app
templates
The ODC as a
virtual brokering
system that
brings offer close
to demand of
open data and
applications
MoUs, Open
Data Charter
MoUs Open
Data Charter
73
Ecosystem
capability
Example from
Citadel pilots
Governance system contributions to

capability enhancement
7. Development
of positive
externalities
N/A
Provision to all interested stakeholders

of a free access to existing knowledge
base(s), against the only commitment to
share and augment knowledge at equal
conditions (reduction of uncertainty and
costs of information acquisition).
Migration of new apps from a city to
another, following the common needs of
the end users and the similarities of city
datasets (induction of spillover effects,
increased efficiency due to shared and
incremental development). Scaling
up/out of existing governance systems
through targeted communication
campaigns (and possibly the signature of
MoUs) to extend the Citadel network to
additional Cities and stakeholders.
Supported in
Citadel vision? If
so, how
The ODC as a
holistic concept
that takes
further
momentum and
gains credibility
across time and
cities.
The table should be read as follows: in the first column, we identify a number of capabilities
that a generic ecosystem should be able to demonstrate. Where relevant, we provide in the
second column some examples of these capabilities taken from the Citadel pilots. In the third
column, we list the enhancements that a well-established open data governance system should
offer to existing capabilities. Finally, the last column shows examples of these enhancements as
emerging from within the Citadel project and partnership.
What can be gathered from the table is a latent conflict between two opposed visions of how
the process of opening up data and promoting their utilization can be finalized and made more
effective through formal agreements: on the one side, there are some functions (like 1 through
4) that do not necessarily require formalization through city level MoUs unless there is a need
to attract and include in the process all of the key stakeholders belonging to the ecosystem at
hand. In fact, this is the experience that has emerged with strength from the Citadel pilots. On
the other side, the table lists a few additional functions (namely 5 through 7) where the utility of
a signed MoU can be validly argued.
We hereby propose to solve the conflict in terms of a Maturity Model for the Cities that are
involved in this process. In literature, several models of such a kind exist that aim at the
fulfilment of heterogeneous purposes from merely descriptive to evaluative up to normative
goals. In Citadel, we decided to focus on the CMM (now a registered service mark of Carnegie
Mellon University in the US). CMM or Capability Maturity Model is a five-levels qualitative
model assessing the maturity of an organization with respect to software development
processes [5]. Historically, the first CMM was developed between 1987 and 1997 for the US Air
Force. Prior to the CMM introduction, organizations tended to emphasize the results of
development, rather than focusing on how to improve the process. In principle, the five-level
structure of CMM and its underlying logic can be replicated and applied to any other process,
74
including the gradual establishment of an Open Data Ecosystem like the one described in this
book.
Instantiated to the Citadel socio-technical environment, the five CMM levels of a City could be
redefined as follows:
I. Accessible (e.g. when large sets of public and private data are provided free of charge
to consumers of content and developers of knowledge services in the city);
II. Inclusive (e.g. when all the major value chain stakeholders, including citizens as both
developers and users, are integrated in periodic consultations to express their individual
judgment and evaluation about the opening of data process);
III. Participatory (e.g. when a joint system of decision-making is permanently set up and
used to integrate local communities of data holders, service providers and users in
collective decisions regarding the design, implementation and evaluation of new
services and apps);
IV. Co-creating (e.g. when resources are in place that enable individual persons as well
as local entrepreneurs and larger companies to create new services by the mash-up and
orchestration of existing resources, application templates, or chunks of data);
V. Leader (e.g. when the city government and/or community become attractive leaders,
creed ambassadors, authoritative gurus and opinion catalysers for sustainable
innovation in public services through open data).
Differently from other maturity models, we do not necessarily see these five stages as steps of a
ramping-up pathway. In other words, our vision is not to promote the once-for-ever jump of a
city from level I to (say) IV by the introduction of an ODC instantiation, or assume that you
need to land in level IV before taking off towards level V. Our vision is more similar to a spiral
model, where progress can be incremental over time in all the five maturity stages, and a city or
community may well experience several recurring cycles that go from I to V. As an additional
clarification, we may wish to use the familiar 5-star deployment scheme for linked open data
introduced by Sir Tim Berners Lee as early as in 2006 (see
http://www.w3.org/DesignIssues/LinkedData.html). The following matrix simply maps the
proposed CMM against Lees 5-star scheme, to demonstrate that (depending also on the
starting point) a city may well be a leader in open licensing of public data on the web, and still
lag behind in other kinds of more advanced deployment. Presumably, but this would require an
empirical demonstration, the process evolves gradually across time, but it may also be subject
to quantum leaps or radical innovation experiments, here shown as a zig-zag pattern.
75

Table 11. A CMM for LOD (example)
5Stars
(link your data to other data to provide context related
information)
4Stars
(use URIs to denote things, so that people can point at
your data more easily and quickly)
3Stars
(use non-proprietary formats for data publication in
machine readable form - e.g., CSV instead of Excel)
2Stars
(make your data available on the Web in structured
form - e.g., Excel instead of image scan of a table)
I. Accessible City (e.g. when large sets of public and private data
are provided free of charge to consumers of content and
developers of knowledge services in the city)
II. Inclusive City (e.g. when all the major value chain stakeholders,
including citizens as both developers and users, are integrated in
periodic consultations to express their individual judgment and
evaluation about the opening of data process)
III. Participatory City (e.g. when a joint system of decisionmaking is permanently set up and used to integrate local
communities of data holders, service providers and users in
collective decisions regarding the design, implementation and
evaluation of new services and apps)
IV. Co-creating City (e.g. when resources are in place that enable
individual persons as well as local entrepreneurs and larger
companies to create new services by the mash-up and
orchestration of existing resources, application templates, or
chunks of data)
V. Leader (e.g. when the city government and/or community
become attractive leaders, creed ambassadors, authoritative gurus
and opinion catalysers for sustainable innovation in public
services through open data).
1Star
(make your data available on the Web in whatever
format under an open license)
Possible patterns:
**
***
****
*****
**
***
**
GOVERNANCE ROLES
Another important outcome of the pilot experiences has been a clarification of the distinct roles
played by the various stakeholders in an open data governance system. As we have tried to
demonstrate with the previous discussion, there can be different levels of maturity in this
system, which correspond to different intensities of engagement for those stakeholders.
However, in a mature community, all of them must be represented and actively engaged.
According to the survey results, there is a considerable awareness of the need for stakeholder
representation in both the Citadel members and non-members who have responded to the
76
survey. However, the underlying (common) vision is still too centred on the Citys ICT
Department (as a proxy for all technical and domain experts who are certainly required to ignite
and support the process from within the local government), while the contribution of other
stakeholders sitting at later stages of the value chain is certainly appreciated, but probably with
a certain amount of lip service paid to it. The reason for this might be that clear rules and
procedures are lacking for the definition of the perimeter and scope of each stakeholders
typology involvement and the signature of a MoU might be a good solution to this impasse.
This aspect is also worth mentioning with respect to the Mayors (and other policy makers)
contribution to the process. In fact, particularly if and when a formal MoU was not signed for
the discipline of open data governance groups, political coordination becomes essential in order
to deliver legitimization and ensure the proactive and committed behaviour of all key
participants.
The following table borrows from the questionnaire results in highlighting the potential
contribution of a MoU with respect to the enhancement of participation and engagement of
stakeholders in the open data governance system.
Table 12. Ecosystem role definition and potential MoU contribution
Role
Defining Open Data
strategies (what data to
publish, ownership and
property rights, pricing,
etc.)
AS-IS (from the questionnaire

responses)
ACTIVE LEADERS: Mayor/City
Government; City/ICT
Department; Public Data
providers
PARTICIPATING IN DECISIONS:
INFORMED AND/OR CONSULTED:
Software companies; Citizen
Developers; User communities;
Defining technical and

quality standards for
Open Data (platforms,
security, data and
semantic standards,
data quality, etc.)
ACTIVE LEADERS: City/ICT

providers; Private Data providers
Mayor/City Government; User
communities
Developers; Citizens and visitors
Dataset refinement and

validation of the quality
of datasets
ACTIVE LEADERS:
City/ICT Department; Public Data
providers; Private Data providers;
Citizen Developers; User
communities
TO-BE (possibly through formal MoUs)

Participation of citizens as developers
and users is essential for the full
realization of the vision.
Information and consultation are not
enough to create awareness and build
consensus and engagement.
A holistic strategy is more efficiently
drawn up and tested with the support of
the whole constituency.
This can also contribute to the full
adherence to the ODC concept with all
its embedded functions and services.
A more inclusive decision-making process
in this domain might certainly be
beneficial to the progress of activities and
the achievement of results.
In our vision, technical and quality
standards for open data are strongly
dependent on what happens at later
stages of the public sector information
use and reuse value chain.
A critical role should be played here by
the Mayor (or other policy makers) to
create the conditions for a full and stable
collaboration among all stakeholders.
No relevant change should be foreseen in
this structure of responsibilities.
77

Role
Publishing and updating

open datasets
Design, development,
and configuration of
mobile applications that
use Open Data
Promotion and/or
selection of apps that
use a city's Open Data
(for example, organizing
Hackathons, selecting
best picks, etc.)
Defining and enforcing

policies related to
privacy
Evaluation and impact

assessment of a city's
Open Data policy
78

responses)
Mayor/City Government
Software companies; Citizens and
visitors
providers;
Citizen Developers
Mayor/City Government;
Software companies; User
communities; Citizens and visitors
Department; Software
companies; Citizen Developers
User communities
Mayor/City Government; Public
Data providers; Private Data
providers; Citizens and visitors
ACTIVE LEADERS:
Mayor/City Government; City/ICT
Department; Citizen Developers;
User communities
Public Data providers; Private
Data providers
Software companies; Citizens and
visitors
ACTIVE LEADERS:
Mayor/City Government; City/ICT
providers; Private Data providers
Developers; User communities;
ACTIVE LEADERS: All stakeholders,
except Software companies
Software companies

this structure of responsibilities (provided
there is a fully developed, ODC-like
framework already in place to enable
those operations).
Here a more prominent role of the City

Government in its strategic policy
planning function regarding service
delivery and innovation would probably
be required, unless it is fully delegated to
the ICT Department (see the responses
given to the question about dataset
refinement and validation of quality of
datasets perhaps there should be
coherence between the two).
this structure of responsibilities.
Participation of all stakeholders ensures a

more active and committed involvement
in tackling this issue, which may imply the
definition of local solutions taking
extant/contingent conditions (as well as
general regulations) into account,
particularly in the sharing of applications
that use personal data by the
users/beneficiaries themselves.
Inclusion of Software companies in the
process of evaluation and impact
assessment.
Establishment of a shared methodology,
with public evidence of interim
implementation results.
Formation of permanent / ad hoc
committees to enable this function as an
embedded (in the ODC?) service to the
entire community.

Role
Research and
innovation activities to
explore new uses and
applications of Open
Data.

responses)
ACTIVE LEADERS: All stakeholders

Initial identification of R&D priorities and
periodic evaluation and monitoring of
activities aimed at revision and
enhancement of objectives according to
the results available.
Formation of permanent / ad hoc
committees to ensure this target is
reached.
PROCESS
Historically in the four Citadel pilots, documented progress towards the definition and
clarifications of above roles and tasks has not been dependent on formal agreements, but more
on the growing maturity level of underlying Open Data Governance Groups. We can therefore
hypothesize four alternative configurations of a city/community MoU (or Open Data Charter),
depending on two main conditions:
-
The initial level of maturity of the underlying Open Data Strategy;

Its current socio-economic impact, in compliance with the Citadel vision (data and
information that are not published per se, but in relation to precise purposes and
exploitation opportunities).
The resulting options can be depicted as follows:
Figure 23: Conditions and purposes of MoU definition
Apart from the top right quadrant where the formalization of a MoU (or an Open Data Charter)
is not required, in the remaining three areas it is left to the decision of the local policy makers
whether this would be required or not. In some cases, particularly when both the maturity and
impact are low, a MoU can be recommended to activate (or rather accelerate) the take-up of
79
open data policy: this is the case of the bottom left quadrant in the picture. In other situations,
it can well happen that despite the good level of maturity in current open data policy, its socioeconomic impact remains negligible, presumably due to lack of involvement and commitment
of local stakeholders. Therefore, one single or a set of MoUs can be designed and implemented
by a city government to attract and consolidate the participation of the market in current and
prospective open data policies: this is what we call finalization in the above scheme. Finally, the
bottom right quadrant is representing the (possibly extreme, but not unlikely) situation where
the city has received clear signals from the market in terms of early impact of open data
initiatives, which now require the integration of expert and specialist knowledge to gain
momentum and become more and more widespread and inclusive. Again, a set of MoUs (like
those signed by the BBC as mentioned at the beginning of this section) may be recommended
here.
For the process of MoU development and negotiation, a set of guidelines can be outlined, based
on the Citadel experience. We split these guidelines in five groups: A) Guidelines for
preparation, B) Guidelines for drafting, C) Guidelines for negotiation, D) Guidelines for
completion, and E) General purpose guidelines.
GUIDELINES FOR MOU PREPARATION
The preparatory stage begins with the realization of the need for a MoU. It is somehow the
following step to the assessment of the conditions stated in Figure 23 as preliminary and
essential for the decision of having one in place. After this assessment has been done, the
purposes of the MoU will be clarified as well its scope and impact. Based on this
understanding, a first draft of the MoU provisions can be obtained.
Proposed steps:
Internal discussion within the city administration, possibly by a dedicated team, to identify:
AS-IS situation regarding open data publication;

Needs and requirements that motivate the decision to propose a MoU;
Targeted stakeholders in the community and ways to approach them;
Minimum / maximum achievements expected / to be reached;
Value / services to be negotiated in exchange;
Financial resources available (if any);
Strategy to be put in place for stakeholder involvement;
Internal staff to be involved in the drafting and negotiation phases.
GUIDELINES FOR DRAFTING

A first draft of the MoU is often unavoidable to approach the potential signatories from the
local community. Examples abound, of various different kinds (including some that have been
mentioned at the beginning of this section). In this process, it is essential to avoid the risk of
coming up with a pre-defined text, dissipating the credibility of a city wanting to become more
inclusive with its stakeholders.
80
Proposed steps:
Nominate the signatories to be invited;

Make sure that a representation of all key stakeholders is guaranteed;
Keep the agreement open for additional parties in the future:
Be realistic and transparent with the MoU goals and objectives;
Assign clear roles and functions to all parties;
Make sure that all prospective signatories have reason to enter the agreement;
Keep the language short and simple;
Do not make the MoU more complicated than necessary;
Set periodic review dates;
Specify procedures for amending the MoU.
GUIDELINES FOR NEGOTIATION

Ideally, any MoU should come up as the result of a preliminary discussion with the local
stakeholders involved. This would help signatories buy the purposes, roles and actions
foreseen in the agreement, as well as make important qualifications and additions to the text,
which may not have been figured out before. Therefore, it would be better to run this step in
parallel with the previous one, making a plan for stakeholder meetings and other forms of
encounter (including public events for the visitors or population as a whole).
In this stage, it is important to start by identifying those individuals (from both the city
administration and the external actors) who are more knowledgeable and influential, to enlist
their early engagement and assistance during the whole process.
Proposed steps:
Nominate the staff to involve in the process as early as possible

Make a plan of meetings and other events with targeted stakeholders;
Base your discussions on the general principles and objectives of the MoU;
Better if you avoid starting by a predefined text, unless there is value in sharing it;
Negotiate the contribution of each stakeholder in the form of concrete tasks;
Define the minimally acceptable standards of performance for each task;
Conclude the meetings by assuming the responsibility for preparing the minutes;
Publish the minutes in accessible places, where they can be read and reused.
GUIDELINES FOR COMPLETION

Ideally, during the negotiation with local stakeholders, a number of open issues should be
clarified. First of all, the very nature of the MoU in relation to its purposes for instance, to
increase socio-economic impact of open data policy, integrate the missing stakeholders into a
shared development vision, or simply solve specific issues (such as licensing or privacy
protection) with the contribution of domain experts. Second, the MoU objectives should be
translated in terms of measurable outputs and outcomes, which would facilitate monitoring,
evaluation, and renegotiation later on. Third, having framed negotiation with local actors in the
scope of the MoU purposes and with an open and inclusive approach, it is likely that the
81
discussions will focus on controversial or grey areas and lead to revision of existing drafts (if
any).
It is right at this stage that the advantages and disadvantages of developing a MoU should be
weighed against its objectives and the reasonably expected results. In some cases, the signature
of a formal agreement may create unnecessary bureaucracy or rigidities in the way things are
done. It can also be misunderstood and disrupt a good relationship with some stakeholders,
giving the false impression of building unwanted differences, making preferences where they
didnt exist before, etc.
In case the formation of an agreed text becomes possible after the negotiation phase:
Circulate the draft to all the other parties at the same time;
Involve the persons with the authority to negotiate for their organization;
Identify the immutable points and be open to changing the remaining ones;
Try to finalize the revisions by phone or in person;
Keep everyone informed of the latest changes;
End up with a public event for approving and signing the MoU.
GENERAL PURPOSE GUIDELINES

Normally, a MoU is not legally binding resembling more an exchange of letters of intent
among its signatories. However, some of its implications may create relationships (by law or de
facto) or modify the distribution of rights and obligations in a permanent manner. Therefore, it
has to be handled with care and typically structured in such a way that leaves little room to
interpretation.
Additional caveats include:
Simplicity. Most problems can be eliminated avoiding the legal jargon;

Duration. A limited time for the MoU provisions to apply helps minimize errors;
Clarity. In defining roles and tasks of all actors is recommended;
Openness. To new entrants and exits, by clear procedures, is also recommended;
Flexibility. There should be procedures for monitoring, revision and adaptation.
THE CITADEL GOVERNANCE TOOLKIT

The final objective of pilot management in Citadel is to define a general charter that codifies the
role and structure of the ODGGs in the pilot cities and lays the basis for cooperation in the
future. Work during the project explored different approaches to doing so, leading to the
general guidelines for developing a MoU as set forth above.
This exploratory exercise, carried out in the course of the last two and a half years, has in fact
been a cumulative effort, in the sense that each iteration has contributed to provide new
approaches which taken together can be seen as a suite of tools that can be adapted to the
governance of Open Data processes in any city.
82

Table 13. Citadel Charter Approaches
Year
1
Approach
Open Data
Ecosystem model
Description
Mapping roles and interactions in Open
Data ecosystems
Memorandum of
Understanding
Formal signed document declaring

common commitment to Open Data
On-line registration
to ODGG
On-line form expressing interests and

competencies for participating in ODGG
Associate Partner
Letter of expression of interest and

procedure for inclusion of Associate
Partners
Survey of roles
A survey on roles of stakeholders in OD

governance.
Maturity model
A framework for defining Open Data

objectives and strategies.
MoU framework
A modular table of contents (based on the

original MoU and Palermo Guidelines) for
a local MoU
Value
Provided the basic
framework for the Open
Data Governance Groups
Not utilized as local MoU,
provided basis for Palermo
Guidelines
Adopted to avoid having a
signed MoU, helped
defined roles in ODGGs,
not used as such.
Associate Partner
campaign accelerated
with platform in place and
specific outreach
programme.
Contributed to
governance model,
currently in distribution
with Associate Cities for
validation.
Used in outreach
programme to guide
engagement strategies.
Can be used for the
drafting of local
procedures and guidelines
This Charter toolkit guided the pilot cities in the gradual structuring and opening up of their
local Open Data Governance Groups, and in addition provided a supporting framework for the
Outreach programme. The experience gained in the project has shown that such an open and
flexible approach can remain as the modular elements with which any city can build and
consolidate their own Open Data governance model.
The final version of the Citadel Charter therefore needed to be some sort of statement that
pulls these elements together, drawing in addition on the extent and success of the outreach
activities and the awareness of the innovative potential of the Citadel vision. Rather than a
formal document of adherence to a network of Citadel-compliant cities (risking in addition to
duplicate the efforts of EuroCities, the Connected Smart Cities Network, and others), what
appeared to be most useful and needed was a declaration of common principles that can:
Clarify the Citadel vision for Open Data

Call on different actors to play their part in reaching shared objectives
Above all, recall and consolidate the Citadel Statement on which the project is originally
based, while demonstrating the contribution to achieving the objectives of the Malmoe
Declaration of 2009.
The text of the Citadel Charter, which appears as Annex III to this book, is therefore intended as
an open document whose primary aim is to promote the Citadel vision more than the specific
83
tools developed within the project, as a forward-looking strategic protocol that can gain the
adherence of cities around the world.
84
THE ODC AND THE FUTURE OF OPEN DATA

TOWARDS THE SEMANTIC WEB
In this chapter, we look towards the possible evolution of the Citadel semantic framework
(primarily as embodied in the ODC) and how it might contribute towards the emergence of the
Semantic Web vision as famously described by Tim Berners-Lee. This might seem like an odd
proposition, since we instinctively think of the five star model as requiring sophisticated web
services to run on (as most Linked Open Data services do), together with the general impression
that Citadel deals less seriously with Open Data by paying more attention to Excels than APIs.
THE CITADEL VISION: TERRITORIES OF DATA

Yet one of the hypotheses of Citadel is that pre-conceptions like this are actually hindering the
development of Open Data, by remaining fixed in technological daydreams rather than learning
from real people in real settings. As an example, take the evolution of the Internet. In 1993, Ed
Krohl wrote the 543-page The Whole Internet: Users Guide and Catalogue, extolled by Kevin
Kelly of Wired as an encyclopaedic compendium of all the places to explore, the short-cuts to
get there, the reasons to linger, the treasure you might find, and the tools to make this free
world-wide service worthwhile. Today, even thinking of an Internet compendium is
impossible, and skimming through this book it is evident how only twenty years ago the internet
was still the domain of a small group of technical enthusiasts, well-versed in UNIX and Gopher
and extolling it as a valuable electronic tool for the free world.
Open Data today appears to be in a similar situation: there are several portals out there whose
job it is to provide a compendium of all the places to explore, since their number is still in the
range of the countable. Public debates on Open Data21 still ask whether this free world-wide
service [is] worthwhile rather than attempting to understand where it is going and what its
broader impacts might be. Yet at the same time, Krohls book appeared just before the internet
began to explode into a totally different phenomena affecting every aspect of the way we live,
work and play. The second edition, in 1995, includes a new chapter on the World Wide Web
(suggesting it might be an interesting alternative to Gopher), and only a year later Google was
launched, based an algorithm that turned the tree-like catalogues of Yahoo and Altavista on
their heads.
If we are on the eve of a similar destiny for Open Data, then it can be a useful exercise to
imagine what an Open Data scenario might look like in 20 years (probably much less). If Open
Data is to explode like the Internet, then it is likely that Berners-Lees scenario of a Web of
data may even appear limited, since it is likely that the web may give way to simpler (unnoticeable?) front-ends, with the actual workings driven far more by automatic, IoT-type or
agent driven transactions than by human intervention. One thing is for sure: the data in
question will not be limited to datasets published by public administrations, but data generated
21
A good example was the Q&A at the session Cohesion Policy and Open Data: boosting transparency, performance
and engagement at Open Days 2014.
85
by all types of human activity as well as natural and machine events22. This fine-grained web of
data is likely to reveal new relationships between data and the specific place where it is, as
geographical, physical, and cultural elements of context become intertwined with ICT services23.
The Citadel project refers to this vision (described as its key value proposition) as a Territory of
Data. This concept implies that the density of information about a given territory leads to a
diffused awareness of all the features, activities, and dynamics happening there. This allows not
only governments to manage public services, but also businesses to understand market
dynamics, citizens to identify life opportunities, and so on.
Indeed, the Citadel project has been working to shift the Open Data paradigm from a finite set
of public administration portals towards a more territorially diffused data environment in three
main ways:
Citadel has done everything possible to break out of the technological temples of Open
Data, putting its tools in the hands of citizens.
Citadel works with cities not just as points on the map but as places where people come
together to give meaning to a place: witness the emergent role of the visitor as
explored in Citadel.
Citadel is based on an open Semantic Framework as a dynamic social construction more

than a technical data model.
In the following, we take a look at how Citadel also aligns with some emergent trends that may
develop to unleash or at least accelerate the transition towards a Territory of Data.
FLATTENING DATASETS
A first signal we see as evidence of this transformation is what might be called the flattening of
data structures. Since the early days of information technology, data has been organized into
increasingly complex structures of inter-relationships in an attempt to more closely represent
the way data is used in a particular domain. This occurred first in nested or hierarchical
relationships, and since the 1970s in relational structures, that instead emphasize links between
simple tables, such as a listing of companies on the one hand and the addresses for each on the
other. Relational databases have since become the norm used in programs ranging from
Microsoft Access to MySQL and in fact are behind most of the web services driving many of
the open data applications we see today.
In the following diagram of a typical relational database structure, the different tables are shown
divided by logical or functional areas, with the links between specific elements in each table also
shown. These links are then used to query the database, according to different views onto the
information, i.e. a view to show a listing of all of a companys suppliers (with addresses), and
another view to show a listing of all outstanding invoices (with company name). This structure of
22
For a preview, see www.thingful.net and http://www.slideshare.net/JustinHayward1/sss14haque

See the Periphria projects city arenas with people in places, validated by the emerging trend of Internet of
Places (more info at: www.peripheria.eu).
23
86
The ODC and the Future of Open Data
tables, relations and views onto the information is studied at depth with the client organization
in order to best represent their needs and operations.
Figure 24. A typical relational database structure
If we want to publish information in such a relational database as Open Data, there are basically
two choices:
An API can be provided that essentially queries the database from the outside, with the
result being provided to the external application in the desired format. While some
systems publish to the web information about how to query them, use of an API
generally requires a knowledge of the databases structure in order to extract
information from it. In particular, it is necessary to know in advance the exact names of
fields, or in other words the semantic structure.
The owner of the database can make a query for some subset of the information
contained in the database (i.e. company names and addresses but not invoices), and
write that to a file that is then published as Open Data. This can be done either
manually, producing an Excel or CSV file, or automatically i.e. through a protocol such
as XML.
Neither of these choices, however, fully opens the database since much of it, especially its
semantic structure, remains hidden. Since the database has been structured to be a mirror of
the specific organizational context it serves a company, a public administration, etc. it can
never be fully adapted to a broader context nor can its data be seamlessly integrated into a
territorially defined web of data.
In many aspects, the LOD paradigm externalizes the relationships of such structures by recreating links between data structures as external and publicly viewable RDF triples, as BurnersLees vision in fact tends towards an evolution of the web as a database for the whole world.
This transfer of the structure of semantic relationships from inside a relational database to
outside is driving the trend towards the flattening of datasets, or in other words a preference
for working with two-dimensional tabular files. Consisting of spreadsheet-like layouts with a
87
series of rows of information using the same column headings, tabular datasets are far more
easy to read externally, especially if they are presented in an open format such as CSV. Indeed,
many on-going efforts to transform existing data structures (notably INSPIRE-based geographical
information systems) into LOD pass through the stage of first generating one or more output
datasets in tabular CSV format.
Evidence of this trend is for example the emergent standard for transport data, GTFS (General
Transport File System)24. GTFS is not actually the way data is held in transport information
systems, but rather a common interchange format, useful for telling an external service such as
Google Maps how a given citys transportation service is organized, independently of the system
used. Nonetheless, it contains all the information necessary, even though not in a format that is
immediately operational. As shown in the diagram below, a GTFS file for a given city consists of a
zipped collection of seven text files (actually structured as CSV), each of which contains a certain
part of the information that needs to be linked afterwards: for instance one contains information
on stops, another on transit lines, and so forth.
Figure 25. A typical GTFS folder unzipped
The interesting thing about GTFS is that while each of these seven files follows a precise
structure, that structure and the relationships between the data in each of the files is not
contained in any GTFS file instance, but rather in the description of the standard. In fact, the
links between each dataset are external, and they are not explicitly stated, mainly because they
are obvious: it is clear that the buses of a given line stop at bus stops. In sum, the GTFS standard
consists of seven tabular datasets, linked by externally (or socially) known relationships
between the datasets.
Another example is in the CKAN data portal software25, an open source platform that is rapidly
becoming the standard for Open Data services. CKAN primarily hosts datasets or links to data
services using a typical data portal structure (similar to the original Citadel Hub); here no
choices are made about semantic structures, only a complete listing of files based on the
relevant metadata. In addition to this File Store service, however, CKAN introduces a new
feature called the Data Store, which is very relevant to this discussion.
24
25
https://developers.google.com/transit/gtfs/
http://ckan.org/
88
The Data Store exposes any tabular dataset hosted in the File Store (or can also be set up on its
own right), in a way that it is possible to query the data inside, say, an Excel file without having
to download it, using a simple external API. As the CKAN Data Store is used ever more widely,
this tabular data format is generally gaining greater interest.
Figure 26. A typical CKAN Data Store
The problem of course with a tabular dataset is that CSV doesnt provide for any facility to store
metadata information about the dataset. It is a simple task (far more simple than with a
relational database) to read the first row column headings and capture the semantic
structure of the dataset, but a system for storing the links between different tabular datasets,
using an RDF file or other system, has yet to be devised. To facilitate this process some have
suggested using standard column headings (more or less what Citadel is doing, as discussed in
the previous chapter), while others are highlighting the importance of identifiers as the anchors
for linking open datasets26.
How and where to define and store information that interconnects flattened datasets is in fact a
key challenge for future research. The important point in the context of this book is that, in the
journey towards the Citadel vision of Territories of Data, the trend is to imagine a massive
number of flat, tabular datasets as the foundation.
SEMANTIC RELATIONSHIPS IN CITADEL APPS

The Citadel project finds itself part of the trend of flattening datasets not so much by design but
rather by the simple fact that the majority of the datasets provided by the pilot cities were
originally in the form of Excel spreadsheets. On the other hand, Citadel is aligned with this
26
Creating Value with Identifiers in an Open Data World, Open Data Institute and Thompson Reuters, available at
http://thomsonreuters.com/corporate/pdf/creating-value-with-identifiers-in-an-open-data-world.pdf
89
trend, and the experimentation throughout the course of the last year can give some valuable
insights as to the paths for future developments.
The Citadel AGT in fact offers the possibility for any user, and in particular non-expert users, to
generate an application using more than one dataset. In the following paragraphs, we will
explore those instances where more than one dataset is used to build an application. By
examining and classifying the associations between datasets that emerge, we can gain insights
on how a bottom-up identification of RDF triples, or any other expression that captures the
relationships between two datasets in a useful way, could occur.
Figure 27. AGT Apps by Month
In the first year of operation of the Citadel toolkit (fully operational only starting in January
2014), a sufficient number of dataset couplings has occurred to be worth investigating. As the
above diagram shows, of the 567 apps generated until end October 2014, 138 or approximately
25% include more than one dataset. The generation of multi-data apps more or less parallels
that of single-dataset apps, except in the final months27.
If we eliminate the 60 apps that use multiple datasets simply because information is coming
from different cities (either the same information in two cities or apps created only to
demonstrate the tools), we still have 78 apps to examine, or about 14% of the total of apps
generated. Further eliminating apps that are clearly demonstrations (mashing up five or six
unrelated datasets from the same city in an app with a name such as test) or that repeat the
same combination of datasets (multiple trials), we are left with 38 apps combining datasets in
an original and meaningful way.
Given the nature of the AGT as a map visualisation tool, we thus have 38 instances where users
have spontaneously created an association between two datasets as a function of their spatial
relationship. In other words, by combining multiple datasets the user is exploring some sort of
logic of the sort that the LOD model tries to express that is expected to emerge when shown
27
This can be attributed to the extensive outreach activity in fall 2014, where single-data apps have been generated
to illustrate the potential of Citadel to a new city.
90
on a map. A closer look at this sample reveals four meaningful classes of relationship each
accounting for about a quarter of the total sample that appear to motivate the couplings:
Associations of datasets: different sources of pretty much the same information are
combined to generate a more complete representation. In these cases, we can imagine
the user wishing to combine datasets as generated by different authorities into a more
complete picture that makes sense in practical terms.
o Examples: Parking lots + on-street parking; Bus stops + bike stations; Childcare
centres + schools; UNESCO Heritage sites + Tourism POIs; Historic sites +
Abandoned villages.
Functional relationships: two types of information are connected by their usefulness

(often transport together with destinations). In these cases, we can imagine a user
associating two datasets according to their purpose. Note that these relationships are
not necessarily permanent: the fact that a parking place is near a cinema is irrelevant if
the cinema is closed.
o Examples: Parking + Cinema; Parking + Bars + Events + Diesel prices; Doctors +
Health insurance; Defibrillators + Pharmacies; Hotels + Tourism POIs, Pastry
shops + Transport stops; Markets + Parking
Temporal relationships: this class is similar to the previous one but with a clearer
sequence in time. Here we imagine the user thinking after you do this you might want
to do that. These are also relationships that are not necessarily permanent.
o Examples: Cinemas + Bars; Schools + Markets; Voting seats + Tourism POIs;
Meeting places + Planned visit; Tourism POIs + Restaurants; Museums + Bars
Urban settings: these are relationships between datasets that associate related public
or civic facilities with neither a specific logical, functional, or temporal relationship. Here
we imagine the user attempting to highlight the features of a neighbourhood in a city,
representing quality of life in spatial terms.
o Examples: Hotels + Cinemas; Parks + Community Centres; POIs + Trees; Coffee
shops + Parks allowing dogs; Sports facilities + POIs + Galleries + Parks
BACK TO LOD
As stated previously, one idea concerning the Citadel Open Data Commons is that it can provide
a way of constructing semantic LOD relationships in a bottom-up rather than top-down fashion.
Already in the early project stages, it emerged that this indeed could be the trend, although the
approaches mentioned there could be considered more as crowd-sourced labour than fully
bottom-up methods. Instead, the ODC concept already suggests a different possibility when, in
Figure 14 above, it is suggested that Semantic patterns could be identified in the Query
recordings, considered at that stage of development as a log file of activity within the ODC
containing information about the conditions in which datasets were accessed.
91
What ultimately appears to be the most promising approach is in fact to infer relationships from
the combinations of datasets as discussed above. Were the kind of activity witnessed in the first
year of use of the Citadel Toolkit to reach a massive scale of the kind justifying a big data
approach, the types of relationships tentatively suggested in the previous section could be
identified with greater certainty. At that point, however, we need to ask: is the RDF framework
appropriate and sufficient to express these relationships?
RDF, in its essence, expresses relationships in a subject > verb > object syntax, meaning that the
relationships are not just neutral associations, but they can have a meaning and a direction
associated with them.
Figure 28. The basic RDF syntax
In this logic, we can imagine that, in the Association of datasets category above, the
combination of Parking lots + on-street parking (the first example from the list in the previous
section) can be modelled as two datasets that can be the subjects with a verb provide and
object parking spaces. Indeed, this sort of descriptive relationship fits well with the LOD
scheme; the well-known example shown below in fact uses the verbs is a, is located at, is
on the topic of, depicts, is famous for, and discovered.
Figure 29. The LOD schema for the statue of Einstein
The other three categories, on the other hand, introduce new elements that may not be able to
be fully captured by the RDF syntax.
92
Functional relationships: as shown in the previous section, these relationships are often
contingent on certain aspects of the context of time, place, and role of the user. RDF on
the other hand only expresses permanent relationships; how to situate them then in
the context for which they hold true: where using RDF can you express as long as the
cinema is open?
Temporal relationships: the contingency here is even more complex, since it depends
on a sequence of events; might we then imagine algorithms that generate RDF triples as
temporal sequences depending on what happens when?28
Urban settings: here the combination of datasets seems to express spatial qualities that
are very related to the map but not directly related to the individual datasets taken
singly; can we imagine some new vocabulary of spatial qualities (not necessarily limited
to urban environments), for instance capable of describing a nice neighbourhood, a
city centre, or even landscapes?
These questions are by no means trivial, since they touch on the very usefulness of LOD
relationships, which, apart from some rather elementary applications, have not been tested to
date on a wide scale with citizens and businesses in city settings. From the evidence that
emerges from the Citadel experimentation, there are significant research tasks ahead in better
exploring the semantics of place, time, and space.
THE ODC AS A POLICY CONCEPT

THE ODC IN PRACTICE
Towards the middle of the Citadel project, feedback from the pilot cities indicated that the
Open Data Commons (ODC), which had initially been set forth as a guiding framework concept,
needed to be implemented in practice as least in terms of some first tools for the pilot
communities. This gave rise to the development of the Citadel tools the Converter together
with the Application Generator Tool (AGT) but it also changed the perspective from which the
ODC concept evolved from then on. The tools under development needed to be constantly
mapped onto the original ODC principles in order to a) see whether the toolkit was actually
working in the directions proposed by the ODC concept and b) if so, what future scenarios and
roadmaps were needed to reach the mature vision starting from the first tools.
What both the ODC concept and the toolkit shared since the outset was the basic Citadel
objective of promoting the uptake of Open Data by making life easier for two groups:
Make it easier for those holding data to publish it electronically and thus make it
available for access by third-party applications
Make it easier for developers to design applications that can move smoothly from city
to city, allowing citizens to access and visualize datasets independently of the format
and standards by which they were originally published.
In its starting configuration, previous to the introduction of the toolkit, the ODC was simply a
collection of static datasets published on the Citadel Platform, the first common space. In the
initial cycle of pilot testing, these datasets were incorporated into the Citadel templates for
each city, as JSON files in a relatively closed client-server framework. The open and flexible ODC
scenario was thus discussed as a possible vision or way ahead for Open Data but not
implemented in practice.
28
To some degree, one could argue that the Google Now service attempts to do this.
93
The first implementation of the Converter and AGT at first seemed to defeat the main principle
of the ODC, which was intended to remain as open as possible to different standards, data
models, etc. through a public collection of tools, not just one toolkit. On the other hand, this
solution could also be framed within the ODC concept as just a first instance of a more open
framework based on the same approach, recovering the original idea of having several app
templates each with its own data model in addition to the AGT.
In October 2013, in parallel with the specification of the tools, we thus considered the following
scenario as an extension of what was being developed. In this context, the ODC could be seen to
be made up of two main elements:
1. One or more servers containing live files accessible via their URL that contain data
in a format (platform and data structure) that at least one template (Citadel or third
party) or application (Citadel or third party) can access remotely without any further
conversions. (The first implementation of the toolkit being a first server with a first data
model for a first application, the AGT, but with nothing prohibiting further
development.)
2. A unique Index that includes a listing of a) converted live files as described in point 1,
including where and how they can be accessed, and b) the relevant template and
application data formats that are compatible with the files. (The first implementation on
the Citadel Platform consisting of an Index that lists all files that have been validated as
compatible with the AGT, the first of possibly many data formats.)
This broader scenario allowed to imagine some possible use cases that have in fact arisen
throughout the pilot testing and outreach activities as concrete situations. They also illustrate
the broader potential of the ODC, taking Citadel far beyond the paradigm of cities publishing the
typical datasets into a city portal:
94
The local bridge club

Any local club such as a bridge club can regularly publish the list of members, together
with their addresses, onto the public ODC (of course with member approval). Besides
giving publicity to the bridge club, the AGT can easily show the location of members and
make it easier to find the address for the next game (possibly with an app that mashes
up the bridge club list with the official city dataset of parking facilities). It is likely that
the members will also be stimulated to think about other datasets they hold that could
contribute to the citys ODC.
Local City government ODCs

The publishing of Open Data with the Converter is so easy that a city can, in addition to
publishing publicly on the Citadel Platform, ask individual city departments to publish all
city datasets on an internal server that mimics the ODCs functionalities, with a closed
AGT that can be used for training and awareness raising among civil servants. In
addition, such a private Citadel platform can be used to develop internal services as
well as to validate the coherence and quality of the data being published. The city
governments data managers can then simply re-publish the appropriate validated
datasets onto a public Index when needed.
Restaurant menus
A citys local restaurant association might find it useful to be able to publish the menus
and special offers of member organizations on a daily or weekly basis. They could
therefore commission a special template that can display restaurant menus on the city
map, together with a special version of the converter that converts to the new data
model. Individual restaurants and pubs that publish their information according to the
agreed format will then be visible through the AGT version that incorporates the new
template.
The broader ODC scenario also allowed us to define a roadmap for development of the
Converter + AGT tools in the direction of realizing the open and interoperable vision of the ODC
in an incremental fashion. In this mature scenario, all the datasets in the ODC are published as
JSON files compatible with one or more data models, so in theory a template or application (or
enhanced future version of the AGT) only needs to know which cities have published data in the
expected format and what URLs should be used. This information is stored in the Citadel Index,
so that developers can easily configure the templates they incorporate into a given application
and be sure that it will work in the different cities29.
In this context, the following were identified as possible areas for development in late 2013.
(Followed by a note on what actually happened.)
Index logging
This consists in logging events that happen through the Index, namely new datasets registered
(by whom and how), accesses by templates, configurations used, etc. This was thought of as
likely to yield very useful information for both the template and application developers as well
as for the data providers. In addition, further services (eg. privacy management or semantic
tracking) could also be built into the system that manages the Citadel Index. (Basic logging has
been implemented with the Citadel Index, and the potential for both privacy as a service and
semantic analytics have been identified in other ODC reports. Further developments are a good
topic for future work.)
Converter enhancements
An important enhancement to the Converter would be to enable on-the-fly conversions. In this
case, rather than generate and save a new file to the ODC, the Converter would save the
configuration info only, directly accessing the original dataset (or a regularly copied data dump)
on the fly. (On the fly conversion has in fact been implemented as a proof-of-concept script with
the PHP Converter Library. A future enhancement of the Converter could include saving semantic
mappings for batch processing, although actual effectiveness in practice would need to be
tested.)
Template developments / enhancements
As shown in the restaurant scenario above, the development of new templates was considered
an important space for the future, in order to visualize types of datasets that go beyond the
Citadel application scenarios, ie. socio-economic data. A new template would simply need to be
29
This feature of discovery of thematically compatible datasets in different cities (though using the same data
model) has already been implemented for the AGT.
95
registered in the (future version of the) Citadel Index, with information on where it resides and
the platform and data format it uses. (Currently, the AGT uses only one data model that includes
all fields from the POI, parking, and event templates. A more modular design for the AGT,
capable of incorporating different templates according to the selected datasets, could be a
future objective. As it stands, the additional data models that have been implemented for the
Converter are related to specific applications outside the AGT, as with the MyNeighbourhood
data model30 in the Lisbon Pilot, see below).
Dataset enhancements
With the ease of access to datasets through the Converter and AGT, it is possible for anyone to
publish open data to the Citadel Platform, eg. a listing of bridge club meetings, and immediately
see them on a map on their mobile phone. This has great potential for motivating the
publication of datasets beyond those of the municipality. In parallel, by shifting the emphasis
from applications to data visualization, it can make sense for a city to enhance existing datasets
rather than developing specific apps. For instance, in order to make a reservation for a concert,
it can be possible to embed a link (which would then be visible via the App Generator) to an
external reservation service in the description text of the concert rather than building
reservation functionalities into the app. (Most of the datasets to date in Citadel contain city
information, with citizen datasets e.g. Lisbon pastry stores, appearing only recently. In some
instances, however, enhanced datasets have been experimented, as with the Museum tour app
in Ghent.)
Dataset refinement
As the Converter can access any dataset on the condition that it is refined, a broad, deprofessionalised uptake of open data as suggested above was expected to bring this topic to
the fore very quickly. Although there exists a broad range of tools and toolkits to help refine
large datasets, there is little awareness of them or diffused expertise on how to use them. Pilot
responsibles were suggested to engage developers with data owners in order to teach them
how to use these tools. The best strategy, however, is to build awareness from the start,
something which can be achieved by eg. helping people to publish bridge club meetings and
then seeing what happens when the address format is not consistent. (Dataset refinement has
played a lesser role than expected, in part because many of the Citadel datasets were made
from scratch. Nonetheless, the toolkit has had a powerful impact of raising awareness on data
quality, and the Apps4Dummies workshops effectively put into practice the recommendation of
mixing developers with data owners.)
OPEN DATA AS A PUBLIC GOOD

As the toolkit was implemented validating rather than betraying the ODC concept in concrete
terms and the uptake on the part of pilot cities began to accelerate, feedback from evaluation
activities helped to clarify what the potential impact of the ODC concept could be on the Open
Data paradigm overall. The following paragraphs result from reflections on the potential policy
implications of the ODC, as Citadel began to realize that a potentially new paradigm was
30
http://my-neighbourhood.eu/
96
emerging. This process started in early April 2014 and continues to date, feeding as well into the
public debates on Open Data mentioned above.
Indeed, one of the central and most transformative tenets of the Open Data Commons concept
is the simple idea that Open Data be considered as a common good, in a public sphere whose
stewardship is to the benefit of both public and private stakeholders as well as citizens. The
mainstream paradigm for Open Data, especially as promoted by technology providers,
essentially ignores this common space, instead identifying a two-step process:
Governments at all levels publish datasets generally formed by internal administrative

processes in machine-readable form. According to the technology providers this should
occur using increasingly sophisticated (and often unaffordable) web services, but in this
Age of Austerity most agencies are in fact publishing raw data in the format they have
and to their own semantic and syntactic standards.
Software developers build applications using the datasets published by governments.
Due to the above, this often requires that developers convert the data from its raw
form into something more usable by each specific application, thus posing a barrier of
time, cost and general efficiency.
Although the relatively cautious uptake of Open Data across European cities is often attributed
to a lack of a culture of transparency or concerns about privacy and sensitive information, this
is not sufficient to explain the lack of information regarding topics such as the location of
galleries, museums, and public toilets. We suggest that three other factors inherent in the
current paradigm might also be identified as creating barriers:
The two-step process governments open data and then sit and wait for developers to
come along and use it creates a discontinuity between supply and demand and a
specialisation of roles that inevitably makes it difficult to engage different points of view
in defining comprehensive Open Data strategies.
This separation of roles also affects the propensity of the actors involved to engage with
similar initiatives under way in related fields, such as the EUs standardisation efforts in
Spatial Data Infrastructures (INSPIRE), Sustainable Energy Technologies (SETIS), etc. This
inevitably leads to interoperability issues both along the data value chain and across
thematic sectors (a key issue to attain the Linked Open Data vision).
These factors together greatly limit the applicability of the Open Data paradigm as it is
today to the few cities who meet the profile of having a strong political will, a culture of
transparency, IT staff capable of managing data publishing, and ideally an active
developers community willing to develop apps.
The Open Data Commons concept directly addresses these issues, by politically identifying the
space between datasets of whatever form and applications of whatever type and declaring
that space to be within the public domain and in the public interest, following the paradigm of
the public Commons. This space belongs neither to data providers nor to data users, but is a
neutral domain containing the tools and knowledge allowing providers and users to connect in a
more nimble, efficient, and innovative way than either could achieve by themselves. The Open
Data Commons can thus be said to include any software element that is generic enough to have
relevance for more than one dataset on the one hand, while independent of the market
97
exploitation potential of a given application or set of applications on the other. Such elements
while the Citadel Converter is central to defining this space, it can also include generic APIs,
convertors, transformers, and tools of various nature collectively bridge the gap between
datasets and applications in the most dynamic and flexible way possible.
POLICY IMPLICATIONS
This has a direct impact on the three barriers identified above as follows:
By unlocking the technical paradigm, the ODC concept allows for data to be exploited
before standards have been fully defined, thus promoting demand-driven processes of
standards convergence and adoption. This not only brings forward the benefits of Open
Data, but it also opens up to greater interoperability flows with other standards
formation processes. This is particularly important in areas where standards adoption is
relatively immature, such as spatial data infrastructures, IoT sensor network
architectures, big data analytics, system dynamics modelling, etc.
By filling the gap between data supply and demand and creating a concretely testable
end-to-end process, data owners can see the purpose of publishing data and application
developers can see the need for new datasets with greater clarity. A common space for
on-going dialogue and interaction between governments and developers is created,
allowing for new dynamics to emerge such as application driven data strategies.
By allowing the introduction of simple tools such as the Citadel Converter and AGT, the
ODC substantially de-professionalises the practice of Open Data, opening up to a full
and active participation of citizens and local businesses in both data supply and demand
and a potentially massive uptake of data-based activities. This also allows for a more
diffused territorial impact of Open Data, no longer confined to large, well-to-do, and
innovative cities but opening up to wide-scale engagement and collective creativity.
At the broader policy level, the ODC concept transforms data into a key element of territorial
capital and its stewardship an essential activity in the public sphere, in an emerging policy
landscape in which the public sector is re-defining its role in a transformation from command
and control to the orchestration of collective and collaborative innovation processes. These
broader policy implications can be mapped onto the pillars of the EU 2020 strategy as follows:
98
Smartness: Data gains a new status as a driver of economic development, with value
streams emerging from the production, analysis, and coupling of diverse and diffused
datasets as produced by social and economic activities themselves. A data-driven Smart
City concept can lead to data-driven local and regional development strategies, that
broaden the scope of Open Data to include public, private, citizens, and businesses, as
well as nature and machines, as data producers, owners, and users.
Sustainability: Ecosystem-based management concepts for sustainable development
depend on knowledge and awareness of the current and potential dynamics embedded
in a territory and its natural and human capital. The Open Data as a common good
approach can interlink with paradigms such as the IoT-based wisdom of the earth
concept to underpin an integrated and dynamic vision of sustainability.
Inclusiveness: The ODC concept proposes data as a basis for an emerging model of
citizenship, in which data as a right and the stewardship of personal and collective data
as an activity in the public sphere. By democratising access to the operational workings
of Open Data processes, the ODC unleashes the creative potential of all parts of society
on an equal footing of opportunities.
TOWARDS A REGIONAL CLOUD FOR A TERRITORY OF DATA

This policy concept for the ODC took shape in parallel with the pilot experimentation and the
launch of the Associate outreach programme, which in turn had the effect of producing an even
more ambitious policy vision. In July 2014, the project launched the format of the
Apps4Dummies workshops, which targeted civil servants in several small and medium
municipalities belonging to generally metropolitan areas. As the ODC concept thus gained
relevance for networks of neighbouring cities and towns, it became clear that the typical Open
Data paradigm (one administration = one portal) would be inadequate to support the concrete
needs of these more complex configurations of administrative competence.
An important gap appears in fact at the regional or sub-regional scale, for say a metropolitan
area that includes many municipalities ranging from a few thousand to hundreds of thousands
of inhabitants (with ICT budgets in a similar range)31. Common problems and issues, from
environmental monitoring to transport planning to citizens favourite panoramic views, lead to
common data models and possible apps and services; but who is supposed to offer the platform
for this varied set of local authorities, open also to the contribution of datasets from citizens,
business associations, NGOs, etc.?
The experience of the Lisbon Pilot, with main activities in August-September 201432, offered a
possible answer to this question. The integration of the Citadel Converter involved the use of
the FI-WARE cloud33, together with one of the key Generic Enablers of that platform: CKAN,
an open source Open Data portal system developed and managed by the Open Knowledge
Foundation. With the integration of CKAN into FI-WARE, this would mean that for any territory
or cluster of territories it is in principle possible to open a FI-WARE instance and run an Open
Data service, including the Citadel toolkit, on it. Indeed, the Citadel project has been invited to
formally propose the Converter as a FI-WARE Generic Enabler itself.
Given the interoperability between the Converter, the AGT, and CKAN, this could therefore be a
possible platform supporting the kind of Territory of Data vision emerging in Citadel, providing
the flexibility of management that would allow it to be configured in the most appropriate way
for a given region, metropolitan area, or other territorial cluster of administrations. In addition,
using the same basic platform functionalities across Europe would mean ensuring an even wider
31
A good example of this is the Apps4Dummies workshop held in Palermo in July 2014, in the context of the signing
of the Ventimiglia Pact, a joint strategic agreement among 52 city governments in the area. Among the objectives
of common interest is listed Open Data and smart city services, but the question immediately arose as to who should
manage the platform.
32
As stated previously, the enhancements to the Converter in the Lisbon Pilot were funded by the FI-WARE
programme. The policy reflections related to FI-WARE and the ODC are instead in the sphere of the Citadel Project.
33
http://www.fi-ware.org/
99
portability of applications and tools built on the Citadel platform and thus a richer business
ecosystem for the Citadel development community.
In this way, the ODC vision of open data as a common good extends beyond the specific
platform architecture of the Citadel toolkit to include a platform vision with far greater scope.
The issue that then remains is the implementation of FI-WARE as an open public facility in a
given territory, rather than as the highly access-controlled cloud platform supporting the
European ICT industrys applications and services as it is currently conceived. This brings us to
the relevance of the Digital Agenda, the policy initiative supporting the development of
connectivity and service infrastructures across Europe, and the way it is implemented through
regional ERDF policies, where most of the funding is to be found.
In fact, in the 2007-2013 programming period, R&D&I represented some 26% of the planned
expenditures for Structural Funds, for a total of over 86 Billion Euro (more than the FP7 and CIP
programs together). With several key EU 2020 Flagship initiatives (e.g. Innovation Union, the
Digital Agenda) pinpointing regional policy as the main instrument for implementation, this
figure is likely to rise even further. The set of 271 Regional Operational Programs currently being
drafted therefore represent an important opportunity for the Citadel ODC vision.
The new conditionalities for this cycle of Regional programming in particular so-called Smart
Specialisation model for innovation strategies impose certain principles and processes for
each region, such as stakeholder engagement, entrepreneurial discovery, and the integration
of social innovation. These are in turn leading to significantly new policy approaches for many
Regions, particularly in Southern Europe and the New Member States, very much in line with
the human approach to technology innovation also shared by the Citadel project and
supported by many initiatives in DG Connect. To support the new policy process, DG Regio has
engaged the Commissions IPTS in Seville to advise and coordinate individual Regions in
complying with the requirements for Smart Specialisation, but despite their efforts the Digital
Agenda has yet to appear on the top of the policy agenda with any degree of sophistication.
The Digital Agenda Toolbox, one of many instruments designed for this purpose, provides
guidelines as regards regional ICT infrastructures, services and applications, and methods for
take-up and digital literacy, and even includes a section on Living Labs, the methodology
adopted in Citadel. Cloud platforms and Open Data are also mentioned, but in relatively
traditional terms compared to the Citadel ODC concepts mentioned above, In addition, no
mention is made of either FI-WARE or the FI-PPP, despite the fact that the Commission has
already funded FI-WARE with over 800 Mln. Euro. This is evidence of the low level of awareness
among Regional policy makers responsible for Smart Specialisation of the possibilities that could
be offered by a wide-scale uptake of the FI-WARE cloud together with the CKAN Open Data
service, the underlying infrastructure onto which the Citadel toolkit would ideally be integrated.
This situation can be attributed to barriers on both sides of the equation. On the one hand, FIWARE and many FI-PPP services are still in an experimental stage and not ready for commercial
launch. On the other, ERDF regulations make programming and assignment of funds a long and
cumbersome process that often misses windows of opportunity in a fast-changing sector. Yet
these difficulties are perhaps hiding the real potential and benefits to be gained by
implementing the FI-PPP at the regional scale, especially as framed by the Citadel ODC vision.
100
While the rollout and provision of broadband can proceed within the context of existing
regional innovation strategies and traditional tenders, cloud services and the ODC concept
overall instead raise a whole series of new questions and opportunities.
With the funding available for the Digital Agenda, this is a potentially very important part of the
business opportunity for Citadel as well as FI-WAREs cloud services. Yet implementation at the
territorial scale is not only a technical issue: who should ensure data policies across
administrations, guaranteeing openness and citizen engagement through governance in the
public interest, and who should ensure quality of service, privacy and security, and
interoperability among platforms and systems? What are the potential benefits of a panEuropean approach in terms of the business opportunities for local ICT SMEs working with a
common information infrastructure across and between their regions (or, why not, at macroregional space level)? And as a consequence, what is the right approach for procurement on the
public side34?
These issues cannot be solved in the abstract but should most effectively be addressed through
a vast and diffused co-design process that engages regional actors and authorities engaged with
the Citadel toolkit together with the FI-PPP as a whole. Since this is essentially an exploratory
process involving large scale (though not necessarily heavy) pilot experimentation, it could be
framed in the context of Pre-Commercial Procurement or the EU Public Private Partnership, i.e.
as a shared-cost experimentation whose principal actor can be the Commission itself (in terms
of defining the framework guidelines linking H2020 to the ERDF). How such an infrastructure, if
successfully tested, can be then implemented in practice could then be integrated into by-then
ongoing ERDF-funded activities.
At a more manageable level, the FI-WARE Accelerator programme provides the opportunity to
further develop the Citadel toolkit and enhance its integration with CKAN and other relevant FIWARE Generic Enablers. A series of sixteen projects are launching calls for SMEs to propose ICT
services to be developed using the FI-WARE platform, and the pilot testing in these initiatives
can include scenarios of use with neighbouring municipalities, as a proof of concept of some of
the more functional aspects of the ODC vision.
In parallel, however, the bottom-up exploration of Citadel as an innovation support
infrastructure can be equally carried out from a regional policy perspective, for instance through
discussions with the IPTS and interested Regions or collaboration in the framework of on-going
European Territorial Cooperation projects. The feasibility of this is illustrated by the diffused
response to the Citadel Associate outreach program, providing a bottom-up platform of
interested cities. In addition, the CreativeMED project35 has been exploring possible areas of
concrete exchange of thematic and operational knowledge among 12 Mediterranean Regions
for the implementation and monitoring of their 2014-2020 Smart Specialisation strategies. Here,
the hypothesis of concretely experimenting the ODC Territory of Data vision using the Citadel
toolkit on top of the FI-WARE cloud (or at least using CKAN) is being explored by regional
programming responsibles in Portugal, Italy, Slovenia, Greece and Cyprus. Similar concepts are
34
In fact the new framework for EU public procurement creates more room for informal negotiations with
prospective awardees.
35
http://www.creativemed.eu/
101
also the subject of specific discussions with CORVE, the Citadel Lead Partner, as concerns the
Flanders Region.
102
REFERENCES
[1] All Citadel deliverables, including those mentioned in this book, are publicly available at:
http://www.citadelonthemove.eu/en-us/results/publicdeliverables.aspx
[2] Article 29 Data Protection Working Party (2013) Opinion 2/2013 on apps and smart devices, 27
February. Available online at http://www.huntonprivacyblog.com/wpcontent/files/2013/03/wp202_en.pdf (last accessed: December 2014)
[3] Article 29 Data Protection Working Party (2011) Opinion 9/2011 on the revised Industry Proposal for a
Privacy and Data Protection Impact Assessment Framework for RFID Applications, 11 February. Available
online at http://cordis.europa.eu/fp7/ict/enet/documents/rfid-pia-framework-a29wp-opinion-11-022011_en.pdf (last accessed: December 2014)
[4] BEPA (Bureau of European Policy Advisers) (2011) Empowering people, driving change: Social
innovation in the European Union. Luxembourg: Publications Office of the European Union. ISBN 978-9279-19275-3
[5] Capability Maturity Model (MATURITY MODEL),
http://en.wikipedia.org/wiki/Capability_Maturity_Model (last accessed: December 2013)
[6] Dekkers, M., Polman, F., te Velde, R. and de Vries, M. (2006) Measuring European Public Sector
Information Resources. Final Report of Study on Exploitation of public sector information benchmarking
of EU framework conditions. Executive summary and Final Report. European Commission, Directorate
General for the Information Society and Media
[7] European Commission (2009) Recommendation on the implementation of privacy and data
protection principles in applications supported by radio-frequency identification, C (2009) 3200 final,
Brussels, 12 May. Available online at: http://eurlex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:32009H0387:EN:HTML (last accessed: December
2014)
[8] Ferro, E. and Osella, M. (2011) Modelli di Business nel Riuso dell'Informazione Pubblica. Studio
Esplorativo. Osservatorio ICT Piemonte, www.sistemapiemonte.it
[9] ISO/IEC WD 29134 Privacy impact assessment Methodology. Available online at
http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=62289 (last accessed:
December 2014)
[10] Itani, W., Kayssi, A. and Chehab, A. (2009) Privacy as a Service: Privacy-Aware Data Storage and
Processing in Cloud Computing Architectures. In Proceedings of the Eighth IEEE International Conference
on Dependable, Autonomic and Secure Computing (DASC '09), 12-14 December, pp. 711-716. Available
online at http://dl.acm.org/citation.cfm?id=1724449 (last accessed: December 2012)
[11] Jacobsson, S. and Bergek, A. (2007) A framework for guiding policy makers intervening in emerging
innovation systems in 'catching up' countries. European Journal of Development Research, (18), 4, 687707
[12] Maximilien, E.M., Grandison, T., Sun, T., Richardson, D., Guo, S. and Liu, K. (2009) Privacy-as-aService: Models, Algorithms, and Results on the Facebook Platform. In Web 2.0 Security and Privacy
Workshop, held in conjunction with the 2009 IEEE Symposium on Security and Privacy, 21 May. Available
online at http://w2spconf.com/2009/papers/s4p2.pdf (last accessed: December 2012)
103

[13] OECD (2006) Digital broadband content: Public sector information and content. Paris, OECD
Publications, 31 July.
[14] Pira International Ltd., University of East Anglia and KnowledgeView Ltd. (200) Commercial
exploitation of Europes public sector information. Executive Summary and Final Report. Luxembourg:
Office for Official Publications of the European Communities. ISBN 92-828-9934-9
[15] Szkuta, K., Osimo, D. and Pizzicannella, R. (2012) When people meet data: Collaborative approaches
st
to public sector innovation. Paper presented at the 1 EIBURS-TAIPS conference on Innovation in the
public sector and the development of e-services, University of Urbino, April 19-20.
[16] Value Network Analysis, Wikipedia (accessed April 2012) http://en.wikipedia.org/wiki/Value_network_analysis
[17] Vickery G. (2011) Review of Recent Studies on PSI Re-Use and Related Market Developments.
Information Economics, Paris
[18] World Economic Forum (2011) Personal Data: The Emergence of a New Asset Class. An initiative in
collaboration with Bain & Company, Inc.
[19] Wright, D. (2011) Should Privacy Impact Assessments Be Mandatory?, in Communications of the
ACM, 54(8), pp. 121-131
[20] Wright, D., Finn, R. and Rodrigues, R. (2013) A Comparative Analysis of Privacy Impact Assessment in
Six Countries, in Journal of Contemporary European Research 9(1), pp. 160180. Available online at
http://www.jcer.net (last accessed: December 2014)
104
ANNEX I: THE ODC TOOLKIT

Various aspects and elements of the ODC have been realised as software in project years 2 and
3. To the extent that the ODC is a common resource space with a range of tools, this can be
equated with the Citadel space on Github, available at https://github.com/CitadelOnTheMove,
which is used by developers in the Citadel community for both on-going and stable projects.
Among the resources there is the landing page http://citadelonthemove.github.io/ which
aims to provide an entry point for new developers with an easy overview of the Citadel toolkit
overall, including examples, tutorials etc.
In these spaces, the resources that can be said most properly to belong to the ODC (apart from
the AGT, which is discussed separately in specific reports) include:
The Citadel Converter (actually three projects)

The PHP Converter Library with geoJSON conversion
The CitySDK-Citadel script
The specific code for each of these resources is available through Github, so in the following we
simply provide an overview of how each element is structured and how it fits into the ODC
concept.
THE CITADEL CONVERTER

The Citadel Converter is the main tool to have been developed as an implementation of the
ODC concept, in that it converts from the most widely found data formats (tabular datasets in
CSV or XLS/X format) into the C-JSON format used by the Citadel templates and the AGT.
Elements of the Converter that have been developed as part of the Lisbon Pilot (namely with
external resources) and then re-integrated into the Converter available on the Citadel Platform
are marked with a double asterisk**.
The Java Converter is made up of three software components:
The Library: this is the heart of the converter, and carries out the actual conversion
function and makes it available to external parties thanks to its APIs
The GUI standalone: this is the graphical interface for those wishing to use the Library
off-line.
The Portlet: this is the portlet installed in a Liferay Portal to use the Library via web (and
integrated into the Citadel platform).
All three modules are written in Java 1.7 using Maven.
THE LIBRARY
Github: https://github.com/CitadelOnTheMove/converter-lib/
Wiki: https://github.com/CitadelOnTheMove/converter-lib/wiki/
105
Full API documentation: https://github.com/CitadelOnTheMove/converter-lib/wiki/APIDocumentation
The main features of the Converter library are to:
Open virtually any kind of source dataset, at the moment:

o Comma-Separated Values (.csv): stores tabular data (numbers and text) in plaintext form following RFC 4180.
o Microsoft Excel files (.xls and .xlsx): Microsoft Excel produces documents in the
generic OLE 2 Compound Document and Office Open XML (OOXML) formats.
The older OLE 2 format was introduced in Microsoft Office version 97 and was
the default format until Office version 2007 and the new XML-based OOXML
format.
Enter metadata for source datasets
Support semantic match using categories and contexts
Support column mapping and string operations on columns and custom text to create
target datasets with different structures and fields
Support virtually any target format of the converted dataset, at the moment the Citadel
common POI format using JSON and MyNeighbourhood** format using CSV
Validate generated target datasets, at the moment only according to the Citadel
common POI schema (C-JSON)
Support uploading the generated dataset in CKAN**
Customizable configuration to tailor CSV parsing options, contexts and categories
Multilingual, at the moment English, French and Italian are supported
Easy to plug your favourite logging framework
THE GUI STANDALONE
Github: https://github.com/CitadelOnTheMove/converter-gui/
Wiki: https://github.com/CitadelOnTheMove/converter-gui/wiki/
Step by step User Guide with pictures:
https://github.com/CitadelOnTheMove/converter-gui/wiki/User-Guide
Video Guide: https://github.com/CitadelOnTheMove/converter-gui/wiki/Video-guide
The main features of the Converter GUI includes the features of the Converter library and make
it easy to:
106
Change the Language

Start a new conversion or cancel it through a navigable wizard
Load one or more source datasets
Specify the settings to use to load the CSV file or the Microsoft Excel spreadsheet with
automatic preview
Drag and drop to carry out the semantic match with categories and contexts
Drag and drop to map columns, add custom text and concatenate both custom text and
columns
Annex I: The ODC Toolkit
Visual indication of different kinds of messages (notice, warning and error) on data
mapping
Display error message boxes in the conversion process or on validation
Preview the generated target dataset
Save the generated dataset locally
PORTLET FOR LIFERAY PORTAL
Github: https://github.com/CitadelOnTheMove/converter-portlet/
Wiki: https://github.com/CitadelOnTheMove/converter-portlet/wiki/
Step by step User Guide with pictures:
https://github.com/CitadelOnTheMove/converter-portlet/wiki/User-Guide
Video walk-through: http://youtu.be/oTn76MqzuG4
The main features of the Converter Portlet include the features of the Converter library and
make it easy to:
Change the Language

Start a new conversion or cancel it through a navigable wizard
Load one or more source datasets
Specify the settings to use to load the CSV file or the Microsoft Excel spreadsheet with
automatic preview
Drag and drop to carry out the semantic match with categories and contexts
Drag and drop to map columns, add custom text and concatenate both custom text and
columns
Visual indication of different kinds of messages (notice, warning and error) on data
mapping
Display error message boxes in the conversion process or on validation
Preview the generated target dataset
Save/publish the generated dataset with three options
o Download the file locally
o Save the file to a CKAN server and publish the dataset information to the Citadel
Index**
o Save the file to the Citadel Platform and publish the dataset information to the
Citadel Index**
USAGE STATISTICS
The following provide some statistics on access to and use of the Converter starting April 24,
2014 (installation of a significantly revised version following user feedback) and ending
December 12, 2014:
Total of 603 user sessions (persons initiating a conversion process for at least one
dataset), average of 2.6 sessions per day
Datasets loaded: 814 (386 CSV, 375 XLSX, and 53 XLS)
107
Datasets successfully converted: 627 (606 Citadel JSON and 21 MyNeighbourhood CSV)
THE PHP CONVERTER LIBRARY

The PHP Converter library was developed as a modification of the original PHP prototype
version of the Converter software (December 2013), as a proof of concept (tech-friendly more
than user-friendly) for the following features:
It can carry out on-the-fly CSV, geoJSON and osmJSON to Citadel JSON conversions for
mobile applications
Conversion from osmJSON format enables to get live data from Open Street Map
GeoJSON export format enables to use the converted data into other geoJSON
compatible applications (including web mapping services)
It includes a mapping template editor, for easy generation of config files, which enable
live encoding from various datasets
It provides some converted data caching (so we update the file only when requested, or
depending on some specific criteria, allowing to serve converted files faster, while
updating them on a regular basis)
It is designed to be embedded into other Open Source products, such as CMS or Data
stores, to allow them to natively provide Citadel JSON output.
Github: https://github.com/CitadelOnTheMove/converter-php-lib
The code is mainly intended as a basis for more advanced projects, with the following overall
roadmap.
Implement the complete set of data fields from the Citadel-JSON format
Add an editor feature to allow using various fields into the output description field for
POI
Plug the library to other data sources than CSV files, and particularly database backends
from existing CMS.
THE CITYSDK-CITADEL CONVERSION SCRIPT**

The CitySDK-Citadel conversion script allows to query a CitySDK tourism POI database using a
variation of the CitySDK API36, with the results being fed directly into the Citadel Converter for
conversion. The original script was developed37 on the fly during the CitySDK-Citadel workshop
held in the context of the Open Data Days in Ghent (February 2014).
36
Github: https://github.com/CitadelOnTheMove/CitySDK-Citadel-Script
CitySDK is a sister project to Citadel. It developed standard APIs for Open Data webservices to allow portability of
apps across Europe.
37
Development of the CitySDK Citadel conversion script was carried out without resources from Citadel. The first
script was developed with resources from the CitySDK project, while its refinement was carried out as part of the
Lisbon FI-WARE Pilot.
108
Annex I: The ODC Toolkit
During the Lisbon pilot, the script was further refined to allow accessing the database (with
parameters, filters, etc.) through a URL query such as
http://citysdk.ist.utl.pt:8000/?city=amsterdam&format=csv&limit=5.
Github: https://github.com/rsbarata/CitySDK-Citadel-Script
In parallel, an additional feature was added to the Citadel converter** that allows to upload a
file through a URL field rather than by browsing and selecting. This has allowed to add to the
Citadel Platform datasets with tourism POIs from Lisbon, Amsterdam, Helsinki, and Rome
through direct queries to the CitySDK API.
109
110
ANNEX II: STANDARDS ADOPTED IN THE CITADEL PLATFORM

This Annex is based on material from Citadel Deliverable 2.3.3, New Standards and
Recommendations, authored by Florian Daniel, Julia Glidden, and Geert Mareels.
Although the Open Data Commons aims to act as an open framework, in practice it is
nonetheless necessary to identify one or more standards to adopt for use. The following
therefore details the main choices made for the Citadel Platform as currently available at
http://www.citadelonthemove.eu/.
FILE FORMATS MAPPING

From our interactions with Pilot and Associate cities, Citadel quickly understood that most data
owners in cities do not have a strong grasp of the relative benefits of different data models and
file formats. Therefore, all but the most advanced cities take the path of least resistance;
publishing data in whatever data structure they already hold using spreadsheet-friendly file
formats as .CSV and .XLS. In only a few cases (10% of Associates) did Associate cities have any
data published in an RDF-model format.
In light of this on-the-ground reality, the Citadel team realized that it needed to create tools and
associated recommendations that balanced the simple compliance steps required to get cities
on board on the one hand, with maximizing data reuse on the other. Toward this end, the team
conducted a mapping of the most appropriate file formats available:
Table 14. Citadel Common File Formats Mapping Grid
Format
Description
Pros
Cons
.XLS/.XLSX
(Excel)
Represents data as tables

accepted by all spreadsheet
programmes
Very accessible to
people and widely used
Proprietary format. Does not

use Unicode. Too simple to
allow for most programmes to
make use of this data directly.
Does not express relations
between data
.CSV
Represents data in flat

tables easily read by people
or machines
Easily understood or
parsed by most
programmes. Easily read
by humans. Applicationneutral.
Tabular format does not

express relationships between
data making it less applicable
for complex applications
.XML
Represents data as a
structured tree schema that
expressed relations between
data
Strong schema makes it

possible to attach rich
information data
Schemas are complex making it

difficult to write programmes
.JSON
Represents data in a simple

tree structure making it
programmer-friendly
Structure and links make

it very easy to build
services using data
Lack of schema means it only

supports simpler types of data
RDF
Formats
(e.g Turtle,
Represents data as a
network of linked points
that make it easy to
Highly-structured
information makes it
easy to search and
Complex structure makes it

difficult to parse and
therefore costly to work with.
111

Format
Description
Pros
Cons
N-Triples
and JSONLD)
understand complex
patterns using computer
programmes
retrieve information in
services. Easy to visualise
data relationships
Relational structure makes

human reading challenging.
Complex to work with in
development.
Based upon the above mapping and in alignment with the trends outlined above, Citadel
ultimately chose to use two types of files formats in its own work out:
CSV Recommended for publishing Open Data for the public38,

JSON - Recommended for publishing Open Data to use in mobile applications
FILE FORMATS CHOICES

Citadels choice to use CSV as the main input data format was first and foremost led by the
projects ambition to foster easy access to open government data. As discussed, a majority of
cities around the world use spreadsheet programmes such as Excel for managing their data.
Thus, rather than fight this trend and try to impose a more technically advanced standard from
above, Citadel elected to work with the practice via a format like CSV that is easy to edit and
export39 and noted for its simplicity and accessibility40, compactness, ease and speed of
processing and scalability41.
The choice of JSON for mobile development was motivated primarily by a desire to make our
Open Datasets attractive to App developers. JSON presents data in a simple tree structure
without the need for a formal schema. JSON is consequently well-suited to building capable
apps and popular with the developer community42 because, unlike schema formats like XML,
the dataset contains all the information the app needs to work well. At the same time, JSONs
lack of schema likewise makes it relatively easy to convert standard CSV files into this
developer-friendly format.
DATA MODEL CHOICE

To support the release of developer-friendly JSON, Citadel created a common data model for
each dataset which we call Citadel JSON. Citadel JSONs Data Model is based on an extension
of the W3C PoI Core Data Model43. Citadel JSONs data model has two advantages over other
commonly-used JSON data models:
1. Semantic Annotation A Citadel JSON file tags every piece of information with a
machine-readable category. These tags are contained within the dataset, meaning any
app can easily read and make sense of the data without additional programming.
38
Some of the Citadel converting tools also accepts Excel, OSM JSON or geoJSON files
http://www.opendataimpacts.net/2014/10/data-standards-and-inclusion-in-the-network-society/
40
CSV files can be read and edited by humans, using a simple text editor
41
Huge CSV datasets can be easily handled, as the one line per entry structure enables sequential processing
42
http://blog.mongolab.com/2011/03/why-is-json-so-popular-developers-want-out-of-the-syntax-business/
43
http://www.w3.org/TR/poi-core/
39
112
Annex II: Standards adopted in the Citadel Platform
2. Cross-Border Use A Citadel JSON file will work with any other application designed
using Citadel JSON. This means that an app developed to find art galleries in Helsinki can
also find galleries in Palermo with no need to develop and download a new service.
The two features above make Citadel JSON a significant improvement, from the perspective of
developers over existing Open Data models. The following visual provides an overview of the
Citadel JSON data model:
Figure 30. Citadel JSON Data Model
The Citadel JSON data model was a significant extension of the W3C PoI Core Draft, currently
the global guideline for the production of PoI Data Models.
POINTS OF INTEREST (POI) STANDARDS

As noted above, Citadels Data Model is based on the W3C POI Core draft44 which defines a data
model for location about which information is available. As the model was designed to be
used in mobile web applications, it was implemented in JSON, which, as discussed above, is the
most used and suitable data format45. The resulting format called Citadel JSON includes both
data from the original dataset, information about the dataset itself (known as metadata:
data about data see below), and specific fields that describe how the data should be used
into mobile applications.
Citadel JSON can be compared with the related format GeoJSON, which is also used to describe
POI, but does not include information about the dataset, and is not specifically designed to
provide all required data for mobile applications. Both files formats can be merged, or easily
converted.
44
45
http://www.w3.org/2010/POI/documents/Core/core-20111216.html
JSON is the native data format used by JavaScript, which is responsible for the dynamic part of the applications
113
GEOSPATIAL STANDARDS
Citadel JSON uses the WGS84 coordinate reference system to represent the location of points of
interest accurately. WGS84 is used by GPS providers and most well-known mapping systems
and can be considered the world standard. The coordinates of a given point on Earth are
expressed into decimal format, using axis order latitude, then longitude, and separated with a
space, e.g. 50.838908 4.373942 for the European Parliament building in Brussels. The Citadel
JSON conversion process also uses latitude and longitude fields to produce the Citadel JSON file.
The resulting format in Citadel JSON combines these two fields with a separating space,
resulting in a single value with latitude first, then longitude.
The Citadel format also allows other standards to be declared and used, though this is not
recommended and not handled by the mobile templates46.
DATE AND TIME DATA

Citadel uses date and time information both to identify the dataset itself and as part of datasets
related to events. The Events Template created by Citadel uses this information to filter the POI
based on selected dates. The dataset time metadata is formatted using the ISO 860147,
preferably with time zone shift information (e.g. 2014-10-02T15:13:19+00:00), as the
applications may be used in various time zones.
Due to the variety of input data formats in source datasets, the date format used for Events POI
is inputted freeform into Citadel JSON. However, precautions should be taken to ensure that
the format used can be parsed by the Events Template.
In addition to ISO 8601, Citadel Events Templates notably handles RFC2822 Date and Time
Specification48 (eg. Mon, 25 Dec 1995 13:30:00 GMT), as well as dates using the format
DD/MM/YYYY.
SENSOR AND IOT

As part of Citadels Pilot Activities with the City of Manchester, we integrated some real-time
sensor data into our App Generator. Citadel work on sensor data needed to find a relevant
standard and ultimately led the project to the work of the Open Geospatial Consortium (OGC)49
on Sensor Web Enablement. OGC has developed a series of industry-leading data model
standards50 that are designed to describe, query and exchange sensor information. Of these
model standards, two proved useful for the Citadel sensor data:
1. Sensor Observation Service (SOS)51 - describes a web service to query sensor data,
46
Because the underlying map uses WGS84 too

http://www.iso.org/iso/iso8601
48
https://www.ietf.org/rfc/rfc2822.txt
49
http://www.opengeospatial.org/
50
These standards are: SensorML (sensor model), O&M (observation data), SOS (observation service), SPS (planning
service), and SAS (alert service)
51
http://www.opengeospatial.org/standards/sos
47
114
2. ISO 9156:2011 Observations and measurements52 - describes the sensor data.

Data produced by sensors and by Internet of Things (IoT) devices are often only available in
proprietary formats, or vendor-specific data models. Such data feeds are generally accessed
through a webservice (an automatic dataset that supplies the latest information when
requested by an app or service) which most commonly export data in JSON, XML and CSV data
format. Citadel chose to use the proprietary sensor platforms already installed in Manchester to
showcase live sensor data in action53. For others exploring the use of sensor data as Open Data,
we recommend the use of a webservice with either JSON or CSV format.
METADATA
Metadata, or data about data, is structured information that describes, explains, locates, or
otherwise makes it easier to retrieve, use or manage an information resource.54 The reference
metadata standard to describe online resources is Dublin Core Metadata,55 from which 15 core
terms56 have been normalised in ISO 15836:2009.57 Citadel chose to conform to this widelyrecognised international standard - all Citadel data therefore uses Dublin-Core Metadata.
POI DATASET CATEGORIES

ISO 1911558 describe the main themes for geographic dataset categorization. These top level
thematics have been extended by the INSPIRE directive which recommends for each dataset:
A unique INSPIRE theme59

Additional keywords from the GEMET-Concepts,60 or a professional thesaurus or free
keywords
The Citadel Data Index61 uses a different, narrower categories list as it is more convenient for
general public POI categorization. This classification is available using a JSON implementation of
RDF Data Catalogue standard (DCAT)62 through the dataset web service of the Open Data Index.
The categories used inside the datasets themselves are free because they reflect the ones
used in the original data file which may or may not be structured. While not enforced at all,
Citadels use of existing categorization vocabularies can be a step forward toward better
interoperability. It would allow better dataset auto-discovery in the future, and is therefore
advised.
52
http://www.iso.org/iso/catalogue_detail.htm?csnumber=32574
Which uses Xively API, which was used by 2 pilot cities, fetching data as JSON : https://xively.com/dev/docs/api/
54
http://en.wikipedia.org/wiki/Metadata_standards
55
http://dublincore.org/
56
The 15 core terms are : title, creator, subject, description, publisher, contributor, date, type, format, identifier,
source, language, relation, coverage, rights
57
http://www.iso.org/iso/fr/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=52142
58
http://www.iso.org/iso/fr/home/store/catalogue_tc/catalogue_detail.htm?csnumber=53798
59
http://inspire.ec.europa.eu/theme/
60
http://www.eionet.europa.eu/gemet/en/themes/
61
http://www.citadelonthemove.eu/en-us/opendata/opendataindex.aspx
62
http://www.w3.org/TR/vocab-dcat/
53
115
MOBILE APPLICATION TEMPLATES

Citadels mobile applications templates use HTML563 which is mobile-platform agnostic
rather than vendor-specific mobile platform technologies such as Google's Android or Apple's
iOS64. While HTML5 itself doesn't allow developers to build native apps (an app program
developed for use on a particular platform or device) at this point in time,65 it can be embedded
into native applications for iOS and Android to offer native features to mobile users. Citadels
HTML-based templates can be used even on the most basic web servers,66 and adapted without
a heavy development environment, using only text editors.
GAPS IN EXISTING STANDARDS

CHARACTER ENCODING ISSUES
Citadel supports the UTF-8 character encoding standard which has existed since 1996, been a
world standard since 2003 and supports all known alphabets on earth. Despite the benefits of
UTF-8, use of Citadel tools has identified a number of outstanding issues regarding character
encoding done using a different format:
Some popular spreadsheet editing software including Microsoft Excel still uses regional
character encodings by default
This choice leads to less-interoperable files, with accents and special characters being
misinterpreted
Such regional encoding can cause challenges for the conversion process including
unreadable accent characters in files
Citadel Recommendation:
Adoption of the UTF-8 standard should be ensured so that text data and information
can be exchanged in an interoperable manner throughout Europe and across the world.
Access to state-of-the-art standards and their implementation into varying languages
should be free wherever possible.67
POI ISSUES
As W3C did not finalise a standard for POI, Citadel had to use the unfinished draft. Use of this
draft uncovered a number of issues:
63
Some fields did not suit flat files,68 and were simplified to become more user-friendly,
HTML5 standards also rely on CSS3 and JavaScript languages

Despite being Open Source, Android remains complex to use for non-professional developers, and iOS is
proprietary
65
A big issue with HTML5
66
The templates also use PHP, and MySQL for database-powered apps, which both power most of the world's
websites
67
Eg. ISO standards are still paid-access.
64
116
Citadel needed to describe the dataset itself, and extend POI drafts to add fields that
describe the dataset these fields basically wrap the updated POI data,
Citadel needed to add fields specifically designed to be used with mobile applications,
which were implemented through an additional extensible data model (the tpl
identifiers),
The draft W3c standard did not go far enough - real-world implementation revealed a
need to build usable tools without using a full linked data infrastructure.
Foster the development of probed standards that fit developer needs.
EVENTS ISSUES
The W3C POI data standard did not include calendar information by default which made it
impossible to properly display events on the map.
Citadel added to POI data using the extendable attributes defined in the Citadel JSON
format.
GEOSPATIAL ISSUES
Geospatial standards still are a fuzzy standards area. Many reference systems exist with no clear
guidance on which ones are best to use and why.
In our work, Citadel found that geographical coordinates from different countries had variations
in both axis order (whether latitude or longitude comes first when written) and the form in
which Lat/Long were written (one cell or two separate cells). Citadel also found that geographic
coordinates are based on various geospatial reference systems which often leads to
inconsistency between datasets as they are not always explicit (especially once used on table
files). As an example, Barcelona which has published many rich data sets offers a bus stop
dataset, which even after proper conversion to the global standard for Latitude and Longitude
(Wgs84), shows a small shift (about 2 blocks) for all POIs - making it concretely unusable for
Citadel apps.
Conversion between geographic coordinate systems remains a complex issue for non-GIS
specialists. We believe this complexity has contributed to a lack of easy-to-use available tools
and best practice on used geospatial reference systems. Finally, the auto-discovery feature of
the Citadel AGT, which allows any app to automatically detect data corresponding to ones
current location and load it into the app, shows that there is a key need to be able to describe
the covered area of a given dataset (instead of attaching it to a central point) in order to enable
applications to get the most accurate data at different scales depending on their completeness.
68
They are rather designed for Linked Data, using extensive URI and namespaces instead of clear text and URL, which
are more user-friendly for developers and data editors that lack the surrounding infrastructure to easily produce
these structured and linked data files.
117
A possible pre-requisite for this functionality would be a common thesaurus of administrative

boundaries and their evolution through time.
118
Set a central open interoperable standard for geographic coordinates based on WGS84
and a once defined axis order and coordinates formatting
Provide adequate and open conversion tools to enable data publishers to publish their
data using a shared, unique coordinate system to exchange information outside from
the GIS community
Establish an administrative ontology of European boundaries, at scales and with a
historical perspective, in order to enable local data naming and discovery using both
geographical coverage and administrative entities
Where possible, cities should geocode their POI in latitude and longitude fields.
ANNEX III: THE CITADEL CHARTER

PREAMBLE
THE MALMOE DECLARATION
The Malmoe Ministerial Declaration on eGovernment, approved on 18th November 200969, sets
forth the following joint vision for 2015:
European governments are recognised for being open, flexible and collaborative in
their relations with citizens and businesses. They use eGovernment to increase
their efficiency and effectiveness and to constantly improve public services in a way
that caters for users different needs and maximizes public value, thus supporting
the transition of Europe to a leading knowledge-based economy.
At the time, the Malmoe Declaration seemed to be an important step forward, yet as we
approach the year 2015, Europe could not be further from that vision. Since 2009, trust in the
European Union has fallen to new lows, from 48% to 31% and that in national governments
from 29% to 27%70. Whatever progress has been made in eGovernment services and Open Data
strategies has not been enough to have a significant impact. Engagement, representation, and
the provision of services will have to change, and change quickly, in order to recover the gap
between European governments and the citizens and businesses they serve.
THE CITADEL STATEMENT

The Citadel Statement, signed by a group of local authorities a year after Malmoe, on December
10th, 201071, aimed to make Malmoe real by identifying the key role local governments should
play in this process. While the Statements recommendations have not been adopted at the EU
Ministerial level, they did lay the ground for the Citadel on the Move project, which in fact
has been working since 2012 with pilot cities in four EU Member States to explore new
scenarios with local citizens and businesses. Two years of co-design and experimentation with
communities of users in over 100 towns and cities are paving the way for a massive uptake of
Open Data as the foundation of new business ecosystems, urban lifestyles, and government
services.
The focus of development in Citadel has been an integrated platform designed to engage nontechnical citizens, businesses, and civil servants in Open Data, providing simple tools with which
to convert and publish information and generate an app in only a few minutes. As part of this
effort, Citadel also explored related issues such as semantics and standards, governance and
privacy, and Living Lab engagement and evaluation methodologies, as enabling mechanisms for
local authorities to spark off diffused processes of transformative innovation. To build
69
http://ec.europa.eu/digital-agenda/sites/digital-agenda/files/ministerial-declaration-on-egovernment-malmo.pdf
Standard Eurobarometer 81, http://ec.europa.eu/public_opinion/index_en.htm.
71
http://www.corve.be/docs/english/Citadel%20Statement.pdf
70
119
momentum, a specific outreach initiative has enlisted over 100 local authorities in over 60
countries worldwide sharing the Citadel vision.
LEARNING FROM THE CITADEL PROJECT

In this process, specific new insights have been gained on the core issues identified in the 2010
Citadel Statement, as follows:
1. Common Architecture, Shared Services and Standards
The Citadel Statement called for a common service delivery architecture.
The Citadel project defined the similar but more open and agile concept of the Open
Data Commons, a shared space in the public domain that rather than impose standards
provides an open, multi-standard framework that promotes the convergence of
practice.
2. Open Data, Transparency and Personal Rights
The Citadel Statement called for common data models to make data consistent across
Europe, respecting personal information.
The Citadel project defined these within an open semantic framework that allows for
the integration of new data models. This allows for the emergence of privacy-as-aservice concepts on the one hand, and for mobile applications to discover similar
datasets in new localities on the other.
3. Citizen Participation and Involvement
The Citadel Statement called for citizen participation in decision-making and service
design.
The Citadel project adopted the Living Lab method, engaged citizen developers in codesigning the tools and services, and developed a platform that opens up Open Data
itself to non-technical citizens and businesses, broadening the scope from Public Sector
Information to include data-holders throughout the community.
4. Privacy and Identification of Individuals
The Citadel Statement called for a European framework to address privacy issues.
The Citadel project explored the implications of privacy in citizen-driven Open Data
scenarios, identified limitations in the current policy debate, and defined procedures for
individual data licensing for privacy-as-a-service in the Open Data Commons.
5. Rural inclusion
The Citadel Statement called for equality of broadband access.
The Citadel project focused on Smart City services, extending the scope beyond large
and well-funded cities to emphasize the innovation potential of small and medium-sized
towns and proposing broadband cloud platforms for Open Data as a regional public
service.
120
Annex III: The Citadel Charter
TOWARDS THE MALMOE OBJECTIVES

These elements, in line with the logic of the Citadel Statement, come together to provide a
more effective and convincing path to reaching the main objectives of the Malmoe Declaration.
Citizens and businesses are empowered by eGovernment services designed
around users needs and developed in collaboration with third parties, as well
as by increased access to public information, strengthened transparency and
effective means for involvement of stakeholders in the policy process.
This objective speaks about Open Data without mentioning it. Citadel, together with other
projects and general trends over the last five years, demonstrates that Open Data is the
foundation for any effective eGovernment strategy based on transparency and open access to
public information. Furthermore, Citadel has demonstrated that local authorities need to go
beyond mere involvement and adopt strategies for deep engagement of citizens and
businesses in a logic of service co-design and co-production. Public sector information needs to
be considered not the end goal of an Open Data strategy but only the beginning of a path which
will need to integrate data openly provided by citizens, businesses, and any other activity or
entity in the territory.
Mobility in the Single Market is reinforced by seamless eGovernment services in
the setting up and running of a business and for studying, working, residing and
retiring anywhere in the European Union.
Citadel has explored services able to cross any administrative border, including those between
municipalities and regions in the same country. The project identified the semantic
interoperability of underlying data structures as key to the seamless, trans-European fluidity of
the services upon which they are based. This issue is addressed not by attempts to agree upon
unique standards, but rather by the definition of an open semantic framework that allows for
standards to interoperate and evolve over time, constantly interacting with end users through
the convergence of practice. Finally, mobility is an issue that needs to be also viewed in cultural
and social terms, as it is a defining feature of European citizenship. Citadel has explored these
issues by considering the status of visitor and host in the definition of Open Data governance
strategies.
Efficiency and effectiveness is enabled by a constant effort to use eGovernment to
reduce the administrative burden, improve organisational processes and promote a
sustainable low-carbon economy.
Citadel pilot cities have demonstrated that the best way to reduce administrative burdens is to
open up to citizen engagement for the co-production of ICT-based services. This leads to
institutional innovation processes that in turn demonstrate positive sustainability effects
through improved awareness of the environmental dynamics of urban systems and the impacts
of service design.
The implementation of the policy priorities is made possible by appropriate key
enablers and legal and technical preconditions.
121
Citadel has focused on inclusive local governance of Open Data strategies as key to identifying
the key legal and procedural enablers that city administrations are able to set in place for
diffused uptake in their territories, including awareness building, public events and hackathons,
etc. Broader engagement of citizens and businesses in debates on issues such as privacy and
security are fundamental in order to provide effective bottom-up input to national and EU legal
frameworks. Finally, the need has emerged for the provision of common Open Data platforms
and tools as an enabling public service open to all, governed by the principles of an Open Data
Commons.
Thanks to the experience gained and the lessons learned in the Citadel project, the key
elements for achieving the Malmoe Declarations important objectives are in place and the way
forward is clear, on the eve of the target date of 2015. Now is the time for local authorities to
join forces by declaring these common principles, committing to common action, and
challenging others to play their part.
THE CITADEL MANIFESTO

VISION
Europe and the world are facing severe challenges that place an increasing burden on
governments at every level. Municipal governments, however, are those best placed to address
many of these challenges, as they are closest to the citizens and businesses who will need to be
engaged in order to find the right solutions. While local communities harbour the innovation
potential we need, the greatest barrier to engagement is the increasing lack of trust in
government experienced in recent years.
One of the key levers for building a new relationship between local governments and citizens
and businesses is the opening up of public sector information. Open Data is a concrete step
towards transparency and engagement, while it also highlights the value of the work of civil
servants, the innovation potential of local development communities, and the benefits that can
be attained when the two collaborate towards common goals. Open Data processes also need
to deeply engage local citizens and businesses, as the first step in going beyond the public sector
to involve the whole local community in publishing and using data.
The objective is to achieve the aims of the Malmoe Declaration engagement of citizens and
businesses, mobility of people and services, and efficient and sustainable public services
through promotion of a massive uptake of Open Data. The vision is that of a Territory of Data,
composed of networks of towns and cities with data-driven strategies that enable local
communities to co-design sustainable services based on an improved awareness of local
phenomena and activities, socio-economic and environmental dynamics, and market and
business opportunities.
122
Annex III: The Citadel Charter
COMMITMENTS AND CHALLENGES

To that end, we the signatories of this Citadel Charter, commit our local governments to work
towards:
The massive opening up of all data we and our constituencies hold, with due respect for
individual and collective rights and using open platforms and standards frameworks.
The deep engagement of local development communities together with citizens and
businesses in data-driven societal innovation processes, including social and
institutional innovation as well as the development of innovative products and services.
The establishment of local Open Data governance groups that allow our administrations
to play a role of process orchestration, in order to most effectively define common
policies for privacy, security and related issues as well as business opportunities in civic
innovation environments and to ensure that access to open data is continuous and
consistent over time.
We challenge the European Commission and national and regional governments to:
Engage with our local governments as key actors for capturing bottom-up energies as
well as implementing Europe 2020 strategic objectives top down, including the Digital
Agenda for Europe and Regional Smart Specialisation Strategies.
Provide a cooperation framework that allows us to effectively work together with local
authorities across Europe to promote the institutional innovations that are key to
capturing the potential for our territories of Open Data.
Support the provision of common and open platform infrastructures and services,
including those which have already received European funding such as those developed
in the CIP Smart Cities initiative and in the Future Internet Public Private Partnership (FIPPP), and play their part in using these platforms to open the data they hold.
We challenge technology developers, from local tech communities to multi-national
corporations to:
Engage directly with public authorities and citizens to discover innovation potentials and
needs, using approaches such as Living Labs to co-design more effective products and
services.
Work to jointly explore the business benefits and market potentials of technology
innovation having the public interest as the primary goal, with a particular focus on
cultural expression, public and civic participation, services for the needy, and
environmental sustainability.
Adopt open platforms, standards, and frameworks that support interoperability while
promoting openness, participation and engagement and full respect of individual and
collective rights in the conception and design of services.
We challenge citizens and businesses both in our own local communities and throughout
Europe to:
Engage with public innovation communities, open up to innovation, and participate in
the co-design of new public services and spaces.
123
124
Reflect, both individually and collectively, on emergent issues of personal privacy and
identity, recognizing the key role for citizen engagement in designing new societal
frameworks of entitlement and citizenship.
Demand openness and transparency from governments and businesses at all levels, as
the prerequisite for gaining the trust required to work together in addressing the key
problems society faces today.
ABOUT THE AUTHORS

JESSE MARSH
Jesse Marsh has been exploring innovation since the late 1980s, when his professional interest
shifted from industrial design to information and communication technologies and local
development. He has participated in over 35 EU projects dealing with a range of issues from
cultural identity to smart cities, and has worked as a consultant to the FAO, the European
Parliament, and the World Bank. He has been an active member of the Living Lab movement
since 2007 and is currently Special Advisor to the President of ENoLL, consultant to the City of
Palermo for its Open Data and Smart City strategies, and is advising the Sicilian and Calabria
Regions on the role of Social Innovation in regional Smart Specialisation Strategies 2014-2020.
FRANCESCO MOLINARI
Francesco Molinari is currently research associate at Politecnico di Milano and visiting professor
at the Ulster Business School of the University of Belfast. As research and project manager he
has worked for several public and private organizations in Europe, including clients from
Belgium, Cyprus, Greece, Israel, Italy, Portugal, Slovenia and the UK. For the European
Commission he wrote in 2008 a study for the assessment of the Living Labs approach in the EU
innovation and Future Internet scenario. He has advised several Italian Regions and central
government bodies in topics related to innovation policy, smart specialization and precommercial procurement.
RICARDO STOCCO
With a degree in Archaeology from the University of Padua, Ricardo has coordinated and
directed several archaeological excavations and surveys in Italy and abroad, with the related
activities of digital documentation. He has also designed and developed ICT solutions and
services for public dissemination and knowledge exchange for a series of archaeological
expeditions. As part of the management of the archaeological site of the Imperial Fora (19992007) in Rome, he began to address issues related to the collection, management and use of
"public" data related to Cultural Heritage. This activity led him to deal specifically with the field
of Open Data, which he has continued to explore in activities related to the theme of cultural
"nomadic" tourism". Since 2011, he coordinates the research activities of the Territorial Living
Lab Prealpe (ENoLL Member), focusing on research and development related to Open Data in
Local Governments.
125
126

The Open Data Commons: A New Vision For The Future of Open Data

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

The Open Data Commons: A New Vision For The Future of Open Data

Загружено:

Авторское право:

Доступные форматы

The Citadel

Open Data Commons

A collection of excerpts from reports and deliverables of

Jesse Marsh, Francesco Molinari, and Ricardo Stocco

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0

Cover image: http://commons.wikimedia.org/wiki/File:Leipzig_1632_Theatrum_Europaeum.jpg

Analysis of implications .......................................................................................................... 64

POI Issues ............................................................................................................................. 116

DEFINITIONS USED IN THIS BOOK

OPEN DATA COMMONS

The 7 foundational principles of Privacy by Design are described at:

THE CITADEL ON THE MOVE PROJECT

The Malm Declaration, agreed on 18 November 2009 at the 5th Ministerial

The Citadel Open Data Commons

The new roles of citizen developer and traveller (visitor)

CITIZEN DEVELOPERS AND TRAVELLERS

The Citadel on the Move Project

generation builds trans-cultural fluency into an emerging notion of nomadic European

The Citadel Open Data Commons

Standards are relevant but cannot be defined top-down

OPEN DATA AS A COMMONS

The Citadel on the Move Project

The Citadel Open Data Commons

DEFINING THE OPEN DATA COMMONS

Figure 1. Open Data Value Chain

The Citadel Open Data Commons

Figure 2. The Citadel Vision

Defining the Open Data Commons

Figure 3. Typologies of innovation

Figure 4. Mapping of stakeholder domains

The Citadel Open Data Commons

Figure 5. Mapping of stakeholder transactions

Figure 6. Citadel additional stakeholder transactions

Defining the Open Data Commons

Figure 7. Citadel integrated ecosystem

The Citadel Open Data Commons

Figure 8. Key areas of stakeholder interaction

Figure 9. Outcome of stakeholder interaction

Defining the Open Data Commons

Figure 10. Open Data activity contribution to Citadel objectives

Scenario Development, to create a shared understanding of what makes a City smart;

The Citadel Open Data Commons

Figure 11. Pilots contributions to Citadel objectives

EXPERIMENTATION IN PILOT CITIES

Defining the Open Data Commons

ROLES IN THE ODGG

Mayor, City Government: defining strategies, promotion, privacy, and evaluation

The Citadel Open Data Commons

Table 2. Governance Roles in Pilot ODGGs

Key roles in the

Issy has strong

DAEM carried out a

Role for schools,

DAEM has already a

REACHING ODGG OBJECTIVES

Defining the Open Data Commons

The Citadel Open Data Commons

THE OPEN DATA COMMONS AT WORK

The Citadel Open Data Commons

FIRST ISSUE (2012)

Figure 12. Issues for specification of the Open Data Commons

The Open Data Commons at Work

We considered all of these approaches as appropriate to the functional requirements of the