Open Data Barometer: 2013 Global Report launched

Open Data BarometerLast Thursday the study I’ve spent the last five month working on with the Web Foundation was formally launched in the Open Government Data Working Group session of the Open Government Partnership Summit. The Open Data Barometer takes a look at the context, implementation and emerging impacts of open government data in 77 countries around the world.

Last week’s launch included both an analytical report and quantitative datasets for the secondary indicator and expert survey data collected in the study. I’ll be writing more in the coming weeks here about the process of designing and carrying out the study, and reflecting on how it might evolve and be built upon in future. But for now, here’s a link to where you can download the report and data, findings from the exec summary, and a few charts pulled out from the overall report.

Executive Summary: 2013 Global Open Data Barometer Report

Open data is still in its infancy. Less than five years after the first major Open Government Data (OGD) portal went live, hundreds of national and local governments have established OGD portals, joined by international institutions, NGOs and businesses. All are exploring, in different ways, how opening data can unlock latent value, stimulate innovation and increase transparency and accountability. Against this backdrop of rapid growth of the open data field, this Open Data Barometer global report provides a snapshot of OGD practices at national level. It also outlines a country-by-country ranking. Covering a broad sample of 77 countries, it combines peer-reviewed expert survey data and secondary indicators to look at open data readiness, implementation and emerging impacts. Through this study we find that:

  • OGD policies have seen rapid diffusion over the last five years, reaching over 55% of the countries surveyed in the Barometer. The OGD initiatives launched have taken a range of different forms: from isolated open data portals launched within an e-government framework, through to ambitious government-wide OGD implementations.
  • But – there is still a long way to go: Although OGD policies have spread fast, the availability of truly open data remains low, with less than 7% of the dataset surveyed in the Barometer published both in bulk machine-readable forms, and under open licenses. This makes it unnecessarily difficult for users to access, process and work with government data, and potential entrepreneurs face significant legal uncertainty over their rights to build businesses on top of government datasets.
  • Leading countries in the ODB are investing in the creation of ‘National Data Infrastructures’ to provide a foundation for public and private innovation and efficiency. They have high-level and broad-based political backing for the OGD initiatives, and are investing in capacity building with entrepreneurs and intermediaries. They are also focussing on building communities around open data, convening government officials and outside stakeholders to understand more clearly how data can be harnessed for economic and social progress. However, no countries can yet claim to fully be ‘open by default’, and embedding OGD practices across government is a key future challenge.
  • Mid-ranking countries have put in place some of the components of an OGD initiative, such as an open data portal and competitions or events to catalyse re-use of data, but have often failed to make key datasets available, and are lacking in important foundations for effective open data re-use. Absence of strong Right to Information laws may prevent citizens from using open data to hold government to account, and weak or absent Data Protection Laws may undermine citizen confidence in OGD initiatives. In addition, limited training and support for intermediaries may mean data cannot be mobilised to generate economic and social benefits.
  • Low-ranking countries have not yet started to engage with Open Data, and many developing countries lack basic foundations such as well-managed and digitised government datasets. In these countries, interventions to support OGD may look radically different from the leading OGD initiatives surveyed in the Barometer – with opportunities for open data approaches to be used to generate, as well as use, public information.
  • The Barometer ranks the UK as the most advanced country for open data readiness, implementation and impact, scoring above the USA (2nd), Sweden (3rd), New Zealand (4th), Denmark and Norway (joint 5th). The leading developing country is Kenya (21st), ranking higher than rich countries such as Ireland (29th) and Belgium (31st). However, no country can yet claim to be fully ‘open by default’.

Furthermore, in offering the first global snapshot covering both OGD policy and practice, the Barometer highlights:

  • Different countries and regions face different challenges in pursuing OGD – including the need to build government data collection and management capacity; the need to support and equip innovators and intermediaries to use data; and the need to secure civil society freedoms that will enable the use of open data for effective transparency and accountability. There is no one-size fits all approach to OGD.

  • Key datasets such as Land Registries and Company Registries are least likely to be available as open data[1], suggesting that OGD initiatives are not yet securing the release of politically important datasets that can be vital to holding governments and companies accountable.

  • In most countries, key datasets for entrepreneurship and improving policy are not available as open data, and when published are in non-standard formats. For example, even in the case of public transport, where data standards are well established, just 25% of countries surveyed have machine-readable data available. Mapping data is also often unavailable in digital forms, or only available for a fee, suggesting that inefficient charging for public data continues to be an issue in many countries.

  • Categories of data managed by statistical authorities are the most likely to be accessible online, but are often only released in very aggregated forms and with unclear or restrictive licenses. Adding a focus on open data to statistical agency capacity building may assist in making key datasets available as bulk, machine-readable open data, contributing positively to the ‘data revolution’ (UN, 2013).

  • Strong evidence on the impacts of OGD is almost universally lacking. Few OGD programmes have yet been evaluated, and the majority of discussion of impacts remains based on anecdote. The Barometer asked about six kinds of OGD impact (government efficiency, transparency and accountability, environmental sustainability, inclusion of marginalised groups, economic growth, and supporting entrepreneurs). In countries with some form of OGD policy (n = 43) in 45% of impact questions no examples of impact could be found, and on average evidence of impact was scored at just 1.7 out of 10.  Scores were particularly low for inclusion and environmental impacts of OGD, suggesting an area in need of further focus.

It remains very early days in the development of OGD practices. The World Wide Web has now been with us for almost 25 years, and, even so, many governments, businesses and civil society groups are still in the early stages of learning how to harness its potential. The open data vision is a bold one: but one that will take considerable work to make a reality. It cannot just be a case of ad-hoc dataset publication, but needs attention paid to legal, social, economic, technical, organisation and political dimensions of open data publication and re-use. This year’s Open Data Barometer provides a baseline for tracking how we collectively progress in the open data arena in years to come.

Web Observatories: The Governance Dimensions

Governance & Sustainability for a Web Observatory

I’m in a workshop at MIT today about plans to create a ‘Web Observatory’, collecting and curating vast quantities of data from across the web for research – in part to ensure that researchers can keep pace in their capacity to research the web with the companies and entrepreneurs who are already gathering terabytes of ‘traces’ of online behaviours in proprietary platforms. A lot of the discussion so far has looked at datasets for research gathered from platforms such as Twitter, curated data from platforms like Open Street Map, or collected in focussed research projects focussed on sensor networks and ‘humans as sensors’. However, the vision of the Web Observatory is not just about providing a catalogue of data for secondary research, but also about providing methods and tools that enable researchers to “locate, analyse, compare and interpret useful information in a consistent and reliable way … rather than drowning in a sea of data”.

As Wendy Hall noted in opening remarks, whilst the Web Observatory work begins with emphasis on academic researchers as the users of data, in the long run, Observatories could (or should) be accessible to individuals also. The growing imbalance of power created between citizens and companies through the privileged access that corporations have to information on our collective social lives is set to become an increasingly pressing social and political issue.

Now, there are clearly big technical challenges ahead in building the Web Observatory project and the many federated Web Observatories that will result, but in this post I want to briefly explore one of the organisational ones: getting the governance and sustainability of Web Observatories right.

Lessons from linked data: sustainability

If you’ve ever spent time exploring Linked Data projects you will have likely stumbled across a lot of abandoned datasets. One off conversions of open data; or data generated through now defunct research projects. The Web of Linked Data is far too often a web of broken links – as the funding for research projects runs out and links go dark.

The Linked Data Around the Clock programme (on a website that’s now offline ) had a slide that captures the coordination dilemma at the heart of creating and sustaining good Linked Data: the value of (linked and/or open) data accrues to a range of parties, and involves input from a range of parties. When projects are sustained through short-term grant funding, which covers all the work to create, curate and make accessible a dataset, then that data is sustainable only so long as the funding continues. It could be argued that when data is open, this is not so big a problem – as someone can simply take a copy of the data and if the original source goes dark, can bring up an alternative host for the data. But in practice, with Web Observatory datasets we’re talking big data where simply storing the datasets can require hundreds of terabytes storage; and datasets which cannot be entirely open due to privacy concerns or Terms of Service of the source data. The data also tends to be shaped primarily by the needs of the funding project that creates it, not by the needs of the projects that want to re-use the data. Although linked data promises distributed annotation and enhancement of data, in practice to query data it needs to be aggregated together in one place – and it’s more efficient to pool resources to enhance and maintain one data store, than to try and copy, convert and enhance multiple copies of big datasets.

So: if learning from Linked Data is anything to go by, the Web Observatory needs to be thinking critically from the start about how key datasets will be sustained, and how collaboration on enhancing data will be facilitated – recognising that there is a non-zero net cost (lots of near-zero marginal costs add up quickly in big data…) to enhancing and adding data to someone else’s data store.

Ethics issues: empowering access

Many of the datasets that might be contained in Web Observatories will raise significant privacy concerns. It might be tempting to manage these by simply deferring responsibility for judging what use can be made of the data to Institutional Review Boards and ethics committees at different participating academic institutions – if the Web Observatory programme is to be open to partners beyond academia, then ethics processes need to be placed into the heart of the Observatory governance structures, rather than managed around the edges.

A proposal: exploring co-opererative ownership and governance

There are, I think, three broad governance models open:

  • Observatories hosted and held-in trust by institutions: institutions, primarily academic, use fixed-term project funds to set up Web Observatories. They let other people use these so long as their funding allows, and prioritise those requests to enhance, extract or work with the data that fit with their own research goals. At the end of the project funding, Observatories either die, or end up maintained through residual or other funds.
  • Independent foundations: the model used by large web public goods like Archive.org and Wikipedia – establishing independent legal entities that maintain an Observatory. This has the value of helping Observatories out-live the projects that start them – but makes Observatories dependent upon finding their own funding, and creates an extra organisation entity over and above the partners with an interest in the data which either ends up with it’s own agenda and organisational imperatives, or which leaves a collective action problem with each of the partners waiting for the others to provide the funding to keep the lights on.
  • Data co-operatives: building on discussions convened last year in Manchester, there may be a new organisational structure the Web Observatories can build upon – that of the data cooperative. In a data cooperative, a light-weight separate entity is established, but which is constituted and jointly owned by the researchers and research institutions with a stake in the data. Cooperatives can establish rules about the resources that members should bring to the co-op, and what control they can expect over the design and maintenance of the Observatory, and can provide procedures for easy entry and exit from the co-op. In Manchester we discussed the potential for hybrid ‘workers/suppliers’ and ‘users/consumers’ co-operatives, that could give both the creators of data, and the researchers using the data, an appropriate stake in it. Co-operative membership to access data with privacy/ethical issues could also address ethics procedures.

Whilst the least developed, this third option I think holds most promise.

I don’t know yet if the Web Observatory programme will have an organisational research – but I hope so…

CfP: Open Data Track – 2014 Conference for E-Democracy and Open Government

I’m co-chairing a track on ‘Open Data, Transparency and Open Innovation‘ at the next CeDEM Conference for E Democracy and Open Government, taking place at Danube University Krems in May next year.

The full call for papers and submission details can be found here, and the details of the Open Data Track are below:

Open Data, Transparency and Open Innovation

Chairs: Johann Höchtl (Danube University Krems, AT), Ina Schieferdecker (Frauenhofer, DE), Tim Davies (University of Southampton, UK)

Open data can provide a platform for many forms of democratic engagement: from enabling citizen scrutiny of  governments, to supporting co-production of public data and services, or the emergence of innovative solutions to shared problems. This track will explore the opportunities and challenges for open data production, quality assurance, supply and use across different levels of governance. Key themes include:

  • Open data policy and politics: opportunities and challenges for governments; the global spread of open data policy; transparency and accountability, economic innovation, drivers for open data; benefits and challenges for developing countries.
  • Licensing and legal issues: copyright vs. open licenses & creative commons; Freedom of Information and the ‘right to data’; information sharing and privacy.
  • Open data technologies: technical frameworks for data and meta-data; mash-ups; data formats, standards and APIs; integration into backend systems; data visualisation; data end-users and intermediaries;
  • Open innovation and co-production: open data enabled models of public service provision; government as a platform; making open data innovation sustainable; data and democracy; connecting open data and crowdsourcing; data and information literacy;
  • Evidence and impacts: costs and benefits of providing or using open data; emerging good practices; methods for open data research; empirical data measuring open data impacts

Submissions are due by 6th December 2013.

Reflections on developing a global sectoral open data initiative: agriculture and nutrition

At the 2012 G-8 Summit leaders committed to a ‘New Alliance for Food Security and Nutrition ‘, and as part of the follow up to the US G8 presidency in April this year the World Bank hosted the ‘G8 Conference on Open Data for Agriculture ‘, exploring opportunities to create a global platform for sharing agriculturally relevant information. Initially driven by the UK and US, this initiative is has developed into the ‘Global Open Data Initiative for Agriculture and Nutrition ‘, currently preparing for a launch at the October 2013 Open Government Partnership summit in London. As the open data concept continues to gain traction at a policy level, such sectoral open data initiatives are increasingly common, and raise a wide range of questions. This post attempts to unpack some such questions for the proposed Agriculture and Nutrition Initiative.

Sector vs. supplier-centric open data

Early open data initiatives, such as the Open Government Data initiatives of the USA, UK and Kenya, have been supplier-centric. They are essentially based on the idea that a single data holder (or, in practice, amalgamation of different departmental data holders, but all from the same overall organisation) supply the data they hold online as open datasets. An open data portal often provides a focal point for this activity.

By contrast, sectoral open data initiatives draw on data from a wide range of suppliers. Some, such as the International Aid Transparency Initiative (IATI) are primarily interested in a single flow of data (in the IATI case, standardised datasets of aid funded activities), although others, such as the renewable energy focussed Reegle project look to aggregate together and integrate a range of different open data datasets with a single sectoral focus*.

An Open Data Initiative for Agriculture and Nutrition may have both supplier-focussed, and sectoral focussed, elements to it. Some of the high-profile holders of data in the agriculture sector, such as theWorld Bank and Food and Agriculture Organisation already have their own open data initiatives. However, there are many more actors who might be suppliers of data when it comes to agriculture and nutrition. It is worth nothing that existing sectoral open data initiatives such as IATI and the Reegle project are relatively limited in their scope and reach, and rely on a certain degree of centralisation (the IATI Registry and Standard in the former case, and a central data store for Reegle in the latter), and so an Open Data Initiative for Agriculture and Nutrition potentially represents a new level of ambition and complexity, requiring more decentralised approaches to securing a wider range of relevant open data.

(*I’ve not looked here at sectoral ‘open data’ initiatives from the sciences as I’m less familiar with these. However, emerging collaborations around genomics research, for example, which seek to pool data into a resource for answering a range of shared research questions may also be relevant points-of-reference in thinking about the shape of an agriculture and nutrition open data initiative).

Why open data?

A recognition of the need for better information and data sharing in agriculture and nutrition is nothing new. For over 100 years organisations like CABI have been producing abstract journals to more effectively transmit agriculture research to the locations on the ground where it is needed, and the agriculture and nutrition field has a well developed network of research institutions, agricultural extension services and initiatives to harmonise information, ranging from the long-establishedAGROVOC vocabulary, through to the more recent CIARD ‘Coherence in Information for Agricultural Research for Development’ movement, bringing together over 50 organisations to collaborate “to make agricultural research information and knowledge publicly accessible to all.”

However, an initiative for open data does have some significant differences of emphasis from one focussing on making information and knowledge publicly accessible. Open data initiatives place explicit emphasis on data over information; upon making that data machine-readable in standard formats; and requiring the use of open licenses that allow the data to be re-used by anyone. A number of arguments might be put forward to justify this specific emphasis:

  • Open data principles lead to lower transaction costs for finding and accessing data – and give re-users certainty that they can work with the data. For example, existing important datasets like Agrovoc do not use an open license, and can only be accessed in bulk behind a registration, meaning that users wanting to use agrovoc classifications in a dataset that also contains commercial data would be prevented from doing so without negotiations with the FAO. Applying open data principles could increase use and coherence of data.

  • Machine-readable data supports efficient and innovative re-use of data. With access to data in standard formats, users can remix, re-interpret and re-present the data, offering alternative interpretations and generating new insights that may not have been contained in shared informational publications. Open data principles also aim to support the easy combination of multiple datasets to support the identification of new trends and patterns across different datasets.

  • Open data allows new actors to get involved in address agriculture and nutrition challenges.Unlike data-sharing initiatives, which often work to ensure an identifiable list of actors have access to data and information, open data is, to a degree, about allowing as key unknown parties to access and innovate with data. This has the potential to bring new researchers, entrepreneurs and policy actors into the process of providing solutions to key challenges, enabling more open forms of innovation .

  • Existing open data initiatives could do more for agriculture. With many governments already publishing open data, and agriculture and nutrition open data initiative can harness the momentum to secure new datasets that would not be provided through existing initiatives – and can work to make sure multi-purpose datasets, such as cadastral data, land ownership records, weather data and other resources are provided in ways that support agriculture and nutrition activities.

I won’t assess the validity of each of these arguments here: that is a matter for empirical research – but they do highlight the kinds of areas that classic open data initiatives may focus on, and allow an assessment of how far an open data initiative may be complementary to other existing activities, or how it might connect with these.

The scope of a global initiative

Agriculture and nutrition is a vast field, and the issues on the agenda vary wildly across the world – from securing crop production and nutritional standards in developing countries, to ensuring trustworthy supply chains in Europe, and from planning for food security, to giving consumers information to choose organic or fairly traded products. An Global Open Data Initiative on Agriculture and Nutrition emerging from the Africa-focussed New Alliance for Food Security and Nutrition could choose to look only at issues of basic food security, but this might be a missed opportunity to also consider how open data has a role to play across a wide range of agriculture and nutrition issues. For example, catalysing activity around open data on food supply chains could be driven by, and have benefits for, both food security and consumer confidence in food.

An initiative also needs to consider whether it’s scope is primarily around public sector data and data held by research organisations, or whether it will also look at the vast quantities of private sector held data on agriculture and nutrition. Many governments already use targeted transparency measures to require food producers to generate and publish nutritional information on their products, suggesting that further steps to require private sector publication of open data on various agriculture and nutrition issues might not be out of the question.

Self-selected commitments, or a shared agenda?

The International Aid Transparency Initiative sets out a clear standard for data that all signatories to the initiative should work towards publishing. As more signatories publish this data to the common standard, network effects kick in making the data more and more useful. By contrast, the Open Government Partnership invites countries to sign up to some broad principles and then to self-select what they will do, and (if choosing to focus on open data at all) the particular datasets they might release: with the areas of focus driven by domestic engagement and pressure. Somewhere between the two, the G8 Open Data Charter includes a list of core datasets that all signatories should work to publish, and then invites self-selected commitments with a long-list of suggested datasets to focus on.

A Global Open Data Initiative on Agriculture and Nutrition could identify a shared agenda based around a small number of datasets and issues, or could be driven by general principles, with members self-selecting their areas of focus. There are pros and cons to each approach, but they potentially lead to initiatives of very different characters, and consequences for the way in which different stakeholders might get involved.

What needs to go into an open data initiative?

I’ve written in the past about ten building blocks of an open data initiative highlighting that open data initiatives need more than datasets – also requiring explicit effort on outreach and engagement, and capacity building to enable wider use of the data that is made available. When it comes to Agriculture and Nutrition there are a wide range of actors who might need to be involved in these wider activities – from the infomediaries who translate research and data into actionable information for farmers and traders, through to the government planners or civil society activists seeking to improve the equitable and fair management of natural resources.

The Ten Building Blocks of open data listed below can take many different forms, and operate at different levels of scale – but an initiative that focusses only on one or two of these building elements to the exclusion of others is unlikely to be able to realise the potential impacts of open data.

  1. Leadership and bureaucratic support

  2. Datasets

  3. Licences

  4. Data standards

  5. Data portals

  6. Interpretations, interfaces and applications

  7. Outreach and engagement

  8. Capacity building

  9. Feedback loops

  10. Policy and legislative lock-in

Evidence and impact

There has been an interesting dialogue recently in the Open Data Innovations LinkedIn Group about “How to monitor progress of open data”. The discussion has highlighted that, as the use to which data will be put is generally left ‘open’, coming up with concrete evaluation frameworks for measure whether open data has had the desired impact can be challenging. This is of course, one of the big issues we’re grappling with in the Open Data in Developing Countries project – currently focussing on qualitiative case studies to understand how open data interacts with existing processes of governance on the ground.

However, it is not inconsistent to set both a series of primary goals for the greater sharing of data against which an intervention can be measured, and to develop frameworks for monitoring secondary impacts resulting from leaving data open to re-use. Such frameworks must, however, be able to also capture unintended consequences of open data re-use – noting that not all results will inevitably be positive, particularly in contexts where so much is stake as the agricultural domain, where the interests of communities, agri-business, governemnts, environmentalists and others are not always aligned.

The ODDC Conceptual Framework seeks to outline some possible directions for such a framework, as a foundation to be further revised as our 2013/14 case studies start to report later this year.

Many more questions…

The formation of larger scale sectoral open data initiatives is an emerging phenomena, and something that will need continued practical and research attention. From a research perspective, it will be fascinating to see how plans for the Global Open Data Initiative on Agriculture and Nutrition involve.

(Disclosure: In my role at the Web Foundation I’ve been involved in some discussions with the convening team for the Global Open Data Initiative on Agriculture and Nutrition, and this post as an open reflection is offered as an input to their ongoing dialogue, as well as a wider reflection on sectoral open data initiatives)

 

Open data and privacy

Cross-posted from the Open Data Research network site.

On 1st August two IDRC research networks came together for a web meeting to explore Open Data and Privacy. The Privacy in the Developing World network, and the Open Data in Developing Countries network set out to explore whether open data and protecting privacy are inherently in tension, or whether the two can be complementary, and to identify particular issues that might come up around privacy and open data in the developing world. This post shares and develops some of the themes discussed in the meeting.

Definitions

Open data is generally defined as data made accessible, in formats that can be manipulated by computers (allowing the creation of new interfaces, mash-ups and other data analysis), and without restrictions on how the data can be re-used. In essence, open data asks those who hold data (usually governments) to give up formal control over how it is used, with the idea that this allows greater scrutiny of governments, and unlocks potential for innovation with the data.

Privacy, by contrast, is concerned with control over information, who can access it, and how it is used. As Daniel Solove notes[1] this has many dimensions, from concerns about intrusive information collection, through to risks of exposure, increased insecurity or interference in their decisions that individuals or communities are subjected to when their ‘private’ information is widely known. Privacy is generally linked to individuals, families or community groups, and is a concept that is often used to demarcate a line between a ‘private’ and ‘public’ sphere. Article 12 of the Universal Declaration on Human Rights states “No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation”. It has been argued that privacy is a western concept, only relevant to industrialised societies – yet work by Privacy International has found privacy concerns to be widespread across developing countries, and legal systems across the world tend to recognise privacy as a concern, even if the depth of legal rights to privacy and their enforcement varies. It is worth noting though that few of the countries covered by the ODDC project have strong privacy protection laws in place.

Different kinds of data

One of the starting points of discussion around open data and privacy is to work out which kinds of ‘data’ might fall within the focus of each. In the context of open government data, we might think about three broad categories:

  • Infrastructural data – data held about the state of the world – for example, describing the land, transport networks, structures of government, weather measurements and so-on. There are very few privacy concerns about this data (though in some states security concerns may restrict the extent to which it is shared, such as geographic border and water flow data in the the rivers of Northern India)
  • Public service data – data about the activities of government – ranging from the locations of public services and their budgets through to public registers, and detailed performance statistics on schools, hospitals and other facilities. This last set can be in a grey area – as they are often built up from the aggregation of records about individual users of public services, and it is not always clear who they are about. For example, is the medical record about an operation and it’s outcome data about the patient, or about the doctor?
  • Personal data – data about individuals, and usually things that an individual would have a legitimate right to manage access to – such as information on their sexuality or their health.

In the Web Meeting, Sam Smith noted that the framers of the ‘Open Definition’, taken as a basis for much open data advocacy, were focussed specifically on non-personal data, and that open data advocates tend to make clear that they are not talking about information that could identify private information about individuals. However, as the categories above show, the dividing line between public and personal data is not always clear.

Kinds of data - infrastructure; public service; personal

This classification does make clear however that there are some kinds of data (the infrastructural data) where applying open data should be, from the privacy perspective at least, uncontroversial. The relative importance of data in the middle category to the kinds of outcomes sought from open data policy interventions then becomes an important question to ask.

It is worth noting that because of the political popularity of open data policy, there has been a tendency for other policies relating to data to be presented under an open data banner in some countries. For example, policies on the restricted sharing of medical records with pharmaceutical companies (through secure data sharing rather than as open data) were included in UK open data measures in 2011. These policies clearly need to be considered distinctly from open data policies, and their implications also weighed carefully.

Opening data disrupts past privacy practice

Steve Song offered an input into the Web Meeting focussed on the online publication of a dataset and mash-up map showing the location of registered Gun Owners in the wake of a school shooting in Connecticut. The register of gun ownership had long been a public document, but it had been in the form of documents that could be inspected rather than as a dataset. The conversion of this public register into open data which could be easily mapped created a strong backlash: law enforcement officials worried that their addresses had been revealed online, and those with and without guns expressing concerns that the information could be used by burglars to target particular houses. The accuracy of the record was also questioned, and it was suggested that much of the information was misleading or wrong.

This case illustrated how turning existing ‘public records’ into open data might change some of the balances around privacy that have been struck by the practical difficulties that exist right now of access to those records. Previously the ‘data’ had been hidden in plain view: but no-one had been encouraged to use it in ways that might give rise to concerns. Thought may be needed then not only when things previously secret are made public, but also when public records are turned into more easily manipulated and processed open data. Steve noted that this may be particularly important in contexts of ethnic or communal tensions: imagine for example how voter registers might be used as data where ethnicity can be inferred from the voters name, and where an election is contested on ethnic lines.

In the United Kingdom, the recent Shakespeare Review of Public Sector Information[2] has proposed shifting the legal responsibility for mis-used of data from the person who publishes the data onto the person who abuses the data – suggesting a model in which privacy laws would control (ab)use rather than access to data. However, such a model is tricky to envisage in a world where data can cross borders easily, there is little harmonisation of privacy laws, and harms from privacy violations can also cross borders.

Privacy as an excuse? Open as a general principle?

One of the key concerns raised in the meeting was that if arguments for open data are applied as a ‘general rule’ without sensitivity to the kinds of data in question, there are significant risks that privacy rights might be undermined. Yet, transparency and open data advocates are often concerned that ‘data protection’, or ‘protecting privacy’ might be used as excuses not to release data, or to only release data in aggregated forms that don’t permit detailed analysis of what government is doing. Neither can necessarily be used as a principle that trumps the other.

In his review of open data and privacy for the UK Government, Kieron O’Hara noted[3] that, even within open data advocacy, different groups have different requirements for what for good quality open data for their purposes (§2.1.4). For example, transparency campaigners may be happy with crime data covering general geographic areas that particular official is responsible for, whereas entrepreneurial re-users of data might want data down to the individual street and house-level to feed into risk models for insurers, or to use in route-planning applications.

In our web meeting Steve Song suggested that by developing a clearer picture of the kinds of impacts open data can have, and the ways in which it might be used (a central theme in ODDC explorations), we will be better able to have informed debate about the trade-offs between privacy and open data. This again moves away from the simple rhetorical message of ‘open everything’, and ‘raw data now’ that many open data advocates have pushed for – and suggests that deeper debate will be needed over the sharing of datasets that fall into the grey areas between public and personal. Such a debate will need to engage with questions of whether open data is being used to support public goods or private gains, and with nationally and culturally specific judgements about how to manage trade offs between public good and personal or community privacy. For example, in some countries, personal tax records are considered public and are published, yet in others, these are judged to be private data.

The question of corporate confidentiality was also raised in the web meeting discussions. Although corporate confidentiality is conceptually distinct from privacy, it is another principle that might sometimes be found to be in tension with a drive towards open data, and can become the grounds of excuses for not releasing data. Distinguishing when privacy or corporate confidentiality are being used as excuses for not releasing data, or when they are based in serious and valid concerns, will be important for open data advocacy.

In practice, it wasn’t clear from web meeting participant’s experience whether privacy is actually being used as a grounds for restricting access to data in developing countries, or if privacy is being adequately considered in decisions about opening data. This will be a key issue to track in future research to better understand how potential tensions between open data and privacy are playing out in practice.

Open data, privacy and power

At the Asia regional meeting of the ODDC project, one participant noted the curious overlap between participants in the Data Meet community (often involved in pushing for open data), and those organising ‘Crypto Parties’, teaching each other about privacy protection software. How have these individual reconciled campaigning for both open data and privacy? If they are pushing for a balance between the two, how is such a balance to be struck. One possible way to understand the compatibility of pushing for both privacy and open data is through the lens of power and autonomy. Activists may be interested in seeking maximum autonomy from the state through protecting their privacy, and maximum control over the state, through the ability to see what the state is doing through open data, and to work with state-collected data. Such a political position might be associated with the libertarianism of some open source geek cultures, but may also have different routes and political slants around the world.

The power-based analysis might also help in determining which kinds of entrepreneurial uses of open data are desirable or not. Cases where entrepreneurs act as intermediaries in ways that enhance the autonomy of citizens (for example, providing public transport planning applications to help citizens move more freely through space, or informational applications that help citizens to collaborate and co-create or claim access to public services) may be seen as positive, whereas commercial open data re-use that leads to interference in individuals decisions through targeting of advertising, or that drive discriminatory pricing of services and insurance, might be seen having a negative impact on individual autonomy (although the negative effective may only be felt by some segments of the population such as minority or marginalised groups). The question however would remain of how such potential negative uses of open data should be governed, particular in developing world contexts where legal frameworks vary widely. Serious abuses of open data (whether to incite community tensions, or affect individuals through discriminatory pricing) could be outlawed, but if they have not, what should those releasing data consider?

Conclusions

By the end of the web meeting we had opened up many more issues that we had resolved, but we had established that there can be a productive dialogue between privacy and open data, and that more work is needed to explore how the two concepts together are unfolding in developed and developing world.

If you would like to join the debate over privacy and open data, there’s a thread over on the Open Data Research network Linked In group.

References

[1] Solove, D. J. (2005). A Taxonomy of Privacy. University of Pennsylvania Law Review, 154(3), 477.

[2] Shakespeare, S. (2013). Shakespeare Review: An independent review of public sector information. London.

[3] O’Hara, K. (2011). Transparent government, not transparent citizens: a report on privacy and transparency for the Cabinet Office.

GIS Watch 2012 article: Who is doing what when it comes to technology for transparency, accountability and anti-corruption

The 2012 issue of Global Information Society Watch was focussed on ‘The Internet and Corruption’, exploring how online technologies are being used in the fight against corruption across the world. GIS Watch is focussed on country level analysis of information society issues around the world, but also includes a number of wider articles. I was asked to put together the ‘institutional review’ on transparency and accountability work. You can find the full GIS Watch here, or read the institutional review article below (licensed under Creative Commons Attribution 3.0 license, please attribute to GIS Watch).

Who is doing what when it comes to technology for transparency, accountability and anti-corruption

Fighting corruption is a responsibility that all global institutions, funders and NGOs have to take seriously. Institutions are engaged with the fight against corruption on a number of fronts. Firstly, for those institutions such as the World Bank that distribute funds or loans there is a responsibility to address potential corruption in their own project portfolios through accessible and well-equipped review and inspection mechanisms. Secondly, institutions with a regulatory role need to ensure that the markets they regulate are free of corruption, and that regulations minimise the potential space for corrupt activity. And thirdly, recognising the potential of corruption to undermine development, institutions may choose to actively support local, national and international anti-corruption activities and initiatives. This report provides a critical survey of some of the areas where multilateral, intergovernmental, multi-stakeholder institutions, NGOs and community groups have engaged with the internet as a tool for driving transparency and accountability.

Although transparency is an often-cited element of the anti-corruption toolbox, technology-enabled transparency remains a relatively small part of the mainstream discourse around anti-corruption efforts in formal international institutional processes. The UN Convention Against Corruption (UNCAC),[1] was adopted in 2003 and ratified by 160 countries, and the OECD Anti-Bribery Convention,[2] adopted in 1997, provides a backbone of international co-operation against corruption and focus heavily on legal harmonisation, improved law enforcement, criminalisation of cross-border bribery and better mechanisms for asset recovery, addressing many of the pre-conditions for being able to act on corruption when it is identified. Continued co-operation on UNCAC takes place through the UN Office on Drugs and Crime,[3] with input and advocacy from the UNCAC Civil Society Coalition. In international development, the outcomes document of the Fourth High Level Forum on Aid Effectiveness,[4] held in Korea in November 2011, notes these foundations and highlights “fiscal transparency” as a key element of the fight against corruption. However, the document more often discusses transparency as part of the aid-effectiveness agenda, rather than as part of anti-corruption. This illustrates an important point: transparency is just one element of the fight against corruption, and reduced corruption is just one of the outcomes that might be sought from transparency projects. Transparency might also be used as a tool to get policy making better aligned with the demands of citizens, or to support co-operation between different agencies.

The review below starts by looking at technology for transparency in this broader context, before briefly assessing how far efforts are contributing towards anti-corruption goals.

Transparency and open data

The last three years have seen significant interest in online open data initiatives as a tool for transparency, with over 100 now existing worldwide. Open data can be defined as the online publication of datasets in machine-readable, standardised formats that can be re-used without intellectual property or other legal restrictions.[5] A core justification put forward for opening up government or institutional data is that it leads to increased transparency as new data is being made available, and existing data on governments, institutions and companies becomes easier to search, visualise and explore.

Following high-profile Open Government Data (OGD) initiatives in the US (data.gov) and UK (data.gov.uk), in April 2012 the World Bank launched its own open data portal (data.worldbank.org), providing open access to hundreds of statistical indicators. Here is how the World Bank describe the data portal’s mission:

The World Bank recognizes that transparency and accountability are essential to the development process and central to achieving the Bank’s mission to alleviate poverty. The Bank’s commitment to openness is also driven by a desire to foster public ownership, partnership and participation in development from a wide range of stakeholders.[6]

The World Bank has also sponsored the development of Open Government Data initiatives in Kenya (opendata.go.ke) and Moldova (data.gov.md), as well as funding policy research and outreach to promote open data through the Open Development Technology Alliance (ODTA).[7]

Central to many narratives about open data is the idea that it can provide a platform on which a wide range of intermediaries can build tools and interfaces that take information closer to people who can use it. The focus is often on web and mobile application developers as the intermediaries. Many of the applications that have been built on open data are convenience tools, providing access to public transport times or weather information, but others have a transparency focus. For example, some apps visualise financial or political information from a government, seeking to give citizens the information they need to hold the state to account.

Apps alone may not be enough for transparency though. In an early case study of the Kenya open data initiative, Rahemtulla et. al., writing for the ODTA, note that “the release of public sector information to promote transparency represents only the first step to a more informed citizenry…”, and that initiatives should also address digital inclusion and information literacy. This involves ensuring ICT access, and the presence of an ‘info-structure’ of intermediaries who can take data and turn it into useful information that actively supports transparency and accountability.[8] World Bank investments in Kenya linked to the open data project go some way to addressing this, seeking to stimulate and develop the skills of both journalists and technology developers to access and work with open data. However, much of the focus here is on e-government efficiency, or stimulating economic growth through creation of commercial apps with open data, rather than on transparency and accountability goals.

Open data was also a common theme in the first plenary meeting of the Open Government Partnership (OGP)[9] in Brazil in April 2012. The OGP is a new multilateral initiative run by a joint steering committee of governments and civil society. Launched in 2011 by eight governments, it now has over 55 member states. Members commit to create concrete National Action Plans that will “promote transparency, empower citizens, fight corruption, and harness new technologies to strengthen governance”.[10] The OGP has the potential to play an influential role over the next few years in networking civil society technology-for-transparency groups with each other, and with governments, and placing the internet at the centre of the open government debate.

The rapid move of open data from the fringes of policy into the mainstream for many institutions has undoubtedly been influenced by the activities of a number of emerging online networks and organisations. The Open Knowledge Foundation (OKF)[11] has played a particularly notable role through their e-mail lists, working groups and conferences in connecting up different groups pushing for access to open data. OKF was founded in 2004 as a community-based non-profit organisation in the UK and now has 15 chapters across the world. OKF explain that they ‘build tools, projects and communities’ that support anyone to “create, use and share open knowledge”.[12] The OKF paid staff and volunteer team are behind the CKAN software used to power many open data portals, and the OpenSpending.org platform that has the ambition to “track every government financial transaction across the world and present it in useful and engaging forms for everyone from a school-child to a data geek”.[13] This sort of ‘infrastructure work’ – building online platforms that bring government data into the open and seek to make it accessible for a wide range of uses – is characteristic of a number of groups, both private firms and civil society, emerging in the open data space.

Another open data actor gaining attention on the global stage has been the small company OpenCorporates.com.[14] OpenCorporates founder, Chris Taggart, describes how their goal is to gather data on every registered company in the world, providing unique identifiers that can be used to tie together information on corporations, from financial reporting, to licensing and pollution reports. Although sometimes working with open data from company registrars, much of the OpenCorporates database of over 40-million company records has been created through “screen scraping” data off official government websites. In early 2012 OpenCorporates were invited to the advisory panel of the Financial Stability Board’s[15] Global Legal Entity Identifier (LEI) project, being conducted on behalf of the G20. The LEI project aims to give a unique identifier to all financial institutions and counterparties, supporting better tracking of information and transactions. Importantly the recommendations, which have been accepted by the G20, will operate “according to the principles of open access and the nature of the LEI system as a public good… without limit on use or redistribution”.[16]

International transparency initiatives and standards

A number of sector-specific international transparency initiatives have developed in recent years, with a greater or lesser reliance on the internet within their processes.

Online sharing of data is at the heart of the International Aid Transparency Initiative (IATI)[17] which was launched at the third High Level Forum on Aid Effectiveness in Accra, Ghana in 2008, and now has over 19 international aid donors as signatories. The initiative’s political secretariat is hosted by the UK Department for International Development (DfID),[18] and a technical secretariat, which maintains a data standard for publishing data on aid flows, is hosted by the AidInfo programme.[19] IATI sets out the sorts of information on each of their aid activities that donors should publish, and provides an XML standard for representing this as open data.[20] A catalogue of available data is then maintained at http://www.iatiregistry.org, and a number of tools have been developed to visualise and make this data more accessible. Through IATI, countries and institutions, from the Asian Development Bank (ADB), to the UN Office of Project Services (UNOPS), have made information on their aid spending or management more accessible.

The Open Aid Partnership,[21] working closely with IATI, and hosted by the World Bank Institute, is focusing specifically on geodata standards for aid information, using the ‘Mapping for Results’ methodology developed with AidData[22] to geocode the location of aid projects and make this information available online. Geocoded data is seen as important to “promote ICT-enabled citizen feedback loops for reporting on development assistance”.[23]

A number of other high-profile sector transparency initiatives, the Extractive Industries Transparency Initiative (EITI),[24] and the Construction Sector Transparency Initiative (CoST)[25] are less open data or ICT-centred, opting instead for processes based on disclosure and audit of documents through local multi-stakeholder processes. However, the Global Initiative on Fiscal Transparency (GIFT),[26] which aims to “advance and institutionalise global norms and continuous improvement on fiscal transparency, participation and accountability in countries around the world”, has a ‘Harnessing new technologies working group’ led by the OKF, which has outlined a number of ways technology can be used for transparent and accountable finance.[27] The ‘Lead Steward’ organisations for GIFT are the International Monetary Fund, World Bank Group, Brazil Ministry of Planning, Budget and Management, the Department of Budget and Management Philippines, and the Washington based CSO-project, the International Budget Partnership.[28]

Crowd-sourcing 

Transparency and accountability isn’t just about information and data from governments, companies or multi-lateral institutions. Input from citizens is crucial too. Crowd-sourcing projects such as Ushahidi,[29] first developed to monitor post-election violence in Kenya, have been deployed or replicated in a number of anti-corruption settings. Accepting submissions by SMS or online, these tools allow citizens to report problems with public services that might point to appropriation of funds, or to directly report cases of corruption. Reports are generally geocoded and the resulting maps are presented publicly online. With UN Development Programme (UNDP)[30] support, a Ushahidi-based corruption monitoring platform was established in Kosovo.[31] In India, the IPaidABribe.com platform, which was launched in 2010 by Bangalore based non-profit Janaagraha,[32] has collected over 20,000 reports of bribery requests or payments.

UNDP analysis suggests that the success of social media and use of crowd sourcing in transparency and accountability projects relies upon transparent mechanisms for verifying reports, and the backing of institutions or systems that can convert information into action – such as ensuring corrupt tenders are cancelled.[33] In a global mapping of technology for transparency and accountability, The Transparency & Accountability Initiative[34] (a donor collaboration chaired by DfID and the Open Society Foundation),[35] found that many of the one hundred projects they reviewed were started by technology-savvy activists.[36] Where these were tailored to local context, and able to adopt a collaborative approach, involving governments and/or service providers, they were more likely to be sustainable and successful. Global Voices Online maintain a directory of over 60 case studies as part of their ‘technology for transparency network’.[37]

The internet is also being used actively by global advocacy networks such as the Land Matrix Partnership, who launched an online database of land deals at the World Bank Land and Poverty Conference in April 2012, seeking to highlight the growing issue of large scale land acquisitions across the world, particularly in Africa. This database, initially created through online collaboration of researchers, also accepts submissions through its website at http://landportal.info/landmatrix where reported data can also be visualised and explored.

Further activity and institutions

For reasons of space this report can only make passing mention of initiatives aimed at increasing parliamentary transparency through developing and implementing online tools for tracking legislative process and parliamentary debates. These have been established by civil society networks in a number of countries following models developed by the independent GovTrack in the US,[38] and the charity MySociety[39] with their TheyWorkForYou.com platform in the UK. MySociety, with support from Open Society Foundation and Omidyar Network,[40] have been focusing in 2012 on making their transparency and civic action tools easier to implement in other jurisdictions, opening up the Alavateli code that powers the public right to information services WriteToThem.com and AskTheEu.org, amongst others.

The funding for this work from the Omidyar Network, established by eBay founder Pierre Omidyar, draws attention to another set of important institutions and actors in the tech-for-transparency space: donors from the technology industry. Google, Omidyar Network, Cisco Foundation, and Mozilla Foundation amongst others have all been involved in sponsoring technology for transparency open source projects like Ushahidi, the work of MySociety, or data-journalism projects across the world. It is likely that without access to funding derived from internet industry profits, many of the current technology-for-transparency projects around would be far less advanced. 

This report has also not explored how institutions have responded to online leaking of information as part of transparency and accountability efforts. However, one project deserves a brief mention: the WCITLeaks website [41] established to accept leaked documents relating to the revision of the International Telecommunications Regulations (ITRs) in response to the secrecy surrounding International Telecommunication Union (ITU) processes, and the lack of a civil society voice at the forthcoming World Conference on International Telecommunication (WCIT).

Exploring impact 

Technology for transparency is a rapidly growing field. The innovations may be emerging from civil society and internet experts (with much of the funding to scale up projects often coming ultimately from internet firms), but governments and international institutions are opting-in to open data based transparency initiatives, and a number of institutions, from the World Bank, to the newly formed OGP, are active in spreading the technology for transparency message to their clients and members. However, there is little hard evidence yet of the internet becoming an integrated and core part of the global anti-corruption architecture, and many tools and platforms remain experimental, hosting just tens or hundreds of reported issues, and offering only limited stories of where crowd-sourced SMS reports, or irregularities spotted in open data, have led to corruption being challenged, and offenders being held to account.

McGee and Gaventa in a review of general transparency and accountability initiatives funded by DfID explain that the evidence base on their impact is limited across the field.[42] Limited evidence of the anti-corruption impacts of technology for transparency should therefore be taken as a challenge to improve the evidence base and focus in impact, rather than to step back from developing new internet-based approaches for transparency and accountability. Working out the impact of those projects that provide online information infrastructures as foundations for accountability efforts, from general open government data projects, to targeted transparency initiatives, will need particular attention if these efforts are to continue to receive institutional backing, and if the new loose-knit networks that provide many of these platforms are to continue to thrive.

 

 

(All links accessed 7th July 2012)


[3] UN Office on Drugs and Crime: http://www.unodc.org/unodc/en/corruption/

[4] Busan 2011 High Level Forum on Aid Effectiveness: http://www.aideffectiveness.org/busanhlf4/

[5] The Open Definition: http://opendefinition.org/

[6] World Bank Open Data Portal: http://data.worldbank.org/about

[7] Open Development Technology Alliance: http://www.opendta.org

[8] Rahemtulla, H., Kaplan, J., Gigler, B.-S., Cluster, S., Kiess, J., & Brigham, C. (2011). Open Data Kenya: Case study of the Underlying Drivers, Principle Objectives and Evolution of one of the first Open Data Initiatives in Africa. http://www.scribd.com/doc/75642393/Open-Data-Kenya-Long-Version

[9] Open Government Partnership: http://www.opengovpartnership.org/

[11] Open Knowledge Foundation: http://okfn.org/

[12] http://www.okfn.org/about/faq

[13] Open Spending (project) http://www.openspending.org

[14] Open Corporates: http://opencorporates.com/

[15] Financial Stability Board: http://www.financialstabilityboard.org/

[17] International Aid Transparency Initiative: http://www.aidtransparency.net

[18] UK Department for International Development: http://www.dfid.gov.uk

[21] Open Aid Partnership: http://www.openaidmap.org/

[24] Extractive Industries Transparency Initiative: http://eiti.org/

[25] Construction Sector Transparency Initiative: http://www.constructiontransparency.org/

[26] Global Initiative for Fiscal Transparency http://fiscaltransparency.net/

[28] International Budget Partnership: http://internationalbudget.org/

[29] Ushahidi: http://ushahidi.com/about-us

[30] UN Development Programme: http://www.undp.org/

[33] Tsegaye Lemma (2012), Corruption Prevention and ICT: UNDP’s Experience from the field. Presented at Joint Experts Group Meeting and Capacity Development Workshop on Preventing Corruption in Public Administration, UN DESA, New York, USA, 26 – 28 June. http://unpan1.un.org/intradoc/groups/public/documents/un-dpadm/unpan049778.pdf

[34] Transparency and Accountability Initiative: http://www.transparency-initiative.org/

[35] Open Society Foundations: http://www.soros.org/

[36] Avila, R., Feigenblatt, H., Heacock, R., & Heller, N. (2011). Global mapping of technology for transparency and accountability: New technologies. http://www.transparency-initiative.org/reports/global-mapping-of-technology-for-transparency-and-accountability

[37] Technology for Transparency Network: http://transparency.globalvoicesonline.org/

[40] Omidyar Network: http://www.omidyar.com/

[41] WCIT Leaks (project) http://wcitleaks.org/

[42] Mcgee, R., & Gaventa, J. (2010). Review of the Impact and Effectiveness of Transparency and Accountability Initiatives: Synthesis Report. http://www.dfid.gov.uk/R4D/Output/187208/Default.aspx. See also http://www.dfid.gov.uk/R4D/Search/SearchResults.aspx?ProjectID=60827 for other outputs of the research programme this report is taken from.

Open Data, Land, Gender

[Summary: very rough and speculative notes in response to a land coalition online dialogue]

The land coalition are hosting a online dialogue until 20th Feb looking at “using online platforms to increase access to open data and share best practices of monitoring women’s land rights”. It’s an interesting topic for a dialogue particularly given one of the most widely cited cases used to highlight potential downsides of open data relates to the digitisation of land records and their exploitation to the detriment of poor landholders. However, as platforms like the LandMatrix (aggregating together land investment reports from research and advocacy groups across the world), and Open Development Cambodia demonstrate, open data is also being used by citizens to monitor land rights issues.
In this post I share a few quick thoughts on the broad theme of open data, land and gender.

Open data and land

The dialogue asks about how online platforms are contributing to the opening of land data. There are three broad sources of data I can see:

Official data – where governments have well managed land ownership databases then as part of national open government data programmes citizens may be able to secure the ongoing publication of this data in open forms. In the United Kingdom we’ve recently seen the Land Registry place data online, detailing land sale transactions in CSV and linked data; and a publicly owned land is a commonly featured dataset on local open data portals in the UK. However, this data itself may be tricky to use directly, and intermediaries are needed to make it accessible. In Kirklees, the Who Owns My Neighbourhood presents an interesting approach to using official data, and combining it with social features for citizens to input local knowledge and news about publicly owned plots of land: making official land data more ‘social’.

Crowdsourced data – in many cases there may not be an official source for the data activists want, or there may be limited prospect of getting access to the official data. Here a range of ‘crowdsourcing’ approaches exist. The LandMatrix approach uses researchers, and works to verify reports before sharing them. There may be other approaches available that use tools like pybossa to crowdsource extraction of structured information from semi-structured documents, or to split analysis of records into micro-tasks. The Open Street Map platform may also be able to act as source of data, allowing tags to be applied to land. Tools like CrowdMap (based on the Ushahidi platform) make it possible to collate reports submitted on a range of platforms including phone, and to verify reports, although the challenge with any crowdmap project is recruiting people to submit data.

Inferred data – at one of the RHOK Hack Days I took part in at Southampton I was interested to hear about a groups project using satellite data to work out crop types on plots of land. I suspect there are ways this data could be used to detect changes in land use that might indicate also changes in ownership – and the conversion of land from multiple crops to large agribusiness.

Using land data

Having open data on land ownership and land rights is only one part of the story. As the Bhoomi case illustrates, the regulatory framework around the data matters: is a dataset taken as authoritative, or are documents or other customary practices able to override the descriptions held in data? Does the data model through which land ownership and rights are described capture the subtlety and nuance of land use practices (see Srinivasan’s field note for a discussion of the need to mash-up multiple schemas of data to get a view of complex land practices)? And what intermediaries are active to help citizens mobilise land records to secure their rights, rather than those records being only truly accessible to private actors with technical and financial capital?

In the ongoing Land Coalition dialogue I’m interested to learn more about the cases of how data on land rights is being mobilised to create change: whether at the level of global advocacy, where big numbers may matter most; or at the level of individual struggles over ownership, access and rights, where detailed, accurate and timely data on particular plots is likely to be most important.

Open data and women’s land rights

I will admit to knowing very little about the specific issues around women’s land rights. However, in making the connection between open data and women’s land rights I did want to briefly explore whether a focus on digital platforms and open data introduces any particular gender issues. For example, whilst statistics on mobile phone penetration in developing countries suggest widespread access to mobile devices, there is a significant gender gap in mobile ownership and access, with women much less likely to have control of a handset than men. Gender issues may also arise in relation to the culture and practices around open data.

In a recent First Monday article, Joseph Reagle suggests that the ‘free culture’ movement associated with open source software and open knowledge products like Wikipedia possess a gender gap that is potentially event greater than the very gender unequal general computing culture from which it arose. Reagle argues that the ideas of ‘openness’ current in these communities can be used to dismiss concerns about gender gaps, and paint them as an issue of choice, rather than highlighting the wider structural factors that lead to the massive underrepresentation of women in online free software and open knowledge construction. For example, Reagle points to the “double shift” of women’s time, and the ways in which the ‘free time’ used to contribute to creation of open culture, whether through evenings away from work, or hack-days and other events, is unequally distributed between women and men.

Does this critique carry across the open data? It is apparent that the open data field is far from gender equal – at least in terms of advocates for open data, and the creators of tools, platforms and analysis built upon data – although whether it is male dominated to the extent that other fields such as open source contribution are is yet to be measured. In part any gender imbalance may be attributed to the connections between the open data community and the open source and free culture communities, which are already have a significant gender imbalance. However, we should also be open to deeper issues of epistemology: whether the very notion of resolving questions of ownership or fact through datasets, rather than through processes of dialogue, is itself gendered. How far advocacy to open up datasets moves into advocacy for the primacy of data over other ways of knowing, and how data is used and interpreted, has a bearing on whether gendered systems of power are being reinforced or challenged.

An ongoing discussion…

The above remarks are just some first thoughts on the topic. The Land Portal dialogue is running for another week, and I’m looking forward to diving spending time looking at what others are saying to better understand how open data and land can connect in constructive and positive ways.

I hope we might also develop some lines of the gender discussion more in upcoming work of the Open Data in Developing Countries project.

Notes on open government data evaluation and assessment frameworks

The evaluation of open data initiatives has become an increasingly pressing concern for many. As open data initiatives have proliferated, there have been a number of attempts to develop assessment, monitoring and measurement frameworks that can inform policy, and that will support comparative assessment of different open data efforts, or that can guide the creation of new initiatives. In this post I look at a number of the frameworks that have been put forward, or are currently in development. This post is part of my thinking aloud in planning for some common research tools in the Exploring the Emerging Impacts of Open Data in Developing Countries project, and in putting together a methods section for my PhD.

My working notes for this post, with a short summary of each of the frameworks described can be found here.

What is being measured?

The frameworks I explored fall into three broad categories:

  • Readiness assessments – looking at whether the conditions exist for an open data initiative to be started or successful.This category includes the Web Foundation Open Government Data Feasibility Studies and World Bank Open Data Readiness Assessment.
  • Evaluating implementation – looking at whether existing initiatives, or organisations, meet some criteria for ‘good’ open data implementation.This was the largest group, including the Five Stars of Linked Open Data (Berners-Lee, 2010); The Open Data Census [LINK]; The Open Data Index (Farhan, D’Agostino, & Worthington, 2012); mOGD-I; MELODA (Garcia, 2011); The State of Open Data method (Braunschweig, Eberius, Thiele, & Lehner, 2012); the assessment of open budgetary data in Brazil (Craveiro, Santana, & Alburquerque, 2013); Grading Government’s Open Data Publication Practices (Harper, 2012); and the Data Openness Index and Government Data Openness Index (Murillo, 2012).
  • Impact assessment – none of the frameworks I looked at explicitly address impact (though there are a number of studies that have developed methods to try and quantify economic impacts of open data (Vickery, 2011)), but a few frameworks in development do seek to make connections between implementation and different kinds of potential open data impacts (Jetzek, Avital & Bjorn-andersen, 2012; Huber, 2012).

The frameworks I explored operate at a number of different levels. Readiness assessments tend to operate at the country level, although the World Bank suggest their Open Data Readiness Assessment can also be applied at sub-national levels.

Implementation assessments may target a variety of:

  • Individual datasets
  • Open data portals
  • Individual institutions
  • Open data initiatives
  • Whole countries

A number of frameworks generate aggregate assessments of initiatives, portals or institutions based on aggregating up numerical scores for the ‘openness’ of datasets belonging to that parent entity. For example, MELODA, and a recent implementation of the Five Stars of Open Data on Data.gov.uk assign scores to institutions based on an average of the scores assigned to their individually published datasets.

How does measurement take place?

There are a number of non-mutually exclusive approaches to measurement, including:

  • Survey of technical features – identifying a list of features that datasets or data portals should possess, and carrying out an automated, or manual, survey of whether these features are in place. These approaches are generally agnostic as to the subject of the data, but are interested in whether datasets are machine readable, openly licensed and well catalogued (Braunschweig et al., 2012; Garcia, 2011) and the 5 Stars of Linked Open Data.
  • Specific dataset checklist – these approaches determine a short list of particularly important datasets and ask about whether these are available, and then conduct a technical assessment of these particular datasets. The Open Data Index, and Open Data Census both adopt this approach.
  • Domain specific assessments – Harper’s grading of US departments dataset publication practices identifies ideal features of specific datasets, and evaluates them against these (Harper, 2012). For example, where a standard exists for representation of a particular kind of data, it would judge a department higher where it adopts this standard.
  • Added value features – The Open Data Index, and the proposed mOGD-I model include questions on whether applications have been built on top of data, or whether there are accompanying tools around datasets. The readiness assessments also consider the capacity of states to support and stimulate activities that might increase uptake and use of open data.
  • Features of the environment – the readiness assessments major on this, describing social, technical, legal, political, economic and organisational contexts for open data.
  • Expert surveys – most assessment frameworks draw to a degree on survey methods, even though some attempt to automate elements. In most cases a single informant is used.

Some frameworks look to generate a single number that can be used to rank the subject of analysis, as in the case of the Open Data Index, MELODA, or Data.gov.uk implementation of the 5-stars of open data model. Other frameworks present a multi-dimensional assessment of their subject, either omitting aggregation altogether, or providing aggregation along a number of dimensions such as legal, organisation, technical etc. 

What does all this mean for the ODDC project?

In the Exploring the Emerging Impacts of Open Government Data in Developing Countries research project there are a number of things we want to try and understand.

  1. How does the context that an open data initiative operates within affect the use of data in governance processes?
  2. How do the technical features of an open data initiative affect the use of data in governance processes?

The first question draws upon the sort of data that might feature in a readiness assessment. The second draws upon the sort of data gathered in an implementation assessment. Like Huber (2012), and Jetzek et. al. (2012) we are hypothesising that the way an open data initiative is implemented may be slanted towards particular kinds of data re-use and thus impacts. By trying to connect context, implementation and impacts, we will be looking to both draw upon, and inform the further development of, evaluation frameworks.

Within the project we need to be able to perform evaluation at two levels:

  • The macro level – as we build upon learning from the Web Index to refine methods of generating country-level indicators that can inform an assessment of the extent to which a country has capacity to benefit from open data, and the extent to which this is being realised.
  • The case level – as the individual qualitative cases in developing countries generate comparable descriptions of how open data has been used.

The development of the macro level framework will be an ongoing task over the next year, but with the individual cases kicking off very soon, there is some immediate work to be done to develop two resources: a simple contextual questionnaire for describing the environment in a country or city; and a dataset assessment tool that can be applied at the level of individual datasets, collections of datasets, or intermediary platforms.

Hopefully a further iteration of working through the frameworks listed in this post will inform the development of these. As I get started on this task I would welcome pointers to any resources I have missed.

References

Berners-Lee, T. (2010, July). Linked Data – Design Issues. Retrieved from http://www.w3.org/DesignIssues/LinkedData.html

Braunschweig, K., Eberius, J., Thiele, M., & Lehner, W. (2012). The State of Open Data Limits of Current Open Data Platforms. WWW2012. Retrieved from http://www2012.wwwconference.org/proceedings/nocompanion/wwwwebsci2012_braunschweig.pdf

Craveiro, G. da S., Santana, M. T. De, & Alburquerque, J. P. de. (2013). Assessing Open Government Budgetary Data in Brazil. ICDS 2013.

Farhan, H., D’Agostino, D., & Worthington, H. (2012). Web Index 2012. Retrieved from http://thewebindex.org/2012/09/2012-Web-Index-Key-Findings.pdf

Garcia, A. A. (2011). Methodology for Releasing Free Data (MELODA) (pp. 1–15). Retrieved from http://meloda.org/index.php/meloda/category/1-meloda

Harper, J. (2012). Grading the Government’s Data Publication Practices.

Huber, S. (2012). The fitness of OGD for the creation of public value. In P. Parycek, N. Edelmann, & M. Sachs (Eds.), CeDEM12 – Proceeding of the Conference for E-Democracy and Open Government. CeDEM.

Jetzek, T, Avital, M., & Bjorn-andersen, N. (2012). The Value of Open Government Data : A Strategic Analysis Framework. Orlando. Retrieved from http://openarchive.cbs.dk/handle/10398/8621

Murillo, M. J. (2012). Including all audiences in the government loop: From transparency to empowerment through open government data.

Vickery, G. (2011). Review of Recent Studies on PSI re-use and related market developments. PAris.

 

Exploring incentives for transparency in developing countries

[Summary: brief reflections on the dynamics of transparency in developing countries]

Doug Hadden of FreeBalance (developers of Public Financial Management software) has posed the question “What are the Incentives for Transparency in Developing Country Governments?“. Doug notes that many of their developing countries customers have been interested in implementing transparency portals such as Transparency.gov.tl, and transparency has been a major topic of conversation at their annual user group meeting.

My initial draft of a comment became rather long, so here are a few reflections in reply to that question by way of a blog post.

Framing the question

First, we need to identify whether a distinction between developed and developing countries has particular relevance to this question. There are three main areas where the distinction could be being drawn: degree of political freedoms and democracy; levels of corruption; and state capacity and effectiveness. Malesky et al. comment on the fact that we might expect the dynamics of transparency initiatives to be different in more authoritarian regimes. We might anticipate both that authoritarian governments have less incentive to pursue transparency, and that if transparency is pursued, it is less likely to be effective in changing policy and implementation outcomes, further undermining the case for it’s adoption. A similar incentive issue may exist for regimes with high levels of corruption. If political elites are seen to be corrupt, then it may be surprising to see those elites adopt and pursue transparency policies. Lastly, on the question of state effectiveness, it might be argued that it is surprising that a democratic state with limited capacity adopts transparency as a policy instrument over other available public sector reforms. In his chapter in Corruption and Democracy in Brazil Bruno Speck discusses the importance of empowered audit and oversight institutions to ensuring effective use of public finance. Transparency may be a means by which actors outside the state can put things on the agenda of empowered institutions, but without effective state mechanisms to enforce compliance with laws once problems are identified, it may look to be a flawed policy tool. All these distinctions (levels of freedom, corruption and state capacity) might have some degree of correlation with the development status of a country, though the line is not clear cut. Alexandru Grigorescu’s paper on international organisations and government transparency points to one further distinction worth noting: the higher levels of involvement of international organisations in developing countries.

Secondly, we need to identify what sort of transparency we are talking about. David Held suggests we need to distinguish four directions of transparency: upwards (hierarchical relationships; when the superior can see the actions of the subordinate), downwards (when the ruled can see the behaviour/results of their rulers; agencies can see behaviour up the management chain); outwards (when agents inside an organisation can see what is happening outside it); and inwards (when those outside can observe what is happening inside the organisation). Using these categories we can interrogate how a particular transparency initiative is functioning. For example, a transparency portal may be giving inwards and upwards transparency to government, but it may not only be giving new insights to citizens, it may also be allowing agencies who previously struggled to get hold of information due to bureaucratic blocks in mid-level agencies or departments, to more effectively access information they need to do their jobs. It is also important to answer the question ‘transparency of what?’. Transparency of outdated information, or information with little political salience is dramatically different from releasing up-to-date information on the most recent public spending, such as occurs through Brazil’s transparency portal.

With these distinctions in mind, what might some of the incentives for transparency be? All the following are hypothesis only, and more work would be needed to track down data to explore them more, or studies that might look at these effects in more depth.

1) The figleaf
Starting with a sceptical suggestion. Publishing low-salience information with a large fanfare can be a good way to gain attention and initial credibility without actually facing high political costs. Similarly, in regimes with low state effectiveness, where corrupt activity isn’t captured in the data, or there are no balancing audit and reconciliation mechanisms such as exist in the Extractives Industry Transparency Initiative, then the potential credibility gain from developing a transparency initiative outweighs the potential risks. With growing international focus on transparency initiative, the reputation pay off from an adopting an initiative may be high right now, and may allow other more substantive reforms to be sidelined.

2) International and external pressure
Less sceptically, we might see transparency initiative adoption as a genuine measure by governments, but primarily taking place in response to international pressure or funding. This might be from international agencies, as donors fund and require transparency and governance reforms. Aid Transparency portals in particular may come down to pressure from donors to have accountability on how funds are being spent. Or it might be from business, and markets, as assessments of doing business in a country are affected by the degree of transparency.

3) Bottom up citizen and political pressure
Citizens may be demanding transparency. Certainly in the global development of Right to Information legislation, bottom up citizen pressure has played a significant role. Where democratic mechanisms are operating, then citizen pressure can provide incentives for greater transparency. Similarly, as Francis Maude often states, political parties in opposition are often advocates of transparency.

4) Improving information flow
Effective states need to process a lot of information, and transfer it between many different organisations and agencies. Doing this inside the state, in access-controlled ways, through person-to-person relationships can be complex and costly, and involve lots of interoperable IT systems. By contrast, with open data, you place data online in a standard format, and then anyone who needs it can come and take a copy (or so the theory goes; note that feedback loops from the previous person-to-person relationships fall out of the picture here). Publishing data transparently can get around bottlenecks in information exchange. This may be particularly important when public services are being delivered by lots of non-state actors who could not be brought inside government systems in any case.

This is certainly part of the idea behind the International Aid Transparency Initiative, which seeks to ensure aid receiving governments and agencies can get a view of available resources without having to spend considerable labour requesting and reconciling information from many different sources. Here, the goal is efficiency through outwards and horizontal transparency, and other forms of upwards transparency and visibility of data to citizens may be a by-product.

5) Addressing principle-agent problems
Principle-Agent problems concern the challenges of a principal (e.g. the government;) to motivate an agent (e.g. a contractor;) to act in the interests of the principlal, rather than in the agents self-interest. There are all sorts of principal-agent problems at work in government. For example, the citizen as principal, trying to get government as agent to act in their interests; central government as principle, trying to get an implementing agency to act in their interest; or donor as principle, trying to get a government to act in their interest. Transparency can play a role in all of these, though the form the transparency may take can vary.

Governments are not monolithic. Corruption benefits certain actors in government, and not others. Transparency can be a policy that one area of government uses to secure the behaviour of another, through allowing parties outside of government to provide the scrutiny or political pressure needed to address an issue. The nature of transparency mandates is interesting to explore here. Transparency in one area of government can also empower another. For example, both the UK and China have sought to increased the transparency of local government. This may increase citizen oversight of government, but it can also increase upwards transparency of the periphery to the centre, strengthening central government capacity.

Exploring further
This post has taken a fairly general view of some of the dynamics that might be in play in a decision to adopt a transparency initiative. There are undoubtedly other significant dynamics I’ve missed. And going with my own point on distinguishing both the type and subject of data being made more transparent, any more detailed account is likely to need to be about transparency in particular domains rather than general.

Of course, looking back I suspect I may have misread Doug’s question, which could have been asking more for arguments that can be used to convince governments to adopt transparency, rather than an analytical look. However, I hope some persuasive arguments in favour of transparency can also be distilled from the above.

References
Grigorescu, A. (2003). International Organizations and Government Transparency: Linking the International and Domestic Realms. International Studies Quarterly, 47, 643–667.

Malesky, E., Schuler, P., & Tran, A. (2012). The Adverse Effects of Sunshine: A Field Experiment on Legislative Transparency in an Authoritarian Assembly. American Political Science Review, 106(4). doi:10.1017/S0003055412000408

Power, T. J., & Taylor, M. M. (2011). Corruption and Democracy in Brazil: The struggle for accountability. University of Notre Dame.

Open Data in Developing Countries


The focus of my work is currently on the Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC) project with the Web Foundation.

MSc – Open Data & Democracy