The UK Home Office is developing a Law Enforcement Data Service (LEDS) and Home Office Biometrics (HOB) programme that will bring together formerly separate policing databases and provide new Automated Facial Recognition (AFR) capabilities.
The Police National Computer (PNC) has existed for over 45 years, and contains criminal records. The Police National Database (PND) was created in 2007 and primary contains 'intelligence' records used by police forces. The HOB programme provides biometrics services used in law enforcement and immigration and asylum cases, as well as manages the National DNA Database.
This is a controversial development, and has a complex data governance structure including the National Law Enforcement Data Programme (NLEDP) Senior Leadership Team, Senior Responsible Officer, NLEDP Board, NDLEP Business Design Authority and OCiP (Operational Communication in Policing) as a voice of the police service within the project.
The Home Office wanted a mechanism to engage with civil society during the development of the project. They describe the 'Open Space' as intended "to establish a productive space where the Home Office and civil society could have safe and productive conversations about the National Law Enforcement Data Programme (NLEDP)". This operates in parallel to formal consultation channels.
The process was established by the Home Office, and facilitated by Involve. 24 CSOs and Regulatory & Oversight Body participants participated in the workshops.
Between the start of the project in July 2018, and the first 'annual report' in May 2020, 9 full or half-day workshop sessions took place. Background papers were prepared for each and circulated in advance.
Workshops involved presentations, plenary discussions and table discussions with Home Office representatives responsible for the areas under discussion, civil society organisations and other invited regulatory and oversight bodies.
Discussions took place around Data Protection Impact Assessments (DPIAs); Custody Image Policy; Data Sharing; Data Quality; Data Retention; Individual Rights; Access Levels and Controls; the National Register of Missing Persons; Governance, oversight & inspection; LEDS Code of Practice & training; Systems demonstrations; Audit process; and the overall Open Space process.
The first annual report notes a number of areas where the discussion had an impact on developing policy, either as a result of helping the Home Office to understand concerns, improving the ways issues would be communicated, or informing decisions. The report also notes a range of 'sticking points and outstanding issues' where concerns from civil society remained.
The report also notes that 24 civil society organisations have taken place in some or all of the Open Space workshops, but these organisations have not been identified. This was justified on the basis of allowing frank and open exchange of views. However, it means that little is reported about the particular groups whose interests were represented in the process.
While background materials were intended to be confidential, one CSO participant (Privacy International) has published many of the documents on its website to inform independent advocacy and campaigning on the NLEDP.
The first annual report states that changes were made to the architecture for police database access to records from the Driver and Vehicle Licensing Agency (DVLA) as a result of discussions in the Open Space.
Participation in the Open Space appears to have supported one CSO, Privacy International, to pursue outside advocacy and to call for greater parliamentary scrutiny of the project.
No information after the May 2020 annual report was found.
The Open Space experience explored data governance through collective lens by acknowledging that data processing impacts specific groups differently (such as children and immigrants), a debate that was raised in many workshops and, as a result, is now part of the Home Office agenda.
Dominic Smith
Seven Shooter
Remote mobile phone-based data collection offers a key opportunity to better understand lived experience of mental health. However, participation in app-based remote studies often drops off quickly.
Wellcome trust commissioned Sage Bionetworks (Sage) to carry out feasibility tests and prototype a global mental health databank (GMHD) to capture rich, longitudinal, electronically-derived data from young people with a focus on mental health, and to support research into the approaches, treatments, and interventions that may be relevant to anxiety or depression in 14-24 year olds.
The project placed a focus on understanding appropriate models of data governance and addressed the question of "How do we create a data governance structure that gives real voice to youth?". Work was informed by an initial set of principles:
The results of this engagement were used to produce an assessment against four ‘Go/No-go’ criteria, resulting in a judgement that the project was viable against ‘data governance and ethics’ and ‘data specification and structure’ criteria, uncertain against criteria on engagement levels, and raising a ‘Stop’ flag against ‘funding sustainability’ due to concerns about commercialisation of young people’s mental health data.
Findings were also written up as a ‘data governance specification’ to be used in the design of any future stages of the databank development.
Data should be available to researchers globally: with a need to balance privacy concerns with open science practice.
The project needs a diversity of participants - including collecting data across geographical and cultural boundaries.
The project has set up panels of young people with lived experience of depression and anxiety in three countries.
Panels have been given formal decision making responsibilities in the project.
Eric Ward
Felicia Buitenwerf
The United Kingdom conducts a full population census every ten years. Data from the census is widely used to inform national and local policy making.
Individual census records are managed securely and the ONS produces and publishes statistical datasets broken down by geography, demography and other variables. These data extracts are carefully designed to make sure that personal data is not accidentally revealed, and, because of the cost of producing and quality controlling each extract, only a subset of the theoretically possible breakdowns are ever produced are published.
The Office for National Statistics has run a series of consultations to shape the design of Census data collection, and to determine priorities for the publication of census data and extracts. The ONS used a written consultation process to gather feedback on priority extracts to create, categories to use, and how these should be labelled.
As a result of the consultation a number of changes were made to the categories that will be used to present data and to the schedule for data release.
In a number of cases, ONS committed to carry out further research.
The categories used in the census, and the data that is made available from the census, can have major impacts on group identity and access to resources. Census data is used to make policy, allocate funding, and is may feature within the training of machine-learning models.
Many of the respondents to the consultation represent groups with particular data needs, or who might be specifically affected by the choice of categories or disaggregations, or the publishing schedule, for the census.
Ryoji Iwata
Yolanda Suen
In 2017, Monash University in Australia made a commitment to achieve Net Zero carbon emissions by 2030. As part of this programme of work, the University initiated a four year project to explore how to apply net zero principles to the precincts (local area) around University buildings. This involves identifying opportunities to apply 'smart city' technologies such as sensor networks and micro-grids to an area used by both students, and local businesses and residents.
The introduction of new urban data collection tools and platforms, even when oriented towards public benefit, can raise concerns about issues of privacy, ownership and control. The proposed technologies, whilst introduced by the University, stand to affect both university and resident communities.
Researchers identified that it was important to find ways in which local citizens might have greater oversight of, involvement in, or say over, the deployment of technologies as part of the Net Zero precinct.
Researchers adopted a two-stage process, involving idea generation through online workshops, and then a structured process to understand how ideas might be prioritised as assessed.
The Net Zero precinct project is ongoing and has not recently reported any updates.
It is not clear if any of the ideas developed through the workshops will be adopted as part of the wider Monash Net Zero project.
Participants were concerned with how data about them might be used.
Citizens were brought together to identify data governance concerns in relation to a smart city project, and to identify, prototype and evaluate future participatory tools, methods or processes that would allow a wider group of citizens to be engaged in ongoing data governance.
While a number of the prototype ideas emphasise individual models of control over data (e.g. 'Your own data dashboard'), at least one (Participatory Planning) envisions greater transparency over proposals for data use, and encouraging citizens to think together about how data might be used, including incorporating discussion and voting features.
Octavian Rosca
Yeshi Kangrang
Engineers or managers of technology projects often face decisions about how to use data in order to achieve their goals.
The data governance frameworks commonly adopted for public sector technology projects often exclusively emphasise data protection and personal data, and may have "only fuzzy, or in fact negative, protection in place for the public interest more broadly."
In two data governance clinics run by academics from the University of Tilberg for projects in the city of Amsterdam, leaders of projects facing data governance questions were supported to think through their projects, using a facilitated discussion.
The process helped project engineers and leaders to formalise a conception of how their work served the public interest, and to assess:
Citizen trust & decisions over whether or not to use particular data sources.
The data governance clinics frame discussions in terms of public interest and value, emphasising a collective framework.
Yeo Yonghwan
Ryoji Iwata
"The Data Assembly is an initiative from The GovLab supported by the Henry Luce Foundation to solicit diverse, actionable public input on data re-use for crisis response in the United States. The initiative began in the summer of 2020 with an initial focus on the response to the COVID-19 pandemic in New York City. The GovLab, New York Public Library, and Brooklyn Public Library co-hosted remote deliberations with three “mini-publics” featuring data holders and policymakers, representatives of civic rights and advocacy organizations, and New Yorkers from across the five boroughs."
The motivation for hosting a data assembly is described as a desire to explore the balance between under- and over-sharing of data:
The Data Assembly deliberations took place during July and August of 2020. The GovLab and its partners at the New York Public Library and Brooklyn Public Library facilitated 90-minute remote video conferences with the data holders and policymakers mini-public and the rights groups and advocacy organizations mini-public. Both of these consultations involved between 15–20 experts curated using the GovLab’s Smarter Crowdsourcing methodology. The New Yorkers Mini-Public deliberation occurred on Remesh, an online research and public engagement platform. This consultation featured 55 New York City residents, sourced through the Remesh sampling methodology, with a focus on diversity across age, gender, income, and borough of residence. The Remesh platform provided participants with the ability to respond to polling questions, free-form text prompts, and to indicate their support for the contributions of their fellow participants.
Participants in each of the three mini-publics were presented with three generalized examples of data being re-sed to support the response to COVID-19.
Cross-cutting recommendations (in all three mini-publics):
Data re-use for crisis response in the United States.
The outcomes of the COVID-19 mini-publics addressed a range of collective or community benefits from data sharing, and expressed concern for communities who might be under-represented in current data. They include recommendations for ongoing mechanisms that guarantee public oversight of data sharing and re-use action, and opportunities for public input and accountability. The recommendations highlight the need for a layered approach to participation, with publics, data intermediaries and data stewards all involved in data governance.
Martin Sanchez
M ACCELERATOR
The GovLab, in partnership with UNICEF’s Health and HIV team in the Division of Data, Analysis, Planning & Monitoring and the Data for Children Collaborative, ran a rapid process to develop, refine and validate a topic map on research issues in Adolescent Mental Health, with the goal of informing the design and prioritisation of future data collaboratives.
The process used desk research, two online workshops, and an online survey to workshop participants inviting them to select priorities from the co-developed topic map.
A number of themes were added to the topic map as a result of workshop inputs.
Which research questions should be prioritised when planning potential data collaboratives?
Many of the topics raised in the research map have a collective aspect, such as peer-groups, youth engagement, migration and provision of services to young people.
However, the focus of the participation activity does not appear to have led to collective data governance being addressed directly. Instead, the goal has been to gather a diversity of perspectives from different communities and settings.
Photo by Rich Smith on Unsplash
Topic map developed by the project.
Administrative Data Research UK (ADR UK) was created in 2018 to support researchers to access public sector data, with the goal of improving the availability of research for policy making [1].
ADR UK described administrative data as “an invaluable resource for public good”, and work to facilitate researcher access to data, including sensitive data containing records on individuals. This requires making, and advising on, decisions about when data should or should not be shared, and how its use should be governed.
<!--more-->
Under the UK Digital Economy Act (2017) legal framework, ‘public good’ (sometimes referred to as ‘public interest’ or ‘public benefit’) is broadly defined. Legal public good uses of data may include: to provide evidence for public policies, services or decisions which benefit our economy, society, or quality of life; to extend understanding of social, or economic trends and events; or to improve quality or understanding of existing or proposed research. [2].
The Administrative Data Research UK and the Office for Statistics Regulation, supported by Kohlrabi Consulting, carried out deliberative one-day workshops in London, Cardiff, Glasgow, Belfast, and one online workshop for those who were unable to join in person, with participants around the country in June 2022 to explore what the public perceive as ‘public good’ (or ‘public interest’) uses of data. In July 2022, a follow-up workshop with 10 participants from these earlier workshops reviewed and validated analysis of the first workshops, addressed questions raised by the Project Advisory Group, and explored practical application of the views raised to inform practical guidance.
The project concluded with five main findings from the feedback of participants:
1. Public Involvement: Members of the public want to be involved in making decisions about whether public good is being served;
2. Real-World Needs: Research and statistics should aim to address real-world needs, including those that may impact future generations and those that only impact a small number of people;
3. Clear Communication: To serve the public good, there should be proactive, clear, and accessible public-facing communication about the use of data and statistics (to better communicate how evidence informs decision-making);
What ‘public good’ means to the public and, consequently, how administrative data about them should be used for research and statistics.
Through a workshop agenda that posed the question “Does data use count as ’public good’ if some people benefit while others’ situation remains unchanged?”, participants were invited to explore situations in which the effects of data usage go beyond individual data subjects. That is exemplified by the “real-world needs” feedback from participants, in which they stated the need to use data to address needs in a way that could impact “future generations” or that only relate to a specific group (“a small number of people”).
Credit: Petr Kratochvil
Credit: Lukas
Virtual Reality and 'metaverse' platforms support the creation of public and private virtual spaces that enable rich social interaction. These spaces can support positive social interaction, but may also enable bullying, harassment and other problem behaviours.
Many virtual worlds can be accessed and used by a global community of users. Platform provides, such as Meta (formerly Facebook) are seeking to establish standard rules and procedures that can apply to all the spaces they host, including addressing the right of platforms to access data on, and intervene in, 'private' spaces.
In November 2022 Meta announced a plan to host a deliberative 'Community Forum' as a means of securing broad input into the design of governance processes for products such as Horizon Worlds.
The quantitative polling component of the project addressed seven key questions:
The Deliberative Democracy Lab at Stanford University, working with the Behavioural Insights Team (BIT) ran a global deliberative polling exercise involving more than 6000 deliberators from 28 countries, with a parallel control group who took part in polling, but did not undertake deliberative activities.
Participants were recruited through a network of 14 research partners, with a sampling strategy designed to understand regional (but not country level) differences, and with results weighted to support global generalisation.
Participants were polled with a common set of questions twice. Once before, and once after, deliberation had taken place.
Deliberative took place online through the Stanford Online Deliberation Platform which presented all participants with common background information, managed speaker queues and questions, and captured polling responses.
Deliberation had a moderate impact on the polling outcomes.
Meta has not yet published a response to the report, but has committed to carrying out a future Community Forum on AI.
Should metaverse platforms be making recordings of 'private' online spaces? Should reports of abuse be shared with creators of these spaces, and/or with the platforms they are hosted on? What personal information from those reporting abuse should be shared?
The process appears to be framed in terms of individual participants in a virtual social space, and creators, described as individuals.
Photo by UK Black Tech on Unsplash
Credit Alestivak - Wikimedia
Building on an existing pilot in which a community struggling with noise pollution came together to gather data to take action in their local area, the project added tools to create the conditions under which data-producing citizens can make informed decisions about the data they share.
Since this type of data is very granular, the community members had concerns about the detailed information they were giving away and how this could be used, for example, by private companies to profile homes subject to certain pollution levels, with associated negative impacts on housing prices or insurance premiums.
Participation was initially encouraged through the neighbourhood community previously involved in the Making Sense project. After an open call, participants were selected to cover a spread over Barcelona, geographically, as well as a mix in terms of gender and age.
Workshops were carefully designed to take users on a journey as a community using Smart Citizen Kits (an open hardware sensor) to gather data on noise and air pollution from inside and outside their homes, and collectively decide how they would share the data they gathered. These sensors were directly integrated into the city’s sensor network, Sentilo, to influence city-level decisions.
Consideration was taken to help onboard people with the technology (through step-by-step guidance and the creation of community-level indicators) and emphasize how it could be used as a tool.
The information provided by the collected data (for example, that public gathering and drinking until late at night was causing severe noise pollution) informed raise awareness campaigns, such as the physical installation in the centre of the Plaça del Sol.
Participants were concerned about how environmental data was being collected and shared.
The pilot demonstrated the potential for community engagement to create policy-changing collective insights from data, whilst enhancing privacy by enabling individuals to have control over what they share and where it is used - overall, the data collected outside participants' homes was more frequently shared than the data collected inside.
Having a sensor created conversations with housemates and friends, sparking discussions around privacy and the implications of data sharing, improving collective awareness about privacy. “An attitude of ‘my data are not really mine, they belong to the public’ emerged as ‘a shift from individual data ownership towards collective data ownership.’”
Source: https://media.nesta.org.uk/documents/DECODE_Common_Knowledge_Citizen_led_data_governance_for_better_cities_Jan_2020.pdf
Carol Lin
Jason Goodman