Social Listening

In popAI project, the European Citizen Action Service (ECAS) is in charge of bringing the citizens’ perspective in understanding how people perceive AI tools being employed by law enforcement agencies (LEAs). To achieve this goal, ECAS employs a multi-layered approach, combining both proactive tools (the crowdsourcing platform that you can still contribute to) and passive ones, like the social listening.

As mentioned, social listening is a way of monitoring web content for key themes and discursive trends. It is used by marketing professionals for their business purposes. Using mainly data from social media platforms, this allows them to then target users with very specific interests, maybe even specific people themselves for future campaigns.

In the context of popAI, ECAS is conducting social listening in order to gather and assess the diverse citizen attitudes towards AI and policing. It should be noted that ECAS makes use of ethical social listening, which does not collect any data about the individuals, but is only interested in the content of the messages or conversations themselves. This prevents any possible biases about the data and respects the privacy of the people who voiced the opinions.


To analyse data, of course, one must first get it from somewhere. For the purposes of popAI social listening, ECAS researchers use the CommonCrawl database for the years 2013-2021, which “crawls” and records the internet – blogs, publications, research papers, news articles and everything else except for social media.

They then cast a net of keywords that identify AI tools (e.g biometric identification) and aspects of possible concern about them (e.g. privacy, accountability, etc.). Any opinion that is present in the CommonCrawl database and matches these keywords becomes part of the dataset and is filed accordingly.

Up to this point, relevant results have been gathered, thus allowing to track the volume of conversations that happened over time, due to their timestamps. What is missing??? The sentiment. How do we understand if an opinion is positive, negative or neutral and how can we make conclusions on the overall disposition of the public towards the topic of interest?

For this purpose, a language model was used. Such language model was trained on millions of texts to determine sentiment in numerical format, from -1 (the most negative) to 1 (the most positive) and decimal numbers in between.

Searched AI tools

  • Biometric identifiers
  • Cyber Operations
  • Decision making algorithms in the justice system
  • Police hacking
  • Predictive policing
For each of those tools, we wanted to find out the sentiment as they relate to possible concern to citizens

Possible citizen concerns

  • Discrimination
  • Efficiency, Reliability, Accuracy
  • Legitimacy
  • Privacy
  • Transparency and Accountability


The social listening activity allowed to gather several information about citizen perceptions and concerns related to the use of AI-based technologies in the security domain. The final report (available at the end of the project) will present all the results in a comprehensive fashion; this section reports the most significant and interesting results that have been gathered in four groups.

Topics generating most online conversation

When it comes to the topic generating most conversations online, there is an uncontested winner: Biometric Identifiers, or biometrics.

Biometric Identifiers gained more than 50% of the total results, followed by police hacking and predictive policing. Cyber operations brought a total of 441, which is statistically irrelevant and is below 1% of all results.

Biometric identifiers are unique and measurable characteristics that allow to label, identify and authenticate individuals. They are two types of biometrics:
  • Physiological biometrics employ physical, structural, and relatively static attributes of a person (e.g. fingerprints, pattern of their iris, contours of the face, or geometry of veins).
  • Behavioural biometrics establish identity by monitoring the distinctive characteristics of movements, gestures, and motor-skills of individuals.

Topics generating positive sentiment

In order to be able to visually follow the trends over time, a  digital dashboard of each topic and its subtopics was created, including their progression from 2013 to 2021. It charts both the number of conversations happening year-to-year and the average sentiment for the year.

Although all topics generate a lot of neutral and negative sentiment, and much less positive, there is one subtopic that distributes sentiment equally, namely “Efficiency, Reliability and Accuracy”.

This shows that, while people do have a great sensitivity of the negative ways that AI can impact their lives and privacy, they are also distinctly aware of the potential positive uses of such tools in the work of police, if wielded ethically.

The bars below show that the average estimate ranges somewhere between -0.2 and +0.2, which means neutrality.

Topics generating steady progress towards more interests

Another interesting trend can be found when viewing the number of conversations centered around the topics of “Predictive Policing” and “Decision making in the justice system”.

Although these are not the most popular in absolute terms, both topics show a steady progression towards generating more and more interest.

When it comes to “Predictive policing” though, it is being discussed with ever more negative feelings.

Unexpected results

A captivating result comes up in relation to the sentiment distribution for “Biometric identifiers”, subtopic “Privacy”. The research team expected this subtopic to be the most negatively discussed aspect under “Biometric Identifiers”.

While it did, indeed, generate a lot of results (more than 66,000), the vast majority of them were neutral (more than 52,000). One possible explanation for this could be the adoption of biometric identification tools in many consumer goods, such as phones, which makes the technologies more acceptable for people when used in the police work as well.

Still, there’s a notably negative sentiment when it comes to whether these tools lead to discrimination. Strongly negative opinions are almost half of all found, represented by the left-most column in the graph below.

The social listening activity gave the popAI project valuable insight into the feelings of people towards artificial intelligence being used by law enforcement agencies. Taking this knowledge, we can now move to the next phase – asking citizens about their ideas on how these concerns can be fixed. Based on the crowdsourcing questions and the data from the ethical social listening, we want to hear citizen’s solutions and recommendations in three key areas:

Biometric identification and privacy

How can police use biometric identification tools while maintaining citizen privacy?

Police hacking and

How can we make sure police hacking operations remain within the boundaries of the law and not gather data illegitimately?

Predictive policing and discrimination

How can we avoid discrimination and prejudice resulting from predictive policing algorithms?

This knowledge is crucial and preliminary to move to the next phase – asking citizens possible solutions , thus providing their ideas on how these concerns can be fixed.