New technology has been developed which may be able to detect violent behaviour in large crowds of people, enabling law enforcement to potentially stop trouble before it gets out of hand. It could also have applications that could help to prevent a number of other crimes and disturbances, including kidnappings in public places, vandalism and illegal border crossings.

The technology was developed by The University of Cambridge, in conjunction with the National Institute of Technology in India and the Indian Institute of Science. Explaining the reasoning behind using AI to spot violent behaviour in groups or crowds of people, the researchers stated in their report:

“Law enforcement agencies have been motivated to use aerial surveillance systems to surveil large areas,”

"Governments have recently deployed drones in war zones to monitor hostiles, to spy on foreign drug cartels, conducting border control operations, as well as finding criminal activity in urban and rural areas.

“One or more soldiers pilot most of these drones for long durations which makes these systems prone to mistakes due to human fatigue."

How does it work?

The technology uses artificial intelligence linked to sophisticated surveillance cameras. A camera known as a quadcopter hovers over the scene and studies the body movements of the crowd as a whole and of individuals. It is then able to detect actions that are deemed ‘aggressive’ – such as:

  • Kicking and punching
  • Strangling
  • Shooting
  • Stabbing.

The AI surveillance technology system can then raise the alarm to law enforcement officials, who can investigate further.

The system is believed to be around 85% accurate, which is a staggering figure when you consider the complexity of identifying individual ‘aggressive’ actions in a crowd that could contain hundreds or thousands of people.

Now for the techy bit…

The AI surveillance technology, known by some as The Eye in the Sky, works by firstly detecting individual people from the camera’s images. It does this through something called the feature pyramid network, a type of convolutional neural network. Once individuals are identified, the AI can then use a scatternet (another type of convolutional neural network) over the top of a regression network to analyse the pose of each person in the image.

If you’re still following this, it’s now time to look at how the AI identifies which poses are deemed to be ‘aggressive’. The system breaks down the outline of the body into 14 key points, which helps it to identify where the limbs and face are in relation to each other, therefore identifying a pose or action that could be seen as threatening or violent.

Crucially, as it is artificial intelligence, the system can learn about violent and aggressive poses as it identifies them. This can help to reduce the likelihood of mistakes. It is also able to do all of this in real-time, as the images from drone cameras are taken, stored and processed in the cloud.

There’s still a long way to go until 100% accuracy

While at present the Eye in the Sky technology has been able to produce an 85% accuracy rate, there are a few crucial points to remember:

  1. It has only been tested on volunteers in artificial settings who were pretending to attack each other, who were likely using exaggerated poses. It has not yet been rolled out to real crowds in public places.
  2. It has only proved so far to be effective on smaller groups of people, not large crowds. For example, the accuracy was 94.1% when just one person was captured on camera. This fell to 84% for five people and drops yet again to 79.8% when ten people are in the image. According to the inventors of the technology, this drop in accuracy was because it was not recognising some of the people in the image – a fairly major flaw at this stage.
  3. The images that were fed into the Eye in the Sky technology were taken from drones that were a maximum of eight metres away. This is not only very indiscreet (as quadcopter drones can be quite loud) but it would also not be feasible to try to capture a larger crowd from a distance of just 8 metres. You just wouldn’t be able to capture a large enough sample.

Are there ethical concerns with this technology?

With the lack of accuracy reported at this stage, some people have understandable concerns about how ethical it would be for law enforcement and government agencies to start using this kind of technology. The ethical concerns echo worries about facial recognition technology, which is not only lacking in accuracy on some occasions but may also display gender or racial bias. Systems such as these can offer to suffer from high instances of false positives, which means that the wrong people or the wrong behaviours can be flagged up.

Amarjot Singh, a PhD student at the University of Cambridge and one of the authors of the report on the new system, expressed his own concerns about the potential applications of the technology in the future. He said:

“The system [could potentially] be used to identify and track individuals who the government thinks is violent but in reality might not be,”

"The designer of the [final] system decides what is 'violent' which is one concern I can think of."

He also said:

"One such application of AI is in surveillance systems! AI can help develop powerful surveillance systems which can assist in identifying pernicious individuals which will make society a safer place. Therefore I think it is a good thing and is also necessary.

"That being said, I also think that AI is extremely powerful and should be regulated for specific applications like defence, similar to nuclear technology."

There is another cause for concern too, relating to the images that are stored and processed in the cloud in real-time. Were they to be accessed by the wrong person or organisation, this could represent a major security concern not to mention a legal headache for the operator of the Eye in the Sky system. To combat this point, the developers have confirmed that after the neural network processes each frame sent by the drone surveillance camera, it automatically deletes it.