The Horizon Scanner can gather information from online sources, and provide and visualize potential trends.

Problem Context

The number of cyberattacks on organizations is growing. To increase cyber resilience, organizations need to obtain foresight to anticipate cybersecurity vulnerabilities, developments, and potential threats.


To help experts explore relevant data sources for potential threats and trends, and to speed up the process of foresight, text mining and information retrieval techniques can be used. In this project, we propose a tool, the Horizon Scanner, which uses natural language processing techniques to scrape and store data from websites, blogs and PDF articles. Additionally, the tool can search a database based on a user query, show textual entities in a graph, and provide and visualize potential trends.


The Horizon Scanner was validated by the Dutch Defense Cyber Command for its use in the cybersecurity domain. Through an initial requirements session and a user evaluation, we explored the potential of the tool to help analysts explore the web for trending topics on cyber security. Although the proof of concept was not optimized for speed and this was not an evaluation criterion, the results were affected by the fact that the current version of the tool cannot compete with commercial search engines in terms of speed and amount of data available. The tool was perceived as useful in providing an extraction of the most important terms in a certain period in time, an important functionality because this helps in identifying weak signals of threats or vulnerabilities. Future iterations of our tool will emphasize human-centric aspects, such as workflow support and increased explanation and controllability of the analyses.

A benefit of the Horizon Scanner is that it can be transformed to scrape other sources and thereby be used in other domains. Interested? Please let us know!


