The objective of the AI Oversight Lab is to enable public entities to develop and use AI tools, while adhering to ethical principles and respecting human rights, and to facilitate oversight bodies in their auditing role.

Problem Context

AI and Machine learning based tools are more and more available and used both by public and private bodies, influencing decisions and processes in many key sectors of the society. It is becoming therefore extremely important and urgent to develop policies and practices to properly assess and evaluate AI systems under the lens of the Trustworthy AI principles, allowing the developers, the public entities, and the citizenship to answer questions about the fairness, the security, the compliance to privacy rules, and the social and ethical impact of a tool.


AI oversight lab structure To tackle such a complex and multi-disciplinary problem, this project is organized in three main lab-rooms that work in strict collaboration. Lab Room 1 focuses on the regulatory and policy framework to perform effective auditing and oversight on AI use in the public sector. Lab Room 2 is dedicated to identifying development best practices and tools to comply with the principle of Trustworthy AI. Lab Room 3 is aimed to research and develop measures and tools to evaluate and audit AI systems. Next to these three lines of research, the project is working to create a community of practice, where different public and commercial parties can join (e.g. municipalities or oversight authorities). Concrete problems and use cases are proposed, and all the stakeholders in the community can benefit of the knowledge and solutions developed for specific scenarios.


In the first year, the research conducted in the three lab rooms has been developed next to the concrete use case offered by the municipality of Nissewaard; TNO has conducted an independent evaluation of a commercial ML tool used by the municipality to detect possible abuses of social benefits. This evaluation showed that, despite the fact that the developers clearly put effort to satisfy the principles of Trustworthy AI, the ML algorithm used in the tool was still not robust enough to comply with the European guidelines. Based on this, the municipality of Nissewaard has decided to stop using the tool.

Guided by this specific use case, the research of Lab 2 has focused mainly on identifying guidelines to develop robust and reproducible AI, presented in a handbook. Both Lab 1 and Lab 3 have conducted an extensive review of existing literature, frameworks and tools to evaluate and audit AI, identifying the current state of the art, and future options and challenges. Finally, a first prototype for a visual and interactive tool for (internal) evaluation of AI systems is proposed. This prototype is based on the literature and the experience of different stakeholders. A list of yes/no questions guides the auditors to analyse the AI system and spot possible problems; where available, information about quantitative measures and tools is provided. At the end of the evaluation a report is generated including the answers, notes and comments of the auditor, a visual representation of the auditing flow; possible weaknesses of the system are also highlighted. At this stage of the development only questions about “Diversity, non-discrimination and fairness” are implemented. For more information about the prototype, contact Ioannis Tolios ( or Lucia Tealdi (



  • Jok Tang, Sr Consultant, TNO, e-mail:
  • Joachim de Greeff, AI consultant, TNO, e-mail: