Texts from various sources, such as social media, online chats or trading platforms, are relevant for police work. Texts such as posts, comments, chats, advertisements, articles or internal reports are examined, which differ greatly in many respects. As a result, several data sets are to be created that depict police scenarios. These data sets will serve as a basis for the training of explainable models. At the same time, the data sets are to be used as a reference to compare the quality of results and performance of basic models with the explainable models.
As AI models use training data to learn how to solve specific tasks, the quality of the data is crucial for the quality of the model's results. If, for example, the data predominantly contains people of one origin, age group or gender, this creates a cognitive bias that may put these groups of people at a disadvantage. Balanced training data is necessary to prevent this. If this is not available, methods for debiasing can be used, which are being investigated, developed and evaluated in this project.
Furthermore, solutions are being developed for creating explanations of decisions made by AI models. Automated marking of relevant text modules such as people, times, places and objects can be used. It is also possible to visualize how strongly specific words influence the decisions of a model. In addition to the explanation of models and algorithms, explanatory comments can also be created for an evaluation, which can be used in investigations into incitement to hatred, for example.
The methods, models and algorithms developed in this sub-project will ultimately be integrated into a demonstrator that will be jointly developed by several parties in the overall network. The police users (including the Munich Police Headquarters, the State Criminal Police Offices of Baden-Württemberg and North Rhine-Westphalia and the Federal Criminal Police Office) can then experiment with models and explanations so that the methods and representations used can be optimized before the application is deployed.
Transferability testing and standardization activities will take place throughout the course of the project. The solutions developed for explainability, debiasing and robustness of models are shared with the other project partners from the fields of face recognition, speaker recognition and object detection in order to identify and implement potential transferability. The standardization activities are carried out in cooperation with the German Institute for Standardization (DIN) in order to identify standardization needs and carry out standardization activities with regard to the above-mentioned aspects at European and international level.