DC13: Predicting financial trends using text mining and NLP

Work Package	WP1
Host Institution	🇷🇴 BBU — UNIVERSITATEA BABES BOLYAI
PhD Enrolment	BBU
Recruiting Participant	BBU
Duration	M9–M45 (36 months)

Objectives

This DC's primary objective is to improve the use of AI-based natural language processing (NLP) solutions in order to predict credit risk and fiscal fraudulent behaviour based on speech text from audit reports, social media, and other sources. Predicting noncompliance based on free-text responses from survey respondents' perceptions. Constructing attitudinal indices based on free text and incorporating them into behavioural models, along with other qualitative or quantitative factors, in order to predict the likelihood of system fraud or the level of risk associated with accreditation.

Expected Results

Constructing large databases that provide both qualitative and quantitative data for use in the development of AI algorithms for both public and private entities (prediction of tax fraud) (banks, FinTechs offering credit services, etc.). Using text mining and NLP, evaluate the viability of various models that could predict the risk of fraudulent behaviour in the financial sector. Utilisation of these models in both the public sector (public policy formulation) and the private sector (help banks and FinTechs in credit scoring).

Secondments (2)

Institution	Supervisor	Start Month	Duration (months)	Activities
RAI	Dr. Stefan Theuss	M15	18	Research exposure in a global business environment, trend modelling
ECB	Dr. Lukasz Kubicki	M33	12	Exposure to globally leading central bank, research training on EU principles, supervision

Recruitment & Hosting Details

DC 13 BBU Babes-Bolyai University M9 36

DC 13 BBU BBU Month 9 36 months D 1.1, 1.2

Predicting financial trends using text mining and NLP (WP 1)

Objectives: This DC's primary objective is to improve the use of AI-based natural language processing (NLP) solutions in order to predict

credit risk and fiscal fraudulent behaviour based on speech text from audit reports, social media, and other sources. Predicting

noncompliance based on free-text responses from survey respondents' perceptions. Constructing attitudinal indices based on free text and

incorporating them into behavioural models, along with other qualitative or quantitative factors, in order to predict the likelihood of system

fraud or the level of risk associated with accreditation.

Expected Results: Constructing large databases that provide both qualitative and quantitative data for use in the development of AI

algorithms for both public and private entities (prediction of tax fraud) (banks, FinTechs offering credit services, etc.). Using text mining

and NLP, evaluate the viability of various models that could predict the risk of fraudulent behaviour in the financial sector. Utilisation of

these models in both the public sector (public policy formulation) and the private sector (help banks and FinTechs in credit scoring).

Planned secondments: RAIFFEISEN, Dr. Stefan Theußl, M15, 18 months, research exposure in a global business environment, trend

modelling

ECB, Dr. Lukasz Kubicki, M33, 12 months, exposure to globally leading central bank, research training on EU principles, supervision

Fellow Host institution PhD enrolment Start date Duration Deliverables

Deliverables

Code	Name	WP	Due
D1.1	Status report on the financial data space	WP1	M24
D1.2	Final industry prototype for data quality tools	WP1	M48