Project Details
Fair Scoring A legal framework for algorithmic credit scoring
Applicant
Professorin Dr. Katja Langenbucher
Subject Area
Private Law
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 467157950
Credit scores have assisted lenders for centuries in deciding on a potential borrower’s creditworthiness. Banks have historically based their loan decisions on an applicant’s credit history. Instead of estimating the probability of repayment on the basis of free income (traditional lenders) or past credit history (traditional scoring agencies), modern FinTech companies focus mainly on future income when assessing probability of repayment. To that end, their novel scoring systems are fed by big data and based on machine learning technology, extending “input” far beyond the approaches traditionally followed by lenders or scoring agencies (including for instance GPS and cell phone data, social network affiliations, taste in music, personal characteristics, race or religion).This form of A.I. credit scoring has been hailed as inclusive, extending credit to groups of borrowers who have long been denied credit opportunities because their profile did not match the factors accounted for by traditional scoring models. At the same time, the availability on a global scale of what has been called "alternative data" and the considerable lack of transparency regarding how such data is used in credit scoring models have raised major concerns.Data privacy is the first issue. Alternative data provides correlations between creditworthiness and characteristics not usually thought of as relevant in a financial context, suggesting to discuss the extent to which the GDPR’s “consent” covers scoring methodology.The second issue is discrimination. The combination of innumerable data points and complex correlations produced by machine learning may lead to discriminatory results. Clearly, this is the case if the model should be trained to use protected variables such as race or gender. More realistically, the algorithm will use thousands of variables and among those we will find protected ones. Even more complex issues of disparate impact doctrine are raised if the input variables have been carefully screened to exclude protected categories, the algorithm relies on variables which are themselves not “suspicious” (e.g.: taste in music) but these correlate significantly with a protected variable (e.g.: gender). German scholarship has so far focused on traditional scoring methodology, working mostly on data protection. A much livelier debate on data protection as well as risks of discrimination related to scoring can be found in the US. China with its “social scoring” has extended machine learning far beyond the credit context.The project’s goal is to establish the legal framework as to German and European data protection and anti-discrimination law on the basis of a comparison with five countries. Against that background, corporate law duties of regulated financial institutions as well as the regulatory framework of banking supervisory law will be spelled out.
DFG Programme
Research Grants