This site will work and look better in a browser that supports web standards, but it is accessible to any browser or Internet device.

Designing example loans

Datasets are created as a set of example loans, described by their risk factors and eventually by the subjective default risk assessment that is ascribed to them by users. Each loan in the dataset is referred to as a case. Risk assessments are produced by pairwise comparison of similar cases. To facilitate meaningful comparison of the potentially large number of cases in a dataset, cases are organised into groups.

Groups of cases are organised into a hierarchy where one of the cases in a parent group MUST be present in the set of cases in a child group. This requirement ensures that there is always a common point of reference across related groups of cases.

The hierarchy of case groups is presented in a tree that can be expanded or collapsed by clicking on the folder icons beside each group that has sub groups. To see the summary information for a specific group and to modify its cases or sub groups, click on the name of the chosen group.

In general, it is difficult to meaningfully compare two cases when they differ in regard to a large number of risk factors. For example, if two loans relate to customers that have different interest cover, leverage, liquidity, management quality, and that operate in different industries, then it will be frustrating for users to weigh up all of those differences together to arrive at a sensible difference in default risk.

To eliminate this problem, Datafactory provides a guided case design process that enables the creation of new groups of cases that ensure case comparisons only involve cases that differ in regard to the value for a single risk factor. Three types of case groups are typically created for a dataset. These are:

Risk range groups

Each dataset has exactly one risk range group. It is the top level group, having no parent groups. It is created automatically when a new dataset is defined. It contains three cases, named, "best case", "worst case", and "typical case". These three cases need to have values specified for their risk factors. This should be done prior to creating any sub groups of the top level risk range group.

When specifying the risk factor values for the best case and the worst case, it is important to choose values that are actually possible to observe in a client. These cases provide an upper and a lower bound for the default probability assessments that will be generated for all of the other cases in the dataset.

The typical case will be used as the reference case for a lot of other case groups. Put another way, many of the subgroups of the risk range group will involve groups of cases that are identical to the typical case except in regard to a single risk factor. These are variation groups. The quality of the risk assessments that can be done by users depends on the recognisability of the reference case. If the combination of risk factor values for that case are understood by users, the risk assessments that will be done for that case and for variations on it will be of a higher quality. For this reason, the risk factor values for the typical case should be chosen carefully, drawing on whatever hard data is available to ensure that the typical case actually is representative of a common type of loan. If there are a variety of different "typical loans that the dataset is intended to cover, then case evolution groups can be used to generate new reference cases.

Once the three cases in the risk range group have been fully specified, the next step in case design is to create the variation groups for the reference case.

Variation groups

Complete analysis of a reference case, such as the "typical" case in the risk range group, requires the creation of one "variation group" for each of the risk factors defined in the dataset. Each variation group is a sub group of the parent group and includes the reference case from the parent group. It must also contain a range of other cases that differ from the reference case with regard to the single risk factor that is being varied. In the summary screen for each case group, Datafactory provides access to a wizard via the "Add a sub-group to explore variations in a risk factor" link that takes users through the steps necessary to create a variation group and to create each of the variation cases in that group.

Evolution groups

In most datasets it will not be sufficient to analyse a single reference case. If the model to be generated from the dataset is to cover a broad range of client situations then several reference cases may need to be analysed using variation groups. These additional reference cases are created using the wizard that is available from the "Add sub-group that creates a new reference case" link on the group summary page.

The wizard takes users through the steps necessary to identify the case to be evolved into the new reference case, create the new evolution group containing the case to be evolved into the new reference case, generate the series of cases in the evolution group that vary one risk factor at a time from the original case to the new reference case. At the end of the wizard, the user will have created a new group that includes a series of cases where comparisons of risk between the cases need only involve comparisons of cases that differ by at most one risk factor.

Once the new reference case has been created in the evolution group, it is necessary to create a variation group for each risk factor, analysing the impact of risk factor variations on default risk for the new reference case.