This site will work and look better in a browser that supports web standards, but it is accessible to any browser or Internet device.

What is a dataset?

Datafactory produces datasets. In the terminology of Datafactory, a dataset is is a collection of information about a set of loans that share similar customer and transaction type characteristics. For example, a dataset may describe loans to small and medium size institutions. A different dataset might describe agribusiness loans. Yet another dataset might describe a portfolio of loans to pubs and clubs.

Datasets are used to create models of default risk. Thus, they need to include information for each loan on:

Before creating a dataset, it is necessary to determine and precisely describe the set events that constitute default. Otherwise, the risk of default is not well defined. Similarly, it is necessary to determine the set of risk factors that are expected to be predictive of default. Once the risk factors have been identified, and precisely defined, the loans in the dataset must be described in terms of their values for each of the risk factors. Only when all of riskfactor information for the loans in the dataset has been specified is it possible to work through the process of subjective estimating a probability of default for each one.

Related links