Abstract:
Decision Support Systems (DSS) are computerized information system that helps decision
makers with decision-making activities. DSS are interactive computer-based system that
uses data, document, communication technologies, knowledge and model to support decision
making process. In this thesis, we consider top two data mining algorithms in the
research community: k-means and k-nearest neighbor (kNN). Given C = fc1; :::;cmg is a set
of pre-defined categories, an initial corpus Co = fd1; :::;dsg of documents previously categorized
under the same set of categories and a training set Tr = fd1; :::;dgg. This is the set
of example documents observing the characteristics of which the classifiers for the various
categories are induced and D = fd1; :::;dng is a set of documents to be categorized. The
problem is to assign a value from f0;1g to each entry, ai j where 1 i m, 1 j n of the
decision matrix. A value of 1 for ai j is interpreted as a decision to file dj under ci, while a
value of 0 is interpreted as a decision not to file dj under c. A test set Te =dg+1; :::;ds will be
used for the purpose of testing the effectiveness of the induced classifiers. Each document in
Te will be fed to the classifiers and a measure of classification effectiveness will be based on
how often the values for the ai j’s obtained by the classifiers match the values for the cai j’s
provided by the experts where cai j is the element of correct decision matrix and 1 i m
, 1 j s. We implemented these two algorithms and perform cross-validation test to
measure accuracy of them. It comes out that; kNN is more accurate than that of k-mean.
Then we develop tracing of document-driven DSS to provide an explanation to improve the
acceptance of decision makers, because decisions are based on both the inheritance among
documents and acceptance of those advices for decision makers. So, we develop a tracing
on the contents and the classification of interrelated documents to improve the explanation.