Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Performance of multivariate clustering methods in oil families' identification

Karavoulia Christina

Full record


URI: http://purl.tuc.gr/dl/dias/05583BB2-C7C4-4BD3-AC9F-F84509207436
Year 2017
Type of Item Master Thesis
License
Details
Bibliographic Citation Christina Karavoulia, "Performance of multivariate clustering methods in oil families' identification", Master Thesis, School of Mineral Resources Engineering, Technical University of Crete, Chania, Greece, 2017 https://doi.org/10.26233/heallink.tuc.68453
Appears in Collections

Summary

As science progresses, the need for analyzing multivariate data sets is growing by the minute. Multiple disciplines, either scientific or not, require the examination of large amounts of data, in a short period of time, in order to obtain useful information. During the recent few decades, multivariate statistical analysis methods have been developed, aiming to satisfy such purposes.This dissertation deals with the implementation of multivariate data analysis methods on a given data set, derived from oil family affiliations, which originate from Williston Basin of North America. In particular, Hierarchical Clustering, k-means and Principal Component analysis have been applied on four independent models, in an attempt to extract information regarding the oil-oil correlations among the samples under study. The models used on the exploration of the compositional information were the Saturated Fraction Compositional Model, the Saturated Fraction Ratios Model, the Gasoline Range Compositional Model and the Biomarkers Compositional Model.These standard statistical methods were found to be quite insufficient in classifying the sample set into distinct familial affiliations. For this reason, the need to examine the nature of the data set arose. Compositional data represent a category on their own as they are characterized by specific numerical properties which present significant consequences when being analyzed by standard multivariate techniques. The analysis of such type of data represents a whole new chapter in the world of statistics and the need for further examination on this matter is constantly growing.

Available Files

Services

Statistics