Content Summary | Despite the continuous and exponential technological evolution, a common phenomenon observed in the service and retail industry is the lack of pattern recognition in a dataset and thus the inability to create predictions on business issues. This, makes them even more susceptible to uncertainty and risk, as they are not able to focus on the key variables that influence their companies' attributes.
This thesis, highlights three different case studies of companies operating in the service and retail industry. Specifically, the paper focuses on the case of a hotel company based in Chania of Crete, a multinational insurance company and a big Greek Super Market chain, that seek to use data analysis along with data science to extract the necessary information and make the predictions needed in order to take effective decisions and improve their business performance.
The aforementioned, can be achieved by using the methods of categorization, clustering and association rule mining through the usage of machine learning software, WEKA. Through algorithms' implementation, it is possible to make predictions, check their accuracy, create patterns of interrelated sales/purchases and group features, based on the data provided by the companies.
In each of these three cases, the dataset is examined, and through WEKA'S assistance, the data is analyzed in order to obtain results, capable of assisting or improving decision-making, increasing competitiveness and possibly increasing the sales of the firms in question.
The first chapter presents the concept of data mining, the purpose it serves and the ways through which it helps in business problem solving. Then the data mining software- WEKA is presented, which is used in each of the cases, to analyze the data given and to provide meaningful patterns, rules and results for the issues addressed. The presentation, analysis and explanation of the different regression and classification algorithms follows, which will be used in the use cases of the following chapters and the different clusterers that will be applied through WEKA’s software. Additionally, the concept of association rule mining is presented and explained, as well as the various metrics that will be used to analyze and interpret WEKA’s results.
The second chapter presents the Creta Palm Hotel’s case. For this case, a certain amount of data, concerning the hotel’s bookings from the different travel agencies as well as the different booking sources was collected, for the years 2019 and 2020. This data is about to be analyzed through classification algorithms' implementations and clustering method developments, with the assistance of the machine learning software WEKA. This aims in generating predictions for the total bookings of the different travel agencies and the different booking pages, in checking the accuracy of the total bookings’ predictions as well as in grouping the different co-operative booking sources' characteristics based on the years of 2019 and 2020. Total bookings’ predictions concern all those travel agencies and booking sources that have the same, or similar characteristics to the agencies/sources given for analysis, that is the training data. The agencies/sources whose data have the same or similar characteristics to the training data, are expected to behave in the same way and have similar number of sales.
The third chapter is about the multinational insurance company NN. In this case, the company created a questionnaire for its customers and collected their responses, in order to examine their intentions and preferences concerning the insurance products she promotes. These responses, are being processed and then analyzed, in order to predict the customers’ interest for insurance estimating, to test the predictions’ accuracy and to cluster the customers’ characteristics, based on the data provided by the company. The aforementioned are accomplished, through classification algorithms’ implementation and clustering methods’ development, with the assistance of WEKA machine learning software. WEKA’s predictions for the customers’ interest in retirement estimation, concern customers who display the same or similar characteristics as those of the customers answering the questionnaire. Therefore, customers with the same or similar characteristics as the training data are expected to behave in the same way and have a similar response.
The forth chapter presents the case of a large Super Market in Greece. For this use case, a database with transactional and demographic data was collected from the Supermarket for a period of eight months during 2021. This database included the customers’ gender, age, card code and all of their purchases with its dates, the shop and area from which the customers made each purchase, the products each customer chose along with their product category and the amount of money that they spent on each product. The collection and analysis of these data, gives us the opportunity to find useful information about the customers, through association rule mining and clustering. Weka Machine Learning Software, transforms the dataset into meaningful patterns with the assistance of integrated algorithms, aiming to find the products that appear an association with each other and are usually purchased together (association rule mining) as well as to group the customers depending on their purchase frequency of the various product categories. Association rule mining- in other words- market basket analysis, discovers the correlations between the different items in customers’ shopping cart and clustering segregates groups with similar traits. These methods help the company to have a better understanding of the customers’ profile and thus, create value for them. This leads to a better customer experience and creates a stronger sentiment or loyalty towards the company. The methods of association rule mining and clustering helps the company to better predict the results of a new oncoming dataset that has similar characteristics with our already existing dataset and make the necessary marketing campaigns depending on the customers’ gender, age and area of shopping.
| en |