Familiarise with some well-known data mining techniques, in order to understand their working principles;
Apply data mining techniques to domain-specific datasets;
Review cutting-edge data mining techniques to gain good overview on current data mining technology;
The whole task of this assignment consists of the following procedural steps.
Find and download a data set that you think is about an interesting topic. There will be a bunch of data sets on LearnJCU which the lecturer thinks is interesting, and you are welcome to pick one of those, but there are also thousands of other data sets available from web sites such as these:
The original data set often comes with a short article describing it, or at least a name. Use Google Scholar at https://scholar.google.com (or a similar academic citation index) to find a few articles that use data mining on the same data set.
If no article uses that same data set, then try looking for articles that use data mining on the same topic.
The aim is to compare your results with the results of other researchers.
• If there are many articles that use your data set, then just pick two or three articles that seem recent, popular, or otherwise interesting.
• You don’t have to read the whole article! Just read the introduction, then skip to the back, and look for a results section or results table.
Google Scholar can also do the formatting for your referencing for the articles.
Choose appropriate data mining techniques (and a few algorithms).
You can select either of two options for this assignment.
- Once you have your own domain-specific dataset and chosen data mining algorithm, then you need to design and implement the chosen algorithm in your preferred programming language.
- A series of preprocessing will be required at this step. The preprocessing procedure should be designed carefully (considering what kind of processing will be required? How? Why?) to make your data ready to be fed to your program. Some parts of this preprocessing procedure can be included in your program as a part of “pre-data-mining module”
. - Your final program must become a stand-alone data-mining tool designed for your own purpose of data analysis. It is expected that your program should include the following modules (and may include more sub-modules if needed);
1) pre-data-mining module – designed for necessary preprocessing and for getting the data ready to be fed to the next module (data-mining module). You don’t need to include all required pre-processing in this module. It is assumed that some initial preprocessing (e.g. cleaning noise data) can be done externally using other software tools (e.g. Excel or Weka).
2) data-mining module – the chosen data mining algorithm is implemented. You can directly borrow the algorithm from one popular existing data mining method, or you can design your own algorithm (by amending the existing one)
3) post-mining module – this module is for presenting/reporting the output result produced through previous modules. The result can be made in a simple text report or additionally in a non-text visualization way (e.g. graph, chart or diagram).
- This programming-intensive assignment still requires an analysis. Try to find all the patterns you can detect with your implemented algorithm. Try to compare and contrast the result using your chosen preprocessing scheme and algorithm with using other existing algorithm or with using other preprocessing methods.
in particular, for the comparison the result using your program with using other existing algorithm, you can use other existing data mining tools (e.g. Weka) to get the result using other algorithm.
- Once you have your own domain-specific dataset chosen, you need to design your own data-mining analysis scheme. This analysis scheme can consist of multiple steps of procedures:
1) Set up a strategy for preprocessing on your data.
A series of preprocessing will be required and need to be designed carefully (considering what kind of processing will be required? How? Why?). You may include multiple different preprocessing schemes for the comparison analysis.
2) Set up a strategy for data-mining.
you need to select one data mining areas (clustering, classification, association rules mining) of your choice and select AT LEAST TWO existing data mining algorithms in your chosen data mining area. For example, if you chose Clustering as your data mining area, you can apply two algorithms; DBScan and K-mean and compare the two results. Alternatively you can design a combined algorithm which applies multiple algorithms from same/different data mining areas in a series. Your strategy also can be designed to apply different parameters for one algorithm. Another strategy you can set up is to apply multiple preprocessing (attribute selection) schemes for one algorithm.
- You can choose one data mining tool (e.g. Weka) to analyze your chosen dataset. Apply the data-mining strategy (you had set up) on your chosen data (preprocessed) using the data mining tool and try to find all the patterns you can detect.
- Do various comparison experiments either by applying different data mining algorithms (or strategy) to the same chosen dataset or by applying a same algorithm to the differently pre-processed datasets.
- Critically analyze experimental results and discuss/demonstrate why a chosen algorithm (strategy) is superior/inferior to other algorithm (strategy).
- You need to present an in-class presentation (15 minutes presentation + 5 minutes for questions) based on your chosen algorithm (strategy) and experimental test.
- The presentation must generally include a good overview on your project, aims/objectives, reasons of your choice, brief overview of strategy/algorithm you chosen, findings, comparison (including experimental results) and conclusion.
- You also need to write a research report paper of minimum 15~20 pages (for CP5634 students) on your project, to summarise your algorithm and experimental results. The report should contain all topics listed above for presentation but with more details.
• For CP5634 students, you need to add in your report one additional section for a brief literature review about the data mining methods (strategy, algorithm and/or preprocessing methods) you chose for your project. Please refer to the following link if you need to get further idea of “literature review”:
- The research paper must follow the generally accepted format of research article consisting of introduction, related work (brief review of methodologies (algorithm/strategy used), a summarized description of your experimental settings and procedures (description of data, justification of chosen data mining area, justification of chosen algorithm, preprocessing details, etc.), comparison, discussion, issues, conclusion, possible future work and a list of references. (you may add more sections if needed)
- In addition to the general components listed above, the report from “Programming-intensive option” should include a summary of your program (including the program structure, implementation details, a summarized algorithm for the main modules etc. including code if necessary).
- For “Analysis-intensive option”, it is required to include a more in-depth analysis on the investigation and experimental comparison made through the project.
• Due for the report submission: Friday 24 January 2020 (in week 9)
• Presentation: Before the report is due, usually during week 8.
• You need to submit your final report as a single document file (MS Word or PDF format) to the electronic drop box on LearnJCU.
• For the “Programming intensive option”, you need to submit the source code and executable file of your program accompanied to your report. Please make a zip file including all necessary files (report document and program files)
Data mining applications assignment help, Data mining book assignment help, It write up at work assignment help,It write up example assignment help, Do it write upassignment help, Test it write up assignment help,It write up example assignment help, Write up meaning assignment help.
Holding a PhD degree in Finance, Dr. John Adams is experienced in assisting students who are in dire need...
55 - Completed Orders
Canada, Toronto I have acquired my degree from Campion College at the University of Regina Occuption/Desi...
52 - Completed Orders
Even since I was a student in Italy I had a passion for languages, in fact I love teaching Italian, and I...
102 - Completed Orders
To work with an organization where I can optimally utilize my knowledge and skills for meeting challenges...
109 - Completed Orders