S1 Select and apply key analytical techniques (e.g. traditional and intelligent analytics) in order to be able to conduct a big data analysis across the whole data science lifecycle on modern data science platforms and with data science programming languages
S2 Conduct pre-processing, data fusion and data analysis on a wide variety of data sets and to report the results.
S3 Visualise, present and organise data in a variety of formats.
Part 1: Report
You are required to produce up to 2,000 word report (max) on the subject of anomaly detection. Choose an anomaly detection algorithm that you find interesting and how this particular approach has been successfully deployed in data mining. You are free to choose any application area such as finance, health, fraud, stock market, military, engineering, social media, prediction, telecoms, planning etc. etc. The report should discuss your data analysis, use R outputs, R code snippets and diagrams to assist your explanations.
Part 2: Practical Implementation
Based on the machine learning method selected, implement a data mining analysis using this method on your selected data set.
Place the word count on 1st page –
State where you obtained or simulated your data, the R packages you have used, any source code you have used from others. Also, place a full R source listing at back of report, these should be screenshots of the code from the RStudio script editor – it will not add to word count.
You can refer to any of your course handouts, any other books, journals, online resources etc.