Task 2: Data Analysis (40 marks)
(Write in third person – academic style)
You will carry out a data analysis and produce visualisations. You will need a data set which can be analysed. You will also need a research question.
- Research question and hypotheses (10 marks)
Identify a research question that might be usefully answered using your analytics record. Develop hypotheses.
Evaluate the potential impact of insights that might occur following exploration of the research question.
- Dataset Generation (10 marks)
Identify data from your application that might be analysed to provide business insights and in particular to answer your research question in (a) above. Based on your selection, create a data set with at least 1000 rows. You can create the data set either using real data (suitably anonymised) or, if this is not possible, you will need to generate a realistic data set. You will need to specify realistic shape and relationships within the data in order to generate realistic data. If you use real data make sure you have permission from your organisation and that your use complies with your organisation’s data governance policy.
Briefly explain why the data chosen has been selected and the reasoning behind your design of the data shape and relationships. The data set will form your analytics record.
Include an appendix that describes the meta-data of your data set together with a sample of some rows.
- Hypothesis Testing (10 marks)
Analyse the data against the hypothesis.
Carry out suitable statistical significance testing.
Evaluate results and justify statistical significance testing method selected.
- Analysis and Visualisation (10 marks)
Using Power BI or another suitable visualisation tool, create at least three visualisations from your data set. Provide a discussion of the visualisations selected, explaining how they were created and what additional insight they bring.
Analyse the data against the hypothesis.
Carry out suitable statistical significance testing.
Evaluate results and justify statistical significance testing method selected.
- Analysis and Visualisation (10 marks)
Using Power BI or another suitable visualisation tool, create at least three visualisations from your data set. Provide a discussion of the visualisations selected, explaining how they were created and what additional insight they bring.