Task 1: Data Architecture Analysis (25 Marks)
(Write in third person – academic style)
The set text provides an end state architecture to which organisations should aspire. It is shown below.
Snapshot : from: Inmon, W. and Linstedt, D. (2019) Data Architecture: A Primer for the Data Scientist, Academic Press; 2nd edition, pp 48
Devise a big-data-oriented application relevant to your organisation (or organisation like your own) and consider how that application could be represented using the given architecture as a foundation.
Provide a description of the application, stating the main data components, their sources and their relationships. Justify the data components included.
Amend the diagram given above to include the data components of your application.
Explain how big data techniques might be used to harness and process the data within your application.
Task 2: Data Analysis (40 marks)
(Write in third person – academic style)
You will carry out a data analysis and produce visualisations. You will need a data set which can be analysed. You will also need a research question.
Research question and hypotheses (10 marks)
Identify a research question that might be usefully answered using your analytics record. Develop hypotheses.
Evaluate the potential impact of insights that might occur following exploration of the research question.
Dataset Generation (10 marks)
Identify data from your application that might be analysed to provide business insights and in particular to answer your research question in (a) above. Based on your selection, create a data set with at least 1000 rows. You can create the data set either using real data (suitably anonymised) or, if this is not possible, you will need to generate a realistic data set. You will need to specify realistic shape and relationships within the data in order to generate realistic data.
Briefly explain why the data chosen has been selected and the reasoning behind your design of the data shape and relationships. The data set will form your analytics record.
Include an appendix that describes the meta-data of your data set together with a sample of some rows.
Hypothesis Testing (10 marks)
Analyse the data against the hypothesis.
Carry out suitable statistical significance testing.
Evaluate results and justify statistical significance testing method selected.
Analysis and Visualisation (10 marks)
Using Power BI or another suitable visualisation tool, create at least three visualisations from your data set. Provide a discussion of the visualisations selected, explaining how they were created and what additional insight they bring.
Task 3: Data Governance (15 marks)
(Write in third person – academic style)
Outline a data governance framework suitable for the organisation. Justify the components included and outline the responsibilities of the data governance function.
It is a must that you be critical (contrast both positives and negatives) in your evaluation.
Tip: use words: however, on the other hand etc
Task 4: Evaluation (10 marks)
(Write in first person – reflective style)
Evaluate your experience in carrying out the assignment. What went well and what was your response to any challenges? Briefly discuss your main points of learning. Again, criticality and looking at both sides (what went well and didn’t) are very important.
Tip: use words: however, on the other hand etc