Big data anylitics project using spark or hive on aws
1. Select food security data use of Map Reduce or Apache Spark to evaluate industry domain specific data analytic goals. For Map Reduce, this could be through direct means such as through Java or Python programs or indirect means such as through Apache Pig or Apache Hive.
2. You will also need to use the data analytic questions or goals for your project that are appropriate for the dataset you have chosen. These data analytic goals or questions can mirror or be similar to the questions posed and answered in the peer-reviewed papers you have researched for your literature survey paper for the domain surrounding your dataset. You can also find such domain-specific questions from published case studies or technical reports. Regardless, ensure the questions you pose are appropriate to the domain and valuable in their own right based on your research into the data science Map Reduce or Apache Spark related work conducted by the industry or institution most appropriate for your dataset. If you do this, then these questions will be well-suited for this project paper.
3. The methodology section should contain what you intend to do in order to answer these project questions and how you will analyze these results in the context of the dataset specific domain you haunchacan Submit
4. Conduct your Map Reduce and Apache Spark job on the AWS Hadoop Cluster. You should be familiar with the specific tool you wish to use through your previous homework assignments. You will need to implement the appropriate commands on the Cluster required to extract the insights from the specific features of the dataset provided on Canvas.
5. Analyze your results and formulate answers to those questions raised in the earlier part of your report based on the data you have extracted from the large data set as a result of your Map Reduce or Apache Spark job.
6. Write your insights into your conclusion section and incorporate the answers you have formulated and supported with the results of your work in the previous sections of this project report.