After you have read this information, head over to the Assignment 2 Q&A discussion board to ask any questions and see what your peers are saying about this assignment.
Step 1: Assignment Overview
The assignment is based on the same scenario from Assignment 1 — a boutique streaming service that specialises in providing its users with curated French-language options in movies, series, or documentaries. For the completion of this assignment, you'll need to conduct and apply advanced data visualisation techniques on the Movie data set provided to you using Python.
This assignment requires you to explore ways to create exploratory visualisations of data analysis using Python. Similar to Assignment 1, you will submit your work as two separate files including your Jupyter Notebook and a 1,000-word report justifying your approach to the exploratory visualisation.
This assignment supports unit learning outcomes 1, 2 and 3.
- Exhibit advanced data analysis processes using Python programming
- Articulate relevant use cases using different Python libraries
- Present exploratory visualisations using Python
Step 2: Read the scenario
The world is switching gears so fast that even without us realising we have gradually moved from watching movies on television channels to streaming them online! Online streaming services are growing more than ever today and have paved the way for a lot of other streaming providers to join the race. As the war between the online streaming platforms is getting intense, there is a handful of niche content in the entertainment and media industry, still out there, which has viewers looking for something that these big names such as Netflix, Disney or Amazon Prime are not offering.
Canopy is one such boutique streaming service that plans to cater to the viewers of French-language movies. Their initial business goals, as a streaming service provider, are:
- offer curated selections of best rated French-language movies from the existing content
- make French-language movies available to various age groups
- identify the least tapped genres so as to provide the filmmakers with data to make original content for Canopy.
You are appointed as the data analyst for Canopy. Your key responsibilities include:
- interpreting data, analysing results using the statistical techniques you learned throughout the course.
- helping gather insights and understand trends to make decisions by conducting data analysis using python.
- deducing results that Canopy can use to inform their business goals.
Step 3: Access the dataset
Study the following metadata for better understanding:
Dataset-movie-details
|
Column
|
Sample record
|
Interpretation of columns
|
Title
|
Inception
|
Title of the movie
|
Year
|
2010
|
Release year
|
Age
|
13+
|
Viewing age
|
IMDb
|
8.8
|
Rating by IMDb
|
Rotten tomatoes
|
0.87
|
Rating by Rotten Tomatoes
|
Directors
|
Christopher Nolan
|
Name of the director
|
Genres
|
Action, Adventure, Sci-Fi, Thriller
|
The genre of the movie
|
Country
|
United States, United Kingdom
|
Country the movie released in
|
Language
|
English, Japanese, French
|
Available languages of the movie
|
Runtime
|
148
|
Length of the movie
|
Step 4: Create your analysis and visualisation
Your tasks may not be limited to what is laid out in the list here. You might choose to implement more functions so as to come to some concrete conclusions for Canopy. Make sure you make use of the Matplotlib, Seaborn, and visualisation libraries (e.g. Bokeh for all your tasks.
Canopy wants to find solutions to some broader business problems. Use this set of questions to create your visualisations.
- Does France make longer movies on average based on the runtime?
- Does France make better movies on average based on the ratings?
- Are there any French movies that claim to be of more than one genre? (For eg: Avengers Infinity War is an adventure movie but not comedy, whereas Back to The Future is both.)
[Note: Canopy doesn’t mind if you choose to include the French dubbed movies, at this stage.]
Step 5: Complete your report
Write a 1000-word report and submit that as a PDF file. Feel free to include screenshots of code snippets and outputs from your Jupyter notebook that might help you support your explanation and rationale.
Ensure that your report has the following structure:
- Reflection on your understanding of data visualisation and Python in general (200 words)
- Justification of the approach you used to create a data visualisation dashboard to analyse and visualise the movie data in Jupyter notebook (400 words)
- Conclusion related to whether Canopy would achieve their business goals or not based on your analysis and observations (400 words)