Movie revenue is one important measure of good and bad movies. It also offers important and intuitionistic feedback to producers, directors and actors. Therefore, it is worthy to put effort on analyzing what factors affect revenue, so that movie makers know how to get higher revenue on next movie by focusing on most correlated factors. Our project analyzes different kinds of factors and how they affect the revenue.
CATEGORY: Data Science
ROLE: Working in a team of 5 people, we use Jupiter notebook to do data analysis on the data from Kaggle Open source. By doing data cleaning and visualization, we are trying to see what affects movie revenues.
Discussion & Conclusion
After data analysis, we ﬁnd out that for numeric values: (budget, popularity, runtime, vote_average, vote_count), budget has highest correlation with revenue, vote_count has the second highest correlation with revenue and popularity is the third.
For non-numeric value, we analyze genres, release month, actor, and ﬁnd that for genres, science ﬁction category has the highest average revenue and relatively high frequency; Adventure is the second highest average revenue and high frequency; And drama has the least average revenue and also lowest frequency. We also ﬁnd that there is almost no correlaiton between release month of ﬁlm and its revenue. For the famousness of actor, we ﬁnd that popular actors who act in more movies tend to earn higher revenue.
In conclusion, we think that in order to guarantee the revenue of a movie, the company should spend more budget and also increase advertising in order to increase potential vote_count. And though speciﬁc genres tend to predict higher revenue, companies should avoid making ﬁlms of only high proﬁtable genres. Revenue, although represents the degree of success to some extent, is not everything after all.