III. Query questions. (COVERING 12 ITEM, 67 marks IN TOTAL)
Table:Movies
Id Title Director Year Length_minutes
1 Toy Story John Lasseter 1995 81
2 A Bug's Life John 1998 95
3 Toy Story 2 John Lasseter 1999 93
4 Monsters, Inc. Pete Docter 2001 92
5 Finding Nemo Andrew Stanto 2003 107
7 Cars John Lasseter 2006 117
9 WALL-E Andrew Stanton 2008 104
10 Up Pete Docter 2009 101
11 Toy Story 3 Lee Unkrich 2010 103
12 Cars 2 John Lasseter 2011 120
13 Brave Brenda Chapman 2012 102
87 WALL-G Brenda Chapman 2042 97
Table: Boxoffice
Movie_id Rating Domestic_sales International_sales
5 8.2 380843261 555900000
12 6.4 191452396 368400000
3 7.9 245852179 239163000
9 8.5 223808164 297503696
11 8.4 415004880 648167031
1 8.3 191796233 170162503
7 7.2 244082982 217900167
10 8.3 293004164 438338580
4 8.1 289916256 272900000
2 7.2 162798565 200600000
13 7.2 237283207 301700000
1.      Create the table Movies. It includes the domain of values associated with each attribute and integrity constraints. (7 marks)
2.      Write an SQL query that finds the title of each film.(5 marks)
3.      Write an SQL query that finds the movies released in the years between 2000 and 2010 . (5 marks)
4.      Find all the WALL-* movies. (5 marks)
5.      List the last four Pixar movies released (ordered from most recent to least). (5 marks)
6.      Find the domestic and international sales for each movie. (6 marks)
7.      List all movies that were released on even number years. (5 marks)
8.      Add the studio's new production, Toy Story 4 to the list of movies (you can use any director). (5 marks)
9.      The director for A Bug's Life is incorrect, it was actually directed by John Lasseter. (5 marks)
10.  This database is getting too big, lets remove all movies that were released before 2005. (5 marks)
11.  SELECT title, year FROM movies WHERE year < 2000 OR year > 2010; (6 marks)
Write the result:
data is in CSV files. Here CSV files are Movies.csv and Ratings.csv.
1)So load these CSV files as Pandas DataFrames into pandas using Pandas read_csv command and use DataFrame head() command to study data.
2)Use merge command to merge DataFrames
3)Use sort_values command to display merged DataFrame sorted by Director
Following is the Python code:-
import pandas as pd
df1 = pd.read_csv('Movies.csv') #load Movies.csv file as DataFrame
df2 = pd.read_csv('Ratings.csv') #load Ratings.csv file as DataFrame
result = pd.merge(df1, df2, how='left', left_on=['Id'], right_on=['Movie_id']) #merge DataFrames
result.head() #display merged DataFrame
result.sort_values(by='Director') #display merged DataFrame sorted by Director
SQL query
CREATE TABLE Movies (
Id int PRIMARY KEY,
Title varchar(255),
Director varchar(255),
Year int,
Length_minutes int
);
Comments
Leave a comment