Air traffic landing statistics is another interesting data. A dataset is available
(https://data.world/sanfrancisco/fpux-q53t). However, there might be other dataset as well. Your
goal is to find for more dataset and then identify task (that generates some meaningful insight)
and prepare that dataset accordingly.
Instructions regarding the Projects:
a. Data collection
b. Data Preprocessing and Cleaning
c. Data Visualization
d. Data Statistics(Summary of statistics)
e. Hypothesis Testing
f. Prediction Task(Using Machine Learning Model)
Any kind of interesting facts about data(e.g., kind of different studies which can be done
using the collected data), innovative ideas, and/or research ideas (Manipulation or
enhancements of existing techniques for better results) will also be considered during
evaluation and will be highly appreciated.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('air_traffic_landings_statistics_1.csv')
df.head()
df.info()
missing_col = [x for x in df.columns if pd.isnull(df[x]).any()]
for i in missing_col:
df[i].fillna(method='bfill', inplace=True)
obj = [x for x in df.columns if df[x].dtypes == object]
from sklearn.preprocessing import OrdinalEncoder
encoder = OrdinalEncoder()
df[obj] = encoder.fit_transform(df[obj])
df.describe()
Comments
Leave a comment