3.Create a function that concatenates two dataframes. Use previously created function to create two dataframes and pass them as parameters Make sure that the indexes are reset before returning:
4. Write code to load data from cars.csv into a dataframe and print its details. Details like: 'count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'.
5. Write a method that will take a column name as argument and return the name of the column with which the given column has the highest correlation. The data to be used is the cars dataset. The returned value should not the column named that was passed as the parameters.
E.G: get_max_correlated_column('mpg')
-> should return 'drat'
import pandas as pd
#3.Create a function that concatenates two dataframes. Use previously created function to create two dataframes and pass them as parameters Make sure that the indexes are reset before returning:
def con_df(df1, df2):
df = pd.concat([train, test], axis=0, ignore_index = True)
return df
#4. Write code to load data from cars.csv into a dataframe and print its details. Details like: 'count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'.
df = pd.read_csv('cars.csv')
df.descr()
#5. Write a method that will take a column name as argument and return the name of the column with which the given column has the highest correlation. The data to be used is the cars dataset. The returned value should not the column named that was passed as the parameters.
def max_correlated_col(col, dataset):
dic = {}
list1 = [x for x in dataset.columns if dataset[x].dtypes == float]
for i in list1:
dic[i] = abs(dataset[col].corr(dataset[i]))
return max(dic)
Comments
Leave a comment