Kaggle练习1——Titanic

    xiaoxiao2021-03-25  70

    最近有打算通过练习经典的Kaggle案例来锻炼自己的实战能力,今天就记录下自己做Titanic练习的全过程吧。

    背景介绍:

    python代码如下:

    # -*- coding: utf-8 -*- """ Created on Fri Mar 10 12:00:46 2017 @author: zch """ import pandas as pd from sklearn.feature_extraction import DictVectorizer from sklearn.ensemble import RandomForestClassifier from xgboost import XGBClassifier from sklearn.cross_validation import cross_val_score #读取训练数据集和测试数据集 train = pd.read_csv('E://Python/data/Titanic/train.csv') test = pd.read_csv('E://Python/data/Titanic/test.csv') selected_features = ['Pclass','Sex','Age','Embarked','SibSp','Parch','Fare'] X_train = train[selected_features] X_test = test[selected_features] y_train = train['Survived'] #填充Embarked缺失值 X_train['Embarked'].fillna('S',inplace=True) X_test['Embarked'].fillna('S',inplace=True) #填充Age缺失值 X_train['Age'].fillna(X_train['Age'].mean(),inplace=True) X_test['Age'].fillna(X_test['Age'].mean(),inplace=True) X_test['Fare'].fillna(X_test['Fare'].mean(),inplace=True) #采用DictVectorizer对特征向量化 dict_vec = DictVectorizer(sparse=False) X_train = dict_vec.fit_transform(X_train.to_dict(orient='record')) print(dict_vec.feature_names_) X_test = dict_vec.transform(X_test.to_dict(orient='record')) rfc = RandomForestClassifier() #使用默认配置初始化XGBClassifier xgbc = XGBClassifier() #使用5折交叉验证的方法在训练集上分别对rfc和xgbc进行性能评估, #获得平均分类准确性的得分。 cross_val_score(rfc,X_train,y_train,cv=5).mean() cross_val_score(xgbc,X_train,y_train,cv=5).mean() #使用rfc进行预测操作 rfc.fit(X_train,y_train) rfc_y_predict = rfc.predict(X_test) rfc_submission = pd.DataFrame({'PassengerId':test['PassengerId'],'Survived':rfc_y_predict}) #将预测结果存储在文件rfc_submission.csv rfc_submission.to_csv('E:\\Python\\data\\Titanic\\rfc_sub.csv',index=False) #使用xgbc进行预测操作 xgbc.fit(X_train,y_train) xgbc_y_predict = xgbc.predict(X_test) xgbc_submission = pd.DataFrame({'PassengerId':test['PassengerId'],'Survived':xgbc_y_predict}) #将预测结果存储在文件xgbc_submission.csv xgbc_submission.to_csv('E:\\Python\\data\\Titanic\\xgbc_sub.csv',index=False)
    转载请注明原文地址: https://ju.6miu.com/read-36510.html

    最新回复(0)