Python编程基础

xiaoxiao2021-03-25 83

Python代码相对简单、易于理解，并且具有交互性。

1.基本语法

1）代码缩进

a. 许多在C、C++和Java语言中需要用{}来分割的模块，在python中严格使用缩进机制进行区分；

b. 代码的层次关系是通过同样深度的空格或制表符缩进体现的；

c. 随着缩进深度的增加，代码块的层次也在加深，没有缩进的代码块是最高层次的，被称作脚本的“主体”部分；

d. 常见的需要缩进的场景包括分支、循环、函数定义等。

2）注释

python中用符号 # 对代码进行注释。

2.数据类型

python内置常用六种数据类型：

1）数字：整型数、长整型数、浮点数以及复杂型数

2）布尔值：True 和 False

注：大小写敏感！！！

3）字符串：引号之间的字符集合。

a. 表示：单引号、双引号或者三引号（三个连续的单引号或双引号）；

b. 三引号：可指示多行的字符串

4）元组：一系列python数据类型按照顺序组成的序列

a. 用小括号（）表示，如：t =（1 , ‘a’ , 10）；

b. 元组中的数据类型不必统一；

c. 访问：可以通过索引直接从元组中找到需要的数据，如：t[0]值为1。

5）列表：功能上与元组类似

a. 用中括号 [ ] 组织数据，如：l = [ 1 , 'a' , 10 ]。

b. 访问：l[0] 的值为1

注：python允许使用者在访问列表的同时修改列表里的数据，而元组则不然。

6）字典：包含多组键（key）：值（value）对。

a. 用大括号{ }来容纳这些键值对，如：d = { 1：‘1’，‘a’：0.1，10：40 }。

b. 访问：查找键对应的值，如：d[ 'a' ] 的值为0.1。

注：键是唯一的，但是没有数据类型的要求。

3.数据运算

1）算术运算：加法（+）、减法（-）、乘法（*）、除法（/）、取模（%）、幂指数（**）

2）比较运算：反馈布尔值类型的结果

3）赋值运算：python在声明变量时不需要预告知类型。

4）逻辑运算：与（and）、或（or）、非（not）。

5）成员运算：针对python中较为复杂的数据结构而设立的一种运算，主要面向元组、列表和字典。

通过运算符 in 询问是否有某个元素在列表或元组里出现，或者检视某个键值是否在字典里存在。

4.流程控制

1）分支语句（if）

常见语法结构：

a. if—else ：

b. if—elif—else：

2）循环控制（for）

a. 常见遍历语法

for 临时变量 in可遍历数据结构（列表、元组、字典）：

【制表符】语句1

【制表符】......

b. 例子

5.函数（模块）设计

python采用def这个关键词来定义一个函数/模块，如：

6.编程库（包）的导入

#直接使用import导入math工具包 from math import exp #调用math包下的函数exp exp(2) #从math包指定导入exp函数，并且对exp重新命名为ep from math import exp as ep ep(2)

7.基础综合实践

“良/恶性乳腺肿瘤预测”——部分python代码样例

1）测试集数据分布

import pandas as pd import matplotlib.pyplot as plt #导入绘图工具包 #调用pandas工具包的read_csv函数/模块，传入训练文件地址参数，获得返回的数据并存至变量df_train df_train=pd.read_csv('E:/xgboost-project/other_example/breast_cancer/breast-cancer-train.csv') #调用pandas工具包的read_csv函数/模块，传入测试文件地址参数，获得返回的数据并存至变量df_test df_test=pd.read_csv('E:/xgboost-project/other_example/breast_cancer/breast-cancer-test.csv') #选取'Clump Thickness','Cell Size'作为特征，构建测试集中的正负分类样本 df_test_negative=df_test.loc[df_test['Type']==0][['Clump Thickness','Cell Size']] df_test_positive=df_test.loc[df_test['Type']==1][['Clump Thickness','Cell Size']] #绘制样本散点图 plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'],marker='o',s=200,c='red') plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'],marker='x',s=200,c='black') #绘制x，y轴的说明 plt.xlabel('Clump Thickness') plt.ylabel('Cell Size') #显示 plt.show() 结果：

2）随机参数下的二类分类器

import pandas as pd import matplotlib.pyplot as plt import numpy as np df_train=pd.read_csv('E:/xgboost-project/other_example/breast_cancer/breast-cancer-train.csv') df_test=pd.read_csv('E:/xgboost-project/other_example/breast_cancer/breast-cancer-test.csv') df_test_negative=df_test.loc[df_test['Type']==0][['Clump Thickness','Cell Size']] df_test_positive=df_test.loc[df_test['Type']==1][['Clump Thickness','Cell Size']] #利用numpy中的random函数随机采样直线的截距和系数 intercept=np.random.random([1]) coef=np.random.random([2]) lx=np.arange(0,12) ly=(-intercept-lx*coef[0])/coef[1] #绘制一条随机直线 plt.plot(lx,ly,c='yellow') plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'],marker='o',s=200,c='red') plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'],marker='x',s=200,c='black') plt.xlabel('Clump Thickness') plt.ylabel('Cell Size') plt.show() 结果：

3）使用前10条训练样本学习直线的系数和截距

import pandas as pd import matplotlib.pyplot as plt import numpy as np from sklearn.linear_model import LogisticRegression #导入sklearn的逻辑斯蒂回归分类器 df_train=pd.read_csv('E:/xgboost-project/other_example/breast_cancer/breast-cancer-train.csv') df_test=pd.read_csv('E:/xgboost-project/other_example/breast_cancer/breast-cancer-test.csv') df_test_negative=df_test.loc[df_test['Type']==0][['Clump Thickness','Cell Size']] df_test_positive=df_test.loc[df_test['Type']==1][['Clump Thickness','Cell Size']] intercept=np.random.random([1]) coef=np.random.random([2]) lx=np.arange(0,12) ly=(-intercept-lx*coef[0])/coef[1] #plt.plot(lx,ly,c='yellow') #plt.scatter(df_test_negative['Clump Thickness'],df_test_negative['Cell Size'],marker='o',s=200,c='red') #plt.scatter(df_test_positive['Clump Thickness'],df_test_positive['Cell Size'],marker='x',s=200,c='black') #plt.xlabel('Clump Thickness') #plt.ylabel('Cell Size') #plt.show() lr=LogisticRegression() #使用前10条训练样本学习直线的系数和截距 lr.fit(df_train[['Clump Thickness','Cell Size']][:10],df_train['Type'][:10]) print 'testing accruary(10 training samples):',lr.score(df_test[['Clump Thickness','Cell Size']],df_test['Type']) 结果：

转载请注明原文地址: https://ju.6miu.com/read-39563.html

技术

最新回复(0)