Python MySQL 实作

    xiaoxiao2021-03-25  85

    这篇文章实作了一个【对一个文本数据文件分析提取有效数据,并且存储在MySQL里,其中涉及数据库创建,表创建,插入数据(浮点数)】的小例子。

    环境 虚拟机Ubuntu Python :2.7.12 MySQL :Server version: 5.7.17-0ubuntu0.16.04.1 (Ubuntu) 数据 数据以【^】分隔,并且用【~~】包裹,可能为空。点我下载 #前十行 ~01001~^~0100~^~Butter, salted~^~BUTTER,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87 ~01002~^~0100~^~Butter, whipped, with salt~^~BUTTER,WHIPPED,W/ SALT~^~~^~~^~Y~^~~^0^~~^6.38^^^ ~01003~^~0100~^~Butter oil, anhydrous~^~BUTTER OIL,ANHYDROUS~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87 ~01004~^~0100~^~Cheese, blue~^~CHEESE,BLUE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87 ~01005~^~0100~^~Cheese, brick~^~CHEESE,BRICK~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87 ~01006~^~0100~^~Cheese, brie~^~CHEESE,BRIE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87 ~01007~^~0100~^~Cheese, camembert~^~CHEESE,CAMEMBERT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87 ~01008~^~0100~^~Cheese, caraway~^~CHEESE,CARAWAY~^~~^~~^~~^~~^0^~~^6.38^4.27^8.79^3.87 ~01009~^~0100~^~Cheese, cheddar~^~CHEESE,CHEDDAR~^~~^~~^~Y~^~~^0^~~^^^^ ~01010~^~0100~^~Cheese, cheshire~^~CHEESE,CHESHIRE~^~~^~~^~~^~~^0^~~^6.38^4.27^8.79^3.87 获取有效数据 def convert(value): if value.startswith('~'): if value=='~~': return 0.0 else: return value.strip('~') elif not value: return 0.0 return float(value) def func(): with open('FOOD-DESC.txt', 'r') as file: for line in file.xreadlines(): line = line.strip('\n') fields = line.split('^') vals = [convert(f) for f in fields] if vals[13] == 0.0: continue else: vals[6] = 'N' if vals[6]=='0.0' else 'Y' temp = [] temp.extend(vals[:3]) temp.extend(vals[6]) temp.extend(vals[10:]) yield temp #这里是一个迭代器 代码 #!/usr/bin/env python # -*- coding:utf_8 -*- import MySQLdb def convert(value): if value.startswith('~'): if value=='~~': return 0.0 else: return value.strip('~') elif not value: return 0.0 return float(value) def func(): with open('FOOD-DESC.txt', 'r') as file: for line in file.xreadlines(): line = line.strip('\n') fields = line.split('^') vals = [convert(f) for f in fields] if vals[13] == 0.0: continue else: vals[6] = 'N' if vals[6]=='0.0' else 'Y' temp = [] temp.extend(vals[:3]) temp.extend(vals[6]) temp.extend(vals[10:]) yield temp try: print 'try to connect database...' conn = MySQLdb.connect( host='localhost', user='root', passwd='0906', port=3306) cur = conn.cursor() print 'try to create database...' cur.execute('create database if not exists mysql_test') conn.select_db('mysql_test') print 'try to create table...' cur.execute(""" create table food_desc ( id varchar(255) primary key, mark varchar(255), name varchar(255), isexists varchar(5), num1 float(10.2), num2 float(10.2), num3 float(10.2), num4 float(10.2) ) """) print 'try to insert values...' stmt = 'insert into food_desc values (%s,%s,%s,%s,%s,%s,%s,%s)' cur.executemany(stmt, [val for val in func()] ) conn.commit() except MySQLdb.Error, err: print err else: print 'Well Done.' finally: cur.close() conn.close() 运行结果: try to connect database... try to create database... try to create table... try to insert values... Well Done. [Finished in 1.9s] 数据库中的数据 mysql> use mysql_test; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> desc food_desc; +----------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------+--------------+------+-----+---------+-------+ | id | varchar(255) | NO | PRI | NULL | | | mark | varchar(255) | YES | | NULL | | | name | varchar(255) | YES | | NULL | | | isexists | varchar(5) | YES | | NULL | | | num1 | float | YES | | NULL | | | num2 | float | YES | | NULL | | | num3 | float | YES | | NULL | | | num4 | float | YES | | NULL | | +----------+--------------+------+-----+---------+-------+ 8 rows in set (0.00 sec) mysql> select * from food_desc limit 10; +-------+------+-----------------------------------------------+----------+------+------+------+------+ | id | mark | name | isexists | num1 | num2 | num3 | num4 | +-------+------+-----------------------------------------------+----------+------+------+------+------+ | 01001 | 0100 | Butter, salted | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01003 | 0100 | Butter oil, anhydrous | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01004 | 0100 | Cheese, blue | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01005 | 0100 | Cheese, brick | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01006 | 0100 | Cheese, brie | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01007 | 0100 | Cheese, camembert | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01008 | 0100 | Cheese, caraway | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01010 | 0100 | Cheese, cheshire | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01011 | 0100 | Cheese, colby | Y | 6.38 | 4.27 | 8.79 | 3.87 | | 01012 | 0100 | Cheese, cottage, creamed, large or small curd | Y | 6.38 | 4.27 | 8.79 | 3.87 | +-------+------+-----------------------------------------------+----------+------+------+------+------+ 10 rows in set (0.00 sec) 数据分析 mysql> select avg(num1),avg(num2),avg(num3),avg(num4) from food_desc; +-------------------+-------------------+-------------------+--------------------+ | avg(num1) | avg(num2) | avg(num3) | avg(num4) | +-------------------+-------------------+-------------------+--------------------+ | 5.971495638143548 | 3.685189947381811 | 8.658753387337168 | 3.8507903562883103 | +-------------------+-------------------+-------------------+--------------------+ 1 row in set (0.05 sec) 人生苦短,我用Python stmt = 'insert into food_desc values (%s,%s,%s,%s,%s,%s,%s,%s)' cur.executemany(stmt, [val for val in func()] ) #语法就是一个批量插入,简洁,对比Java的字符串拼接而言,简直了

    建议: 不要拼接字符串,那样会遇到【特殊字符】很难处理 批量插入,这样更高效快速

    转载请注明原文地址: https://ju.6miu.com/read-15452.html

    最新回复(0)