概述

近期在学习机器学习，机器学习中一般都推荐使用python 语言。使用python 语言，在数据处理方面相对于java 等其他语言来说，更加方便易容，python 可以引入第三方的库，加以数据处理，比如处理矩阵等高等数学以及数据结构，还十分方便的进行程序数据的展示。

本文中，只是列列举了python 的简单用法和基础知识，关于进阶的内容，请参考度娘，官网（如果你english hen good）,博客等。个人因为兴趣，把python 当做强大的脚本工具来学习了。各位python 大神如果路过，请海涵。

学习路线

首先我个人整理了一个python 学习的思维导图。这个思维导图，可能有考虑不全的地方，但是是根据java 中有的一些内容进行整理的。具体如下：

至于kNN.py 这个示例文件：具体内容如下：

从这个文件中可以看到的内容有：注释，变量定义，数据类型，循环，判断，函数，包管理，集合，第三方库，文件操作等。

'''
Created on Sep 16, 2010
kNN: k Nearest Neighbors

Input:      inX: vector to compare to existing dataset (1xN)
            dataSet: size m data set of known vectors (NxM)
            labels: data set labels (1xM vector)
            k: number of neighbors to use for comparison (should be an odd number)
            
Output:     the most popular class label

@author: pbharrin
'''
from numpy import *
import operator
from os import listdir

def createDataSet():
    group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
    labels = ['A','A','B','B']
    return group, labels

def file2matrix(filename):
    fr = open(filename)
    numberOfLines = len(fr.readlines())         #get the number of lines in the file
    returnMat = zeros((numberOfLines,3))        #prepare matrix to return
    classLabelVector = []                       #prepare labels return   
    fr = open(filename)
    index = 0
    for line in fr.readlines():
        line = line.strip()
        listFromLine = line.split('\t')
        returnMat[index,:] = listFromLine[0:3]
        classLabelVector.append(int(listFromLine[-1]))
        index += 1
    return returnMat,classLabelVector

具体内容：

OK, 有了思维导图，那么我们接下来就是具体的学习了。至于python 的安装，度娘去吧，安装软件这些都不是事，哈哈

python注释：

注释可以分为单行注释和多行注释。

首先我们来看一下单行注释：

# 这是一行注释

print("Hello, World!")

然后是多行注释，可以有两种方式，一种是单引号的方式，一种是双引号的方式

#!/usr/bin/python3 
'''
这是多行注释，用三个单引号
这是多行注释，用三个单引号 
这是多行注释，用三个单引号
'''

#!/usr/bin/python3 
"""
这是多行注释，用三个双引号
这是多行注释，用三个双引号 
这是多行注释，用三个双引号
"""

变量定义

注释看完之后，我们来看一下具体变量的定义，即变量标识符改怎么定义。

按照java 的理解，一把全部小写字母和大写字母来定义准没错。但是我们还是看一下语法。

标识符语法规则：

第一个字符必须是字母表中字母或者下划线_ (一般不用下划线，感觉)

标识符的其他部分由字母、数字、和下划线组成。

标识符对字母大小写敏感。

另外还要注意的是python 保留字，主要有如下：

>>> import keyword
>>> keyword.kwlist

['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'with', 'yield']

源文件语法格式

python 代码主要采取缩进的格式，缩进一定要严格对齐：

if True:
    print ("True")
else:
    print ("False")

如果缩进不对齐，会包语法错误，例子如下：

if True:
    print ("Answer")
    print ("True")
else:
    print ("Answer")
  print ("False")    # 缩进不一致，会导致运行错误

如果一行语句需要分成多行来写，一般用反斜杠来进行分行：

total = item_one + \
        item_two + \
        item_three

如果是[],{},() 中的内容，则不需要反斜杠，如下：

total = ['item_one', 'item_two', 'item_three',
        'item_four', 'item_five']

输入输出

此处，先用到简单的print()，来输出，其他的格式化输出，以及用表达式进行输出，后面在作分析。

直接上代码：

>>> print 'HelloWorld'
HelloWorld
>>> print 'hello','world'
hello world
>>> print 1000
1000
>>> print 100 + 3000
3100
>>> print(1)
1
>>> print('1', '2')
('1', '2')
>>> print('1', '2', '3')
('1', '2', '3')
>>> x = 1
>>> print(x)
1
>>> x = 7 
>>> y = '7'
>>> print("the x is %d, the y is %s " % (x, y))
the x is 7, the y is 7 
>>>

格式化输出，类似print("the x is %d, the y is %s " % (x, y)

具体其他格式：大致如下：

转换类型含义

d,i 带符号的十进制整数
o 不带符号的八进制
u 不带符号的十进制
x 不带符号的十六进制（小写）
X 不带符号的十六进制（大写）
e 科学计数法表示的浮点数（小写）
E 科学计数法表示的浮点数（大写）
f,F 十进制浮点数
g 如果指数大于-4或者小于精度值则和e相同，其他情况和f相同
G 如果指数大于-4或者小于精度值则和E相同，其他情况和F相同
C 单字符（接受整数或者单字符字符串）
r 字符串（使用repr转换任意python对象)

s 字符串（使用str转换任意python对象）

数据类型

在python 中，大致可以分为5类数据类型

Number （数字）

String (字符串)

List(列表)

Tuple(元组)

Dictionary(字典)

列表，元组，字典属于集合类型。

注意python 中没有java 中的数组的概念。

首先我们来看一下数字

>>> var1 = 1
>>> var2 = 2
>>> var3 = 3   # var1, var2, var3 都是整型
>>> var1
1
>>> var2
2
>>> del var3   # 删除var3 这个引用
>>> var3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'var3' is not defined
>>> var3 = 1.2   # float 
>>> var3
1.2
>>> var3 = 1.20123 ## float
>>> var3  
1.20123
>>> var4=5+6j    ## 复数
>>> var4
(5+6j)
>>>

整体上看，python 的数字类型分为整型int,长整型long，浮点型float, 复数complex( x＋yj, x 和y 都是实数)

String 类型

字符串有数字字母下滑线等特殊字符组成：

>>> x='awlalla@@##ad;a[a]a'
>>> x
'awlalla@@##ad;a[a]a'
>>> x='草泥马'
>>> x
'\xe8\x8d\x89\xe6\xb3\xa5\xe9\xa9\xac'
>>>

可以从-1开始取值，也可以从0开始取值

>>> x='awlalla@@##ad;a[a]a'
>>> x[1]
'w'
>>> x[0]
'a'
>>> x[-1]
'a'
>>> 
>>> x[-2]
']'
>>> x[-3]
'a'
>>> x[-100]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> x[100]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>>

注意长度不可以超过字符串长度

列表（List)

>> list = ['apple', 'jack', 12, 12.5, 12 - 5j]    #列表1
>>> list
['apple', 'jack', 12, 12.5, (12-5j)]              # 列表2
>>> otherlist=[12, 13]
>>> list
['apple', 'jack', 12, 12.5, (12-5j)]
>>> list[0]       #第一个值
'apple'
>>> list[0:0]   #从0号元素开始，取到第0号元素，不包含0
[]
>>> list[0:1]   # 取0到1的元素，不包含1
['apple']
>>> list[0:2]   # 0到2的元素，不包含2
['apple', 'jack']
>>> list[1:3]   # 1到3的元素，不包含3
['jack', 12]
>>> list[:3]     # 从开始取到3，不包含3
['apple', 'jack', 12]
>>> list[:]    # 取所有元素
['apple', 'jack', 12, 12.5, (12-5j)]
>>> list[0:]          # 从开始取到最后
['apple', 'jack', 12, 12.5, (12-5j)]
>>> list*2             # 元素复制一遍，依次往后面添加
['apple', 'jack', 12, 12.5, (12-5j), 'apple', 'jack', 12, 12.5, (12-5j)]
>>> list + otherlist   # 两个列表相加，
['apple', 'jack', 12, 12.5, (12-5j), 12, 13]
>>> list.append(otherlist)   # 把otherlist 作为一个元素，添加到list 中
>>> list
['apple', 'jack', 12, 12.5, (12-5j), [12, 13]]
>>> list.extend(otherlist)  # 两个列表相加
>>> list
['apple', 'jack', 12, 12.5, (12-5j), [12, 13], 12, 13]
>>>

元组

关于元组的操作，与列表类似，内嵌min,max comp，len 等函数

>>> tup1=(1,1,'234','dlaldlal', ';d;ad;a')
>>> tup1
(1, 1, '234', 'dlaldlal', ';d;ad;a')
>>> tup1(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not callable
>>> tup1[1]
1
>>> tup1[0]
1
>>> tup1[0:]
(1, 1, '234', 'dlaldlal', ';d;ad;a')
>>> tup1[2:]
('234', 'dlaldlal', ';d;ad;a')
>>> tup1[1:]
(1, '234', 'dlaldlal', ';d;ad;a')
>>> tup1[1:3]
(1, '234')
>>> tup2=(1,2)
>>> tup1 + tup2
(1, 1, '234', 'dlaldlal', ';d;ad;a', 1, 2)
>>>

字典

字典类似于java 中的Map ：

下面我们来直接看例子。

>>> dict = {"a" : 1, "b" : "b", 1 : 2}
>>> dict[1]  #寻找元素
2
>>> dict[2.3] = 2  # 添加元素
>>> dict
{'a': 1, 1: 2, 'b': 'b', 2.3: 2}
>>>

条件判断，循环

1，条件判断

语法：（通常需要注意的点主要有 if 条件后面的冒号容易忘记：）

if condition1:
true1_expressions
elif condition2:
true2_expressions
elif condtion3:
true3_expressions
elif ...
...
else:
else_expressions

2，for 循环

直接上例子：

for letter in 'Python':     # 第一个实例
   print '当前字母 :', letter
 
fruits = ['banana', 'apple',  'mango']
for fruit in fruits:        # 第二个实例
   print '当前水果 :', fruit
   
fruits = ['banana', 'apple',  'mango']
for index in range(len(fruits)):
   print '当前水果 :', fruits[index]

3，while 循环：

count = 0
while (count < 3):
    print('重复一遍')
    count = count + 1

方法定义：

来两个基本款例子：

func(name,arg1,arg2)

func(name,key1=value1,key1=value2)

包管理：

大致可以参考如下，python 中，包和模块可以认为是同一个概念，但是具体应该还是有些差别的，暂时没有深究。

文件读取

在机器学习中，文件读取时经常用到的内容，下面我们来看一下python 中的文件读取。

1，文件对象的获取。

python 中文件对象的获取，用open(fileName) 来得到一个file 对象。

open(filename, mode)

其中mod 读写文件的模式如下

2，简单的写文件：

f=open("/home/demo.txt", "w") 会覆盖原来的内容

f.write("hello.world...\n")

f.close()

注意如果/home 目录不存在，则会报错。否则，不管demo.txt 是否存在，都不会报错。

3，简单的文件读取：

f=open("/home/demo.txt", "w") 会覆盖原来的内容

f.readlines()

['hello,world!!!hello,worldcat demo.txt !\n', 'hello,worldcat demo.txt !\n']

集合操作

即关于元组，List，字典的操作，大致已经在数据类型模块下有提及过。

类定义

直接上python代码，看例子：

class people:
    #定义基本属性
    name = ''
    age = 0
    #定义私有属性,私有属性在类外部无法直接进行访问
    __weight = 0
    #定义构造方法
    def __init__(self,n,a,w):
        self.name = n
        self.age = a
        self.__weight = w
    def speak(self):
        print("%s 说: 我 %d 岁。" %(self.name,self.age))

# 实例化类
p = people('runoob',10,30)
p.speak()

至于高阶的函数式编程，以及第三方库的使用，

本小白也不是弄的很清楚，函数式编程，在我个人的理解，每次调用一个函数，返回的都是一个对象，这样就可以连续调用函数，类似这样子。希望各位，如果有好的理解，可以在评论区留言，谢谢各位。

第三方库的使用，想数学库，画图用的库，python 爬虫框架等内容，算是基础之上的一个封装。需要用到的时候，具体查询即可。

可以说，看到这里，加上动手试一下。整个python 算是入门了吧，之后的工作就是多用了，可以实现常见的排序算法、实现常见的数据结构来进行练习，希望大家一同共同成长。

Python 系列之一基础知识（一篇文章掌握基础知识）

概述