一、前言

在人工智能领域，如机器学习、深度学习等，优先使用 Python 语言进行开发，算法清晰自然，数据分析库齐全。

同理，在数据结构与算法的基础方面，从 Python 入手，能很好掌握各类基础算法。

之前使用 C -> C++ -> Java，体验不好。臃肿的语法上的障碍，徒增理解难度。而 Python 基本是伪代码的难度了。

参考资料：
《图解算法》

缺点：
使用 Python2 写
部分算法不够严谨

本文：
使用 Python3 写
修复算法特殊情况的 bug

二、内容

1.二分查找法

def binary_search(ordered_array, goal):
    low = 0
    high = len(ordered_array) - 1
    while low <= high:
        mid = int((low + high) / 2)
        guess = ordered_array[mid]
        # print(guess)
        if guess == goal:
            return guess
        if guess < goal:
            low = mid - 1
        else:
            high = mid + 1
    return None


if __name__ == '__main__':
    print(binary_search([1, 2, 3, 4, 5], 2))

在这里插入图片描述

2.选择排序

def find_smallest_index(arr):
    smallest = arr[0]
    smallest_index = 0
    for i in range(1, len(arr)):
        if arr[i] < smallest:
            smallest = arr[i]
            smallest_index = i
    return smallest_index


def selection_sort(arr):
    sorted_arr = []
    for i in range(len(arr)):
        smallest_index = find_smallest_index(arr)
        sorted_arr.append(arr.pop(smallest_index))
    return sorted_arr


if __name__ == '__main__':
    print(selection_sort([3, 5, 4, 0, 7, 3, 8, 7, 8]))

在这里插入图片描述

3.递归算法

# 阶乘
def fact(x):
    if x == 1:
        return 1
    else:
        return x * fact(x - 1)


# 求和
def sum(arr):
    if arr == []:
        return 0
    return arr[0] + sum(arr[1:])


# 求最大值
def max(arr):
    if len(arr) == 1:
        return arr[0]
    elif len(arr) == 2:
        return arr[0] if arr[0] > arr[1] else arr[1]
    sub_max = max(arr[1:])
    return arr[0] if arr[0] > sub_max else sub_max

在这里插入图片描述

4.快速排序

def quick_sort(arr):
    if len(arr) < 2:
        return arr
    else:
        pivot = arr[0]
        less = [i for i in arr[1:] if i <= pivot]
        greater = [i for i in arr[1:] if i > pivot]
        return quick_sort(less) + [pivot] + quick_sort(greater)


if __name__ == '__main__':
    print(quick_sort([3, 5, 4, 0, 7, 3, 8, 7, 8]))

在这里插入图片描述

5.归并排序

if __name__ == '__main__':
    pass

6.散列表

book = {}
phone = dict()

book["think in java"] = "Java 编程思想"
phone["120"] = "叫救护车"

if __name__ == '__main__':
    pass

7.广度优先搜索

from queue import Queue


def person_is_seller(name):
    return name == 'thom'


def search(name, graph):
    search_queue = Queue()
    for i in range(len(graph[name])):
        search_queue.put(graph[name][i])
    searched = []
    while not search_queue.empty():
        person = search_queue.get()
        if not person in searched:
            if person_is_seller(person):
                print(person, "--> is a mango seller")
                return True
            else:
                print(person, "--> is no a mango seller")
                for i in range(len(graph[person])):
                    search_queue.put(graph[person][i])
                searched.append(person)
    return False


if __name__ == '__main__':
    graph = {}
    graph['you'] = ['alice', 'bob', 'claire']
    graph['bob'] = ['anuj', 'peggy']
    graph['alice'] = ['peggy']
    graph['claire'] = ['thom', 'jonny']
    graph['anuj'] = []
    graph['peggy'] = []
    graph['thom'] = []
    graph['jonny'] = []

    print(search('you', graph))

在这里插入图片描述

8.狄克斯特拉算法

"""
Dijkstra：
必须是有向无环加权图
不适用于含有负权边的图
"""

# -*-coding:utf-8-*-
# 用散列表实现图的关系
graph = {}
graph["start"] = {}
graph["start"]["a"] = 6
graph["start"]["b"] = 2
graph["a"] = {}
graph["a"]["end"] = 1
graph["b"] = {}
graph["b"]["a"] = 3
graph["b"]["end"] = 5
graph["end"] = {}

# 创建节点的开销表，开销是指从start到该节点的权重
# 无穷大
infinity = float("inf")
costs = {}
costs["a"] = 6
costs["b"] = 2
costs["end"] = infinity

# 父节点散列表
parents = {}
parents["a"] = "start"
parents["b"] = "start"
parents["end"] = None

# 已经处理过的节点，需要记录
processed = []


# 找到开销最小的节点
def find_lowest_cost_node(costs):
    # 初始化数据
    lowest_cost = infinity
    lowest_cost_node = None
    # 遍历所有节点
    for node in costs:
        # 该节点没有被处理
        if not node in processed:
            # 如果当前节点的开销比已经存在的开销小，则更新该节点为开销最小的节点
            if costs[node] < lowest_cost:
                lowest_cost = costs[node]
                lowest_cost_node = node
    return lowest_cost_node


# 找到最短路径
def get_shortest_path():
    node = "end"
    shortest_path = ["end"]
    while parents[node] != "start":
        shortest_path.append(parents[node])
        node = parents[node]
    shortest_path.append("start")
    return shortest_path


# 寻找加权的最短路径
def dijkstra():
    # 查询到目前开销最小的节点
    node = find_lowest_cost_node(costs)
    # 只要有开销最小的节点就循环
    while node is not None:
        # 获取该节点当前开销
        cost = costs[node]
        # 获取该节点相邻的节点
        neighbors = graph[node]
        # 遍历这些相邻节点
        for n in neighbors:
            # 计算经过当前节点到达相邻结点的开销,即当前节点的开销加上当前节点到相邻节点的开销
            new_cost = cost + neighbors[n]
            # 如果计算获得的开销比原本该节点的开销小，更新该节点的开销和父节点
            if new_cost < costs[n]:
                costs[n] = new_cost
                parents[n] = node
        # 遍历完毕该节点的所有相邻节点，说明该节点已经处理完毕
        processed.append(node)
        # 去查找下一个开销最小的节点，若存在则继续执行循环，若不存在结束循环
        node = find_lowest_cost_node(costs)
    # 循环完毕说明所有节点都已经处理完毕
    shortest_path = get_shortest_path()
    shortest_path.reverse()
    print(shortest_path)


if __name__ == '__main__':
    # 测试
    dijkstra()

在这里插入图片描述

9.贪心算法

"""
对于 NP 问题，无法找到快速解决的方案，只能求近似解 —— 使用贪心算法。

贪心算法的例子：
① 广度优先搜索
② 狄克斯特拉算法
"""

10.动态规划

"""
来点幽默的：
费曼算法（Feynman algorithm）。这个算法是以著名物理学家理查德·费曼命名的，其步骤如下。
(1) 将问题写下来。
(2) 好好思考。
(3) 将答案写下来
"""

11.K近邻算法

"""
k-nearest neighbours，KNN

① 距离公式（初步体会）
② 余弦公式（更加准确——机器学习）
"""

12.大数据算法

from functools import reduce

"""
一、MapReduce 分布式算法
① Map 映射函数 —— 将一个数组转换为另一个数组。
② Reduce 归并函数 —— 将一个数组转换为一个元素。

二、布隆过滤器
布隆过滤器是一种概率型数据结构，它提供的答案有可能不对，但很可能是正确的。
适合用于不要求答案绝对准确的情况。

三、HyperLogLog
与布隆过滤器类似，但占用内存更小

四、SHA 安全散列算法
给定一个字符串，SHA 返回其散列值。

五、Diffie-Hellman 密钥交换
① 公钥就是公开的，使用公钥对其进行加密。
② 加密后的消息只有使用私钥才能解密。只要只有你知道私钥，就只有你才能解密消息！
"""

# 映射函数
map_arr1 = [1, 2, 3]
print(map_arr1)
map_arr2 = list(map(lambda x: x * x, map_arr1))
print(map_arr2)

# 归并函数
reduce_arr = [3, 4, 5]
print(reduce_arr)
total = reduce(lambda x, y: x + y, reduce_arr)
print(total)

在这里插入图片描述

三、其他

本文以《图解算法》为大纲，做了一定的提炼。

有空会参考其他书籍，完善丰富更多内容。

借助 Python 轻松理解数据结构中的常见算法

一、前言

二、内容

1.二分查找法

2.选择排序

3.递归算法

4.快速排序

5.归并排序

6.散列表

7.广度优先搜索

8.狄克斯特拉算法

9.贪心算法

10.动态规划

11.K近邻算法

12.大数据算法

三、其他

猜你喜欢