postgresql | 数据库| 生成2000W条的简单测试表

前言：

数据库学习的过程中，很可能需要数据量比较大的表来进行模拟测试，那么，测试表的创建需要遵循的是贴近实际的生产环境，尽量的模仿实际的生产环境。

因此，学习数据库的时候，快速的创建一个具有足够数据量的大表是非常有必要的。

OK，本文将就如何创建一个数量级达到2000W的单表做一个详细的介绍。

一，

创建表用到的函数

generate_series(1,20000000)

lpad()

random()

二，

随机生成身份证号的自定义函数

create or replace function gen_id(  
 a date,  
 b date  
)   
returns text as $$  
select lpad((random()*99)::int::text, 2, '0') ||   
    lpad((random()*99)::int::text, 2, '0') ||   
    lpad((random()*99)::int::text, 2, '0') ||   
    to_char(a + (random()*(b-a))::int, 'yyyymmdd') ||   
    lpad((random()*99)::int::text, 2, '0') ||   
    random()::int ||   
    (case when random()*10 >9 then 'X' else (random()*9)::int::text end ) ;  
$$ language sql strict;

三，

创建测试表

create table if not exists testpg (
	"id" int,
	"shenfenzheng" VARCHAR ( 255 ) COLLATE "pg_catalog"."default"
);

#或者创建这个表
CREATE SEQUENCE test START 1;
create table if not exists testpg (
	"id" int8 not null DEFAULT nextval('test'::regclass),
	CONSTRAINT "user_vendorcode_pkey" PRIMARY KEY ("id"),
	"shenfenzheng" VARCHAR ( 255 ) COLLATE "pg_catalog"."default"
);

四，

向测试表插入数据，暂定是2000W条：

insert into testpg SELECT generate_series(1,20000000) as xm, gen_id('1949-01-01', '2023-10-16') as num;

插入数据的速度看CPU是否给力了，反正我的笔记本是比较差劲，因此十来分钟才生成完

五，

测试表的简单使用

#####注：为什么是使用navicat？因为navicat通常是远程连接的数据库，是可以真实模拟数据库使用的，在本地查询速度会快很多的。

1，

快速查询

select * from testpg where id between 10012 and 52013 limit 1000;

慢速查询

select * from testpg where id between 10012 and 52013

给ID列增加索引，然后无limit查询：

可以看到加索引后查询速度增加了30多倍，由15秒左右变为0.3秒左右

create index on testpg(id);
select * from testpg where id between 10012 and 52013 ;

未完待续！！！！

postgresql | 数据库| 生成2000W条的简单测试表

猜你喜欢