不明就里的朋友,看上一篇就明白了,欠大家一份自动化赋值语句:
其文中提到了三张表的建表语句,如下:
CREATE TABLE dbo.DimUser (
UserID INT IDENTITY(1, 1)
,UserName NVARCHAR(200)
)
GO
CREATE TABLE dbo.FctOrderHeader (
OrderID BIGINT IDENTITY(1, 1)
,OrderDate DATETIME
,OrderAmount BIGINT
)
GO
CREATE TABLE dbo.FctOrderUser (
OrderID BIGINT
,UserID INT
)
GO
我们尝试着给这三张表赋值,DimUser 表附上 10000 条数据, FctOrderHeader 附上3650000 行数据, 而 FctOrderUser 则是将这些订单均分给这些用户。
一般大家的做法肯定是循环处理,使用 for..each, while 等等手段,这里介绍一种不一样的 set 手法。
1. 使用 tally table 产生一定数量的记录
原型是这样的, 利用 Join 可以产生笛卡尔积,以 3 为底数,2^5 为指数,共产生 power(3,32) 次方的数据行,再用 Row_Number() 函数给每一行排序,得到连续的 n 行指定数据:
;WITH LO AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1 AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2 AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3 AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4 AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5 AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) AS RNK
FROM L5
ORDER BY RNK
OFFSET 0 ROWS
FETCH FIRST 1000 ROWS ONLY;
在正式的环境中,自然要将它扩展为可服用的单元代码,所以封装为表值函数是最好的:
IF EXISTS (
SELECT TOP 1 1
FROM sys.objects WITH (NOLOCK)
WHERE UPPER(name) = UPPER('GetSeqNum')
AND type_desc = 'SQL_INLINE_TABLE_VALUED_FUNCTION'
)
BEGIN
DROP FUNCTION dbo.GetSeqNum;
END
GO
CREATE FUNCTION dbo.GetSeqNum (
@BEGINNUM BIGINT,
@ENDNUM BIGINT
)
RETURNS TABLE
AS
RETURN
WITH LO AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1 AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2 AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3 AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4 AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5 AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) AS RNK
FROM L5
ORDER BY RNK
OFFSET(@BEGINNUM - 1) ROWS
FETCH FIRST(@ENDNUM - @BEGINNUM) + 1 ROWS ONLY;
GO
在以上两段 SQL 中,我们都使用了 SQL Server 2012 以来的新语法糖, offset … fetch first rows only 来分页
2 给用户表附上一定数量(比如 10000)的记录
DECLARE @LOW INT = 1;
DECLARE @HIGH INT = 10000;
WITH LO
AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1
AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2
AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3
AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4
AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5
AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
INSERT INTO dbo.DimUser (UserName)
SELECT 'WX_USER' + LTRIM(RTRIM(CONVERT(NVARCHAR, ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
))))
FROM L5
ORDER BY ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
)
OFFSET(@LOW - 1) ROWS
FETCH FIRST(@HIGH - @LOW) + 1 ROWS ONLY
GO
3 给订单表附上一定数量的记录,并按照设定日期,随机配上订单的数量(比如10年,每天 1000 笔订单)
扫描二维码关注公众号,回复:
1438719 查看本文章
DECLARE @BEGINDATE DATETIME = '2008-05-01',
@ENDDATE DATETIME = '2018-05-01';
DECLARE @LOW INT = 1;
DECLARE @HIGH INT = 1000;
WITH LO
AS (
SELECT c
FROM (
VALUES (1),
(2),
(3)
) AS d(c)
),
L1
AS (
SELECT a.c,
b.c AS bc
FROM LO AS a
CROSS JOIN LO AS b
),
L2
AS (
SELECT a.c,
b.c AS bc
FROM L1 AS a
CROSS JOIN L1 AS b
),
L3
AS (
SELECT a.c,
b.c AS bc
FROM L2 AS a
CROSS JOIN L2 AS b
),
L4
AS (
SELECT a.c,
b.c AS bc
FROM L3 AS a
CROSS JOIN L3 AS b
),
L5
AS (
SELECT a.c,
b.c AS bc
FROM L4 AS a
CROSS JOIN L4 AS b
)
INSERT INTO dbo.FctOrderHeader (
OrderDate,
OrderAmount
)
SELECT DATEADD(DAY, DT.RNK - 1, @BEGINDATE) AS OrderDate,
RAND() * DT2.RNK * 100000
FROM dbo.GetSeqNum(1, DATEDIFF(DAY, @BEGINDATE, @ENDDATE)) DT
CROSS APPLY (
SELECT ROW_NUMBER() OVER (
ORDER BY (
SELECT NULL
)
) AS RNK
FROM L5
ORDER BY RNK ASC
OFFSET(@LOW - 1) ROWS
FETCH FIRST(@HIGH - @LOW) + 1 ROWS ONLY
) DT2
ORDER BY OrderDate,
DT2.RNK
GO
4 按照模运算,把订单分配给用户,以便测试用
INSERT INTO dbo.FctOrderUser (
OrderID,
UserID
)
SELECT ord.OrderID,
Usr.UserID
FROM dbo.FctOrderHeader ord
INNER JOIN dbo.DimUser Usr ON ord.OrderID % 10000 = Usr.UserID
GO
Tally Table 作为特别有用的一个辅助手段,可以在多种场景中复用,比如数据孤岛问题,即求解非连续数字或者填补连续数字等。
欢迎关注【有关SQL】,入群讨论技术