查找以编程方式联接表所需的所有联接


8

给定一个SourceTable和TargetTable,我想以编程方式创建一个具有所有所需联接的字符串。

简而言之,我试图找到一种创建这样的字符串的方法:

FROM SourceTable t
JOIN IntermediateTable t1 on t1.keycolumn = t.keycolumn
JOIN TargetTable t2 on t2.keycolumn = t1.keycolumn

我有一个查询,该查询返回给定表的所有外键,但是在尝试以递归方式遍历所有这些以找到最佳联接路径并制成字符串时遇到了局限性。

SELECT 
    p.name AS ParentTable
    ,pc.name AS ParentColumn
    ,r.name AS ChildTable
    ,rc.name AS ChildColumn
FROM sys.foreign_key_columns fk
JOIN sys.columns pc ON pc.object_id = fk.parent_object_id AND pc.column_id = fk.parent_column_id 
JOIN sys.columns rc ON rc.object_id = fk.referenced_object_id AND rc.column_id = fk.referenced_column_id
JOIN sys.tables p ON p.object_id = fk.parent_object_id
JOIN sys.tables r ON r.object_id = fk.referenced_object_id
WHERE fk.parent_object_id = OBJECT_ID('aTable')
ORDER BY ChildTable, fk.referenced_column_id

我敢肯定这已经做过,但是我似乎找不到一个例子。


2
如果从源到目标有两条或更多条路径怎么办?
ypercubeᵀᴹ

2
是的,我会担心多个可能的路径,而且还要担心一个多于两个步骤的路径。而且,键由多于一列组成。这些方案都会在任何自动化解决方案中大放异彩。
亚伦·伯特兰

请注意,即使两个表之间只有一个外键,也将允许2条或更多条路径(实际上是无限数量的任意长度的路径)。考虑查询“查找与商品X以相同顺序至少放置过一次的所有商品”。你需要加入OrderItemsOrders和背部OrderItems
ypercubeᵀᴹ

2
@ypercube对,“最佳路径”到底是什么意思?
亚伦·伯特兰

“最佳JOIN路径”表示“将Target表连接到Source表的最短系列的连接”。如果在T2和T3中引用了T1,则在T4中引用T2,而在T4中引用T3。从T1到T3的最佳路径是T1,T2,T3。T1,T2,T4,T3路径不是最优的,因为它更长。
隐喻

Answers:


4

我有一个脚本,可以执行外键遍历的基本版本。我很快对其进行了修改(请参见下文),您也许可以将其用作起点。

在给定目标表的情况下,脚本会尝试为所有可能的源表打印最短路径的连接字符串(或在联系的情况下为其中之一),以便可以遍历单列外键到达目标表。该脚本在具有数千个表和我尝试过的许多FK连接的数据库上似乎运行良好。

正如其他人在评论中提到的那样,如果需要处理多列外键,则需要使其更加复杂。另外,请注意,这绝不是可用于生产且经过全面测试的代码。如果您决定扩展此功能,希望它对您有所帮助!

-- Drop temp tables that will be used below
IF OBJECT_ID('tempdb..#paths') IS NOT NULL
    DROP TABLE #paths
GO
IF OBJECT_ID('tempdb..#shortestPaths') IS NOT NULL
    DROP TABLE #shortestPaths
GO

-- The table (e.g. "TargetTable") to start from (or end at, depending on your point of view)
DECLARE @targetObjectName SYSNAME = 'TargetTable'

-- Identify all paths from TargetTable to any other table on the database,
-- counting all single-column foreign keys as a valid connection from one table to the next
;WITH singleColumnFkColumns AS (
    -- We limit the scope of this exercise to single column foreign keys
    -- We explicitly filter out any multi-column foreign keys to ensure that they aren't misinterpreted below
    SELECT fk1.*
    FROM sys.foreign_key_columns fk1
    LEFT JOIN sys.foreign_key_columns fk2 ON fk2.constraint_object_id = fk1.constraint_object_id AND fk2.constraint_column_id = 2
    WHERE fk1.constraint_column_id = 1
        AND fk2.constraint_object_id IS NULL
)
, parentCTE AS (
    -- Base case: Find all outgoing (pointing into another table) foreign keys for the specified table
    SELECT 
        p.object_id AS ParentId
        ,OBJECT_SCHEMA_NAME(p.object_id) + '.' + p.name AS ParentTable
        ,pc.column_id AS ParentColumnId
        ,pc.name AS ParentColumn
        ,r.object_id AS ChildId
        ,OBJECT_SCHEMA_NAME(r.object_id) + '.' + r.name AS ChildTable
        ,rc.column_id AS ChildColumnId
        ,rc.name AS ChildColumn
        ,1 AS depth
        -- Maintain the full traversal path that has been taken thus far
        -- We use "," to delimit each table, and each entry then has a
        -- "<object_id>_<parent_column_id>_<child_column_id>" format
        ,   ',' + CONVERT(VARCHAR(MAX), p.object_id) + '_NULL_' + CONVERT(VARCHAR(MAX), pc.column_id) +
            ',' + CONVERT(VARCHAR(MAX), r.object_id) + '_' + CONVERT(VARCHAR(MAX), pc.column_id) + '_' + CONVERT(VARCHAR(MAX), rc.column_id) AS TraversalPath
    FROM sys.foreign_key_columns fk
    JOIN sys.columns pc ON pc.object_id = fk.parent_object_id AND pc.column_id = fk.parent_column_id 
    JOIN sys.columns rc ON rc.object_id = fk.referenced_object_id AND rc.column_id = fk.referenced_column_id
    JOIN sys.tables p ON p.object_id = fk.parent_object_id
    JOIN sys.tables r ON r.object_id = fk.referenced_object_id
    WHERE fk.parent_object_id = OBJECT_ID(@targetObjectName)
        AND p.object_id <> r.object_id -- Ignore FKs from one column in the table to another

    UNION ALL

    -- Recursive case: Find all outgoing foreign keys for all tables
    -- on the current fringe of the recursion
    SELECT 
        p.object_id AS ParentId
        ,OBJECT_SCHEMA_NAME(p.object_id) + '.' + p.name AS ParentTable
        ,pc.column_id AS ParentColumnId
        ,pc.name AS ParentColumn
        ,r.object_id AS ChildId
        ,OBJECT_SCHEMA_NAME(r.object_id) + '.' + r.name AS ChildTable
        ,rc.column_id AS ChildColumnId
        ,rc.name AS ChildColumn
        ,cte.depth + 1 AS depth
        ,cte.TraversalPath + ',' + CONVERT(VARCHAR(MAX), r.object_id) + '_' + CONVERT(VARCHAR(MAX), pc.column_id) + '_' + CONVERT(VARCHAR(MAX), rc.column_id) AS TraversalPath
    FROM parentCTE cte
    JOIN singleColumnFkColumns fk
        ON fk.parent_object_id = cte.ChildId
        -- Optionally consider only a traversal of the same foreign key
        -- With this commented out, we can reach table A via column A1
        -- and leave table A via column A2.  If uncommented, we can only
        -- enter and leave a table via the same column
        --AND fk.parent_column_id = cte.ChildColumnId
    JOIN sys.columns pc ON pc.object_id = fk.parent_object_id AND pc.column_id = fk.parent_column_id 
    JOIN sys.columns rc ON rc.object_id = fk.referenced_object_id AND rc.column_id = fk.referenced_column_id
    JOIN sys.tables p ON p.object_id = fk.parent_object_id
    JOIN sys.tables r ON r.object_id = fk.referenced_object_id
    WHERE p.object_id <> r.object_id -- Ignore FKs from one column in the table to another
        -- If our path has already taken us to this table, avoid the cycle that would be created by returning to the same table
        AND cte.TraversalPath NOT LIKE ('%_' + CONVERT(VARCHAR(MAX), r.object_id) + '%')
)
SELECT *
INTO #paths
FROM parentCTE
ORDER BY depth, ParentTable, ChildTable
GO

-- For each distinct table that can be reached by traversing foreign keys,
-- record the shortest path to that table (or one of the shortest paths in
-- case there are multiple paths of the same length)
SELECT *
INTO #shortestPaths
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY ChildTable ORDER BY depth ASC) AS rankToThisChild
    FROM #paths
) x
WHERE rankToThisChild = 1
ORDER BY ChildTable
GO

-- Traverse the shortest path, starting from the source the full path and working backwards,
-- building up the desired join string as we go
WITH joinCTE AS (
    -- Base case: Start with the from clause to the child table at the end of the traversal
    -- Note that the first step of the recursion will re-process this same row, but adding
    -- the ParentTable => ChildTable join
    SELECT p.ChildTable
        , p.TraversalPath AS ParentTraversalPath
        , NULL AS depth
        , CONVERT(VARCHAR(MAX), 'FROM ' + p.ChildTable + ' t' + CONVERT(VARCHAR(MAX), p.depth+1)) AS JoinString
    FROM #shortestPaths p

    UNION ALL

    -- Recursive case: Process the ParentTable => ChildTable join, then recurse to the
    -- previous table in the full traversal.  We'll end once we reach the root and the
    -- "ParentTraversalPath" is the empty string
    SELECT cte.ChildTable
        , REPLACE(p.TraversalPath, ',' + CONVERT(VARCHAR, p.ChildId) + '_' + CONVERT(VARCHAR, p.ParentColumnId)+ '_' + CONVERT(VARCHAR, p.ChildColumnId), '') AS TraversalPath
        , p.depth
        , cte.JoinString + '
' + CONVERT(VARCHAR(MAX), 'JOIN ' + p.ParentTable + ' t' + CONVERT(VARCHAR(MAX), p.depth) + ' ON t' + CONVERT(VARCHAR(MAX), p.depth) + '.' + p.ParentColumn + ' = t' + CONVERT(VARCHAR(MAX), p.depth+1) + '.' + p.ChildColumn) AS JoinString
    FROM joinCTE cte
    JOIN #paths p
        ON p.TraversalPath = cte.ParentTraversalPath
)
-- Select only the fully built strings that end at the root of the traversal
-- (which should always be the specific table name, e.g. "TargetTable")
SELECT ChildTable, 'SELECT TOP 100 * 
' +JoinString
FROM joinCTE
WHERE depth = 1
ORDER BY ChildTable
GO

0

您可以为要连接的所有表添加一个包含两个字段TAB_NAME,KEY_NAME的表的键列表。

示例,用于表 City

  • 城市|城市名称
  • 城市|国家名
  • 城市|省名
  • 城市|城市代码

同样ProvinceCountry

收集表的数据并放入单个表(例如,元数据表)

现在像下面这样草拟查询

select * from
(Select Table_name,Key_name from Meta_Data 
where Table_name in ('City','Province','Country')) A,
(Select Table_name,Key_name from Meta_Data 
where Table_name in ('City','Province','Country')) B,
(Select Table_name,Key_name from Meta_Data 
where Table_name in ('City','Province','Country')) C

where

A.Table_Name <> B.Table_name and
B.Table_name <> C.Table_name and
C.Table_name <> A.Table_name and
A.Column_name = B.Column_name and
B.Column_name = C.Column_name

这将为您提供如何基于匹配键(相同键名)链接表的方法

如果您认为键名可能不匹配,则可以包括一个备用键字段,并尝试在where条件下使用它。


请注意,询问者希望使用sysSQL Server中的现有表来描述表中的列,如何将表链接在一起等。所有这些都已经存在。建立自己的表来定义表结构以满足特定需求可能是一个后备职位,但是首选答案将使用已经存在的答案,就像公认的答案一样。
RDFozz '18
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.