第一句话
您可以放心地忽略以下(和包括)JOIN的部分:如果仅想破解代码,则从Start入手。的背景和结果只是作为背景。如果您想查看最初的代码,请查看2015年10月6日之前的编辑历史记录。
目的
最终,我想根据表中可用GPS数据的DateTime时间戳(直接在表中观察数据的侧面)来计算发射机(X
或Xmit
)的内插GPS坐标。SecondTable
FirstTable
我的近期目标实现的最终目标是要弄清楚如何最好地加入FirstTable
到SecondTable
得到这些侧翼的时间点。以后,我可以使用这些信息,并假设沿着等矩形坐标系进行线性拟合,就可以计算GPS中间坐标(用奇特的话说,我不在乎地球是这个范围的球体)。
问题
- 有没有更有效的方法来生成最接近的前后时间戳?
- 由我自己解决,方法是仅抓住“之后”,然后仅获取与“之后”相关的“之前”。
- 是否有一种不涉及
(A<>B OR A=B)
结构的更直观的方法。- Byrdzeye提供了基本的替代方法,但是我的“现实世界”经验与他的所有4种执行相同策略的加入策略并不一致。但是,他对替代连接样式的解决也深表感谢。
- 您可能还有其他想法,窍门和建议。
- 到目前为止,byrdzeye和Phrancis在这方面都非常有帮助。我发现Phrancis的建议非常出色,并在关键阶段提供了帮助,因此在这里我将给予他优势。
我仍然很感激我在问题3方面能获得的任何其他帮助。项目 符号反映了我认为对个人问题最有帮助的人。
表定义
半视觉表示
第一表
Fields
RecTStamp | DateTime --can contain milliseconds via VBA code (see Ref 1)
ReceivID | LONG
XmitID | TEXT(25)
Keys and Indices
PK_DT | Primary, Unique, No Null, Compound
XmitID | ASC
RecTStamp | ASC
ReceivID | ASC
UK_DRX | Unique, No Null, Compound
RecTStamp | ASC
ReceivID | ASC
XmitID | ASC
第二表
Fields
X_ID | LONG AUTONUMBER -- seeded after main table has been created and already sorted on the primary key
XTStamp | DateTime --will not contain partial seconds
Latitude | Double --these are in decimal degrees, not degrees/minutes/seconds
Longitude | Double --this way straight decimal math can be performed
Keys and Indices
PK_D | Primary, Unique, No Null, Simple
XTStamp | ASC
UIDX_ID | Unique, No Null, Simple
X_ID | ASC
ReceiverDetails表
Fields
ReceivID | LONG
Receiver_Location_Description | TEXT -- NULL OK
Beginning | DateTime --no partial seconds
Ending | DateTime --no partial seconds
Lat | DOUBLE
Lon | DOUBLE
Keys and Indicies
PK_RID | Primary, Unique, No Null, Simple
ReceivID | ASC
ValidXmitters表
Field (and primary key)
XmitID | TEXT(25) -- primary, unique, no null, simple
SQL小提琴...
...以便您可以使用表定义和代码。这个问题是针对MSAccess的,但是正如Phrancis指出的那样,Access没有SQL提琴风格。因此,您应该可以在这里查看我基于Phrancis的答案的表定义和代码:http : //sqlfiddle.com/#! 6/ e9942 /4
(外部链接)
加入:开始
我目前的“内心”加入策略
首先创建一个具有列顺序和(RecTStamp, ReceivID, XmitID)
所有索引/已排序复合主键的FirstTable_rekeyed ASC
。我还分别在每个列上创建了索引。然后像这样填充它。
INSERT INTO FirstTable_rekeyed (RecTStamp, ReceivID, XmitID)
SELECT DISTINCT ROW RecTStamp, ReceivID, XmitID
FROM FirstTable
WHERE XmitID IN (SELECT XmitID from ValidXmitters)
ORDER BY RecTStamp, ReceivID, XmitID;
上面的查询用153006条记录填充了新表,并在10秒左右的时间内返回了该表。
当使用TOP 1子查询方法时,将整个方法包装在“ SELECT Count(*)FROM(...)”中,以下操作将在一两秒钟内完成
SELECT
ReceiverRecord.RecTStamp,
ReceiverRecord.ReceivID,
ReceiverRecord.XmitID,
(SELECT TOP 1 XmitGPS.X_ID FROM SecondTable as XmitGPS WHERE ReceiverRecord.RecTStamp < XmitGPS.XTStamp ORDER BY XmitGPS.X_ID) AS AfterXmit_ID
FROM FirstTable_rekeyed AS ReceiverRecord
-- INNER JOIN SecondTable AS XmitGPS ON (ReceiverRecord.RecTStamp < XmitGPS.XTStamp)
GROUP BY RecTStamp, ReceivID, XmitID;
-- No separate join needed for the Top 1 method, but it would be required for the other methods.
-- Additionally no restriction of the returned set is needed if I create the _rekeyed table.
-- May not need GROUP BY either. Could try ORDER BY.
-- The three AfterXmit_ID alternatives below take longer than 3 minutes to complete (or do not ever complete).
-- FIRST(XmitGPS.X_ID)
-- MIN(XmitGPS.X_ID)
-- MIN(SWITCH(XmitGPS.XTStamp > ReceiverRecord.RecTStamp, XmitGPS.X_ID, Null))
先前的“内部胆量” JOIN查询
首先(快速...但不够好)
SELECT
A.RecTStamp,
A.ReceivID,
A.XmitID,
MAX(IIF(B.XTStamp<= A.RecTStamp,B.XTStamp,Null)) as BeforeXTStamp,
MIN(IIF(B.XTStamp > A.RecTStamp,B.XTStamp,Null)) as AfterXTStamp
FROM FirstTable as A
INNER JOIN SecondTable as B ON
(A.RecTStamp<>B.XTStamp OR A.RecTStamp=B.XTStamp)
GROUP BY A.RecTStamp, A.ReceivID, A.XmitID
-- alternative for BeforeXTStamp MAX(-(B.XTStamp<=A.RecTStamp)*B.XTStamp)
-- alternatives for AfterXTStamp (see "Aside" note below)
-- 1.0/(MAX(1.0/(-(B.XTStamp>A.RecTStamp)*B.XTStamp)))
-- -1.0/(MIN(1.0/((B.XTStamp>A.RecTStamp)*B.XTStamp)))
秒(慢)
SELECT
A.RecTStamp, AbyB1.XTStamp AS BeforeXTStamp, AbyB2.XTStamp AS AfterXTStamp
FROM (FirstTable AS A INNER JOIN
(select top 1 B1.XTStamp, A1.RecTStamp
from SecondTable as B1, FirstTable as A1
where B1.XTStamp<=A1.RecTStamp
order by B1.XTStamp DESC) AS AbyB1 --MAX (time points before)
ON A.RecTStamp = AbyB1.RecTStamp) INNER JOIN
(select top 1 B2.XTStamp, A2.RecTStamp
from SecondTable as B2, FirstTable as A2
where B2.XTStamp>A2.RecTStamp
order by B2.XTStamp ASC) AS AbyB2 --MIN (time points after)
ON A.RecTStamp = AbyB2.RecTStamp;
背景
我有一个将近100万个条目的遥测表(别名为A),它具有基于DateTime
图章,发送器ID和记录设备ID 的复合主键。由于无法控制的情况,我的SQL语言是Microsoft Access中的标准Jet DB(用户将使用2007年及更高版本)。由于发送器ID,这些条目中只有大约200,000与查询相关。
第二个遥测表(别名B)包含大约50,000个具有单个DateTime
主键的条目
对于第一步,我着重于从第二张表中找到最接近第一张表中邮票的时间戳。
加入结果
我发现的怪癖...
...在调试过程中
编写JOIN
逻辑FROM FirstTable as A INNER JOIN SecondTable as B ON (A.RecTStamp<>B.XTStamp OR A.RecTStamp=B.XTStamp)
,就像@byrdzeye在评论中指出的(此后已消失)是交叉联接的一种形式,这真的很奇怪。请注意,取代LEFT OUTER JOIN
了INNER JOIN
在上面显示出来的代码,以使返回的行的数量或身份没有影响。我似乎也不能放弃ON子句或说ON (1=1)
。仅使用逗号联接(而不是INNER
or LEFT OUTER
JOIN
)会导致Count(select * from A) * Count(select * from B)
此查询返回的行,而不是每个表A仅返回一行,因为(A <> B OR A = B)显式JOIN
返回。这显然不合适。FIRST
给定复合主键类型,似乎无法使用。
第二种JOIN
风格虽然可以说更清晰易读,但其速度较慢。这可能是因为JOIN
在较大的表上以及CROSS JOIN
在两个选项中都找到两个时,需要另外两个in。
另外:IIF
用MIN
/ 替换子句MAX
似乎返回相同数量的条目。
MAX(-(B.XTStamp<=A.RecTStamp)*B.XTStamp)
适用于“ Before”(MAX
)时间戳,但不适用于“ After”(MIN
),如下所示:
MIN(-(B.XTStamp>A.RecTStamp)*B.XTStamp)
因为条件的最小值始终为0 FALSE
。此0小于任何后纪DOUBLE
(该DateTime
域是Access中的子集,该计算将其转换为该域)。的IIF
和MIN
/ MAX
方法的通过零提出了AfterXTStamp值工作,因为分割(候补FALSE
)产生空值,其中,集合函数MIN和MAX跳过。
下一步
进一步讲,我希望在第二个表中找到直接位于第一个表中时间戳旁边的时间戳,并根据到这些点的时间距离对第二个表中的数据值进行线性插值(即,如果从第一个表是“ before”和“ after”之间距离的25%,我希望计算值的25%来自与“ after”点关联的第二个表值数据,而75%来自“ before” )。使用修订后的联接类型作为内胆的一部分,并在以下建议的答案之后产生了...
SELECT
AvgGPS.XmitID,
StrDateIso8601Msec(AvgGPS.RecTStamp) AS RecTStamp_ms,
-- StrDateIso8601MSec is a VBA function returning a TEXT string in yyyy-mm-dd hh:nn:ss.lll format
AvgGPS.ReceivID,
RD.Receiver_Location_Description,
RD.Lat AS Receiver_Lat,
RD.Lon AS Receiver_Lon,
AvgGPS.Before_Lat * (1 - AvgGPS.AfterWeight) + AvgGPS.After_Lat * AvgGPS.AfterWeight AS Xmit_Lat,
AvgGPS.Before_Lon * (1 - AvgGPS.AfterWeight) + AvgGPS.After_Lon * AvgGPS.AfterWeight AS Xmit_Lon,
AvgGPS.RecTStamp AS RecTStamp_basic
FROM ( SELECT
AfterTimestampID.RecTStamp,
AfterTimestampID.XmitID,
AfterTimestampID.ReceivID,
GPSBefore.BeforeXTStamp,
GPSBefore.Latitude AS Before_Lat,
GPSBefore.Longitude AS Before_Lon,
GPSAfter.AfterXTStamp,
GPSAfter.Latitude AS After_Lat,
GPSAfter.Longitude AS After_Lon,
( (AfterTimestampID.RecTStamp - GPSBefore.XTStamp) / (GPSAfter.XTStamp - GPSBefore.XTStamp) ) AS AfterWeight
FROM (
(SELECT
ReceiverRecord.RecTStamp,
ReceiverRecord.ReceivID,
ReceiverRecord.XmitID,
(SELECT TOP 1 XmitGPS.X_ID FROM SecondTable as XmitGPS WHERE ReceiverRecord.RecTStamp < XmitGPS.XTStamp ORDER BY XmitGPS.X_ID) AS AfterXmit_ID
FROM FirstTable AS ReceiverRecord
-- WHERE ReceiverRecord.XmitID IN (select XmitID from ValidXmitters)
GROUP BY RecTStamp, ReceivID, XmitID
) AS AfterTimestampID INNER JOIN SecondTable AS GPSAfter ON AfterTimestampID.AfterXmit_ID = GPSAfter.X_ID
) INNER JOIN SecondTable AS GPSBefore ON AfterTimestampID.AfterXmit_ID = GPSBefore.X_ID + 1
) AS AvgGPS INNER JOIN ReceiverDetails AS RD ON (AvgGPS.ReceivID = RD.ReceivID) AND (AvgGPS.RecTStamp BETWEEN RD.Beginning AND RD.Ending)
ORDER BY AvgGPS.RecTStamp, AvgGPS.ReceivID;
...返回152928条记录,符合(至少近似)预期记录的最终数量。在我的i7-4790、16GB RAM,无SSD,Win 8.1 Pro系统上,运行时间可能是5-10分钟。