MySQL->遍历表，在每个条目上运行存储过程

我有一个包含“书籍”（儿童短篇小说）的数据库，对书籍中每个单词的字数统计将非常有用。

我想出了如何使用以下方法获得每个单词的单词计数：

SELECT SUM
( 
    ROUND
    ( 
        (LENGTH(pageText) - LENGTH (REPLACE (pageText, "Word", "")))
        /LENGTH("Word")
    )
) FROM pages WHERE bookID = id;

这对于计数单词非常有用。但是，这需要我仔细阅读每本书，找出每个单词，然后通过该功能运行它（我将其保存为存储过程。）

我有一个包含每个单词的表格，没有重复。

我的问题：有没有办法使用存储过程在Words表上执行某种“ for each”循环？

即。向存储过程传递一个书ID和一个单词并记录结果。为每本书做每个字。这样可以节省大量的手动时间...这甚至是我应该从数据库方面做的事情吗？我应该用PHP代替吗？

老实说，任何输入都将不胜感激！

mysql stored-procedures

— 迈克尔·麦克唐纳
source

您可以通过解析书籍来创建一个包含所有单词的表格。然后它将成为一种选择，将书与词连接起来。那里不需要循环。

— jkavalik '16

有些任务最好用一种真正的编程语言而不是SQL来完成。在PHP中，可能类似于count(explode(' ', $pageText))+1。或者更复杂的方法来处理单词之间的多个空格，可能涉及preg_replace('/\s+/', ' ', $pageText)

— Rick James

对于Perl，它可能短于1+split(/\s+/, $pageText)。1是因为计数是空格而不是单词。

— 瑞克·詹姆斯

创建使用两个嵌套游标的第二个过程。

存储过程中的游标使您可以执行非常不像SQL的操作：一次遍历结果集一行，将选定的列值放入变量中并对其进行处理。

它们很容易被误用，因为SQL是声明性的而不是过程性的，通常不需要“针对每个”类型的操作，但是在这种情况下，它似乎是一个有效的应用程序。

一旦掌握了这些技巧，游标就很简单，但是它们确实需要在其支持代码中采用结构化方法，而这种方法并不总是直观的。

我最近提供了一些相当标准的“样板”代码，用于使用游标在Stack Overflow的答案中调用存储过程，下面我将大量借鉴该答案。

使用游标需要一些标准的样板代码来包围它。

您SELECT可以从任何地方获取要传递的值（可以是临时表，基表或视图，并且可以包括对存储函数的调用），然后使用这些值调用existinf过程。

这是必要代码的语法有效示例，并带有注释，以解释每个组件的工作。

本示例使用2列将2个值传递给被调用的过程。

请注意，由于某种原因，此处发生的事件按特定顺序排列。变量必须首先声明，游标必须在其继续处理程序之前声明，并且循环必须遵循所有这些内容。

您不能无序地进行操作，因此，当您将一个游标嵌套在另一个游标中时，您必须通过在过程主体内的BEGIN... END块内嵌套其他代码来重置过程范围；例如，如果您需要在循环中使用第二个游标，则只需在循环中的另一个BEGIN... END块中声明它。

DELIMITER $$

DROP PROCEDURE IF EXISTS `my_proc` $$
CREATE PROCEDURE `my_proc`(arg1 INT) -- 1 input argument; you might need more or fewer
BEGIN

-- declare the program variables where we'll hold the values we're sending into the procedure;
-- declare as many of them as there are input arguments to the second procedure,
-- with appropriate data types.

DECLARE val1 INT DEFAULT NULL;
DECLARE val2 INT DEFAULT NULL;

-- we need a boolean variable to tell us when the cursor is out of data

DECLARE done TINYINT DEFAULT FALSE;

-- declare a cursor to select the desired columns from the desired source table1
-- the input argument (which you might or might not need) is used in this example for row selection

DECLARE cursor1 -- cursor1 is an arbitrary label, an identifier for the cursor
 CURSOR FOR
 SELECT t1.c1, 
        t1.c2
   FROM table1 t1
  WHERE c3 = arg1; 

-- this fancy spacing is of course not required; all of this could go on the same line.

-- a cursor that runs out of data throws an exception; we need to catch this.
-- when the NOT FOUND condition fires, "done" -- which defaults to FALSE -- will be set to true,
-- and since this is a CONTINUE handler, execution continues with the next statement.   

DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

-- open the cursor

OPEN cursor1;

my_loop: -- loops have to have an arbitrary label; it's used to leave the loop
LOOP

  -- read the values from the next row that is available in the cursor

  FETCH NEXT FROM cursor1 INTO val1, val2;

  IF done THEN -- this will be true when we are out of rows to read, so we go to the statement after END LOOP.
    LEAVE my_loop; 
  ELSE -- val1 and val2 will be the next values from c1 and c2 in table t1, 
       -- so now we call the procedure with them for this "row"
    CALL the_other_procedure(val1,val2);
    -- maybe do more stuff here
  END IF;
END LOOP;

-- execution continues here when LEAVE my_loop is encountered;
-- you might have more things you want to do here

-- the cursor is implicitly closed when it goes out of scope, or can be explicitly closed if desired

CLOSE cursor1;

END $$

DELIMITER ;

— 迈克尔-SQLbot
source

很棒的答案，非常有用！还没有完成，但是有了提供的资源，我确定我可以使游标正常工作！谢谢！

— Michael MacDonald

太好了！使用repeat / while使我的proc为最后一条记录触发两次，因此需要进行其他检查，但这可以解决该问题。

— 尼克M

关闭cursor1; 缺少OPEN-CLOSE一起游标

— Felicia A Kovacs小姐

@MissFeliciaAKovacs游标只能存在于BEGIN/ END块的范围内，并且在超出范围时会隐式关闭 ...因此，严格地不必关闭游标。作为实践，我认为它是不需要的，并且不包含它，但是为了完整起见，我在CLOSE声明中添加了声明。

— Michael-sqlbot