更改逗号分隔列表中的最后一个条目


8

我有一个巨大的文本文件,如下所示:

36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,3
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,8
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,14
36,53,15596,0.58454577855,0.26119,2.24878677855,0.116147072052964,12

所需的输出是这样的:

36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-03
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-08
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-14
36,53,15596,0.58454577855,0.26119,2.24878677855,0.116147072052964,MI-12

我曾在这里和其他社区尝试过其他相关职位,但无法完全得到我想要的东西。

更新

这是交叉询问(我想要Unix / perl答案和批处理/ powershell解决方案。),它有有趣的答案。

Answers:


14

使用 sprintf函数的 awk方法(添加前导零):

awk -F, -v OFS=',' '$8=sprintf("MI-%02d",$8);' file

输出:

36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-03
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-08
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-14
36,53,15596,0.58454577855,0.26119,2.24878677855,0.116147072052964,MI-12

-F,-将逗号设置,为字段分隔符

$8 -指向第八场

%02d-将函数参数视为2位数字的格式


注意,记录中的最后一个字段可以用表示$NF

NF是一个预定义变量,其值是当前记录中的字段数

因此,$NF$8(用于您的输入)相同

awk -F, -v OFS=',' '$(NF)=sprintf("MI-%02d", $(NF))' file

1
警告(在此示例中不相关,但可能在其他情况下适用):更改字段之一(此处为:$ 8)的值会“重新计算”整行的字段,并且有副作用:ex1:丢失“多个分隔符” ':echo "1   2 3    4" | awk '{$2=$2;print $0}'给出:(1 2 3 4字段之间仅剩1个空格(或OFS))。ex2)echo "1,,,2,3,,,,4" | awk -F',' '{$2=$2;print $0}'给出:(1   2 3    4逗号变为空格)。可能还有其他副作用。如果辅助字段具有有害的副作用,请测试并采取另一种方法(例如,复制变量$ 0上的gsub)。
奥利维尔·杜拉克

3

您可以尝试使用awk

awk 'BEGIN { FS = OFS = "," } { $NF = sprintf("MI-%02d", $NF); } 1' file

2

这是perl解决方案:

$ perl -F',' -lane '$last=$#F;$F[$last]=sprintf("MI-%02d",$F[$last]);print join ",", @F' input.txt                                       
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-03
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-08
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-14
36,53,15596,0.58454577855,0.26119,2.24878677855,0.116147072052964,MI-12

-a标志使我们可以根据所指定的分隔符将输入视为数组-F。基本上,我们更改该数组中的最后一项,然后通过join命令对其进行重建。


谢谢您的回答。如果有人需要perl,它确实有帮助,但仍然sprintf是您答案的核心思想。不喜欢它是否不合适,只是不提供与接受的答案不同的东西。还是+1。
M–

1
@Masoud很好,这里的主要原因是因为sprintf()通常在将特定格式的字符串写入变量时使用它,这就是为什么在许多其他语言中使用它的原因。我也可以用Python编写它-Python没有,sprintf()但无论如何,其核心思想都是相同的-将格式化字符串写入变量。另外,我们可以直接对数组项目进行操作,而仅打印它们。这类问题的解决方法是有限的,基本上我想说的是
Sergiy Kolodyazhnyy

1

输入数据如下:

36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,3  
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,8  
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,14  
36,53,15596,0.58454577855,0.26119,2.24878677855,0.116147072052964,12  

在text.csv中

下面的代码

awk -F"," '{ i = 0;
  MyOutLine = "";
  j = NF - 1;
  while ( i < j ) {
    i++;
    MyOutLine = MyOutLine""$i",";
  }
  i++;
  x = sprintf( "%.2i", $i );
  y = "MI-"x;
  MyOutLine = MyOutLine""y;
  print MyOutLine; }' ./text.csv  

产生如下输出:

36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-03
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-08
36,53,90478,0.58699759849,0.33616,4.83449759849,0.0695335954050315,MI-14
36,53,15596,0.58454577855,0.26119,2.24878677855,0.116147072052964,MI-12

1

Tcl

这是我的解决方案,使用Tcl完成,该Tcl从input.csv文件读取并将结果放入output.csv文件

set in [open input.csv]
set out [open output.csv w]

while {![eof $in]} {
   set line [gets $in]
   set last_comma_pos [string last , $line]
   puts $out [string range $line 0 $last_comma_pos][format MI-%02d [string range $line $last_comma_pos+1 end]]
}

close $in
close $out

示范

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.