根据二进制数字创建单词列表

12

我有一个矩阵，如下所示：

输入：

A   B   C   D   E   F   G   H   I 
0   0   0   0   1   0   0   0   1
0   0   0   1   0   0   0   0   0  
0   0   0   1   0   0   0   0   0  
1   0   0   0   0   0   0   0   0  
1   0   1   0   0   0   1   0   0  
1   0   0   1   0   0   0   1   0  
1   0   0   0   1   1   1   0   0

我想为每一行提取对应于值1的字母列表。

输出：

E,I 
D
D
A
A,C,G  
A,D,H  
A,E,F,G

我试图拆分标题并将单词与数字匹配，但是我失败了。

text-processing awk

— 融合坡度
source

12

在awk：

NR == 1 { for(column=1; column <= NF; column++) values[column]=$column; }
NR > 1 { output=""
        for(column=1; column <= NF; column++)
                if($column) output=output ? output "," values[column] : values[column]
        print output }

— 杰夫·谢勒
source

6

也可以使用NR == 1 { split($0,values) }

— Sundeep

跳过第二行。考虑next在第一行的末尾放置a ，这样您就无需为后续的行测试相反的条件。

— 艾德·莫顿

1

似乎原始输入文本中有一个额外的空白行，我对此进行了编码。此内容已被删除，因此只需更改NR > 2为即可NR > 1。

— 杰夫·谢勒

1

谢谢您的“打高尔夫球”技巧，Sundeep！我认为我更喜欢显式的“ for”循环，因为它在视觉/逻辑上与体内的“ for”循环对齐。

— 杰夫·谢勒

1

@ fusion.slope，必须通过整个代码在一个单引号参数awk，或将代码粘贴到一个文件并运行它awk -f that.script.file input-file

— 杰夫·夏勒

6

与另一个 perl

$ perl -lane 'if($. == 1){ @h=@F }
              else{@i = grep {$F[$_]==1} (0..$#F); print join ",",@h[@i]}
             ' ip.txt
E,I
D
D
A
A,C,G
A,D,H
A,E,F,G

-a用于在空格上分割输入行的选项，在@F数组中可用
if($. == 1){ @h=@F } 如果第一行，则保存标题
@i = grep {$F[$_]==1} (0..$#F) 如果输入是，则保存索引 1
print join ",",@h[@i]仅使用,分隔符从标头数组中打印那些索引

— 日深
source

4

仍然很有趣，一个zsh版本：

{
   read -A a  &&
   while read -A b; do
     echo ${(j<,>)${(s<>)${(j<>)a:^b}//(?0|1)}}
   done
} < file

${a:^b} 压缩两个数组，得到A 0 B 0 C 0 D 0 E 1 F 0 G 0 H 0 I 1
${(j<>)...} 连接元素之间没有任何东西，因此变成A0B0C0D0E1F0G0H0I1
${...//(?0|1)}我们从中剥离?0和1，使其变为EI：
${(s<>)...} 拆分任何内容以获得每个字母一个元素的数组：EI
${(j<,>)...}加入那些,-> E，I。

— StéphaneChazelas
source

这只是一个简单的打击吧？

— fusion.slope

1

@ fusion.slope，不，那就是zsh，它不同于另一个外壳bash（功能更强大，如果您问我，它的设计也更好）。bash只是借用了一小部分zsh的功能（如{1..4}，<<<，**/*）不是那些在这里所提到的，大多数bash的功能，否则，从借来的ksh。

— 斯特凡Chazelas

3

另一个awk解决方案：

awk 'NR==1{ split($0,a); next }   # capture and print `header` fields
     { for (i=1;i<=NF;i++)         # iterating through value fields `[0 1 ...]`
           if ($i) { printf "%s",(f?","a[i]:a[i]); f=1 } 
       f=0; print "" 
     }' file

输出：

E,I
D
D
A
A,C,G
A,D,H
A,E,F,G

— 罗曼·佩列赫雷斯特
source

2

这是Perl中的解决方案：

use strict;

my @header = split /\s+/, <>;
<>; ## Skip blank line
while (<>) {
    my @flags = split /\s+/;
    my @letters = ();
    for my $i (0 .. scalar @flags - 1) {
        push @letters, $header[$i] if $flags[$i];
    }

    print join(',', @letters), "\n";
}

它的工作方式是将标题列读入数组，然后对于每个数据行，如果匹配的数据列的值为true，则将列名复制到输出数组。列名然后以逗号分隔打印。

— 达格
source

2

一个sed有趣的地方：

sed '
  s/ //g
  1{h;d;}
  G;s/^/\
/
  :1
    s/\n0\(.*\n\)./\
\1/
    s/\n1\(.*\n\)\(.\)/\2\
\1/
  t1
  s/\n.*//
  s/./&,/g;s/,$//'

使用GNU sed，您可以通过以下方式使它更加清晰：

sed -E '
  s/ //g # strip the spaces

  1{h;d} # hold the first line

  G;s/^/\n/ # append the held line and prepend an empty line so the
            # pattern space becomes <NL>010101010<NL>ABCDEFGHI we will
            # build the translated version in the part before the first NL
            # eating one character at a time off the start of the
            # 010101010 and ABCDEFGHI parts in a loop:
  :1
    s/\n0(.*\n)./\n\1/     # ...<NL>0...<NL>CDEFGHI becomes
                           # ...<NL>...<NL>DEFGHI (0 gone along with C)

    s/\n1(.*\n)(.)/\2\n\1/ # ...<NL>1...<NL>CDEFGHI becomes
                           # ...C<NL>...<NL>DEFGHI (1 gone but C moved to 
                           #                        the translated part)
  t1 # loop as long as any of those s commands succeed

  s/\n.*// # in the end we have "ADG<NL><NL>", strip those NLs

  s/./,&/2g # insert a , before the 2nd and following characters'

稍短一点的版本，假设每行上的位数始终相同：

sed -E '
  s/ //g
  1{H;d}
  G
  :1
    s/^0(.*\n)./\1/
    s/^1(.*\n)(.*\n)(.)/\1\3\2/
  t1
  s/\n//g
  s/./,&/2g'

与上述相同，除了我们交换翻译部分和索引部分，以便进行一些优化。

— StéphaneChazelas
source

如果您能解释将对社区有益。在此先感谢

— –fusion.slope

1

@ fusion.slope，请参阅编辑。

— 斯特凡Chazelas

用t1命令很好地循环！

— fusion.slope

1

python3

python3 -c '
import sys
header = next(sys.stdin).rstrip().split()
for line in sys.stdin:
  print(*(h*int(f) for (h, f) in zip(header, line.rstrip().split()) if int(f)), sep=",")

  ' <file
E,I
D
D
A
A,C,G
A,D,H
A,E,F,G

— 伊鲁瓦
source

0

纯bash解决方案：

read -a h
while read -a r
do (
    for i in ${!r[@]}
    do 
        (( r[i] == 1 )) && y[i]=${h[i]}
    done
    IFS=,
    echo "${y[*]}")
done

— 戴维·昂加罗（David Ongaro）
source

3

请说明这是如何解决问题的。

— 斯科特，

这留给读者作为练习。假设基本的bash知识LESS="+/^ {3}Array" man bash应提供bash阵列所需的所有信息。您可以自由编辑答案以添加任何有用的说明。

— David Ongaro '17

-1

 void Main(string[] args)
        {
            int[,] numbers = new int[,]
            {
            {0, 0, 0, 0, 1, 0, 0, 0, 1},
            {0, 0, 0, 1, 0, 0, 0, 0, 0},
            {0, 0, 0, 1, 0, 0, 0, 0, 0},
            {1, 0, 0, 0, 0, 0, 0, 0, 0},
            {1, 0, 1, 0, 0, 0, 1, 0, 0},
            {1, 0, 0, 1, 0, 0, 0, 1, 0},
            {1, 0, 0, 0, 1, 1, 1, 0, 0}
            };
            string letters = "ABCDEFGHI";
            for (int row = 0; row < 7; row++)
            {
                for (int col = 0; col < 9; col++)
                {
                    if (numbers[row, col] == 1)
                        Console.Write(letters[col]);
                }
                Console.WriteLine();
            }
        }

— 乔治·雷克
source

3

请解释它的作用及其工作原理。

— 斯科特，

还有语言。

— fusion.slope