内省式编程:分析其源代码和输出的代码


13

编写一个程序,输出字符总数以及源输出中每个字符的频率。您必须遵循示例中说明的格式。

如果您的代码是

abb1

它的输出必须是

My source has 4 characters.
1 is "a"
2 are "b"
1 is "1"
Besides unquoted numbers, my output has 383 characters.
34 are "
"
79 are " "
63 are """
2 are "'"
2 are ","
4 are "."
2 are "1"
2 are "B"
2 are "I"
2 are "M"
39 are "a"
4 are "b"
6 are "c"
4 are "d"
38 are "e"
3 are "g"
5 are "h"
4 are "i"
4 are "m"
3 are "n"
8 are "o"
3 are "p"
2 are "q"
38 are "r"
12 are "s"
8 are "t"
7 are "u"
3 are "y"
It's good to be a program.

(输出必须转到stdout。)

注意,例如,输出包含两个大写的m。一为My2 are "M"。这对于所有字符都必须成立,因此输出不会以任何方式自相矛盾。

输出中未引用的数字将被忽略,以避免设置不令人满意的频率。例如,1 is "1"如果两个都计数,则是不正确的。它应该显示为2 are "1",但是又只有1了。

格式说明

  • “ is”必须用于出现单个字符。

  • “ are”必须用于出现多个字符。

  • “ is”绝不能出现在输出字符列表中,因为它是多余的。1 is 'Z'指的是Z本身,因此可以删除整行。

  • 三个完整句子短语必须按顺序出现,并且在其间出现字符频率列表(如示例所示)。因此,您的输出将以开头My source...和结尾...be a program.。请注意,输出末尾没有换行符。

  • 字符频率列表本身可以以任何顺序排列。

  • 换行符算作一个字符(如果为\ r \ n)。

格式检查器

以下Python脚本将您的代码及其输出作为字符串,并断言该输出没有矛盾。如果出现问题,它会提供有用的错误消息。您可以在http://ideone.com/6H0ldu上在线运行它,方法是分叉它,替换为CODE和OUTPUT字符串,然后运行它。它永远不会给出错误的肯定或否定(假设其没有错误)。

#Change the CODE and OUTPUT strings to test your program

CODE = r'''abb1'''

OUTPUT = r'''My source has 4 characters.
1 is "a"
2 are "b"
1 is "1"
Besides unquoted numbers, my output has 383 characters.
34 are "
"
79 are " "
63 are """
2 are "'"
2 are ","
4 are "."
2 are "1"
2 are "B"
2 are "I"
2 are "M"
39 are "a"
4 are "b"
6 are "c"
4 are "d"
38 are "e"
3 are "g"
5 are "h"
4 are "i"
4 are "m"
3 are "n"
8 are "o"
3 are "p"
2 are "q"
38 are "r"
12 are "s"
8 are "t"
7 are "u"
3 are "y"
It's good to be a program.'''

#######################################################

import re

amountPattern = r'(\d+) (is|are) "(.)"\n'

class IntrospectionException(Exception):
    pass

def getClaimedAmounts(string, errorOnIs):
    groups = re.findall(amountPattern, string, re.DOTALL)

    for amount, verb, char in groups:
        if verb == 'is':
            if errorOnIs:
                raise IntrospectionException('\'1 is "%s"\' is unnecessary' % char)
            elif amount != '1':
                raise IntrospectionException('At "%s", %s must use "are"' % (char, amount))
        elif verb == 'are' and amount == '1':
            raise IntrospectionException('At "%s", 1 must use "is"' % char)

    amounts = {}
    for amount, verb, char in groups:
        if char in amounts:
            raise IntrospectionException('Duplicate "%s" found' % char)
        amounts[char] = int(amount)
    return amounts

def getActualAmounts(string):
    amounts = {}
    for char in string:
        if char in amounts:
            amounts[char] += 1
        else:
            amounts[char] = 1
    return amounts

def compareAmounts(claimed, actual):
    for char in actual:
        if char not in claimed:
            raise IntrospectionException('The amounts list is missing "%s"' % char)
    for char in actual: #loop separately so missing character errors are all found first
        if claimed[char] != actual[char]:
            raise IntrospectionException('The amount of "%s" characters is %d, not %d' % (char, actual[char], claimed[char]))
    if claimed != actual:
        raise IntrospectionException('The amounts are somehow incorrect')

def isCorrect(code, output):
    p1 = r'^My source has (\d+) characters\.\n'
    p2 = r'Besides unquoted numbers, my output has (\d+) characters\.\n'
    p3 = r"It's good to be a program\.$"
    p4 = '%s(%s)*%s(%s)*%s' % (p1, amountPattern, p2, amountPattern, p3)

    for p in [p1, p2, p3, p4]:
        if re.search(p, output, re.DOTALL) == None:
            raise IntrospectionException('Did not match the regex "%s"' % p)

    claimedCodeSize = int(re.search(p1, output).groups()[0])
    actualCodeSize = len(code)
    if claimedCodeSize != actualCodeSize:
        raise IntrospectionException('The code length is %d, not %d' % (actualCodeSize, claimedCodeSize))

    filteredOutput = re.sub(r'([^"])\d+([^"])', r'\1\2', output)

    claimedOutputSize = int(re.search(p2, output).groups()[0])
    actualOutputSize = len(filteredOutput)
    if claimedOutputSize != actualOutputSize:
        raise IntrospectionException('The output length (excluding unquoted numbers) is %d, not %d' % (actualOutputSize, claimedOutputSize))

    splitIndex = re.search(p2, output).start()

    claimedCodeAmounts = getClaimedAmounts(output[:splitIndex], False)
    actualCodeAmounts = getActualAmounts(code)
    compareAmounts(claimedCodeAmounts, actualCodeAmounts)

    claimedOutputAmounts = getClaimedAmounts(output[splitIndex:], True)
    actualOutputAmounts = getActualAmounts(filteredOutput)
    compareAmounts(claimedOutputAmounts, actualOutputAmounts)

def checkCorrectness():
    try:
        isCorrect(CODE, OUTPUT)
        print 'Everything is correct!'
    except IntrospectionException as e:
        print 'Failed: %s.' % e

checkCorrectness()

计分

这是代码高尔夫球。字符最少的提交将获胜。提交的内容必须通过格式检查器才有效。尽管您可以阅读自己的源代码和/或对输出进行硬编码但仍存在标准漏洞。


是否允许读取您自己的源文件?
Ventero 2014年

@MrLore可能还有其他错误,但我刚刚意识到三重引号(''')仍使用反斜杠转义。这可能与您的问题有关。我正在修复它。
卡尔文的爱好2014年

@Ventero绝对!
卡尔文的爱好2014年

@MrLore正则表达式允许一些误报,是的。要解决三引号内的反斜杠问题,请使用原始字符串(r'''CODE''')。
Ventero 2014年

1
@MrLore修复了未转义的点。感谢您的关注!
卡尔文的爱好2014年

Answers:


2

果酱-189

{`"_~"+:T;"Besides unquoted numbers, my output has &It's good to be a program.&My source has & characters.
"'&/~_]:X2=T,X3=3i({T_&:B{TI/,(" are ":AM`I*N}fIXK=]o
XBA`N+f+2*+s:T,X3=}fK'q];}_~

http://cjam.aditsu.net/上尝试

输出:

My source has 189 characters.
3 are "{"
3 are "`"
6 are """
4 are "_"
3 are "~"
4 are "+"
5 are ":"
5 are "T"
2 are ";"
3 are "B"
8 are "e"
9 are "s"
2 are "i"
3 are "d"
17 are " "
6 are "u"
2 are "n"
2 are "q"
8 are "o"
6 are "t"
3 are "m"
2 are "b"
7 are "r"
4 are ","
2 are "y"
2 are "p"
3 are "h"
7 are "a"
5 are "&"
4 are "I"
3 are "'"
2 are "g"
2 are "."
2 are "M"
3 are "c"
2 are "
"
2 are "/"
3 are "]"
5 are "X"
2 are "2"
4 are "="
3 are "3"
2 are "("
2 are "A"
2 are "*"
2 are "N"
3 are "}"
3 are "f"
2 are "K"
Besides unquoted numbers, my output has 988 characters.
3 are "B"
108 are "e"
11 are "s"
3 are "i"
5 are "d"
214 are " "
8 are "u"
4 are "n"
3 are "q"
9 are "o"
9 are "t"
5 are "m"
4 are "b"
108 are "r"
3 are ","
4 are "y"
4 are "p"
6 are "h"
108 are "a"
3 are "I"
3 are "'"
4 are "g"
5 are "."
3 are "M"
7 are "c"
102 are "
"
2 are "{"
198 are """
2 are "`"
2 are "_"
2 are "~"
2 are "+"
2 are ":"
2 are "T"
2 are ";"
2 are "&"
2 are "/"
2 are "]"
2 are "X"
2 are "2"
2 are "="
2 are "3"
2 are "("
2 are "A"
2 are "*"
2 are "N"
2 are "}"
2 are "f"
2 are "K"
It's good to be a program.

11

Ruby,269(311,367)个字符

对于这个挑战,我有三种不同的解决方案。他们每个人使用不同的技巧:

“适当的”解决方案,367个字符:

最长的解决方案或多或少只是一个概念证明,即可以毫无困难地解决此难题的方法-而且还不能完全解决问题。这是一个真实的线索(即,它生成自己的源代码而不是从文件中读取),并实际上计算出其打印的所有数字(代码长度,输出长度,字符出现次数)。由于quine的工作方式,所有代码必须在一行中且在字符串文字中。

eval r="S='eval r=%p'%r;O=-~$.;q=\"My source has \#{S.size}\"+(X=' characters.\n')+S.chars.uniq.map{|c|[k=S.count(c),k>O ? :are: :is,?\"+c+?\"]*' '}*$/+'\nBesides unquoted numbers, my output has ';r=(w=q+X+s=\"It's good to be a program.\").scan(D=/\\D/).uniq;$><<q<<(w+v=r.map{|c|j=' are \"\n\"';(-~(w+j*r.size).count(c)).to_s+(j[~O]=c;j)}*$/+$/).scan(D).size<<X+v+s"

部分硬编码的输出,311个字符:

下一个最短的解决方案使用了两个技巧,但是仍然是正确的方法:-在源代码中,没有字符会出现一次。这样,我不需要决定是打印is还是are在输出的前半部分。这也使计算总输出大小变得容易一些(尽管我实际上不需要这样做)。-总输出大小是硬编码的。由于这仅取决于源代码中不同字符的数量(通常情况下,这些字符中只有多少个仅出现一次),因此很容易预先计算出它。

请注意,该代码以两个非常重要的换行符开头,而StackExchange不会在代码块中显示。因此,我在这些换行符的前面添加了另一行,这不是代码的一部分。

#


eval R="I=$/+$/+'eval R=%p'%R;?\\4>w='%d are \"%s\"';B=\"My source has \#{I.size}\#{X=\" characters.\n\"}\#{z=(m=I.chars.uniq).map{|x|w%[I.count(x),x]}*$/}\nBesides unquoted numbers, my output has 1114\"+X;$><<B+m.map{|c|w%[(B+z+$M=\"\nIt's good to be a program.\").gsub!(/\\d++(?!\")/,'').count(c),c]}*$/+$M"

最短的解决方案,共269个字符:

最短的解决方案还对其自身的源长度进行硬编码。通过使用/尚未包含在源代码中的变量名称,可以找到一个“定点”,其中源代码中的所有字符(包括硬编码长度的数字!)至少出现两次。

通过从代码文件中读取其自身的源代码而不是生成它,该解决方案还节省了一些字符。作为一个好的副作用,这使代码更加“可读”(但是谁在乎中的可读代码……),因为现在代码不再必须位于字符串文字中。

U='%d are "%s"'
O=IO.read$0
?\126>B="My source has 269#{X=" characters.
"}#{z=(m=O.chars.uniq).map{|c|U%[O.count(c),c]}*$/}
Besides unquoted numbers, my output has 1096"+X
$><<B+m.map{|c|U%[(B+z+$M="
It's good to be a program.").gsub!(/\d++(?!")/,"").count(c),c]}*$/+$M

我还对测试脚本进行了一些修改,以减少检查代码所需的复制粘贴。通过替换的定义CODEOUTPUT使用

import subprocess

CODE = open("packed.rb").read()
OUTPUT = subprocess.check_output(["ruby", "packed.rb"])

print CODE
print len(CODE)

该脚本现在自动运行我的代码,读取其输出,并从代码文件中获取源代码。


这是最短代码生成的输出:

My source has 269 characters.
3 are "U"
7 are "="
3 are "'"
4 are "%"
6 are "d"
17 are " "
11 are "a"
9 are "r"
9 are "e"
11 are """
11 are "s"
6 are "
"
4 are "O"
2 are "I"
10 are "."
6 are "$"
2 are "0"
2 are "?"
2 are "\"
2 are "1"
2 are "2"
3 are "6"
2 are ">"
4 are "B"
3 are "M"
2 are "y"
9 are "o"
10 are "u"
12 are "c"
4 are "h"
2 are "9"
2 are "#"
4 are "{"
2 are "X"
8 are "t"
4 are "}"
2 are "z"
6 are "("
7 are "m"
5 are "n"
2 are "i"
2 are "q"
6 are ")"
4 are "p"
4 are "|"
2 are "["
4 are ","
2 are "]"
2 are "*"
4 are "/"
3 are "b"
7 are "+"
2 are "<"
3 are "g"
2 are "!"
Besides unquoted numbers, my output has 1096 characters.
2 are "U"
2 are "="
3 are "'"
2 are "%"
5 are "d"
238 are " "
120 are "a"
120 are "r"
120 are "e"
222 are """
11 are "s"
114 are "
"
2 are "O"
3 are "I"
5 are "."
2 are "$"
2 are "0"
2 are "?"
2 are "\"
2 are "1"
2 are "2"
2 are "6"
2 are ">"
3 are "B"
3 are "M"
4 are "y"
9 are "o"
8 are "u"
7 are "c"
6 are "h"
2 are "9"
2 are "#"
2 are "{"
2 are "X"
9 are "t"
2 are "}"
2 are "z"
2 are "("
5 are "m"
4 are "n"
3 are "i"
3 are "q"
2 are ")"
4 are "p"
2 are "|"
2 are "["
3 are ","
2 are "]"
2 are "*"
2 are "/"
4 are "b"
2 are "+"
2 are "<"
4 are "g"
2 are "!"
It's good to be a program.

您能否发布代码和输出的确定副本,以便我可以轻松对其进行测试?代码不应该输出自身,并且输出应在不换行的期间结束。
卡尔文的爱好2014年

@ Calvin'sHobbies第一个代码块是我的实际代码。不过,它的确会在最后一个换行符上输出输出,因此请给我几分钟的时间来解决此问题(这是您在规范中必须提及的内容)。
Ventero 2014年

当然,我刚刚更新了规格。
卡尔文的爱好2014年

@ Calvin'sHobbies完成。第一个代码块是第二个代码块生成的实际代码(这样,在编写代码时,我不必照顾字符串转义和所有操作)。
Ventero'7
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.