匹配罗马数字


19

挑战

给定一些输入字符串,如果它表示介于1(= I)和3999(= MMMCMXCIX)之间的正确罗马数字,则返回真实值,否则返回假值。

细节

  • 输入是一个仅包含字符的非空字符串IVXLCDM
  • 罗马数字(我们在本次挑战中在此处使用)的定义如下:

我们仅使用以下符号:

Symbol  I   V   X   L   C   D    M
Value   1   5  10  50 100 500 1000

要定义哪些字符串实际上是有效的罗马数字,可能最容易提供对话规则:写一个十进制数字a3 a2 a1 a0(每个ai数字代表一个数字。例如,表示792我们有a3=0, a2=7, a1=9, a0=2。)作为罗马数字,我们将其分解变成数十的力量。十的不同幂可以表示如下:

      1-9: I, II, III, IV, V, VI, VII, VIII, IX
    10-90: X, XX, XXX, XL, L, LX, LXX, LXXX, XC
  100-900: C, CC, CCC, CD, D, DC, DCC, DCCC, CM
1000-3000: M, MM, MMM

从左侧的最高有效数字开始,我们可以转换每个数字分别代表的数字并将其连接起来。因此,对于上面的示例,它看起来像这样:

Digit        a3    a2   a1   a0
Decimal       0     7    9    2
Roman             DCC   XC   II

因此罗马数字792DCCXCII。以下是与此挑战相关的所有罗马数字的完整列表:OEIS a006968.txt

例子

特鲁西

MCCXXXIV (1234)
CMLXXXVIII (988)
DXIV (514)
CI (101)

虚假

MMIXVIII
IVX
IXV
MMMM
XXXVX
IVI
VIV


我仍然认为这无效,因为无效输入的集合更大。这里的挑战仅是指在OEIS A006968中使用的“井井有条”的数字
漏洞

2
为什么MMMM无效?是否有5000个字母代替M <letter>?
Skyler

检查规格,没有这样的信。使用的唯一符号是I,V,X,L,C,D,M
flawr

Answers:


17

详细,1362字节

GET A ROMAN NUMERAL TYPED IN BY THE CURRENT PERSON USING THIS PROGRAM AND PUT IT ONTO THE TOP OF THE PROGRAM STACK
PUT THE NUMBER MMMM ONTO THE TOP OF THE PROGRAM STACK
MOVE THE FIRST ELEMENT OF THE PROGRAM STACK TO THE SECOND ELEMENT'S PLACE AND THE SECOND ELEMENT OF THE STACK TO THE FIRST ELEMENT'S PLACE
DIVIDE THE FIRST ELEMENT OF THE PROGRAM STACK BY THE SECOND ELEMENT OF THE PROGRAM STACK AND PUT THE RESULT ONTO THE TOP OF THE PROGRAM STACK
PUT THE NUMBER V ONTO THE TOP OF THE PROGRAM STACK
GET THE FIRST ELEMENT OF THE PROGRAM STACK AND THE SECOND ELEMENT OF THE PROGRAM STACK AND IF THE SECOND ELEMENT OF THE PROGRAM STACK IS NOT ZERO JUMP TO THE INSTRUCTION THAT IS THE CURRENT INSTRUCTION NUMBER AND THE FIRST ELEMENT ADDED TOGETHER'S RESULT
PUT THE NUMBER I ONTO THE TOP OF THE PROGRAM STACK
GET THE TOP ELEMENT OF THE STACK AND OUTPUT IT FOR THE CURRENT PERSON USING THIS PROGRAM TO SEE
PUT THE NUMBER III ONTO THE TOP OF THE PROGRAM STACK
GET THE FIRST ELEMENT OF THE PROGRAM STACK AND THE SECOND ELEMENT OF THE PROGRAM STACK AND IF THE SECOND ELEMENT OF THE PROGRAM STACK IS NOT ZERO JUMP TO THE INSTRUCTION THAT IS THE CURRENT INSTRUCTION NUMBER AND THE FIRST ELEMENT ADDED TOGETHER'S RESULT
PUT THE NUMBER NULLA ONTO THE TOP OF THE PROGRAM STACK
GET THE TOP ELEMENT OF THE STACK AND OUTPUT IT FOR THE CURRENT PERSON USING THIS PROGRAM TO SEE

否则I,范围为(0)I-MMMCMXCIXNULLA(0)范围内的有效罗马数字的输出或通知用户输入不是有效罗马数字。


12
我无法确定这是否是适合该工作的工具。
Vaelus

5
这是适合任何工作的工具吗?
omzrs

8

C#(Visual C#中交互式编译器)79个 109字节

这似乎是Regex的挑战,我相信可以找到更短的解决方案...

s=>System.Text.RegularExpressions.Regex.IsMatch(s,"^M{0,3}(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})$")

在线尝试!


你能不能缩短{0,3}{,3}
瑕疵

@flawr似乎并没有捕获任何东西
Innat3

1
抱歉,只有工作这样的东西{5,},但是没有{,5}
瑕疵

2
您可以将其添加为编译器标志,因此它是72个字节,并且应使用标志将语言更改为C#(Visual C#交互式编译器)/u:System.Text.RegularExpressions.Regex,例如,这样的答案 :)
Kevin Cruijssen

3
备用正则表达式:^M?M?M?(C[MD]|D?C?C?C?)(X[CL]|L?X?X?X?)(I[XV]|V?I?I?I?)$。长度相同,但看起来很奇怪(目标是吧?)
无知的体现

8

Wolfram语言(Mathematica),35个字节

Check[FromRomanNumeral@#<3999,1<0]&

在线尝试!

感谢@attinat,节省了5个字节

限制[1,3999]unfortunateley花费7个字节...
这是任何罗马数字的代码

Wolfram语言(Mathematica),28个字节

Check[FromRomanNumeral@#,F]&

在线尝试!

上面的代码适用于任何数字,而不仅限于[1,3999]


2
@ExpiredData“输入是仅包含字符的非空字符串IVXLCDM。”
mathmandan

35个字节Boole也比用If这种方法使用的短(一个字节)。
attinat

8

CP-1610组件(Intellivision), 52 ... 48  47 DECLE 1 = 59字节

让我们在比Perl早7年的系统上进行尝试。:-)

R4中获取一个以空终止的字符串的指针。如果输入是有效的罗马数字,则设置“ 零”标志,否则将其清除。

                ROMW    10              ; use 10-bit ROM width
                ORG     $4800           ; map this program at $4800

                ;; ------------------------------------------------------------- ;;
                ;;  test code                                                    ;;
                ;; ------------------------------------------------------------- ;;
4800            EIS                     ; enable interrupts

4801            SDBD                    ; R5 = pointer into test case index
4802            MVII    #ndx,     R5
4805            MVII    #$214,    R3    ; R3 = backtab pointer
4807            MVII    #11,      R0    ; R0 = number of test cases

4809  loop      SDBD                    ; R4 = pointer to next test case
480A            MVI@    R5,       R4
480B            PSHR    R0              ; save R0, R3, R5 onto the stack
480C            PSHR    R3
480D            PSHR    R5
480E            CALL    isRoman         ; invoke our routine
4811            PULR    R5              ; restore R5 and R3
4812            PULR    R3

4813            MVII    #$1A7,    R0    ; use a white 'T' by default
4815            BEQ     disp

4817            MVII    #$137,    R0    ; or a white 'F' is the Z flag was cleared

4819  disp      MVO@    R0,       R3    ; draw it
481A            INCR    R3              ; increment the backtab pointer

481B            PULR    R0              ; restore R0
481C            DECR    R0              ; and advance to the next test case, if any
481D            BNEQ    loop

481F            DECR    R7              ; loop forever

                ;; ------------------------------------------------------------- ;;
                ;;  test cases                                                   ;;
                ;; ------------------------------------------------------------- ;;
4820  ndx       BIDECLE test0, test1, test2, test3
4828            BIDECLE test4, test5, test6, test7, test8, test9, test10

                ; truthy
4836  test0     STRING  "MCCXXXIV", 0
483F  test1     STRING  "CMLXXXVIII", 0
484A  test2     STRING  "DXIV", 0
484F  test3     STRING  "CI", 0

                ; falsy
4852  test4     STRING  "MMIXVIII", 0
485B  test5     STRING  "IVX", 0
485F  test6     STRING  "IXV", 0
4863  test7     STRING  "MMMM", 0
4868  test8     STRING  "XXXVX", 0
486E  test9     STRING  "IVI", 0
4872  test10    STRING  "VIV", 0

                ;; ------------------------------------------------------------- ;;
                ;;  routine                                                      ;;
                ;; ------------------------------------------------------------- ;;
      isRoman   PROC

4876            PSHR    R5              ; push the return address

4877            MOVR    R7,       R2    ; R2 = dummy 1st suffix
4878            MOVR    R2,       R5    ; R5 = pointer into table
4879            ADDI    #@tbl-$+1,R5

487B  @loop     MVI@    R5,       R1    ; R1 = main digit (M, C, X, I)
487C            MVI@    R5,       R3    ; R3 = prefix or 2nd suffix (-, D, L, V)

487D            MVI@    R4,       R0    ; R0 = next digit

487E            CMPR    R0,       R3    ; if this is the prefix ...
487F            BNEQ    @main

4881            COMR    R2              ; ... disable the suffixes
4882            COMR    R3              ; by setting them to invalid values
4883            MVI@    R4,       R0    ; and read R0 again

4884  @main     CMPR    R0,       R1    ; if R0 is not equal to the main digit,
4885            BNEQ    @back           ; assume that this part is over

4887            MVI@    R4,       R0    ; R0 = next digit
4888            CMPR    R0,       R1    ; if this is a 2nd occurrence
4889            BNEQ    @suffix         ; of the main digit ...

488B            CMP@    R4,       R1    ; ... it may be followed by a 3rd occurrence
488C            BNEQ    @back

488E            MOVR    R2,       R0    ; if so, force the test below to succeed

488F  @suffix   CMPR    R0,       R2    ; otherwise, it may be either the 1st suffix
4890            BEQ     @next
4892            CMPR    R0,       R3    ; or the 2nd suffix (these tests always fail
4893            BEQ     @next           ; if the suffixes were disabled above)

4895  @back     DECR    R4              ; the last digit either belongs to the next
                                        ; iteration or is invalid

4896  @next     MOVR    R1,       R2    ; use the current main digit
                                        ; as the next 1st suffix

4897            SUBI    #'I',     R1    ; was it the last iteration? ...
4899            BNEQ    @loop

489B            CMP@    R4,       R1    ; ... yes: make sure that we've also reached
                                        ; the end of the input

489C            PULR    R7              ; return

489D  @tbl      DECLE   'M', '-'        ; table format: main digit, 2nd suffix
489F            DECLE   'C', 'D'
48A1            DECLE   'X', 'L'
48A3            DECLE   'I', 'V'

                ENDP

怎么样?

只要#保证输入字符串中不存在任何无效字符,就可以将正则表达式重写为具有相同结构的4个组。

                 +-------+---> main digit
                 |       |
(M[##]|#?M{0,3})(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})
                   ||  |
                   |+--+-----> prefix or second suffix
                   |
                   +---------> first suffix

NN1(main_digit,second_suffix

我们的例程尝试根据这些模式逐个字符地解析输入的字符串,并最终检查是否到达字符串的结尾。

输出量

输出

jzIntv的屏幕截图


1. CP-1610操作码使用10位值(称为“ DECLE”)进行编码。该例程的长度为47 DECLE,起始价为$ 4876,结束价为$ 48A4(含税)。


这是不是可以在几个地方的分数一个字节有效的
ASCII-仅

我以前只是这样认为@@ ASCII,但我不确定。有关此问题的一些见解,请参阅此答案的评论。
Arnauld

@only ASCII另外,我刚刚在meta中找到了这篇文章,倾向于确认它最好四舍五入为整个字节。
Arnauld

啊,所以当它在RAM时只有10位?
仅ASCII的

该程序从不存储在RAM中,而仅存储在ROM中。因此,这取决于盒带中使用的存储芯片。CPU被设计为访问10位或16位ROM。“ ROMW 10”指令强制编译器生成10位格式的代码。
Arnauld

7

Java 8,70字节

s->s.matches("M{0,3}(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})")

@ Innat3的C#答案的端口,因此请确保对他进行投票!

在线尝试。

说明:

s->                // Method with String parameter and boolean return-type
  s.matches("...") //  Check if the string matches the regex fully
                   //  (which implicitly adds a leading "^" and trailing "$")

M{0,3}             // No, 1, 2, or 3 adjacent "M"
(     |        )   // Followed by either:
 C[MD]             //  A "C" with an "M" or "D" after it
      |            // or:
       D?          //  An optional "D"
         C{0,3}    //  Followed by no, 1, 2, or 3 adjacent "C"
(     |        )   // Followed by either:
 X[CL]             //  An "X" with a "C" or "L" after it
      |            // or:
       L?          //  An optional "L"
         X{0,3}    //  Followed by no, 1, 2, or 3 adjacent "X"
(     |        )   // Followed by either:
 I[XV]             //  An "I" with an "X" or "V" after it
      |            // or:
       V?          //  An optional "V"
         I{0,3}    //  Followed by no, 1, 2, or 3 adjacent "I"

5

R74 71 56字节

感谢@ RobinRyder,@ Giuseppe和@MickyT提出的关于如何在R内置的grep中有效使用grep的建议as.roman

sub("^M(.+)","\\1",scan(,""))%in%paste(as.roman(1:2999))

在线尝试!


as.roman无论如何都不会起作用,因为它只能3899出于某种原因起作用。
朱塞佩

我确实应该更好地阅读文档,可能是因为4000在罗马语中没有明确的表示形式,所以怎么做3900。这类似于390,现在我只是发现grep存在问题,我必须锚定模式。
CT大厅

@Giuseppe,使用与其他答案相同的正则表达式解决。
CT厅

2
66个字节,使用以下方法as.roman:首先剥离开头M是否有一个,然后检查结果是否在as.roman(1:2999)。这需要对输入为的情况进行特殊处理M
罗宾·赖德

1
我的最后一个问题是,谁决定romans将R放入R?它是在2.5.0(2007年4月)中添加的...
Giuseppe


2

果冻 48 47 46  44 字节

-1感谢尼克·肯尼迪

5Żo7;“ÆæC‘ð“IVXLCDM”ṃ@3Ƥm2”MẋⱮ3¤ṭŻ€ṚŒpF€ḟ€0ċ

IVXLCDM11个39990

在线尝试!或查看测试套件

怎么样?

5Żo7;“ÆæC‘ð“IVXLCDM”ṃ@3Ƥm2”MẋⱮ3¤ṭŻ€ṚŒpF€ḟ€0ċ  - Main Link: list of characters S

5Żo7;“ÆæC‘  - chain 1: f(S) -> X
5Ż          - zero range of five = [0,1,2,3,4,5]
  o7        - OR seven             [7,1,2,3,4,5]
     “ÆæC‘  - list of code-page indices        [13,22,67]
    ;       - concatenate          [7,1,2,3,4,5,13,22,67]

          ð - start a new dyadic chain...

“IVXLCDM”ṃ@3Ƥm2”MẋⱮ3¤ṭŻ€ṚŒpF€ḟ€0ċ - chain 2: f(X,S) -> isValid
“IVXLCDM”                         - list of characters, IVXLCDM
           3Ƥ                     - for infixes of length three:
                                  - (i.e. IVX VXL XLC LCD CDM)
         ṃ@                       -   base decompression with swapped arguments
                                  -   (i.e. use characters as base-3 digits of X's values)
                                  -   (e.g. IVX -> VI I V IX II IV III VII VIII)
             m2                   - modulo two slice (results for IVX XLC and CDM only)
                    ¤             - nilad followed by link(s) as a nilad:
               ”M                 -   character 'M'
                  Ɱ3              -   map across [1,2,3] with:
                 ẋ                -     repeat -> M MM MMM
                     ṭ            - tack
                      Ż€          - prepend a zero to each
                        Ṛ         - reverse
                                  -   -- now we have the table: 
                                  -    0 M MM MMM
                                  -    0 DC C D CM CC CD CCC DCC DCCC
                                  -    0 LX X L XC XX XL XXX LXX LXXX
                                  -    0 VI I V IX II IV III VII VIII
                         Œp       - Cartesian product   [[0,0,0,0],...,["M","CM",0,"IV"],...]
                           F€     - flatten €ach  [[0,0,0,0],...,['M','C','M',0,'I','V'],...]
                             ḟ€0  - filter out the zeros from €ach       ["",...,"MCMIV",...]
                                ċ - count occurrences of S

第一行似乎有多余的空间。另一个字节。可以通过使用更简单的第一行来保存另一个字节。在线尝试!
尼克·肯尼迪

谢谢,我从中又节省了一个。
乔纳森·艾伦

1

Perl 5(-p),57字节

$_=/^M*(C[MD]|D?C*)(X[CL]|L?X*)(I[XV]|V?I*)$/&!/(.)\1{3}/

蒂奥

  • 使用几乎相同的正则表达式,除了{0,3}量词被*
  • &!/(.)\1{3}/ 以确保同一字符不能连续出现4次。
  • 不能与golfed -/(.)\1{3}/因为会给-1IIIIVI,例如

1

Python 2,81个字节

import re
re.compile('M{,3}(D?C{,3}|C[DM])(L?X{,3}|X[LC])(V?I{,3}|I[VX])$').match

在线尝试!

让我们看一下正则表达式的最后一部分,它匹配罗马数字最多9个(包括空字符串)

V?I{,3}|I[VX]

这有两个备选方案,由|

  • V?I{,3}:可选V,最多3个I。这种匹配空字符串IIIIIIVVIVIIVIII
  • I[VX]I后跟一个VX。这与IV和相匹配IX

X,L,C匹配十个相同的东西,匹配C,D,M数百个,最后^M{,3}允许最多3个M在开始时(千个)。

我尝试为每个三个字符生成模板,而不是编写3次,但这要长得多。


开始时不需要^锚点;match已经暗示它在字符串的开头匹配。
ShadowRanger

@ShadowRanger谢谢,我删除了^
xnor

虽然我认为您搞砸了编辑工作;应该是83,不是81
ShadowRanger

@ShadowRanger计数为81,因为因为f=允许匿名函数,所以不包含在代码中。仅用于TIO。
xnor

1
嗯,很有道理。令人讨厌的是,没有办法将其组织起来以将其隐藏在页眉或页脚中,但是,是的,未分配的lambdas是合法的,因此,未分配的已编译正则表达式的绑定方法也应该是好的。
ShadowRanger

1

视网膜56 51字节

(.)\1{3}
0
^M*(C[MD]|D?C*)(X[CL]|L?X*)(I[XV]|V?I*)$

@NahuelFouilleul的Perl 5答案端口,因此请确保支持他!

在线尝试验证所有测试用例

说明:

(.)\1{3}        # If four adjacent characters can be found which are the same
0               # Replace it with a 0

^...$           # Then check if the string matches the following fully:
 M*             #  No or any amount of adjacent "M"
 (     |    )   #  Followed by either:
  C[MD]         #   A "C" with an "M" or "D" after it
       |        #  or:
        D?      #   An optional "D"
          C*    #   Followed by no or any amount of adjacent "C"
 (     |    )   #  Followed by either:
  X[CL]         #   An "X" with a "C" or "L" after it
       |        #  or:
        L?      #   An optional "L"
          X*    #   Followed by no or any amount of adjacent "X"
 (     |    )   #  Followed by either:
  I[XV]         #   An "I" with an "X" or "V" after it
       |        #  or:
        V?      #   An optional "V"
          I*    #   Followed by no or any amount of adjacent "I"

1

05AB1E61 9 8 字节

ŽF¯L.XIå

ing -52字节 谢谢 @Adnan,因为显然05AB1E的罗马数字内置文件没有记录,哈哈.. xD

在线尝试验证所有测试用例

说明:

ŽF¯       # Push comressed integer 3999
   L      # Create a list in the range [1,3999]
    .X    # Convert each integer in this list to a roman number string
      Iå  # Check if the input is in this list
          # (and output the result implicitly)

请参阅我的05AB1E技巧(如何压缩大整数部分)以了解原因ŽF¯3999


原始61字节答案:

•1∞Γ'иÛnuÞ\₂…•Ž8вв€SÐ)v.•6#&‘нδ•u3ôNèyè}'M3L×)Rεõš}`3Fâ}€˜JIå

在线尝试验证所有测试用例

说明:

1∞Γ'иÛnuÞ\₂…•             '# Push compressed integer 397940501547566186191992778
              Ž8в           # Push compressed integer 2112
                 в          # Convert the integer to Base-2112 as list:
                            #  [1,11,111,12,2,21,211,2111,10]
S                          # Convert each number to a list of digits
  Ð                         # Triplicate this list
   )                        # And wrap it into a list of lists (of lists)
    v                       # Loop `y` over each these three lists:
     .•6#&‘нδ•              #  Push compressed string "xivcxlmcd"
              u             #  Uppercased
               3ô           #  And split into parts of size 3: ["XIV","CXL","MCD"]
     Nè                     #  Use the loop index to get the current part
       yè                   #  And index the list of lists of digits into this string
    }'M                    '# After the loop: push "M"
       3L                   # Push list [1,2,3]
         ×                  # Repeat the "M" that many times: ["M","MM","MMM"]
          )                 # Wrap all lists on the stack into a list:
                            # [[["I"],["I","I"],["I","I","I"],["I","V"],["V"],["V","I"],["V","I","I"],["V","I","I","I"],["I","X"]],[["X"],["X","X"],["X","X","X"],["X","L"],["L"],["L","X"],["L","X","X"],["L","X","X","X"],["X","C"]],[["C"],["C","C"],["C","C","C"],["C","D"],["D"],["D","C"],["D","C","C"],["D","C","C","C"],["C","M"]],["M","MM","MMM"]]
           R                # Reverse this list
            εõš}            # Prepend an empty string "" before each inner list
                `           # Push the four lists onto the stack
                 3F         # Loop 3 times:
                   â        #  Take the cartesian product of the two top lists
                    }€˜     # After the loop: flatten each inner list
                       J    # Join each inner list together to a single string
                        Iå  # And check if the input is in this list
                            # (after which the result is output implicitly)

请参阅我的05AB1E技巧(各节如何压缩不属于字典的字符串?如何压缩大整数?如何压缩整数列表?以了解原因:

  • •1∞Γ'иÛnuÞ\₂…•397940501547566186191992778
  • Ž8в2112
  • •1∞Γ'иÛnuÞ\₂…•Ž8вв[1,11,111,12,2,21,211,2111,10]
  • .•6#&‘нδ•"xivcxlmcd"

1
我不确定为什么.X没有记录在案,但我认为这应该起作用:3999L.XQO
Adnan

@Adnan Haha,-52个字节就在那。完全忘记了您确实告诉过我们有关添加内置罗马数字的信息。会在聊天中要求@ Mr.Xcoder将其添加到文档中。还有其他命令吗?;)PS:通过压缩保存了另一个字节3999。:)
Kevin Cruijssen

0

perl -MRegexp :: Common -pe,34个字节

$_=/^$RE{num}{roman}$/&!/(.)\1{3}/

&!/(.)\1{3}/部分是必需的,因为Regexp::Common连续允许四个(但不能五个)相同的字符。这样,它可以匹配钟面使用的罗马数字,该数字IIII通常用于4。


0

Python 3中116个 113 109 107 105 106字节

import re
lambda n:re.match(r'(M{,3}(C(M|CC?|D)?|DC{,3}))(X(C|XX?|L)?|(LX{,3}))?(I(X|II?|V)?|VI{,3})?$',n)

在线尝试!

-1个字节,感谢ShadowRanger


2
正如我在Py2答案中提到的那样,前导^ 是不必要的,因为match仅在字符串的开头已经匹配。
ShadowRanger

@ShadowRanger在调试时添加了锚,然后在没有锚的情况下不再尝试。我现在会记得-谢谢!:)
Noodle9

好吧,只是要清楚一点,尾随$是必要的(仅fullmatch意味着两端都有锚点,并且显然比a花费更多$)。
ShadowRanger

@ShadowRanger啊!这就解释了为什么我需要锚!没意识到我只需要锚定结局。再次感谢。
Noodle9

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.