中世纪拼字法


9

任务

您的任务是将文本转换为中世纪的拼字法。

细节

  1. j被转换为iJI
  2. uU单词的开头分别转换为vV
  3. vV除单词开头以外的任何地方分别转换为uU
  4. sſ除非在单词的末尾或另一个之前,否则将转换为(U + 017F)s

眼镜

  • 单词定义为中的字母序列abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
  • 所有单词将至少包含两个字母。
  • 输入将仅包含可打印的ASCII字符(U + 0020-U + 007E)。
  • 连续出现的次数不会超过两次s。也就是说,sss将不会是输入的子字符串。

测试用例

个别词:

Input       Output
------------------------
Joy         Ioy
joy         ioy
Universe    Vniuerſe
universe    vniuerſe
Success     Succeſs
successfull ſucceſsfull
Supervise   Superuiſe
supervise   ſuperuiſe
Super-vise  Super-viſe
I've        I've
majors      maiors
UNIVERSE    VNIUERSE
0universe   0vniuerſe
0verify     0verify
I0ve        I0ve
_UU_          _VU_
_VV_          _VU_
ss_         ſs_

整个段落:

Input:  Christian Reader, I have for thy use collected this small Concordance, with no small labour. For being to comprise much in little roome, I was to make choyse of the most principall and usefull places, and to rank them under such words as I thought most essentiall and materiall in the sentence, because the scant roome allotted unto me, would not permit that I should expresse them under every word in the verse, as it is the manner in large Concordances.

Output: Chriſtian Reader, I haue for thy vſe collected this ſmall Concordance, with no ſmall labour. For being to compriſe much in little roome, I was to make choyſe of the moſt principall and vſefull places, and to rank them vnder ſuch words as I thought moſt eſsentiall and materiall in the ſentence, becauſe the ſcant roome allotted vnto me, would not permit that I ſhould expreſse them vnder euery word in the verſe, as it is the manner in large Concordances.

最后一个测试用例的输出的SHA-256哈希为:

5641899e7d55e6d1fc6e9aa4804f2710e883146bac0e757308afc58521621644

免责声明

Medievall拼字术不是那样的竞争者。如果您要用不同的拼字法对旧书进行折衷,请不要打扰。


1
“允许在输出中使用f而不是“。” 因此,基本上没有动力使用ſ,因为它占用更多字节。
致命

1
@Fatalize公平点。删除了那个。
Leaky Nun

@LeakyNun然后我们可以算作ſ1个字节吗?
R. Kap

如果不使用ſ,实际上在某些算法中会以ff的形式将ff变为fs的激励
Destructible Lemon

1
不应该Super-vise成为Super-viſe
R. Kap

Answers:


3

SED,144140111字节

由于NoOneIsHere节省了29个字节

-r -e'y/j/i/g;y/J/I/g;s/ u/ v/g;s/ U/ V/g;s/^u/v/g;s/^U/V/g;s/([^s])s(\w)/\1ſ\2/g;s/(\w)v/\1u/g;s/(\w)V/\1U/g'

1
你勇敢,勇敢的灵魂。
亚历山大-恢复莫妮卡

您可以仅使用1削减许多字节-e;在语句之间使用。
NoOneIsHere

我不知道你能做到 谢谢!!
莱利

2

Python 3中(128 126个字节)

import re;lambda k:re.sub("(?<!s)s(?=[a-zA-Z])",'ſ',re.sub("(?i)j|(?<![a-z])u|(?<=[a-z])v",lambda c:chr(ord(c.group())^3),k))

chr(ord(c.group())^3)对单个字符的字符串进行异或运算会感到有点过分,但也许真正的Pythonista可以建议打高尔夫球。但是,^3足以互换i <-> ju <-> v

注意:这里唯一需要Python 3的是Unicode字符:Python 2抱怨Non-ASCII character '\xc5' <snip> but no encoding declared


您不应该使用此字,\b因为\b使用的字词定义包含数字和下划线。
Leaky Nun

@LeakyNun,嗯。当我在寻找修复程序时,能否请您添加一些测试用例?
彼得·泰勒

@ R.Kap。(?i)
彼得·泰勒

@PeterTaylor等等,那是什么?
R. Kap

@ R.Kap,它使正则表达式不区分大小写。
彼得·泰勒


1

蟒3.5,124个 116 111 118 125 144 142字节:

import re;lambda k:re.sub("J|j|(?<![a-zA-Z])[uU]|(?<=[a-zA-Z])[Vv]|(?<!s)s(?=[a-zA-Z])",lambda g:dict(zip('jJuUvVs','iIvVuUſ'))[g.group()],k)

好吧,这似乎是正则表达式的完美选择!


1
您可以使用J|j,而不是[Jj]
漏尼姑

1

JavaScript(ES6),154

使用parseInt识别字母字符。注意:随便,但幸运的parseInt('undefined',36)|0是<0

s=>[...s].map((c,i)=>((n=v(c))-19?n==31&p>9?'uU':n!=30|p>9?c=='s'&s[i-1]!=c&v(s[i+1])>9?'?':c+c:'vV':'iI')[p=n,c<'a'|0],p=0,v=c=>parseInt(c,36)|0).join``

少打高尔夫球

s=>
  [...s].map(
  (c,i)=>
  ((n=v(c))-19
  ?n==31&p>9
    ?'uU'
    :n!=30|p>9
      ?c=='s'&s[i-1]!=c&v(s[i+1])>9
        ?'ſ'
        :c+c
      :'vV'
  :'iI')[p=n,c<'a'|0],
  p=0,
  v=c=>parseInt(c,36)|0
).join``

测试

F=
s=>[...s].map((c,i)=>((n=v(c))-19?n==31&p>9?'uU':n!=30|p>9?c=='s'&s[i-1]!=c&v(s[i+1])>9?'ſ':c+c:'vV':'iI')[p=n,c<'a'|0],p=0,v=c=>parseInt(c,36)|0).join``

out=(a,b,c)=>O.textContent+=a+'\n'+b+'\n'+c+'\n\n'

ti='Christian Reader, I have for thy use collected this small Concordance, with no small labour. For being to comprise much in little roome, I was to make choyse of the most principall and usefull places, and to rank them under such words as I thought most essentiall and materiall in the sentence, because the scant roome allotted unto me, would not permit that I should expresse them under every word in the verse, as it is the manner in large Concordances.'
to='Chriſtian Reader, I haue for thy vſe collected this ſmall Concordance, with no ſmall labour. For being to compriſe much in little roome, I was to make choyſe of the moſt principall and vſefull places, and to rank them vnder ſuch words as I thought moſt eſsentiall and materiall in the ſentence, becauſe the ſcant roome allotted vnto me, would not permit that I ſhould expreſse them vnder euery word in the verſe, as it is the manner in large Concordances.'
r=F(ti)
out(to==r?'OK':'KO',ti,r)

test=`Joy         Ioy
joy         ioy
Universe    Vniuerſe
universe    vniuerſe
Success     Succeſs
successfull ſucceſsfull
Supervise   Superuiſe
supervise   ſuperuiſe
Super-vise  Super-viſe
I've        I've
majors      maiors
UNIVERSE    VNIUERSE
0universe   0vniuerſe
0verify     0verify
I0ve        I0ve
_UU_          _VU_
_VV_          _VU_
ss_         ſs_`
.split('\n').map(t=>{
  var [i,o]=t.split(/\s+/),r=F(i)
  out(o==r?'OK':'KO',i,r)
})
#O {width:90%; overflow:auto; white-space: pre-wrap}
<pre id=O></pre>


1

JavaScript(ES6),111个字节

s=>s.replace(/[a-z]+/gi,w=>w.replace(/j|J|^u|^U|\Bv|\BV|ss|s(?!$)/g,c=>"iIvVuUſ"["jJuUvVs".search(c)]||"ſs"))

说明:因为JavaScript regexp没有后顾之忧,所以我将字符串分解为单词,然后允许我使用^\B作为负数和正数字母lookbehinds。ss通过单独匹配来处理,替换表达式稍显笨拙,与只替换两个字符串的第一个字符c或向s两个字符串添加一个额外的字符以及使用匹配的子字符串相比,占用更少的字节。


c=>"iIvVuUſ"["jJuUvVs".search(c)]||"ſs"很好 👍🏻–
约旦

0

CJam(89 88字节)

{32|_'`>\'{<*}:A;SqS++3ew{_1="jJuUvVs"#[-4_{_0=A!3*}_{_0=A3*}_{_)A\0='s=>268*}W]=~f^1=}%

在线演示

我从来不明白为什么CJam没有正则表达式,但是因为它不存在,所以这里没有使用它们的解决方案。


0

Ruby,85 + 1 = 86字节

运行ruby -pp标志为+1字节)。在stdin上接受输入。

gsub(/j|(?<=^|[^a-z])u|(?<=[a-z])v|(?<=^|[^s])s(?=[a-z])/i){$&.tr"jJsUuVv","iIfVvUu"}

在ideone上运行测试(包装在lambda中,因为您不能给ideone标记):http ://ideone.com/AaZ8ya

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.