Lua: 缩写单词

我正在寻找代码来返回输入的缩写,即"联邦调查局"应该返回FBI(最好不带o),并且也适用于小写的"联邦调查局"。我该如何做到这一点? 谢谢

点赞
用户1442917
用户1442917

这样可能有效:

("Federal bureau of Investigation")
  :gsub("of","") -- 去掉 "of"
  :gsub("(%w)%S+%s*","%1") -- 保留单词的第一个字符
  :upper() -- 转换成大写字母

这将返回 "FBI"。

2017-10-31 04:48:07
用户805875
用户805875

怎么样?

do
   -- 不需要或需要改变的默认单词列表
   local stopwords = { }
   for w in ("a an and for of the to"):gmatch "%w+" do  stopwords[w] = ""  end
   -- 缩写短语:
   function TLA( phrase, subst )
      subst = subst or stopwords
      -- 首先将每个单词(包括“'”)替换为其缩写...
      --(将保留字符串中的空格等)
      phrase = phrase:gsub( "[%w']+", function( word )
         if not word:find "%U" then  return word  end -- OPTIONAL keep abbrevs
         word = word:lower()
         if subst[word] then  return subst[word]  end -- from substitution list
         return word:sub( 1, 1 ):upper( )             -- others: to first letter
      end )
      -- ...然后删除所有非单词字符
      return (phrase:gsub( "%W", "" ))
   end
end

它处理简单的情况:

TLA "Ministry Of Information"  --> "MI"
TLA "floating-point exception" --> "FPE"

可以处理一些特殊情况:

TLA "augmented BNF" --> "ABNF"

调整替换列表或将非空字符串放入其中也可能会很有用:

TLA "one way or the other" --> "OWOO"
TLA( "one way or the other", {} ) --> "OWOTO"
TLA( "Ministry Of Information", { of = "of" } ) --> "MofI"

local custom_subst = {
   ["for"] = "4", to = "2", ["and"] = "", one = "1", two = "2", -- ...
}
TLA "Ministry for Fear, Uncertainity and Doubt" --> "MFUD"
TLA( "Ministry for Fear, Uncertainity and Doubt", custom_subst ) --> "M4FUD"
TLA( "Two-factor authentication", custom_subst ) --> "2FA"

按照惯例,

TLA( "there ain't no such thing as a free lunch", {} ) --> "TANSTAAFL"

TLA( "There is more than one way to do it!", {} ) --> "TIMTOWTDI"

-因此,您可能需要在替换列表之外调整代码的许多其他方面。

2017-10-31 06:14:20
用户3735873
用户3735873

显然,除非你使用特定的字典,否则没有完美的通用解决方案,因为许多缩写不遵循一致的规则。

例如,考虑BASIC = Beginner's All-Purpose Symbolic Instruction Code。

(应该是BAPSIC而不是BASIC)

因此,除了注意一些限制之外,这是另一种似乎适用于大多数“正常”情况的首字母缩写生成器。

function acronym(s,ignore)
  ignore = ignore or
  {                                     --默认要忽略的单词列表
  ['a'] = true, ['an'] = true, ['and'] = true, ['in'] = true, ['for'] = true,
  ['of'] = true, ['the'] = true, ['to'] = true, ['or'] = true,
  }

  local ans = {}
  for w in s:gmatch '[%w\']+' do
    if not ignore[w:lower()] then ans[#ans+1] = w:sub(1,1):upper() end
  end
  return table.concat(ans)
end
2017-11-01 11:01:35