Lua中使用多个模式进行字符串替换。

我正在尝试重命名电影标题中的不必要的字母。string.gsub可以将字符串替换为空或nil值,但我有大约200个需要替换成空字符串的字符串模式。

目前,我必须为每个模式使用string.gsub。我在想是否有一种将所有字符串模式放入单个string.gsub行的方法。我在网上搜寻了解决方案,但仍然没有找到任何东西。

电影标题看起来像这样 B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT,我想去掉像2013Hindi720pDvDRipCROPPEDAACx264RickyKT这样的额外字符。

点赞
用户501250
用户501250

将所有模式放入表中,然后枚举表格,对每个模式调用 string.gsub()

str = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"

patterns = {"pattern1", "pattern2", "pattern3"}
for i,v in ipairs(patterns) do
    str = string.gsub(str, v, "")
end

这将需要许多 string.gsub() 调用,但代码应该比有许多 string.gsub() 调用更易于维护。

2014-08-13 07:28:59
用户1009479
用户1009479

你可以像这样将一个表作为第三个参数传递给 string.gsub

local movie = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"
movie = movie:gsub("%S+", {["2013"] = "", ["Hindi"] = "", ["720p"] = "",
                       ["DvDRip"] = "", ["CROPPED"] = "", ["AAC"] = "",
                       ["x264"] = "", ["RickyKT"] = ""})

print(movie)
2014-08-13 07:31:03
用户8273880
用户8273880

为了避免为每个新条目编写表格中的键和值,我将编写一个处理数字索引表(模式为值)的函数。

这样,我就不需要为每个新模式编写{["pattern_n"] = ""}

例如:

PATTERNS = {"2013","Hindi","720p","DvDRip","CROPPED","AAC","x264","RickyKT"}
function replace(match)
    local ret = nil
    for i,v in ipairs(PATTERNS) do
        if v:find(match) then
            ret = ""
        end
    end
    return ret
end

local movie = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"
movie = movie:gsub("%S+", replace)

print(movie)
2017-12-04 02:41:54
用户9155965
用户9155965

你可以使用一个简单的函数来完成,这样你就不需要针对每个字符串编写代码,或者只需使用string.gsub和需要替换的字符串即可

函数:

local large_name = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"

function clean_name(str)
  local v = string.gsub(str, "(.-)%s([%(%[']?%d%d%d?%d?[%)%]]?)%s*(.*)", "%1")
  return v
end

print(clean_name(large_name))

只使用 string.gsub 替换值

local large_name = "B.A.Pass 2013 Hindi 720p DvDRip CROPPED AAC x264 RickyKT"
local clean_name = string.gsub(large_name, "(.-)%s([%(%[']?%d%d%d?%d?[%)%]]?)%s*(.*)", "%1")

print(clean_name)

替换模式将第一个值(电影的名称)与空格分开,并打印它,还将年份识别为第二个值,以避免标题中的错误,因此不需要放置电影名称中可能存在的所有值,并将避免许多错误判断。

我添加了一个用于测试不同电影名称的测试函数

local testing = {"Whiplash 2014 [1080p]",
"Anon (2018) [WEBRip] [1080p] [YTS.AM]",
"Maze Runner The Death Cure 2018 [WEBRip] [1080p] [YTS.AM]",
"12 Strong [2018] [WEBRip] [1080p] [YTS.AM]",
"Kingsman The Secret Service (2014) [1080p]",
"The Equalizer [2014] [1080p]",
"Annihilation 2018 [WEBRip] [1080p] [YTS.AM]",
"The Shawshank Redemption '94",
"Assassin's Creed 2016 HC 720p HDRip 850 MB - iExTV",
"Captain Marvel (2019) [WEBRip] [1080p] [YTS.AM]",}

for k,v in pairs(testing) do
  local result = string.gsub(v, "(.-)%s([%(%[']?%d%d%d?%d?[%)%]]?)%s*(.*)", "%1")
  print(result)
end

输出结果:

Whiplash
Anon
Maze Runner The Death Cure
12 Strong
Kingsman The Secret Service
The Equalizer
Annihilation
The Shawshank Redemption
Assassin's Creed
Captain Marvel

2019-06-03 14:45:57