<bdo id='FNWK0'></bdo><ul id='FNWK0'></ul>
<legend id='FNWK0'><style id='FNWK0'><dir id='FNWK0'><q id='FNWK0'></q></dir></style></legend>

  1. <i id='FNWK0'><tr id='FNWK0'><dt id='FNWK0'><q id='FNWK0'><span id='FNWK0'><b id='FNWK0'><form id='FNWK0'><ins id='FNWK0'></ins><ul id='FNWK0'></ul><sub id='FNWK0'></sub></form><legend id='FNWK0'></legend><bdo id='FNWK0'><pre id='FNWK0'><center id='FNWK0'></center></pre></bdo></b><th id='FNWK0'></th></span></q></dt></tr></i><div id='FNWK0'><tfoot id='FNWK0'></tfoot><dl id='FNWK0'><fieldset id='FNWK0'></fieldset></dl></div>

    <small id='FNWK0'></small><noframes id='FNWK0'>

  2. <tfoot id='FNWK0'></tfoot>

      RegExp:删除字符串中可以包含其他句点的最后一个句点(挖掘输出)

      时间:2023-08-30
      <legend id='c1qgB'><style id='c1qgB'><dir id='c1qgB'><q id='c1qgB'></q></dir></style></legend>

            <bdo id='c1qgB'></bdo><ul id='c1qgB'></ul>

              <i id='c1qgB'><tr id='c1qgB'><dt id='c1qgB'><q id='c1qgB'><span id='c1qgB'><b id='c1qgB'><form id='c1qgB'><ins id='c1qgB'></ins><ul id='c1qgB'></ul><sub id='c1qgB'></sub></form><legend id='c1qgB'></legend><bdo id='c1qgB'><pre id='c1qgB'><center id='c1qgB'></center></pre></bdo></b><th id='c1qgB'></th></span></q></dt></tr></i><div id='c1qgB'><tfoot id='c1qgB'></tfoot><dl id='c1qgB'><fieldset id='c1qgB'></fieldset></dl></div>
            1. <tfoot id='c1qgB'></tfoot>

                <tbody id='c1qgB'></tbody>

                <small id='c1qgB'></small><noframes id='c1qgB'>

              • 本文介绍了RegExp:删除字符串中可以包含其他句点的最后一个句点(挖掘输出)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                问题描述

                我正在尝试解析 linux dig 命令的输出并执行几个用正则表达式一次性完成.

                I am trying to parse the output of the linux dig command and do several things on one shot with regular expressions.

                假设我挖主机mail.yahoo.com:

                /usr/bin/dig +nocomments +noquestion 
                    +noauthority +noadditional +nostats +nocmd 
                    mail.yahoo.com A
                

                此命令输出:

                mail.yahoo.com.                   0  IN  CNAME  login.yahoo.com.
                login.yahoo.com.                  0  IN  CNAME  ats.login.lgg1.b.yahoo.com.
                ats.login.lgg1.b.yahoo.com.       0  IN  CNAME  ats.member.g02.yahoodns.net.
                ats.member.g02.yahoodns.net.      0  IN  CNAME  any-ats.member.a02.yahoodns.net.
                any-ats.member.a02.yahoodns.net. 12  IN  A      98.139.21.169
                

                我想要找到所有 <host><record_type><resolved_name> 部分最后一段只使用一个正则表达式

                What I'd like to is finding all the <host>, <record_type> and <resolved_name> parts without the final period using only one regular expression

                对于这个带有 mail.yahoo.com 的特定示例,应该是:

                For this particular example with mail.yahoo.com, it'd be:

                [
                    ('mail.yahoo.com', 'CNAME', 'login.yahoo.com'),
                    ('login.yahoo.com', 'CNAME', 'ats.login.lgg1.b.yahoo.com'),
                    ('ats.login.lgg1.b.yahoo.com', 'CNAME', 'ats.member.g02.yahoodns.net'),
                    ('ats.member.g02.yahoodns.net', 'CNAME', 'any-ats.member.a02.yahoodns.net'),
                    ('any-ats.member.a02.yahoodns.net', 'A', '98.139.21.169'),
                ]
                

                但事实证明,dig 命令可能会在名称末尾显示一个句点:

                But it turns out that the dig command might be showing a period at the end of the name:

                    mail.yahoo.com. 
                        ^     ^   ^
                        |     |   |
                  Good dot    |   |
                              |   |
                        Good dot  |
                                  |
                           (!) Baaaad dot
                

                使用正则表达式拆分 dig 的输出并返回带有最后一个句点的名称非常简单:

                Doing a regular expression that splits dig's output and returns the name with the final period is fairly straightforward:

                regex = re.compile("^(S+).+INs+([A-Z]+)s+(S+).*s*$",re.MULTILINE)
                

                但是使用该正则表达式调用 .findall 确实会返回主机中的最后一个句点,因为 S+ 也会匹配最后一个句点:

                But calling .findall with that regex does return the final period in the host, because S+ will match the last period as well:

                [
                    ('mail.yahoo.com.', 'CNAME', 'login.yahoo.com.'),
                    ('login.yahoo.com.', 'CNAME', 'ats.login.lgg1.b.yahoo.com.'),
                    ('ats.login.lgg1.b.yahoo.com.', 'CNAME', 'ats.member.g02.yahoodns.net.'),
                    ('ats.member.g02.yahoodns.net.', 'CNAME', 'any-ats.member.a02.yahoodns.net.'),
                    ('any-ats.member.a02.yahoodns.net.', 'A', '98.139.21.169'),
                ]
                

                所以我需要 something 匹配所有非空格 S 除非它是一个句点后跟一个空格.

                So I'd need something that matches all non-spaces S except if it's a period followed by a whitespace.

                我已经做了无数次尝试,但我无法想出一个像样的解决方案.

                I've done endless tries, and I haven't been able to come up with a decent solution.

                提前谢谢你!

                PS:我知道我总是可以使用简单"的正则表达式并(在第二次通过时)删除找到的字符串的最后一个点,但我很好奇这是否可以用正则表达式一次性完成.

                PS: I know I can always use the "easy" regular expression and (on a second pass) remove the last dot of the found string, but I'm curious about whether this can be done with a regular expression in one shot.

                推荐答案

                您可以将此模式与多行修饰符一起使用:

                You can use this pattern with multiline modifier:

                ^([^ ]+)(?<!.).?[ ]+[0-9]+[ ]+IN[ ]+([^ ]+)[ ]+(.+(?<!.)).?$
                

                存储在 $1 $2 和 $3 中的组

                Groups stored in $1 $2 and $3

                演示

                试试这个:

                ^([^ 	]+)(?<!.).?[ 	]+[0-9]+[ 	]+IN[ 	]+([^ 	]+)[ 	]+(.+(?<!.)).?$
                

                这篇关于RegExp:删除字符串中可以包含其他句点的最后一个句点(挖掘输出)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                上一篇:Python Pandas 数据框查找缺失值 下一篇:查找带有子进程的命令不适用于 Shell=True

                相关文章

                  <tfoot id='YWU7V'></tfoot>

                    <i id='YWU7V'><tr id='YWU7V'><dt id='YWU7V'><q id='YWU7V'><span id='YWU7V'><b id='YWU7V'><form id='YWU7V'><ins id='YWU7V'></ins><ul id='YWU7V'></ul><sub id='YWU7V'></sub></form><legend id='YWU7V'></legend><bdo id='YWU7V'><pre id='YWU7V'><center id='YWU7V'></center></pre></bdo></b><th id='YWU7V'></th></span></q></dt></tr></i><div id='YWU7V'><tfoot id='YWU7V'></tfoot><dl id='YWU7V'><fieldset id='YWU7V'></fieldset></dl></div>
                  1. <small id='YWU7V'></small><noframes id='YWU7V'>

                    <legend id='YWU7V'><style id='YWU7V'><dir id='YWU7V'><q id='YWU7V'></q></dir></style></legend>
                      <bdo id='YWU7V'></bdo><ul id='YWU7V'></ul>