notepad++正则抽取所有符合条件的字符串 & 去掉重复

很喜欢notepad++, 简单而强大

今天想利用他来正则抽取所有sql语句中的表名,SQL语句如下,以供广大人民使用

getEntityData.sql=select s.analysis_group_id,s.period_id,s.period_value_id,s.fiscal_year_nbr,s.period_start_dt,s.period_end_dt,o.CURRENCY_ID from dbo.statement s,dbo.organization o where s.statement_id= @statementId and o.ORGANIZATION_ID=s.ANALYSIS_GROUP_ID union select s1.analysis_group_id,s1.period_id,s1.period_value_id,s1.fiscal_year_nbr,s1.period_start_dt,s1.period_end_dt,o1.CURRENCY_ID from dbo.statement s1,dbo.organization o1 where s1.statement_id=@statementId and o1.ORGANIZATION_ID=s1.ANALYSIS_GROUP_ID getGLwithVequation.sql=select distinct h.account_id ,e.expression_txt,a.account_ds,e.business_txt,e1.expression_txt from dbo.FINANCIAL_ACCOUNT_HIERARCHY h,dbo.FINANCIAL_ACCOUNT a,dbo.FINANCIAL_ACCOUNT_CONDITION c, dbo.EXPRESSION e,dbo.EXPRESSION_SCOPE es,dbo.EXPRESSION e1 where a.active_ind=’Y’ and h.account_relationship_type_id=1 and h.account_id=a.account_id and c.account_id=h.account_id and c.evaluation_expression_id=e.expression_id and e.expression_scope_id=es.expression_scope_id and es.expression_cd=’VALIDATION’ and c.validation_expression_id=e1.expression_id and e.expression_id < 20000 思路: 尝试把匹配的表名放置到每一行,并且给表名前后加一个特别的标记, 然后利用Mark功能去掉unMarked的行,最后通过TextFX插件去掉重复的行

1. 将正则匹配的字符串替换变成行

replace

2.Mark,注意要勾选 Bookmark line. mark

 

 

3. Search->Bookmark->Remove Unmarked Lines  去掉unmarked的行

remove

 

 

4. 用下面正则去替换成空值:

^(.*?)$\s+?^(?=.*^\1$)

另外,也可以安装TextFX插件, 然后TextFX->TextFX Tools->Sort lines case insensitive  。注意要勾选Sort output only UNIQUE lines.

removeduplicate 到此就可以提取到了所有正则匹配的数据。

发表评论

电子邮件地址不会被公开。 必填项已用*标注

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

您可以使用这些HTML标签和属性: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>