翼度科技»论坛 编程开发 python 查看内容

Python正则表达式匹配一段英文中包含关键字的句子

5

主题

5

帖子

15

积分

新手上路

Rank: 1

积分
15
1.问题/需求

在含有多行文字的英文段落或一篇英文中查找匹配含有关键字的句子。
例如在以下字符串:
  1. text = '''Today I registered my personal blog in the cnblogs and wrote my first essay.
  2. The main problem of this essay is to use python regular expression matching to filter out
  3. sentences containing keywords in the paper. To solve this problem, I made many attempts
  4. and finally found a regular expression matching method that could meet the requirements
  5. through testing. So I've documented the problem and the solution in this blog post and
  6. shared it for reference to others who are having the same problem. At the same time,
  7. this text is also used to test the feasibility of this matching approach. Some additional
  8. related thoughts and ideas will be added to this blog later.'''
复制代码
中匹配含有’blog‘的句子。
 2.解决方法

因为要找出所有含有关键字的句子,所以这里采用re库中findall()方法。同时,由于所搜索的字符串中含有换行符'\n',因此向re.compilel()传入re.DOTALL参数,以使'.'字符能够匹配所有字符,包括换行符'\n'。这样我们匹配创建Pattern对象为:
  1. newre = re.compile('[A-Z][^.]*blog[^.]*[.]', re.DOTALL)<br>newre.findall(text)  # 进行匹配<br># 结果为:<br>['Today I registered my personal blog in the cnblogs and wrote my first essay.',<br>"So I've documented the problem and the solution in this blog post and \nshared it for reference to others who are having the same problem.",<br><em id="__mceDel">'Some additional \nrelated thoughts and ideas will be added to this blog later.']  # 这其中的'\n'就是换行符, 它在字符串中是不显示的, 但是匹配结果中又显示出来了</em>
复制代码
 

来源:https://www.cnblogs.com/blogLYP/p/17080272.html
免责声明:由于采集信息均来自互联网,如果侵犯了您的权益,请联系我们【E-Mail:cb@itdo.tech】 我们会及时删除侵权内容,谢谢合作!

举报 回复 使用道具