使用Beautifulsoup的whitestaces类的正则表达式(Regular expression for class with whitespaces using Beautifulsoup)
我发现方法BeautifulSoup.find()按空格分割类属性。 在这种情况下,我无法使用正则表达式,如下面的代码所示。 你能不能帮助我找到所有'树儿'的元素:
import re from bs4 import BeautifulSoup r_html = "<div class='root'>" \ "<div class='tree children1'>text children 1 </div>" \ "<div class='tree children2'>text children 2 </div>" \ "<div class='tree children3'>text children 3 </div>" \ "</div>" bs_tab = BeautifulSoup(r_html, "html.parser") workspace_box_visible = bs_tab.findAll('div', {'class':'tree children1'}) print workspace_box_visible # result: [<div class="tree children1">textchildren 1 </div>] workspace_box_visible = bs_tab.findAll('div', {'class':re.compile('^tree children\d')}) print workspace_box_visible # result: [] >>>> empty array because #class name was splited by whitespace character<<<< # >>>>>> print all element classes <<<<<<< def print_class(class_): print class_ return False workspace_box_visible = bs_tab.find('div', {'class': print_class}) # expected: # root # tree children1 # tree children2 # tree children3 # actual: # root # tree # children1 # tree # children2 # tree # children3
提前致谢,
====评论==========
stackoverflow站点不允许添加超过500个字符的注释,所以我在这里添加了注释:
上面,举例说明了BeautifulSoup如何寻找所需的类。
但是,如果我有DOM结构,如:
r_html = "<div class='root'>" \ "<div class='tree children'>zero</div>" \ "<div class='tree children first'>first</div>" \ "<div class='tree children second'>second</div>" \ "<div class='tree children third'>third</div>" \ "</div>"
当需要选择具有类属性的控件时:' tree children '和' tree children first ',你的(Padraic Cunningham)帖子中描述的所有方法都不起作用。
我找到了使用正则表达式的解决方案:
controls = bs_tab.findAll('div') for control in controls: if re.search("^tree children|^tree children first", " ".join(control.attrs['class'] if control.attrs.has_key('class') else "")): print control
另一个解决方案:
bs_tab.findAll('div', class_='tree children') + bs_tab.findAll('div', class_='tree children first')
我知道,这不是一个好的解决方案。 我希望BeautifulSoup模块有适当的方法。
I found that method BeautifulSoup.find() splits class attribute by whitespaces. In that case I couldn't use regular expression as show in code below. Could you somebody help me to get right way find all 'tree children' elements:
import re from bs4 import BeautifulSoup r_html = "<div class='root'>" \ "<div class='tree children1'>text children 1 </div>" \ "<div class='tree children2'>text children 2 </div>" \ "<div class='tree children3'>text children 3 </div>" \ "</div>" bs_tab = BeautifulSoup(r_html, "html.parser") workspace_box_visible = bs_tab.findAll('div', {'class':'tree children1'}) print workspace_box_visible # result: [<div class="tree children1">textchildren 1 </div>] workspace_box_visible = bs_tab.findAll('div', {'class':re.compile('^tree children\d')}) print workspace_box_visible # result: [] >>>> empty array because #class name was splited by whitespace character<<<< # >>>>>> print all element classes <<<<<<< def print_class(class_): print class_ return False workspace_box_visible = bs_tab.find('div', {'class': print_class}) # expected: # root # tree children1 # tree children2 # tree children3 # actual: # root # tree # children1 # tree # children2 # tree # children3
Thanks in advance,
==== comments ==========
stackoverflow site don't allow add comments more than 500 characters, so I added comments here:
Above, it was example to show how to BeautifulSoup looking for required classes.
But, If I have DOM structure like:
r_html = "<div class='root'>" \ "<div class='tree children'>zero</div>" \ "<div class='tree children first'>first</div>" \ "<div class='tree children second'>second</div>" \ "<div class='tree children third'>third</div>" \ "</div>"
and when need to select controls with class attributes: 'tree children' and 'tree children first', All of the methods described in your(Padraic Cunningham) post aren't work.
I found a solution with using regex:
controls = bs_tab.findAll('div') for control in controls: if re.search("^tree children|^tree children first", " ".join(control.attrs['class'] if control.attrs.has_key('class') else "")): print control
and another solution:
bs_tab.findAll('div', class_='tree children') + bs_tab.findAll('div', class_='tree children first')
I know, it's not good solution. and I hope that BeautifulSoup module has appropriate method for that.
原文:https://stackoverflow.com/questions/38824121
最满意答案
你最好有四个图像,并使用overflow:hidden属性将它们屏蔽为div。
// Your markup <div id="imgMask" style="overflow:hidden; height:200px; width:200px;"> <div id="inner" style="position:relative; left:0;"> <img src="images/environments/img0.jpg" /> <img src="images/environments/img1.jpg" /> <img src="images/environments/img2.jpg" /> <img src="images/environments/img3.jpg" /> </div> </div> // Your js function slideLeft(){ $('#inner').animate({ left: -200px; },2000, function(){ $('#inner img').eq(0).remove().appendTo('#inner'); $('#inner').css({ 'left',0 }); }); }
这样,您只能滑动一个父元素而不是多个图像。 希望它有所帮助 - 上面的代码是未经测试的,但假设你的图像高度和宽度为200px,当然样式在样式表中比在线内容更好。
You're best off having four images and having them masked bi a div using the overflow:hidden attribute.
// Your markup <div id="imgMask" style="overflow:hidden; height:200px; width:200px;"> <div id="inner" style="position:relative; left:0;"> <img src="images/environments/img0.jpg" /> <img src="images/environments/img1.jpg" /> <img src="images/environments/img2.jpg" /> <img src="images/environments/img3.jpg" /> </div> </div> // Your js function slideLeft(){ $('#inner').animate({ left: -200px; },2000, function(){ $('#inner img').eq(0).remove().appendTo('#inner'); $('#inner').css({ 'left',0 }); }); }
This way you are only sliding one parent element instead of multiple images. Hope it helps - the above code is untested but assumes you have an image height and width of 200px, and of course the styles are better off in your stylesheet than being inline like this.
相关问答
更多-
TCP/IP模型是一个________。[2023-10-02]
a -
下列中不属于面向对象的编程语言的是?[2022-05-30]
a -
你最好有四个图像,并使用overflow:hidden属性将它们屏蔽为div。 // Your markup
jquery,图像幻灯片效果(jquery, Image slide effect)[2022-03-13]
我建议你看看这个教程。 它还有一个jquery滑块tutrial。 你可以在一天内建立自己的滑块 https://tutsplus.com/lesson/the-obligatory-slider/ i would suggest you to have a look on this tutorial. It also has a jquery slider tutrial. you can build your own slider in a day https://tutsplus.com/lesson ...jQuery幻灯片,每张幻灯片支持多个图像?(jQuery Slideshow that supports more than 1 image per slide? [closed])[2023-11-16]
试试jCarousel 您可以查看一些示例或文档,以更好地了解它的功能。 祝你好运! Try jCarousel You can take a look at some of the examples or the documentation to get a better idea of it's capabilities. Good luck!var image = new Image(); image.src = "kick.gif"; this.$slides.eq(this.current) .addClass('da-slide-current') .find("> .da-img > img").attr('src', image.src); var image = new Image(); image.src = "kick.gif"; this.$slides.eq(this.current) .add ...工作小提琴: http : //jsfiddle.net/x4VUv/ HTML:jQuery向上滑动图像?(jQuery slide up the image?)[2022-04-02]
您是否尝试过使用jQuery的height()方法来获取图像的高度? 可能存在内置的跨浏览器兼容性修补程序。 $('.image img').hover( var $th = $(this); var height = $th.height(); function(){ $th.stop().animate({marginTop:-(height - 100)}, 1000); }, function(){ $th.stop().animate({marginTop:'0px ...js和css include的正确部分也没有检查包含的正确路径,如果找不到,您将在浏览器控制台中找到404错误。 码:在这里你是解决方案: https://jsfiddle.net/23p4c8v0/2/ 要实现这一点,您需要定义此css: .fadein img { position: absolute; top: 0; left: 0; } 看它工作: $('.fadein img:gt(0)').hide(); setInterval(function() { $('.fadein :first-child').fadeOut(1000).delay(7000) .next('img' ...相关文章
更多- 正则表达式 - 语法
- 快速了解正则表达式
- Java正则表达式
- 正则表达式 - 示例
- 正则表达式 - 常用表达式示例
- JAVA 正则表达式教程(超详细)三(续)
- JAVA 正则表达式教程(超详细)二(续)
- 揭开正则表达式的神秘面纱
- 关于正则表达式空格的问题.
- 【原】storm源码之一个class解决nimbus单点问题
最新问答
更多- 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
- 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
- OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
- 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
- codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
- 在计算机拍照在哪里进入
- 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
- No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
- 单页应用程序:页面重新加载(Single Page Application: page reload)
- 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
- System.StackOverflow错误(System.StackOverflow error)
- KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
- 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
- android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
- TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
- 企业安全培训的各项内容
- 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
- C#类名中允许哪些字符?(What characters are allowed in C# class name?)
- NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
- 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
- 将多个行和可变行移动到列(moving multiple and variable rows to columns)
- 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
- 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
- Angular $资源不会改变方法(Angular $resource doesn't change method)
- 在Angular 5中不是一个函数(is not a function in Angular 5)
- 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
- 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
- 常见的python rpc和cli接口(Common python rpc and cli interface)
- Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
- 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)