填坑18年:我总结的CSS选择器
系列文章
- 网页爬虫第一课:从案例解构爬虫基本概念
- 填坑18年:我总结的CSS选择器
- 爬虫数据持久化方式的选择
- 爬取静态博客网页以分析本网站拓扑结构
- python程序的性能测试及瓶颈分析
- Python工程项目的规范开发指南
CSS选择器
这个坑我18年前就该填了.
十八年前, 我还是一个沉不住气的小朋友, 遇到困难随时准备放弃的那种. “CSS选择器”就是其中一个. 这么多年来, 这个坑时不时地折磨我一下, 让我错失很多机会. 痛定思痛, 今天我就要在两篇材料的辅佐下, 把它彻底解决掉.
下面首先给出全文总结出的CSS选择器的概览. 真是简单啊~
mindmap
root((CSS选择器))
基本选择器
element
.class
#id
*
多元素组合选择器
element,element
element element
element>element
element+element
element1~element2
.class1.class2
.class1 .class2
属性选择器
element.class
attribute
attribute=value
attribute~=value
attribute|=value
attribute^=value
attribute$=value
attribute*=value
伪类选择器
树状结构伪类.12个
位置伪类.7个
函数式伪类.7个
输入状态伪类.4个
用户行为伪类.5个
时间方面的伪类.?
元素显示状态伪类.?
语言的伪类.?
状态资源伪类.?
基本选择器
| 选择器 | 例子 | 含义 |
|---|---|---|
| element | p: 选择所有 <p> 元素 |
元素选择器: 通过元素标签选择 HTML 元素. |
| .class | .intro: 选择 “class=”intro” 所有元素 |
类选择器: 通过类别名称选择具有特定类别的 HTML 元素. |
| #id | #firstname: 选择 id=”firstname” 元素 |
ID 选择器: 通过元素的唯一标识符(ID)选择 HTML 元素. |
| * | * | 通用选择器: 选择所有元素. |
多元素组合选择器
| 选择器 | 例子 | 含义 |
|---|---|---|
| element,element | div, p | 选择所有<div>元素和所有<p>元素. |
| element element | div p | 选择<div>元素内的所有<p>元素. |
| element>element | div > p | 选择父元素是<div>的所有<p>元素. |
| element+element | div + p | 选择紧跟 <div> 元素的首个 <p> 元素. |
| element1~element2 | p ~ ul | 选择前面有 <p> 元素的每个 <ul> 元素. |
| .class1.class2 | .name1.name2 | 选择 class 属性中同时有 name1 和 name2 的所有元素. |
| .class1 .class2 | .name1 .name2 | 选择作为类名 name1 元素后代的所有类名 name2 元素. |
属性选择器
| 选择器 | 例子 | 含义 |
|---|---|---|
| element.class | p.intro | 选择 class=”intro” 的所有<p>元素。 |
| [ attribute ] | [target] | 选择带有 target 属性的所有元素。 |
| [ attribute = value ] | [target=_blank] | 选择带有 target=”_blank” 属性的所有元素。 |
| [ attribute ~= value ] | [title~=flower] | 选择 title 属性包含单词 “flower” 的所有元素。 |
| [ attribute |= value ] | [lang|=en] | 选择 lang 属性值以 “en” 开头的所有元素。 |
| [ attribute ^= value ] | a[href^=”https”] | 选择其 src 属性值以 “https” 开头的每个 <a> 元素。 |
| [ attribute $= value ] | a[href$=”.pdf”] | 选择其 src 属性以 “.pdf” 结尾的所有<a>元素。 |
| [ attribute *= value ] | a[href*=”abc”] | 选择其 href 属性值中包含 “abc” 子串的每个<a>元素。 |
伪类选择器
CSS伪类是添加到选择器的关键字, 用于指定所选元素的特殊状态. 例如, 伪类:hover可以用于选择一个按钮, 当用户的指针悬停在按钮上时, 设置此按钮的样式.
1 | /* 用户的指针悬停在其上的任何按钮 */ |
伪类由冒号(:)后跟着伪类名称组成(例如,:hover)。函数式伪类还包含一对括号来定义参数(例如,:dir())。附上了伪类的元素被定义为_锚元素_(例如,button:hover 中的 button)。
伪类让你可以将样式应用于元素,不仅与文档树内容有关,也与外部因素有关——如与导航历史有关的(例如,:visited)、与其内容的状态有关的(如某些表单元素上的 :checked)或者与鼠标位置有关的(如 :hover,它可以让你知道鼠标是否在一个元素上)。
树状结构伪类(12个)
These pseudo-classes relate to the location of an element within the document tree.
:root- Represents an element that is the root of the document. In HTML this is usually the
<html>element.
:empty- Represents an element with no children other than white-space characters.
:nth-child- Uses
An+Bnotation to select elements from a list of sibling elements.
:nth-last-child- Uses
An+Bnotation to select elements from a list of sibling elements, counting backwards from the end of the list.
:first-child- Matches an element that is the first of its siblings.
:last-child- Matches an element that is the last of its siblings.
:only-child- Matches an element that has no siblings. For example, a list item with no other list items in that list.
:nth-of-type- Uses
An+Bnotation to select elements from a list of sibling elements that match a certain type from a list of sibling elements.
:nth-last-of-type- Uses
An+Bnotation to select elements from a list of sibling elements that match a certain type from a list of sibling elements counting backwards from the end of the list.
:first-of-type- Matches an element that is the first of its siblings, and also matches a certain type selector.
:last-of-type- Matches an element that is the last of its siblings, and also matches a certain type selector.
:only-of-type- Matches an element that has no siblings of the chosen type selector.
位置伪类(7个)
These pseudo-classes relate to links, and to targeted elements within the current document.
:link- Matches links that have not yet been visited.
:visited- Matches links that have been visited.
:local-link- Matches links whose absolute URL is the same as the target URL. For example, anchor links to the same page.
:target- Matches the element which is the target of the document URL.
:target-within- Matches elements which are the target of the document URL, but also elements which have a descendant which is the target of the document URL.
:scope- Represents elements that are a reference point for selectors to match against.
函数式伪类(4个)
These pseudo-classes accept a selector list or forgiving selector list as a parameter.
:is()- The matches-any pseudo-class matches any element that matches any of the selectors in the list provided. The list is forgiving.
:not()- The negation, or matches-none, pseudo-class represents any element that is not represented by its argument.
:where()- The specificity-adjustment pseudo-class matches any element that matches any of the selectors in the list provided without adding any specificity weight. The list is forgiving.
:has()- The relational pseudo-class represents an element if any of the relative selectors match when anchored against the attached element.
输入状态伪类(17个)
These pseudo-classes relate to form elements, and enable selecting elements based on HTML attributes and the state that the field is in before and after interaction.
:enabled- Represents a user interface element that is in an enabled state.
:disabled- Represents a user interface element that is in a disabled state.
:read-only- Represents any element that cannot be changed by the user.
:read-write- Represents any element that is user-editable.
:placeholder-shown- Matches an input element that is displaying placeholder text. For example, it will match the
placeholderattribute in the<input>and<textarea>elements.
:default- Matches one or more UI elements that are the default among a set of elements.
:checked- Matches when elements such as checkboxes and radio buttons are toggled on.
:indeterminate- Matches UI elements when they are in an indeterminate state.
:blank- Matches a user-input element which is empty, containing an empty string or other null input.
:valid- Matches an element with valid contents. For example, an input element with the type ‘email’ that contains a validly formed email address or an empty value if the control is not required.
:invalid- Matches an element with invalid contents. For example, an input element with type ‘email’ with a name entered.
:in-range- Applies to elements with range limitations. For example, a slider control when the selected value is in the allowed range.
:out-of-range- Applies to elements with range limitations. For example, a slider control when the selected value is outside the allowed range.
:required- Matches when a form element is required.
:optional
: Matches when a form element is optiona>:hover- Matches when a user designates an item with a pointing device, such as holding the mouse pointer over the item.
:active- Matches when an item is being activated by the user. For example, when the item is clicked on.
:focus- Matches when an element has focus.
:focus-visible- Matches when an element has focus and the user agent identifies that the element should be visibly focused.
:focus-within- Matches an element to which
:focusapplies, plus any element that has a descendant to which:focusapplies.
时间方面的伪类
参见
元素显示状态伪类
参见
语言的伪类
参见
资源状态伪类
参见
(需要时再补充. 现在写爬虫暂时用不到. 伪类的内容还是以该文[6]的形式为参考, 以w3school的内容作为权威[7])
结语
没想到CSS选择器这么简单, 我却让这个阿喀琉斯之踵折磨了我十几年.
每当想起这个过失对我的折磨, 我没有一天不后悔: 不仅仅为这一件事, 而是很多类似的遭遇. 回首前尘往事, 那个犯下错误的小笨蛋, 我想和他谈谈. 讲给他, 我现在的感受. 告诉他还可以有其他的方式解决问题. 但已经不能了, 那个少年早就不见了, 只剩下我垂老之躯. 所以我得接受事实.
但是让我改过自新吗? 狗屁不通的词儿. 我改了又能怎么样, 既然如此, 我根本不应该在乎.
参考与注释
- 6.伪类的概念和种类. 英文版:Pseudo-classes ↩
- 7.w3school的CSS 选择器参考手册是权威参考. ↩







