lxml.cssselect module

CSS Selectors based on XPath.

This module supports selecting XML/HTML tags based on CSS selectors. See the CSSSelector class for details.

This is a thin wrapper around cssselect 0.7 or later.

class lxml.cssselect.CSSSelector(css, namespaces=None, translator='xml')[source]

Bases: XPath

A CSS selector.

Usage:

>>> from lxml import etree, cssselect
>>> select = cssselect.CSSSelector("a tag > child")

>>> root = etree.XML("<a><b><c/><tag><child>TEXT</child></tag></b></a>")
>>> [ el.tag for el in select(root) ]
['child']

To use CSS namespaces, you need to pass a prefix-to-namespace mapping as namespaces keyword argument:

>>> rdfns = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
>>> select_ns = cssselect.CSSSelector('root > rdf|Description',
...                                   namespaces={'rdf': rdfns})

>>> rdf = etree.XML((
...     '<root xmlns:rdf="%s">'
...       '<rdf:Description>blah</rdf:Description>'
...     '</root>') % rdfns)
>>> [(el.tag, el.text) for el in select_ns(rdf)]
[('{http://www.w3.org/1999/02/22-rdf-syntax-ns#}Description', 'blah')]
evaluate(self, _eval_arg, **_variables)

Evaluate an XPath expression.

Instead of calling this method, you can also call the evaluator object itself.

Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.

Deprecated:

call the object, not its method.

error_log
path

The literal XPath expression.

class lxml.cssselect.LxmlHTMLTranslator(xhtml: bool = False)[source]

Bases: LxmlTranslator, HTMLTranslator

lxml extensions + HTML support.

xpathexpr_cls

alias of XPathExpr

css_to_xpath(css: str, prefix: str = 'descendant-or-self::') str

Translate a group of selectors to XPath.

Pseudo-elements are not supported here since XPath only knows about “real” elements.

Parameters:
  • css – A group of selectors as a string.

  • prefix – This string is prepended to the XPath expression for each selector. The default makes selectors scoped to the context node’s subtree.

Raises:

SelectorSyntaxError on invalid selectors, ExpressionError on unknown/unsupported selectors, including pseudo-elements.

Returns:

The equivalent XPath 1.0 expression as a string.

pseudo_never_matches(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

selector_to_xpath(selector: Selector, prefix: str = 'descendant-or-self::', translate_pseudo_elements: bool = False) str

Translate a parsed selector to XPath.

Parameters:
  • selector – A parsed Selector object.

  • prefix – This string is prepended to the resulting XPath expression. The default makes selectors scoped to the context node’s subtree.

  • translate_pseudo_elements – Unless this is set to True (as css_to_xpath() does), the pseudo_element attribute of the selector is ignored. It is the caller’s responsibility to reject selectors with pseudo-elements, or to account for them somehow.

Raises:

ExpressionError on unknown/unsupported selectors.

Returns:

The equivalent XPath 1.0 expression as a string.

xpath(parsed_selector: Union[Element, Hash, Class, Function, Pseudo, Attrib, Negation, Relation, Matching, SpecificityAdjustment, CombinedSelector]) XPathExpr

Translate any parsed selector object.

xpath_active_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_attrib(selector: Attrib) XPathExpr

Translate an attribute selector.

xpath_attrib_dashmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_different(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_equals(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_exists(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_includes(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_prefixmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_substringmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_suffixmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_checked_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is an immediate child of left

xpath_class(class_selector: Class) XPathExpr

Translate a class selector.

xpath_combinedselector(combined: CombinedSelector) XPathExpr

Translate a combined selector.

xpath_contains_function(xpath, function)
xpath_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a child, grand-child or further descendant of left

xpath_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling immediately after left

xpath_disabled_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_element(selector: Element) XPathExpr

Translate a type or universal selector.

xpath_empty_pseudo(xpath: XPathExpr) XPathExpr
xpath_enabled_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_first_child_pseudo(xpath: XPathExpr) XPathExpr
xpath_first_of_type_pseudo(xpath: XPathExpr) XPathExpr
xpath_focus_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_function(function: Function) XPathExpr

Translate a functional pseudo-class.

xpath_hash(id_selector: Hash) XPathExpr

Translate an ID selector.

xpath_hover_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling after left, immediately or not

xpath_lang_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_last_child_pseudo(xpath: XPathExpr) XPathExpr
xpath_last_of_type_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

static xpath_literal(s: str) str
xpath_matching(matching: Matching) XPathExpr
xpath_negation(negation: Negation) XPathExpr
xpath_nth_child_function(xpath: XPathExpr, function: Function, last: bool = False, add_name_test: bool = True) XPathExpr
xpath_nth_last_child_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_nth_last_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_nth_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_only_child_pseudo(xpath: XPathExpr) XPathExpr
xpath_only_of_type_pseudo(xpath: XPathExpr) XPathExpr
xpath_pseudo(pseudo: Pseudo) XPathExpr

Translate a pseudo-class.

xpath_pseudo_element(xpath: XPathExpr, pseudo_element: Union[FunctionalPseudoElement, str]) XPathExpr

Translate a pseudo-element.

Defaults to not supporting pseudo-elements at all, but can be overridden by sub-classes.

xpath_relation(relation: Relation) XPathExpr
xpath_relation_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is an immediate child of left; select left

xpath_relation_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a child, grand-child or further descendant of left; select left

xpath_relation_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling immediately after left; select left

xpath_relation_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling after left, immediately or not; select left

xpath_root_pseudo(xpath: XPathExpr) XPathExpr
xpath_scope_pseudo(xpath: XPathExpr) XPathExpr
xpath_specificityadjustment(matching: SpecificityAdjustment) XPathExpr
xpath_target_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_visited_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

attribute_operator_mapping = {'!=': 'different', '$=': 'suffixmatch', '*=': 'substringmatch', '=': 'equals', '^=': 'prefixmatch', 'exists': 'exists', '|=': 'dashmatch', '~=': 'includes'}
combinator_mapping = {' ': 'descendant', '+': 'direct_adjacent', '>': 'child', '~': 'indirect_adjacent'}
id_attribute = 'id'

The attribute used for ID selectors depends on the document language: http://www.w3.org/TR/selectors/#id-selectors

lang_attribute = 'lang'

The attribute used for :lang() depends on the document language: http://www.w3.org/TR/selectors/#lang-pseudo

lower_case_attribute_names = False
lower_case_attribute_values = False
lower_case_element_names = False

The case sensitivity of document language element names, attribute names, and attribute values in selectors depends on the document language. http://www.w3.org/TR/selectors/#casesens

When a document language defines one of these as case-insensitive, cssselect assumes that the document parser makes the parsed values lower-case. Making the selector lower-case too makes the comparaison case-insensitive.

In HTML, element names and attributes names (but not attribute values) are case-insensitive. All of lxml.html, html5lib, BeautifulSoup4 and HTMLParser make them lower-case in their parse result, so the assumption holds.

class lxml.cssselect.LxmlTranslator[source]

Bases: GenericTranslator

A custom CSS selector to XPath translator with lxml-specific extensions.

xpathexpr_cls

alias of XPathExpr

css_to_xpath(css: str, prefix: str = 'descendant-or-self::') str

Translate a group of selectors to XPath.

Pseudo-elements are not supported here since XPath only knows about “real” elements.

Parameters:
  • css – A group of selectors as a string.

  • prefix – This string is prepended to the XPath expression for each selector. The default makes selectors scoped to the context node’s subtree.

Raises:

SelectorSyntaxError on invalid selectors, ExpressionError on unknown/unsupported selectors, including pseudo-elements.

Returns:

The equivalent XPath 1.0 expression as a string.

pseudo_never_matches(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

selector_to_xpath(selector: Selector, prefix: str = 'descendant-or-self::', translate_pseudo_elements: bool = False) str

Translate a parsed selector to XPath.

Parameters:
  • selector – A parsed Selector object.

  • prefix – This string is prepended to the resulting XPath expression. The default makes selectors scoped to the context node’s subtree.

  • translate_pseudo_elements – Unless this is set to True (as css_to_xpath() does), the pseudo_element attribute of the selector is ignored. It is the caller’s responsibility to reject selectors with pseudo-elements, or to account for them somehow.

Raises:

ExpressionError on unknown/unsupported selectors.

Returns:

The equivalent XPath 1.0 expression as a string.

xpath(parsed_selector: Union[Element, Hash, Class, Function, Pseudo, Attrib, Negation, Relation, Matching, SpecificityAdjustment, CombinedSelector]) XPathExpr

Translate any parsed selector object.

xpath_active_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_attrib(selector: Attrib) XPathExpr

Translate an attribute selector.

xpath_attrib_dashmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_different(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_equals(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_exists(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_includes(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_prefixmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_substringmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_attrib_suffixmatch(xpath: XPathExpr, name: str, value: Optional[str]) XPathExpr
xpath_checked_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is an immediate child of left

xpath_class(class_selector: Class) XPathExpr

Translate a class selector.

xpath_combinedselector(combined: CombinedSelector) XPathExpr

Translate a combined selector.

xpath_contains_function(xpath, function)[source]
xpath_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a child, grand-child or further descendant of left

xpath_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling immediately after left

xpath_disabled_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_element(selector: Element) XPathExpr

Translate a type or universal selector.

xpath_empty_pseudo(xpath: XPathExpr) XPathExpr
xpath_enabled_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_first_child_pseudo(xpath: XPathExpr) XPathExpr
xpath_first_of_type_pseudo(xpath: XPathExpr) XPathExpr
xpath_focus_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_function(function: Function) XPathExpr

Translate a functional pseudo-class.

xpath_hash(id_selector: Hash) XPathExpr

Translate an ID selector.

xpath_hover_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling after left, immediately or not

xpath_lang_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_last_child_pseudo(xpath: XPathExpr) XPathExpr
xpath_last_of_type_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

static xpath_literal(s: str) str
xpath_matching(matching: Matching) XPathExpr
xpath_negation(negation: Negation) XPathExpr
xpath_nth_child_function(xpath: XPathExpr, function: Function, last: bool = False, add_name_test: bool = True) XPathExpr
xpath_nth_last_child_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_nth_last_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_nth_of_type_function(xpath: XPathExpr, function: Function) XPathExpr
xpath_only_child_pseudo(xpath: XPathExpr) XPathExpr
xpath_only_of_type_pseudo(xpath: XPathExpr) XPathExpr
xpath_pseudo(pseudo: Pseudo) XPathExpr

Translate a pseudo-class.

xpath_pseudo_element(xpath: XPathExpr, pseudo_element: Union[FunctionalPseudoElement, str]) XPathExpr

Translate a pseudo-element.

Defaults to not supporting pseudo-elements at all, but can be overridden by sub-classes.

xpath_relation(relation: Relation) XPathExpr
xpath_relation_child_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is an immediate child of left; select left

xpath_relation_descendant_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a child, grand-child or further descendant of left; select left

xpath_relation_direct_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling immediately after left; select left

xpath_relation_indirect_adjacent_combinator(left: XPathExpr, right: XPathExpr) XPathExpr

right is a sibling after left, immediately or not; select left

xpath_root_pseudo(xpath: XPathExpr) XPathExpr
xpath_scope_pseudo(xpath: XPathExpr) XPathExpr
xpath_specificityadjustment(matching: SpecificityAdjustment) XPathExpr
xpath_target_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

xpath_visited_pseudo(xpath: XPathExpr) XPathExpr

Common implementation for pseudo-classes that never match.

attribute_operator_mapping = {'!=': 'different', '$=': 'suffixmatch', '*=': 'substringmatch', '=': 'equals', '^=': 'prefixmatch', 'exists': 'exists', '|=': 'dashmatch', '~=': 'includes'}
combinator_mapping = {' ': 'descendant', '+': 'direct_adjacent', '>': 'child', '~': 'indirect_adjacent'}
id_attribute = 'id'

The attribute used for ID selectors depends on the document language: http://www.w3.org/TR/selectors/#id-selectors

lang_attribute = 'xml:lang'

The attribute used for :lang() depends on the document language: http://www.w3.org/TR/selectors/#lang-pseudo

lower_case_attribute_names = False
lower_case_attribute_values = False
lower_case_element_names = False

The case sensitivity of document language element names, attribute names, and attribute values in selectors depends on the document language. http://www.w3.org/TR/selectors/#casesens

When a document language defines one of these as case-insensitive, cssselect assumes that the document parser makes the parsed values lower-case. Making the selector lower-case too makes the comparaison case-insensitive.

In HTML, element names and attributes names (but not attribute values) are case-insensitive. All of lxml.html, html5lib, BeautifulSoup4 and HTMLParser make them lower-case in their parse result, so the assumption holds.

lxml.cssselect._make_lower_case(context, s)[source]