Models overview¶
All AST node types inherit from BCQLNode.
Each node carries a node_type literal discriminator, making the full tree
serializable to/from JSON.
Public AST model exports for bcql_py.models.
BCQLNodeUnion
module-attribute
¶
BCQLNodeUnion = Union[
StringValue,
AnnotationConstraint,
IntegerRangeConstraint,
FunctionConstraint,
NotConstraint,
BoolConstraint,
TokenQuery,
SequenceNode,
RepetitionNode,
GroupNode,
SequenceBoolNode,
NegationNode,
UnderscoreNode,
LookaheadNode,
LookbehindNode,
SpanQuery,
PositionFilterNode,
CaptureNode,
AnnotationRef,
ConstraintLiteral,
ConstraintInteger,
ConstraintBoolLiteral,
ConstraintIntegerRange,
ConstraintComparison,
ConstraintBoolean,
ConstraintNot,
ConstraintFunctionCall,
GlobalConstraintNode,
RelationOperator,
ChildConstraint,
RelationNode,
RootRelationNode,
AlignmentOperator,
AlignmentConstraint,
AlignmentNode,
FunctionCallNode,
]
Annotated union of every concrete BCQL AST node, discriminated by node_type.
Use this anywhere a field can hold any BCQL node (sub-queries, sequence children, relation targets, etc.). For fields restricted to a smaller subset of node types, prefer the narrower unions defined alongside their owners (e.g. ConstraintExpr for token-level constraints, CaptureConstraintExpr for capture constraints). Narrower unions give better validation errors and make the schema honest about which nodes are actually legal in that position.
AlignmentConstraint
¶
Bases: BCQLNode
One alignment constraint: operator target
Multiple alignment constraints are separated by ;.
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
AlignmentOperator
|
The AlignmentOperator. |
target |
BCQLNodeUnion
|
The target sub-query. |
to_bcql
¶
Return this alignment constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/alignment.py lines 69–71
AlignmentNode
¶
Bases: BCQLNode
A parallel alignment query: source ==>field target [; ==>field target]*.
Attributes:
source: The source query in the primary field
alignments: One or more alignment constraints.
to_bcql
¶
Return this alignment query in BCQL syntax.
View source on GitHub: src/bcql_py/models/alignment.py lines 87–93
AlignmentOperator
¶
Bases: BCQLNode
The operator in an alignment query: =type=>field or ==>field?.
See https://github.com/instituutnederlandsetaal/BlackLab/blob/dev/site/docs/guide/040_query-language/030_parallel.md
Attributes:
| Name | Type | Description |
|---|---|---|
target_field |
str
|
The target field name (e.g. |
optional |
bool
|
|
relation_type |
str | None
|
Optional type filter (e.g. |
capture_name |
str | None
|
Override for the capture group name (default |
to_bcql
¶
Return this alignment operator in BCQL syntax.
View source on GitHub: src/bcql_py/models/alignment.py lines 47–52
BCQLNode
¶
Bases: BaseModel, ABC
Abstract base for every node in the BCQL abstract syntax tree.
Sub-classes must override to_bcql and set node_type
to a unique Literal string so that discrimination works correctly.
Configuration
frozen = True: instances are immutable after creationuse_enum_values = True: enum fields store their.value
bcql
cached
property
¶
Convenience property to get the BCQL string representation of this node.
to_bcql
abstractmethod
¶
Reconstruct a BCQL query string from this AST node.
The returned string is functionally equivalent to the original query but may differ in trivial whitespace and formatting.
View source on GitHub: src/bcql_py/models/base.py lines 32–38
AnnotationRef
¶
Bases: BCQLNode
Reference to a captured token's annotation: label.annotation, or a bare capture label.
Examples:
- A.word refers to the word annotation of capture A.
- A as a bare label (typically used as a function argument, e.g. start(A)).
Attributes:
| Name | Type | Description |
|---|---|---|
label |
str
|
Capture group name. |
annotation |
str
|
Annotation name, or empty string for a bare label reference. |
to_bcql
¶
Return this annotation reference in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 71–75
CaptureNode
¶
Bases: BCQLNode
A capture label applied to a sub-query: label:body, e.g. A:[word="hello"].
Everything matched by body is captured under label in the match info.
Attributes:
| Name | Type | Description |
|---|---|---|
label |
str
|
The capture group name (e.g. |
body |
BCQLNodeUnion
|
The sub-query whose match is captured |
to_bcql
¶
Return this capture expression in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 48–50
ConstraintBoolean
¶
Bases: BCQLNode
Boolean combination of capture constraints: left op right.
Operators: & (AND), | (OR), -> (implication). All three share the same precedence
per Bcql.g4's booleanOperator rule. The -> implication operator is most commonly
seen in capture constraints (e.g. A.word = "cat" -> B.word = "dog") but the grammar
allows it at every level.
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
Literal['&', '|', '->']
|
|
left |
CaptureConstraintExpr
|
Left operand. |
right |
CaptureConstraintExpr
|
Right operand. |
to_bcql
¶
Return this boolean expression in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 142–144
ConstraintComparison
¶
Bases: BCQLNode
A comparison in a capture constraint: left op right.
Supported operators: =, !=, <, <=, >, >=.
Operators here do not get their own class; should not be needed here.
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
Literal['=', '!=', '<', '<=', '>', '>=']
|
The comparison operator. |
left |
CaptureConstraintExpr
|
Left-hand operand (usually an AnnotationRef]. |
right |
CaptureConstraintExpr
|
Right-hand operand (annotation ref, literal, or function call). |
to_bcql
¶
Return this comparison expression in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 118–120
ConstraintFunctionCall
¶
Bases: BCQLNode
A function call in a capture constraint.
Examples: start(A) or end(B) used in expressions like
start(B) < start(A).
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Function name (e.g. |
args |
list[CaptureConstraintExpr]
|
Function arguments (annotation refs, literals, etc.). |
to_bcql
¶
Return this function call in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 235–238
ConstraintInteger
¶
Bases: BCQLNode
An integer literal in a capture constraint.
Example: the 5 in focus.pos > 5.
Attributes:
| Name | Type | Description |
|---|---|---|
value |
int
|
The integer value. |
to_bcql
¶
Return this integer literal in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 174–176
ConstraintLiteral
¶
Bases: BCQLNode
A literal string value in a capture constraint.
Example: the "over" in A.word = "over".
Attributes:
| Name | Type | Description |
|---|---|---|
value |
str
|
The literal string (without quotes) |
quote_char |
Literal['"', "'"]
|
The quote character used in the original query, either |
to_bcql
¶
Return this literal value in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 94–96
ConstraintNot
¶
Bases: BCQLNode
Logical NOT in a capture constraint
Attributes:
| Name | Type | Description |
|---|---|---|
operand |
CaptureConstraintExpr
|
The constraint being negated. |
to_bcql
¶
Return this negated expression in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 157–159
GlobalConstraintNode
¶
Bases: BCQLNode
A query with a global capture constraint.
The constraint expression follows the :: operator and relates captures defined in body.
Example: A:[] "by" B:[] :: A.word = B.word where A:[] "by" B:[] is the body and A.word = B.word is the constraint expression.
Attributes:
| Name | Type | Description |
|---|---|---|
body |
BCQLNodeUnion
|
The main query containing captures. |
constraint |
CaptureConstraintExpr
|
The constraint expression relating captures. |
to_bcql
¶
Return this global-constraint query in BCQL syntax.
View source on GitHub: src/bcql_py/models/capture.py lines 289–291
FunctionCallNode
¶
Bases: BCQLNode
A built-in function call at the sequence level.
Function arguments can be sub-queries BCQLNode, tring values StringValue, or integers
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Function name |
args |
list[BCQLNodeUnion | int]
|
Positional arguments |
to_bcql
¶
Return this function call in BCQL syntax.
View source on GitHub: src/bcql_py/models/function.py lines 33–45
LookaheadNode
¶
Bases: BCQLNode
A lookahead assertion: (?=...) (positive) or (?!...) (negative).
Matches a position only if the enclosed query matches (or doesn't match) the tokens that follow.
Attributes:
| Name | Type | Description |
|---|---|---|
positive |
bool
|
|
body |
BCQLNodeUnion
|
The sub-query that must (or must not) match ahead. |
to_bcql
¶
Return this lookahead assertion in BCQL syntax.
View source on GitHub: src/bcql_py/models/lookaround.py lines 37–40
LookbehindNode
¶
Bases: BCQLNode
A lookbehind assertion: (?<=...) (positive) or (?<!...) (negative).
Matches a position only if the enclosed query matches (or doesn't match) the tokens that precede
Attributes:
| Name | Type | Description |
|---|---|---|
positive |
bool
|
|
body |
BCQLNodeUnion
|
The sub-query that must (or must not) match behind |
to_bcql
¶
Return this lookbehind assertion in BCQL syntax.
View source on GitHub: src/bcql_py/models/lookaround.py lines 59–62
ChildConstraint
¶
Bases: BCQLNode
A single child constraint in a relation query.
Represents [-label:] -type-> target inside a relation expression.
Multiple child constraints are separated by ;. The target itself can be any BCQL sub-query,
including another relation query (e.g. _ -nsubj-> (_ -amod-> _)).
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
RelationOperator
|
The relation operator (type, negation, target field). |
target |
BCQLNodeUnion
|
The target sub-query. |
label |
str | None
|
Optional capture label on this child relation (e.g. |
to_bcql
¶
Return this child constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/relation.py lines 84–87
RelationNode
¶
Bases: BCQLNode
A dependency relation query: source -type-> target [; -type-> target]*.
The source is specified once; one or more child constraints follow, separated by ;.
Attributes:
| Name | Type | Description |
|---|---|---|
source |
BCQLNodeUnion
|
The source of the relation. |
children |
list[ChildConstraint]
|
One or more target constraints. |
to_bcql
¶
Return this relation query in BCQL syntax.
View source on GitHub: src/bcql_py/models/relation.py lines 106–112
RelationOperator
¶
Bases: BCQLNode
The operator in a relation query: -type-> or !-type->.
See https://github.com/instituutnederlandsetaal/BlackLab/blob/dev/site/docs/guide/040_query-language/020_relations.md#negative-child-constraints
for details on negative relations.
Attributes:
| Name | Type | Description |
|---|---|---|
relation_type |
str | None
|
The relation type as a string or regex pattern (e.g. |
negated |
bool
|
|
target_field |
str | None
|
For cross-field relations (e.g. |
to_bcql
¶
Return this relation operator in BCQL syntax.
View source on GitHub: src/bcql_py/models/relation.py lines 55–60
RootRelationNode
¶
Bases: BCQLNode
A root relation query: ^-type-> target or label:^-type-> target.
Usually this relation does not have a "type" (since ROOT is the dependency relation from the root), but some corpora may differ.
TODO: see if the Validator and CorpusSpec should account for "allowed root relations"
Root relations have no source, only a target. They match the root of a dependency tree.
Attributes:
| Name | Type | Description |
|---|---|---|
relation_type |
str | None
|
Optional relation type filter (usually |
target |
BCQLNodeUnion
|
The target sub-query. |
label |
str | None
|
Optional capture label. |
to_bcql
¶
Return this root-relation query in BCQL syntax.
View source on GitHub: src/bcql_py/models/relation.py lines 140–144
GroupNode
¶
Bases: BCQLNode
A parenthesized group of sub-queries.
Groups allow applying repetition operators or capture constraints to a complex sub-expression. We specify that there can only be one child node in a group, which typically would be a SequenceNode if there are multiple adjacent tokens or a token-level Node.
Attributes:
| Name | Type | Description |
|---|---|---|
child |
BCQLNodeUnion
|
The inner sub-query. |
to_bcql
¶
Return this parenthesized group in BCQL syntax.
View source on GitHub: src/bcql_py/models/sequence.py lines 111–113
NegationNode
¶
Bases: BCQLNode
Sequence-level negation (!).
Negation sits at the span level in the precedence chain (above repetition), so
!"man"+ parses as !("man"+) per Bcql.g4's sequencePartNoCapture rule.
The child is always a single span-level node (never a bare sequence), so
to_bcql just prepends ! without extra parentheses.
Attributes:
| Name | Type | Description |
|---|---|---|
child |
BCQLNodeUnion
|
The sub-query being negated. |
to_bcql
¶
Return this negated sub-query in BCQL syntax.
View source on GitHub: src/bcql_py/models/sequence.py lines 154–156
RepetitionNode
¶
Bases: BCQLNode
A repetition quantifier applied to a sub-query.
Supports + (1+), * (0+), ? (0 or 1), {n}, {n,m},
{n,}. Note that "up to" quantifiers like {0,m} are exported as
{,m} and may therefore be different in surface form from the original.
Attributes:
| Name | Type | Description |
|---|---|---|
child |
BCQLNodeUnion
|
The sub-query being repeated. |
min_count |
int
|
Minimum number of repetitions (inclusive, min. 0). |
max_count |
int | None
|
Maximum number of repetitions (inclusive), or |
to_bcql
¶
Return this repetition expression in BCQL syntax.
View source on GitHub: src/bcql_py/models/sequence.py lines 91–93
SequenceBoolNode
¶
Bases: BCQLNode
Sequence-level boolean combination (&, |, ->).
Binary, left-associative node mirroring the booleanOperator rule in Bcql.g4:
all three operators share the same precedence. For example, "a" | "b" & "c"
parses as ("a" | "b") & "c".
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
Literal['&', '|', '->']
|
The boolean operator. |
left |
BCQLNodeUnion
|
The left operand. |
right |
BCQLNodeUnion
|
The right operand. |
to_bcql
¶
Return this sequence-level boolean expression in BCQL syntax.
View source on GitHub: src/bcql_py/models/sequence.py lines 134–136
SequenceNode
¶
Bases: BCQLNode
An ordered sequence of adjacent tokens / sub-queries. A very high-level node type that can represent an entire query or a sub-sequence
Attributes:
| Name | Type | Description |
|---|---|---|
children |
list[BCQLNodeUnion]
|
The ordered list of child nodes in the sequence. |
to_bcql
¶
Return this sequence in BCQL syntax.
View source on GitHub: src/bcql_py/models/sequence.py lines 47–49
UnderscoreNode
¶
Bases: BCQLNode
The _ wildcard used in relation queries.
Distinct from [] (match-all token): _ means "any source or
target" in a relation expression without constraining token count.
to_bcql
¶
Return the underscore wildcard in BCQL syntax.
View source on GitHub: src/bcql_py/models/sequence.py lines 168–170
PositionFilterNode
¶
Bases: BCQLNode
A position-filter operator: within, containing, or overlap.
Example: "baker" within <person/> means find "baker" inside a <person/> span.
These operators are right-associative, so A within B within C is parsed as A within (B within C).
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
Literal['within', 'containing', 'overlap']
|
One of |
left |
BCQLNodeUnion
|
The query whose hits are filtered. |
right |
BCQLNodeUnion
|
The span/query that defines the positional constraint. |
to_bcql
¶
Return this position filter in BCQL syntax.
View source on GitHub: src/bcql_py/models/span.py lines 96–98
SpanQuery
¶
Bases: BCQLNode
A span (XML tag) query.
Three forms exist per Bcql.g4's tag rule:
- Whole span: <s/> or <ne type="PERS"/>
- Start tag: <s>
- End tag: </s>
The tag name can be a plain identifier (s, ne) or a quoted string
for regex patterns (<"person|location"/>).
Attributes:
| Name | Type | Description |
|---|---|---|
tag_name |
str | StringValue
|
The tag name as a plain string or |
position |
Literal['whole', 'start', 'end']
|
|
attributes |
dict[str, StringValue]
|
XML attributes as |
to_bcql
¶
Return this span query in BCQL syntax.
View source on GitHub: src/bcql_py/models/span.py lines 65–73
AnnotationConstraint
¶
Bases: BCQLNode
A single annotation comparison: annotation op "value".
Typically between an identifier, an operator, and a string value.
Note that the identifier is not semantically specified here! It fully depends
on the corpus which attributes (like word, lemma, pos) are available. So here
annotation is underspecified as just a string.
Example: word="man" or pos != "noun".
Attributes:
| Name | Type | Description |
|---|---|---|
annotation |
str
|
The annotation name (e.g. |
operator |
Literal['=', '!=', '<', '<=', '>', '>=']
|
|
value |
StringValue
|
The value being compared against. |
to_bcql
¶
Return this annotation constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 89–91
BoolConstraint
¶
Bases: BCQLNode
Boolean combination of token-level constraints: left op right.
The operator is & (AND), | (OR), or -> (implication). Per the BCQL spec / Bcql.g4,
all three share identical precedence and are left-associative. See the booleanOperator rule
in Bcql.g4. Naming-wise calling it "boolean" might be somewhat confusing for the implication case though
Not to be confused with sequence-level boolean operators (also &, |, and ->) which
combine whole sub-queries instead of token constraints. See sequence.SequenceBoolNode for those.
Attributes:
| Name | Type | Description |
|---|---|---|
operator |
Literal['&', '|', '->']
|
|
left |
ConstraintExpr
|
Left operand. |
right |
ConstraintExpr
|
Right operand. |
to_bcql
¶
Return this boolean token constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 181–183
FunctionConstraint
¶
Bases: BCQLNode
A function-call constraint inside token brackets.
TODO: check for predefined functions in blacklab?
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The function / pseudo-annotation name. |
args |
list[StringValue]
|
The string arguments to the function. |
to_bcql
¶
Return this function constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 131–134
IntegerRangeConstraint
¶
Bases: BCQLNode
An integer range constraint, such as a parser's confidence: confidence=in[min,max].
Example: pos_confidence=in[50,100].
Note that we require both min and max vals to be given. No implicit "infinite" or "zero" bounds.
Attributes:
| Name | Type | Description |
|---|---|---|
annotation |
str
|
The annotation name. |
min_val |
int
|
Inclusive lower bound. |
max_val |
int
|
Inclusive upper bound. |
to_bcql
¶
Return this integer range constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 112–114
NotConstraint
¶
Bases: BCQLNode
Logical NOT on a token-level constraint: !expr.
Typically for a capture group: !(pos="noun" | pos="verb").
Attributes:
| Name | Type | Description |
|---|---|---|
operand |
ConstraintExpr
|
The constraint being negated. |
to_bcql
¶
Return this negated token constraint in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 149–155
StringValue
¶
Bases: BCQLNode
A quoted string value inside a BCQL query.
Handles regular strings, literal strings (prefixed with l), and
sensitivity flags ((?-i) for sensitive, (?i) for insensitive).
Attributes:
| Name | Type | Description |
|---|---|---|
value |
str
|
The raw string content (without surrounding quotes). |
is_literal |
bool
|
|
sensitivity |
Literal['default', 'sensitive', 'insensitive']
|
|
Example::
StringValue(value="(?-i)Panama").to_bcql()
# '"(?-i)Panama"'
to_bcql
¶
Return this string value in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 58–62
TokenQuery
¶
Bases: BCQLNode
A single token query: [...], "string" shorthand, or [].
Attributes:
| Name | Type | Description |
|---|---|---|
constraint |
ConstraintExpr | None
|
The constraint expression inside the brackets, or
|
negated |
bool
|
|
shorthand |
StringValue | None
|
When the query was written as a bare string like
|
to_bcql
¶
Return this token query in BCQL syntax.
View source on GitHub: src/bcql_py/models/token.py lines 236–244