Skip to content

Remove complicated examples from patterns #976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion core-concepts/modules/ROOT/pages/typeql/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,10 @@ TypeQL functions bring the familiar abstraction from programming languages to da
****
Short definitions of terms used throughout the guide, with links to the section which introduces them.
****
--

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're happy with the state of it, we can include it here (and maybe in the nav)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you can include it. Any page we have i think that is 'live' should be in the nav!

You can also link to it from the query-variables-patterns page i think!

.xref:{page-version}@core-concepts::typeql/invalid-patterns.adoc[]
[.clickable]
****
A deeper look at invalid patterns involving disjoint variable reuse
****
--
198 changes: 198 additions & 0 deletions core-concepts/modules/ROOT/pages/typeql/invalid-patterns.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
= Advanced: Invalid Patterns

This section dives into the semantics of TypeQL,
using the <<_disjunctive_normal_form>> to explain why <<_disjoint_variable_reuse>> makes a pattern invalid.


== Disjoint variable reuse
This section demonstrates some of the cases which can arise due to
xref:new_core_concepts::typeql/query-variables-patterns.adoc#_disjoint_variable_usage_errors[disjoint variable reuse]

[unbound negations]
=== Unbound negation inputs
Notice we can write a pattern where the input variable to a negation is not bound in the parent.
For example:
[,typeql]
----
match
$p1 isa person; $p2 isa person;
{
$emp1 isa employment, links (employer: $company, employee: $p1);
} or {
$edu1 isa education, links (institute: $institute, attendee: $p2);
};

not { $emp2 isa employment, links (employer: $company, employee: $p2); };
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
----
At first glance, this looks like a reasonable query:
we query for persons `$p1` and `$p2` who neither worked for the same company,
nor attended the same institute.
However, you can see that the input variables for the negations (`$company` and `$institute`)
are not local to the negation, but also not bound in the parent conjunction.

==== Disjunctive Normal Form
The best way to think about these requirements is to convert the query to Disjunctive Normal Form
by rewriting the pattern using "distributivity" and examining each branch:

`A; { B; } or { C; };` becomes `{A; B} or {A; C};`

In this case, we get the pattern:
[,typeql]
----
match
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
not { $edu2 isa education, links (institute: $institute, attendee: $p2); };
};
----
Although this could now be a valid logic query,
the first branch requires that `$p2` did not attend *any* institute,
and the second branch requires that `$p2` was not employed by *any* employer.
This is clearly not what we intended to write. Hence, we flag these as invalid TypeQL queries.

[NOTE]
====
There is, of course, a way to express the intended query:
[,typeql]
.The correct query
----
$p1 isa person; $p2 isa person;
not {
$emp1 isa employment, links (employer: $company, employee: $p1);
$emp2 isa employment, links (employer: $company, employee: $p2);
};
not {
$edu1 isa education, links (institute: $institute, attendee: $p2);
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----
====

=== Reusing branch local variables in disjunctions

Consider another case of questionable query composition:
[,typeql]
----
match
$p1 isa person; $p2 isa person;
{
$emp1 isa employment, links (employer: $company, employee: $p1);
} or {
$edu1 isa education, links (institute: $institute, attendee: $p2);
};
{
$emp2 isa employment, links (employer: $company, employee: $p2);
} or {
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----
Ideally, this would be a query to find two persons `$p1` and `$p2` who
were either employed by the same company, or attended the same institute.

The DNF quickly reveals the mistake:
[,typeql]
----
match
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
$emp2 isa employment, links (employer: $company, employee: $p2);
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
$emp2 isa employment, links (employer: $company, employee: $p2);
} or
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
$edu2 isa education, links (institute: $institute, attendee: $p2);
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----

You can see the query we meant to write in two of those branches:
[,typeql]
----
match
$p1 isa person; $p2 isa person;
{
$emp1 isa employment, links (employer: $company, employee: $p1);
$emp2 isa employment, links (employer: $company, employee: $p2);
} or {
$edu1 isa education, links (institute: $institute, attendee: $p2);
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----

The problem lies in the other two branches.
[,typeql]
----
match
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
$edu2 isa education, links (institute: $institute, attendee: $p2);
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
$emp2 isa employment, links (employer: $company, employee: $p2);
};
----
This will return any persons `$p1` & `$p2` when
either (1) `$p1` is employed by *any* and `$p2` attended *any* institute;
or (2) `$p2` is employed by *any* company and `$p1` attended *any* institute.

Notice `$company` is "internal" to both the first and second disjunctions (The same is the case for `$institute`).
TypeQL throws a "disjoint variable re-use" error for such cases.

=== Single origin for optionals
Optional variables can only be used in only one `try` block in a given match clause. Consider:
[,typeql]
----
match
try { $x isa person, has name "John"; };
try { $x isa person, has name "James"; };
----
The reason this is banned is similar to the <<_unbound_negation_inputs>> case above:
It is unclear if `$x` is an input or either of the `try` blocks, or an optional variable.

The same variable can be used in `try` blocks in the next stage.
This is because the variable is bound - either to a concept, or to `None` - and is an 'input' to the block.
The block simply fails if it tries to use a variable bound to `None` in a constraint.

[NOTE]
====
The semantics of try blocks dictate the equivalence of

`try { P };` and `{ P } or { not { P' }; };`

where, `P'` is the pattern obtained by renaming all the 'optional' variables in `P` with fresh ones.
Rewriting the above query with this equivalence (assuming `$x` to be 'optional' and renaming it, although this is ambiguous)
[,tyepql]
----
{ $x "John"; } or not { $x_1 "John"; };
{ $x "James"; } or not { $x_2 "James"; };
----

Taking the DNF gives us 4 cases, and some very unintuitive behaviour:
[,typeql]
----
{ $x "John"; $x "James"; } or # persons named both James and John.
{ $x "John"; not { $x_2 "James"; }; } or # If nobody is named James, then persons named John
{ not { $x_1 "John"; }; $x "James"; } or # If nobody is named John, then persons named James
{ not { $x_1 "John"; }; not { $x_2 "James"; }; }; # If nobody is named either, then a single answer `None`
# If there is one person named John, and a different person named James, then no answers
----

====
Original file line number Diff line number Diff line change
Expand Up @@ -329,156 +329,20 @@ Notice that a variable that is "bound" in a conjunction is guaranteed to be boun
in ALL execution paths - i.e., regardless of which branch is taken in each disjunction.
An answer to a pattern contains only those variables bound in the root conjunction.

// TODO: This could be its own page
=== Invalid patterns: unbound negation inputs
Notice we can write a pattern where the input variable to a negation is not bound in the parent.
For example:
=== Disjoint variable usage errors
Since variable names are scoped to a pipeline, it's possible to write a query in which
a certain variable is internal to several different patterns. For example:
[,typeql]
----
match
$p1 isa person; $p2 isa person;
{
$emp1 isa employment, links (employer: $company, employee: $p1);
} or {
$edu1 isa education, links (institute: $institute, attendee: $p2);
};

not { $emp2 isa employment, links (employer: $company, employee: $p2); };
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
----
At first glance, this looks like a reasonable query:
we query for persons `$p1` and `$p2` who neither worked for the same company,
nor attended the same institute.
However, you can see that the input variables for the negations (`$company` and `$institute`)
are not bound in the parent conjunction. Hence, this is an invalid query.

==== DNF
The best way to think about these requirements is to convert the query to Disjunctive Normal Form
by rewriting the pattern using "distributivity" and examining each branch:

`A; { B; } or { C; };` becomes `{A; B} or {A; C};`

In this case, we get the pattern:
[,typeql]
----
match
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
not { $edu2 isa education, links (institute: $institute, attendee: $p2); };
};
----
Although this could now be a valid logic query,
the first branch requires that `$p2` did not attend *any* institute,
and the second branch requires that `$p2` was not employed by *any* employer.
This is clearly not what we meant. Hence, we flag these as invalid TypeQL queries.

[NOTE]
====
There is, of course, a way to express the intended query:
[,typeql]
.The correct query
----
$p1 isa person; $p2 isa person;
not {
$emp1 isa employment, links (employer: $company, employee: $p1);
$emp2 isa employment, links (employer: $company, employee: $p2);
};
not {
$edu1 isa education, links (institute: $institute, attendee: $p2);
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----
====

=== Invalid patterns: Disjoint variable re-use

Consider another case of questionable query composition:
[,typeql]
{ $x isa person, has name "John"; } or { $y isa person, has name "James"; };
{ $x isa person, has age 25; } or { $y isa person, has age 50; };
----
match
$p1 isa person; $p2 isa person;
{
$emp1 isa employment, links (employer: $company, employee: $p1);
} or {
$edu1 isa education, links (institute: $institute, attendee: $p2);
};
{
$emp2 isa employment, links (employer: $company, employee: $p2);
} or {
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----
Ideally, this would be a query to find two persons `$p1` and `$p2` who
were either employed by the same company, or attended the same institute.

The DNF quickly reveals the mistake:
[,typeql]
----
match
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
$emp2 isa employment, links (employer: $company, employee: $p2);
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
$emp2 isa employment, links (employer: $company, employee: $p2);
} or
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
$edu2 isa education, links (institute: $institute, attendee: $p2);
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----

You can see the query we meant to write in two of those branches:
[,typeql]
----
match
$p1 isa person; $p2 isa person;
{
$emp1 isa employment, links (employer: $company, employee: $p1);
$emp2 isa employment, links (employer: $company, employee: $p2);
} or {
$edu1 isa education, links (institute: $institute, attendee: $p2);
$edu2 isa education, links (institute: $institute, attendee: $p2);
};
----

The problem lies in the other two branches.
[,typeql]
----
match
{
$p1 isa person; $p2 isa person;
$emp1 isa employment, links (employer: $company, employee: $p1);
$edu2 isa education, links (institute: $institute, attendee: $p2);
} or {
$p1 isa person; $p2 isa person;
$edu1 isa education, links (institute: $institute, attendee: $p2);
$emp2 isa employment, links (employer: $company, employee: $p2);
};
----
This will return any persons `$p1` & `$p2` when
either (1) `$p1` is employed by *any* and `$p2` attended *any* institute;
or (2) `$p2` is employed by *any* company and `$p1` attended *any* institute.

Notice `$company` is "internal" to both the first and second disjunctions (The same is the case for `$institute`).
TypeQL throws a "disjoint variable re-use" error for such cases.
Such patterns are considered invalid and will throw disjoint variable usage errors.
See the xref:core-concepts::typeql/invalid-patterns.adoc[invalid patterns] page for more examples.

==== Valid variable scopes:
From these, it can be seen that a variable is either:
Rephrasing the xref:reference::typeql/patterns/index.adoc#_scopes[unique scope principle], a variable is either:

1. Internal to one branch of one disjunction.
2. Internal to one negation.
Expand Down
1 change: 1 addition & 0 deletions core-concepts/modules/ROOT/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
** xref:{page-version}@core-concepts::typeql/query-variables-patterns.adoc[]
** xref:{page-version}@core-concepts::typeql/query-clauses.adoc[]
** xref:{page-version}@core-concepts::typeql/queries-as-functions.adoc[]
** xref:{page-version}@core-concepts::typeql/invalid-patterns.adoc[]
// ** xref:{page-version}@core-concepts::typeql/best-practices.adoc[]

* xref:{page-version}@core-concepts::drivers/index.adoc[]
Expand Down