
| Key: |
RDBMS-38
|
| Type: |
Improvement
|
| Status: |
Open
|
| Priority: |
Minor
|
| Assignee: |
Unassigned
|
| Reporter: |
Martin Brauner
|
| Votes: |
0
|
| Watchers: |
1
|
|
If you were logged in you would be able to see more operations.
|
|
|
|
When executing a query, the current JPOX implementation does not immediately fetch all objects B that are part of a collection field of the candidate class A of that query (so there is a A :1 --- :N B relationship), even though the user would expect this according to the given fetch plan (fetch groups and/or fetch depths). Instead, the B objects are fetched when the collection field is accessed during iteration of the A result objects. This means n+1 SQL statements which could be improved. See also thread "How to fetch objects including a 1:N relationship in 1 SQL join statement?" in the forum.
May Q be the generated SQL statement of a JDOQL query on a candidate class A and B the type of a collection field of A.
Case 1: direct join in 1 statement
Q' = select A.*, B.* from A, B where all B.fk (+) = A.pk and (WHERE clause of Q)
pros:
a) in lot of cases the most performant solution
cons:
a) would mean more than 1 candidate classes (here: A and B) to be fetched from the sql result set which probably means a lot of changes in the current implementation.
b) if not only 1 collection field of A should be fetched according to the fetch plan, but more, this could lead to a complex join over many tables with a lot of repetitive data being sent from the database in large rows and long query execution times. So this approach, good for 1 relation, can backfire with more relations.
c) a RANGE specified in Q cannot be handled in the database any longer in Q'
d) aggregations in the JDOQL query that result in a GROUP BY clause in Q will cause trouble for Q'
e) probably there can be situations in Q where also ORDER BY or DISTINCT clauses will be a problem
Case 2: separate direct joins
Q' = select B.* from A, B where all B.fk (+) = A.pk and (WHERE clause of Q)
Q' is executed after Q for each collection field of A according to fetch plan.
pros:
a) will eliminate cons a) + b) of case 1
cons:
a) cons c) to e) of case 1 remain
b) means 1+k (k = number of :N relations to navigate according fatch plan), but still far better than n+1
Case 3: separate subselect joins
Q' = select 'this'.* from B 'this', (Q) 'that' where all 'this'.fk fields = 'that'.pk fields
Q' is executed after Q for each collection field of A according to fetch plan.
pros:
a) would eliminate all cons of case 1
b) relativly easy to implement as SQL string of Q can directly be embedded into Q' (works even when recursively fetching collection fields of B)
cons:
a) can be inefficient in relation to a direct join. Don't know how good todays RDBMS query optimizers will handle these subselects. But of course
should still be better than running n statements against the database.
So, personally, I would say Case 3 could be a solution. Would be nice of you can improve JPOX with something like this.
Martin
|
|
Description
|
When executing a query, the current JPOX implementation does not immediately fetch all objects B that are part of a collection field of the candidate class A of that query (so there is a A :1 --- :N B relationship), even though the user would expect this according to the given fetch plan (fetch groups and/or fetch depths). Instead, the B objects are fetched when the collection field is accessed during iteration of the A result objects. This means n+1 SQL statements which could be improved. See also thread "How to fetch objects including a 1:N relationship in 1 SQL join statement?" in the forum.
May Q be the generated SQL statement of a JDOQL query on a candidate class A and B the type of a collection field of A.
Case 1: direct join in 1 statement
Q' = select A.*, B.* from A, B where all B.fk (+) = A.pk and (WHERE clause of Q)
pros:
a) in lot of cases the most performant solution
cons:
a) would mean more than 1 candidate classes (here: A and B) to be fetched from the sql result set which probably means a lot of changes in the current implementation.
b) if not only 1 collection field of A should be fetched according to the fetch plan, but more, this could lead to a complex join over many tables with a lot of repetitive data being sent from the database in large rows and long query execution times. So this approach, good for 1 relation, can backfire with more relations.
c) a RANGE specified in Q cannot be handled in the database any longer in Q'
d) aggregations in the JDOQL query that result in a GROUP BY clause in Q will cause trouble for Q'
e) probably there can be situations in Q where also ORDER BY or DISTINCT clauses will be a problem
Case 2: separate direct joins
Q' = select B.* from A, B where all B.fk (+) = A.pk and (WHERE clause of Q)
Q' is executed after Q for each collection field of A according to fetch plan.
pros:
a) will eliminate cons a) + b) of case 1
cons:
a) cons c) to e) of case 1 remain
b) means 1+k (k = number of :N relations to navigate according fatch plan), but still far better than n+1
Case 3: separate subselect joins
Q' = select 'this'.* from B 'this', (Q) 'that' where all 'this'.fk fields = 'that'.pk fields
Q' is executed after Q for each collection field of A according to fetch plan.
pros:
a) would eliminate all cons of case 1
b) relativly easy to implement as SQL string of Q can directly be embedded into Q' (works even when recursively fetching collection fields of B)
cons:
a) can be inefficient in relation to a direct join. Don't know how good todays RDBMS query optimizers will handle these subselects. But of course
should still be better than running n statements against the database.
So, personally, I would say Case 3 could be a solution. Would be nice of you can improve JPOX with something like this.
Martin
|
Show » |
Sort Order:
|
1- the fetched objects may already be in the cache. The cache is one the main mechanisms that improves the ORM engines performance.
2-may add more I/O for the above reason, and also for returning redundant data in the result set