Possible optimization when query candidate is a class with no base class


Page bloom
 

When a candidate class for a query is a class that has no base class then, in an RDBMS datastore at least, there should be no need to specify discriminators because they don't need to be part of any filter if you are interested in any instances of the base class or any of its subclasses/derived classes.

e.g. I have a Message class that has subclasses for each type of message that can be stored in the queue (the table). When processing messages in the queue I want the result to include all messages of any class so there should be no need for a discriminator.

A sample generated query is shown below.
Note: classid is the discriminator (it's an 'int' to avoid creating an Elizabethen novel of FQCNs in the SQL for each query and also avoids having indexes 50x the size of what they are when using int for discriminator type)

SELECT a0.creation_time AS nucorder0,a0.priority,a0.send_attempts,a0.msg_id,a0.version,a0.classid
FROM sa_msg_msg a0
WHERE
(
    (a0.classid = 3101 OR a0.classid = 469904975 OR a0.classid = 201472179 OR a0.classid = 85434732 OR a0.classid = 26141 OR
    a0.classid = 968904800 OR a0.classid = 621894233 OR a0.classid = 26132 OR a0.classid = 299345225 OR a0.classid = 486749780
    OR a0.classid = 26137 OR a0.classid = 485538021 OR a0.classid = 26144 OR a0.classid = 462802748 OR a0.classid = 82853853)
)
AND a0.priority > 0
ORDER BY nucorder0 LIMIT 0,1 

As you can see there's quite a lot of classid= in the where clause which are redundant in this case.

I'm not sure how this affects performance on most RDBMSes.

Some may be smart enough to realize that the full extent of discriminators is present in the query so it can effectively ignore that part of the where clause (just a wild guess) but most probably aren't.

If not ignored then we should probably look at creating a composite index that includes the discriminator.

I always have a simple index on discriminator but I think in this case a composite including discriminator would help.

Alternatively I propose (will look into) an optimization for this scenario where discriminators are left out of the where clause).


Andy
 
Edited

Perhaps if you look at http://www.datanucleus.org:15080/products/accessplatform_5_2/jdo/query.html#jdoql_candidate in particular the second "info" point. It specifies a query extension to not add a discriminator clause when you want to override default behaviour.

If a user was sharing a table for multiple things (or only some of the records applicable to DN usage) then they clearly would not want to include the other records in such a query ... hence why the default is to include a discriminator always.


Page bloom
 

On Fri, Jul 19, 2019 at 03:16 AM, Andy wrote:
If a user was sharing a table for multiple things (or only some of the records applicable to DN usage) then they clearly would not want to include the other records in such a query ... hence why the default is to include a discriminator always.
Ah, that reason makes sense.


Page bloom
 
Edited

In regard to the reference you gave:

If you want to override this and NOT have a discriminator restriction imposed in the SQL then you provide the query extension datanucleus.query.dontRestrictDiscriminator set to true.
This appears that it would affect all queries with any candidate class whereas I definitely still need discriminators in the case where the candidate class is anything but the ultimate base class... unless..

... When it says "if this class has a table"  - in my case, universally, the ultimate base class in any class hierarchy has a table and the descendant classes do not so maybe this would work. Given that the subclasses will not have their own table then when they are a candidate class the discriminators would be used - that is if I understand it correctly.


Page bloom
 

I tested the dontRestrictDiscriminator option but unfortunately it applies regardless of whether the class has a table (new-table in the meta data) or just uses the table of a class in its inheritance hierarchy.

So consequently, whenever a query is performed where the candidate class is a subclass and doesn't have it's own table but uses that of a superclass setting 
datanucleus.query.dontRestrictDiscriminator=true
will also prevent the SQL having a discriminator where clause even though they are definitely needed in that case otherwise it picks up every type of class in the table.

I was hoping the option would only apply to "classes with a table" as per my optimistic interpretation :)

Maybe the note could say, "classes with a table or classes extending a class with a table" or maybe clearer to say it's a global option and discriminators will never be used in any query to restrict class types to that of the candidate class or its subclasses - you would basically need to have "table per class" strategy for that to be workable.