Best practices for connection factory configurations when using connection pooling?


I'm aware this is kind of a generic question, but is there any guidance or any best practices around connection factory configuration when using connection pooling?

We use DN RDBMS with HikariCP, and I was a little bamboozled by the fact that DN creates two connection pools.
So providing these properties to the PMF:
results in two Hikari pools of up to 20 connections each to be created, which was not clear to me based on the docs, and seems rather excessive.

The docs mention:
The secondary connection factory is used for schema generation, and for value generation operations (unless specified to use primary).
Based on that description it seems like the secondary factory should be fine with a smaller pool than the primary factory. Is that a valid assumption?
I see that it is possible to provide custom DataSources as connection factories. This would allow us to provide tweaked HikariCP configurations.

I've been looking for examples of how others do it, and only found Apache Hive which allocates a fixed amount of 2 connections to the secondary pool.
Justification is "Since DataNucleus uses locks for schema generation and value generation, 2 connections should be sufficient.", which makes sense to me.
But maybe I'm overlooking something?

Is there anyone else who is willing to share some insight into how to best size the secondary connection pool?
What are the factors that would help me in figuring out an optimal size?



It is a valid assumption (secondary using smaller pool).
If you do all of your schema generation up front you won't need any secondary connections for those ops also.
If your chosen value generation strategies don't need a DB connection (e.g "identity", "uuid", "uuid-hex", "auid", etc) then you don't need any secondary connections for those ops also.
You could even get to not needing a pool for secondary in those situations


Thanks Andy, I appreciate the response. Makes sense.

On the same note, given we use a pool of adequate size for the primary connection factory, wouldn't it also make sense to enable datanucleus.connection.singleConnectionPerExecutionContext? If the connections were not pooled, using different connection factories for tx and non-tx contexts sounds logical. But with pooling in place already, it feels a bit wasteful. Speaking in context of an application that performs many non-tx read operations here.