[OFBiz] Users - Having a problem with findByConditionCache()
David Vediner
vedinerd at bctonline.com
Tue Sep 14 13:29:42 EDT 2004
Hi Al -
Thank you for your help with this. I forgot to mention that my
conclusions in the last email (the problem being caused by the fact that
in get(), getConditionKey() is called and in set(),
getFrozenConditionKey() is called) were supported by the fact that if I
simply change get() to also call getFrozenConditionKey(), caching in
findByConditionCache() seems to work fine. This is, of course, just a
hack because I'm not 100% certain about the entire purpose of calling
getFrozenConditionKey() is and therefore this hack may actually be
introducing extra issues I'm just not seeing yet.
Reading through Adam's email, it does seem like the issue perhaps has to
do with how different orderings of the same entity list are stored in
the cache - where the orderBy is the key to a hash that is the actual
cache value. That seems to work fine, but it seems that passing the
condition through getFrozenConditionKey() gets rid of this extra hash
layer, and therefore the key lookup fails. If I'm right, perhaps
getFrozenConditionKey just needs to be rewritten to be a little smarter
about what it does, which I'd be happy to do when I have a little more
time to actually sit down and figure out exactly what it's supposed to
do and the full consequences of it's actions (I, too, am currently
slammed with a deadline).
This should also be easy to test and reproduce, too, because it should
mean that caching in calls to findByConditionCache() from places like
CatalogWorker.getProdCatalogCategories() in the current code from SVN
should be failing to actually cache items and then retrieve them from cache.
Thanks!
David Vediner
Al Byers wrote:
> David,
>
> I happen to be working with Caching right now and I know that David
> and Andy are slammed on a contract, so I thought I would attempt to
> help. I have been looking at the code through the lens of your
> comments. Are you aware of the work that Adam Heath did on caching? I
> am pasting in a long email from him from back on June 2 entitled,
> "EntityCache implementation(long)". I have not read through it all,
> but it seems to be dealing with code that you are. I will continue to
> study this and see if I can catch David sometime to verfiy if your
> suggestions are correct. Let's see if we can determine exactly what
> code needs to change.
>
> -Al
>
> <<<<<<<<<<<<< from Adam >>>>>>>>>>>>>>>>>>>>>
>
> So, I've talked and talked about my advanced caching infrastructure.
> But I've
> never shown the code. I'm not yet at a point to show the code(*very*
> soon,
> tho). What I am going to do now, is discuss the existing
> implementation of
> caching, it's pitfalls, and my new implementation(and it's current
> issues).
>
> Old Way
> =======
>
> Currently, the entity engine's cache infrastructure is rather
> simplistic. It
> only has findByPrimaryKeyCache and findByAndCache. Caching on primary
> key is
> very simple. A direct link between a pk object and the actual entity
> is very
> easy to maintain; even with a UtilCache, and it's soft-references.
>
> And caching, however, is a tad more complex.
>
> When a list of values is cached for findByAnd, first, all the values
> are set
> immutable. Then, the set of fields used by the findByAnd invocation are
> converted to a GenericEntity object, which is used as the key to the
> andCache
> UtilCache instance. Then, the mapping of entity name to sets of
> keys(one for
> each findByAndCache invocation) is maintained. This map is *not*
> soft-referenced.
>
> When fetching an item from the primaryKey cache, the pk object is
> directly
> used as the key to fetch for. This is very straight forward.
>
> Fetching from the andCache is also rather simple. The fields being
> used as a
> match are converted to a GenericEntity object, and used as the key for
> lookups.
>
> Storing(creating or updating) is a bit complex.
>
> Whenever a store is attempted, the following occurs:
>
> 1) If the value is a primary key, then that item is removed from the
> primaryKey cache.
> 2) If the value is not a primary key, then:
> a) Fetch the primary key, and clear the primaryKey cache with it.
> b) Using the fieldSet from above, get the set of keySets for this
> entity, and for each set:
> A) Create a new map that contains all values from the item being
> stored, for all the field names mentioned in the current set.
> B) Also do this for the original db values, if they exist.
> C) Convert both the new and old values to primary keys, and remove
> from the andCache.
>
> I have glossed over the findByAllCache, as it is rather simple as
> well. Finds
> are simple: each entity has a single entry in the allCache UtilCache
> instance,
> keyed by entity name. Clearing is also simple: any stores for the entity
> clears that andCache instance for that entity.
>
> This current system has several issues:
> 1) It uses different cache algorithms for each type of caching system.
> 2) Not all findBy methods can be cached. Most notably,
> findByCondition, the
> most powerful of all the findBy methods, is not cached(findByOr,
> findByLike are the others that are not cached).
> 3) Caches on views are not cleared correctly.
> 4) If a hit occurs on a findByAndCache or findAllCache, and an
> orderBy list
> is specified, then the delegator will *always* sort in memory.
>
> New Way
> =======
>
> For the new system, the first thing that was done, was to attempt to
> convert
> all find methods to conditions. Other than primaryKey and all, this was
> straightforward. This solves issues 1 and 2 above. As a benefit of
> having
> everything go thru the same findByCondition code path, all
> findBy(And|Or|Like|Condition)Cache calls that happen to convert to the
> same
> condition can now cache together.
>
> Each model entity get's it's own UtilCache instance. I do this by
> creating
> the UtilCache name based on the delegator name, and the entity name.
> I also
> have separate instances for list returns, and primary keys.
>
> The key in the UtilCache instance, for primary key, is the primary key
> itself.
> This is very close to the original system.
>
> For conditions, the condition itself is the key. I have modified the
> condition objects to implement hashCode and equals. Additionally, the
> toString method calls makeWhereString, so that the condition looks
> like sql,
> when displaying in the util cache webtools pages.
>
> However, the value indexed by the condition key is *not* the list itself.
> It's a map, keyed by orderBy. When a cache method is called, the passed
> orderBy is looked up in this list. If found, the value is returned
> directly.
> If not, the first value in the map is fetched, the list is reordered,
> then
> stored in this sub-level map. This solves issue 4.
>
> Solving 3 was the most complex. After all model entities are read,
> all view
> entities register themselves with their constituent real entities. Then,
> during cache clearing on a real entity, the real entity is converted
> into a
> partial view entity, which is then used to further clear caches.
>
> The previous paragraph requires more detail explanation, however. My
> first
> implementation of this real->view entity conversion was not full
> featured. It
> only converted <alias>, and did not consider <view-link>. It also
> iterated
> over the mappings each time a cache clearing operation took place; the
> current
> version builds an object graph at model entity load time, to speed up the
> runtime.
>
> For <alias> tags that exist, I can figure out exactly how to change
> fields of
> a real entity into a view entity. I also cross-reference on
> <view-link> with
> <alias>, so fields in a real entity that are not mentioned in an
> <alias>, but
> are mentioned in <view-link>, can be handled.
>
> However, fields mentioned in a <view-link> that do not have any
> corresponding
> <alias>, or if a <constraint> tag is used, are not handled. Solving the
> former can be rather complex, as it would entail modifying the query
> sent to
> the database to return the 'hidden' fields, then storing them into the
> entity,
> into a hidden map. The latter is a problem I just realized while
> writing this
> email.
>
> Cache clearing is rather simple, and compares somewhat to the former.
> For
> each store, I attempt to clear the primaryKey cache. Then, I walk all
> stored
> conditions, seeing if the old and new values match the condition(by
> calling
> condition.mapMatches(delegator, map)). I also take this time to
> convert the
> incoming entity into partial views, and do the same.
>
> The initial version of the last paragraph had issues. Converting a real
> entity to a view, meant that the view didn't have all it's normal fields.
> On complex views, this meant that no condition ever matched. So, the
> current
> version added a feature to the condition logic, to support WILDCARD
> values in
> entities(and maps). As I convert a real entity to a view, and fields
> in the
> view that are not in the real entity, are set to this special WILDCARD
> value.
> Then, the condition code, upon seeing this WILDCARD, returns true.
>
> All this cache logic is separated out into helper classes. A single
> Cache
> class, which acts as a dispatcher to the EntityCache,
> EntityListCache(extends
> AbstractEntityConditionCache), and EntityObjectCache(extends
> AbstractEntityConditionCache). The latter is something I haven't used
> in the
> real world yet. It allows you to cache complex object graphs based on
> entity
> conditions(condition -> map[name -> object]).
>
> Issues:
> 1) As mentioned on this list previously, someone was working on an ldap
> helper. This implementation does not work with conditions. I can
> fix
> this, by only converting findBy calls to conditions for the
> purpose of
> saving in the cache.
> 2) Hidden fields in view links are not considered during caching.
> I'd like
> to fix this, but would prefer to refactor all the sql generating code
> first(refacting could also mean that conditions could be used for
> ldap
> lookups, as my plans invovling removing the sql building code from
> conditions)
> 3) <view-link> and <relation> entries in the model can specify
> conditions,
> to further limit the returned values.
> 4) Questions about performance have been discussed here at work.
> With large
> numbers of conditions, looping over all of them for a store may be
> detrimental. I was thinking about solving this, by asking each
> condition(when storing it in the cache) for the list of
> fields/values it
> is checking against. Then, maintaining a map that allows for easy
> finding of conditions based on changed values from the incoming
> entity.
>
> Also, as a way to work around this issue, I've modified UtilCache to
> support 'wildcard' entries in it's properties file, so I can set the
> various cache parameters on names that are generated at runtime.
>
> I've also thought about having some sort of config that allows me
> to set
> the ttl and soft-reference settings for individual
> conditions(using the
> mapMatches code again).
>
> =======
>
> My plans going forward are to fix the issues I have outlined above.
> Since the
> project that drove all the above features is now complete(waiting on
> the boss
> to return from Brazil next week for deployment), I will have more time
> to fix
> this. I can't give full time to it, as I of course have other things
> to do.
>
> I'd also like input on whether writing all this complex code is worth
> it. My
> boss(Ean Schuessler) has suggested that all this complex code should
> really be
> done in the database. However, most jdbc drivers I have seen are
> rather slow;
> mostly due to converting between internal java representations, and
> their own
> wire formats.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe at ofbiz.dev.java.net
> For additional commands, e-mail: dev-help at ofbiz.dev.java.net
>
>
>
More information about the Users
mailing list