How to efficiently invalidate cache?

What is the best practice for using a cache layer (e.g. memcached) in front of an eventually consistent datastore (e.g. SimpleDB)?

  • If you are using a ‘read-through’ cache layer (memached) in front of an eventually consistent datastore like SimpleDB, what is the best practice for avoiding this situation: User A updates SimpleDB (a write).  After the successful write you invalidate (remove) the cached object representation from memcached if it exists.  User B does a read so cache is checked first.  The requested object is not found in cache so it must retrieve it from SimpleDB instead and place it into cache before it returns it to the client (now it is available in cache for other users as well), but because of eventual consistency of SimpleDB, user B just happens to have been returned the older version of the attributes due to eventual consistency latency.  Since it was removed from cache by the write operation and now need to be reloaded, the old values can get loaded into cache!!   If after the write instead of cache removal/invalidation, you put the new object in cache immediately to get around this case, you would be putting an object in cache that may not be needed there (no one may request it for a while if ever) taking up cache space. What is the best practice for using a cache in front of an eventually consistent datastore to avoid this scenario?

  • Answer:

    In this case, if you want the true data to be in memcached, guaranteed, then you have to write it there from the point where you generate that data -- where the object is written to SimpleDB.  That isn't really avoidable. The trick to making this work out well in practice is to monitor your eviction rate and to keep an eye on what is being evicted.  memcached is an LRU cache.  If you are evicting data that was last accessed 24 hours ago, you probably don't care.  If, instead, your site is thrashing because you're evicting data you just wrote ten minutes ago, then you need to expand your cache. While I typically prefer read-through caching, this is a case where write-through caching is probably the right choice.  You will have to closely monitor your caching layer to ensure it's sized correctly, but you shouldn't have to pay too much of a penalty by doing it this way.

Mark Smith at Quora Visit the source

Was this solution helpful to you?

Other answers

BTW, SimpleDB does support a consistent read http://aws.amazon.com/simpledb/#consistent at the cost of additional latency.

Ranjit Mavinkurve

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.