我们可以定义如下 Scope:
简易转换指南:
由于Cache的容量是固定的,我们将使用 a least recently used(LRU) 方法去过期的老 entiies.
cache 使用双向链表:新的 items 会被添加到头节点,当items 过期时会被为尾节点移除,我们可以使用Hash 表来快速查找每一个 linked list node.
Query API Server 实现:
class QueryApi(object):
def __init__(self, memory_cache, reverse_index_service):
self.memory_cache = memory_cache
self.reverse_index_service = reverse_index_service
def parse_query(self, query):
"""Remove markup, break text into terms, deal with typos,
normalize capitalization, convert to use boolean operations.
"""
def process_query(self, query):
query = self.parse_query(query)
results = self.memory_cache.get(query)
if results is None:
results = self.reverse_index_service.process_search(query)
self.memory_cache.set(query, results)
return results
Node 实现:
class Node(object):
def __init__(self, query, results):
self.query = query
self.results = results
LinkedList 实现:
class LinkedList(object):
def __init__(self):
self.head = None
self.tail = None
def move_to_front(self, node):
...
def append_to_front(self, node):
...
def remove_from_tail(self):
...
Cache 实现:
class Cache(object):
def __init__(self, MAX_SIZE):
self.MAX_SIZE = MAX_SIZE
self.size = 0
self.lookup = {}
self.linked_list = LinkedList()
def get(self, query)
""" Get the stored query result from the cache
Accessing a node update its position to the front of the LRU list.
"""
node = self.lookup[query]
if node is None:
return None
self.linked_list.move_to_front(node)
return node.results
def set(self, results, query):
""" Set the result for the given query key in the cache
When updating an entry, updates its position to the front of the LRU list.
If the entry is new and the cache is at capacity, removes the oldest entry
before the new entry is added.
"""
node = self.lookup[query]
if node is not None:
node.results = results
self.linked_list.move_to_front(node)
else:
if self.size == self.MAX_SIZE:
self.loopup.pop(self.linked_list.tail.query, None)
self.linked_list.remove_from_tail()
else:
self.size += 1
new_node = Node(query, results)
self.linked_list.append_to_front(new_node)
self.lookup[query] = new_node
会在如下时间Update Cache:
处理这些情况最简单的方法是简单地设置缓存条目在更新之前可以保留在缓存中的最大时间,通常称为生存时间(TTL)。
为了处理重负载请求和大量需要的内存,我们需要水品扩展,我们有三个选择关于怎样存储数据进 Memory Cache cluster:
machine = hash(query
. 我们将想要去使用 持续性 hash.