Memcache官网文档精选整理

MemCache全名可以看做MemoryCache。功能是做缓存。为什么不说他是一个缓存数据库呢，因为它很轻量，只是存储一般的K-V键值对，相比Redis而言，Redis功能更多，比如分布式、持久化、支持的数据结构多这些特点而言，MemCache更简单，只基于LruCache做缓存。

硬件层

1、如果你有4G内存，Server APP用了2G，那给Memcache分配1.5G就可以
2、不要把Memcache和Database一起部署，应该给Database更大的内存。
3、尽量单独部署，比如在一个64G内存的机器上。这样可以随时增加内存容量，而不是增加很多Server（说的是尽量不要和你的web应用部署在一起）
算法：
采用一致性哈希，一致性哈希是一种模型，它允许在添加或删除服务器时更稳定地分发密钥。在普通的哈希算法中，改变服务器的数量会导致许多密钥被重新映射到不同的服务器，从而导致大量的缓存丢失。一致性哈希描述了将键映射到服务器列表的方法，其中添加或删除服务器会导致键映射到的位置的极小变化。使用普通的哈希函数，添加第十一个服务器可能会导致40%以上的密钥突然指向不同于普通的服务器。如果使用一致的散列算法，添加第十一个服务器会导致少于10%的密钥被重新分配。在实践中，这会有少量差异。

Memcache的github自己还写了个场景故事来炫耀memcache的强大功能。。
传送门–>https://github.com/memcached/memcached/wiki/TutorialCachingStory

命令

存储命令（Storage commands）

set

最常用的命令。存储该数据，可能覆盖任何现有数据，新item在LRU的顶部。

add

仅在不存在数据的情况下存储此数据。新item在LRU的顶部。如果某个项已经存在，并且添加失败，则将其提升到LRU的顶部。

replace

存储此数据，但仅当数据已经存在时。几乎从不使用，并且存在协议完整性（设置、添加、替换等）

append

在现有item的最后一个字节之后添加此数据。不允许超出项目限制，用于管理列表。

prepare

与追加相同，但在现有数据之前添加新数据。

cas

Check And Set或Compare And Swap。用于存储数据，但仅当自上次读取数据以来没有其他人更新过数据时使用，用于解决更新缓存数据的竞争条件。

检索命令（Retrieval Commands）

get

用于检索数据的命令，获取一个或多个键并返回所有找到的项。

gets

与CAS一起使用的另一个get命令。返回具有该项的CAS标识符（唯一64位数字），用CAS命令返回这个值。如果项目的CAS值在您得到它之后发生了变化，它就不会被存储。

Deletion命令

delete

如果存在，则从缓存中移除项。

Increment/Decrement

incr/decr

增量和递减。只能用正值来表示，如果一个值尚未存在，则ICR/DECR将失败。

Touch

touch

用于更新现有项目的过期时间。

touch <key> <exptime> [noreply]

Get And Touch

gat/gats

用于更新现有项目的过期时间，并且获取该项。

gat <exptime> <key>*\r\n
gats <exptime> <key>*\r\n

Slabs Reassign

slabs reassign

运行时重新分配内存

slabs reassign <source class> <dest class>\r\n

slabs automove

由后台线程决定是否重新移动slab（重新分配内存）

slabs automove <0|1>

其他

官网给出的例子，并表示很好用。

 # Don't load little bobby tables
sql = "SELECT * FROM user WHERE user_id = ?"
key = 'SQL:' . user_id . ':' . md5sum(sql)
 # We check if the value is 'defined', since '0' or 'FALSE' # can be
 # legitimate values!
if (defined result = memcli:get(key)) {
	return result
} else {
	handler = run_sql(sql, user_id)
	# Often what you get back when executing SQL is a special handler
	# object. You can't directly cache this. Stick to strings, arrays,
	# and hashes/dictionaries/tables
	rows_array = handler:turn_into_an_array
	# Cache it for five minutes
	memcli:set(key, rows_array, 5 * 60)
	return rows_array
}

建议的key选择
···
key = ‘SQL’ . query_id . ‘:’ . m5sum(“SELECT blah blah blah”)
···

Key Usage
Thinking about your keys can save you a lot of time and memory. Memcached is a hash, but it also remembers the full key internally. The longer your keys are, the more bytes memcached has to hash to look up your value, and the more memory it wastes storing a full copy of your key.
On the other hand, it should be easy to figure out exactly where in your code a key came from. Otherwise many laborous hours of debugging wait for you.
Avoid User Input
It’s very easy to compromise memcached if you use arbitrary user input for keys. The ASCII protocol uses spaces and newlines. Ensure that neither show up your keys, live long and prosper. Binary protocol does not have this issue.
Short Keys
64-bit UID’s are clever ways to identify a user, but suck when printed out. 18446744073709551616. 20 characters! Using base64 encoding, or even just hexadecimal, you can cut that down by quite a bit.
With the binary protocol, it’s possible to store anything, so you can directly pack 4 bytes into the key. This makes it impossible to read back via the ASCII protocol, and you should have tools available to simply determine what a key is.
Informative Keys
key = ‘SQL’ . md5sum(“SELECT blah blah blah”)
… might be clever, but if you’re looking at this key via tcpdump, strace, etc. You won’t have any clue where it’s coming from.
In this particular example, you may put your SQL queries into an outside file with the md5sum next to them. Or, more simply, appending a unique query ID into the key.
key = ‘SQL’ . query_id . ‘:’ . m5sum(“SELECT blah blah blah”)

不建议用memcached缓存session，memcached主要用来减少数据库的I/O，如果一个机器down掉，则缓存数据消失，用户是有感知的。这种最好还是用DB或Redis做。

Memcached所有操作内部都是原子性的（atomic）

性能

预期TPS

On a fast machine with very high speed networking, memcached can easily handle 200,000+ requests per second. With heavy tuning or even faster hardware it can go many times that. Hitting it a few hundred times per second, even on a slow machine, usually isn’t cause for concern.

响应迅速，考虑网络抖动，OS，CPU的延迟后，很少有响应会超过1-2毫秒

On a good day memcached can serve requests in less than a millisecond. After accounting for outliers due to OS jitter, CPU scheduling, or network jitter, very few commands should take more than a millisecond or two to complete.

理论上不会有连接client限制

Since memcached uses an event based architecture, a high number of clients will not generally slow it down. Users have hundreds of thousands of connected clients, working just fine.

但是会有硬件上的限制

Each connected client uses some TCP memory. You can only connect as many clients as you have spare RAM

如果有很多连接，可以考虑使用TCP长连接或UDP连接。
可以进行TCP调优，涉及到OS和网卡的参数。

High connection churn requires OS tuning. You will run out of local ports, TIME_WAIT buckets, and similar. Do research on how to properly tune the TCP stack for your OS.

集群最大的节点数问题
从客户端角度来看，客户端会在启动时，计算Server的hash table，并不会在每次请求都计算一遍。
如果禁止TCP长连接，太多的客户端需要给每个服务端创建TCP连接，太多的Server必然会造成RAM空间的浪费。（3次握手造成的浪费）

超时情况

首先检查listen_disabled_num，这个当连接数达到最大（maxconn）时，新的connection会被延迟。
检查是否OS在将缓存数据交换到硬盘
检查CPU使用率
墙裂建议使用64位机器，32位只能寻址4G内存
使用下面工具检查每个instance

http://www.memcached.org/files/mc_conn_tester.pl

尽量不要在memcache前面部署防火墙