[Distributed Caching with Memcached] 摘要
上一篇 / 下一篇 2008-05-14 21:34:15 / 个人分类:LAMP
Distributed Caching with Memcached
R_ax_k~0http://www.linuxjournal.com/article/7451
Si}_2Bh9UL@D F0
Ko[P.ax l01, Memcached is used on LiveJournal, Slashdot, Wikipedia and other high-traffic sites.
Ed8h[ l$~"?0
2g8x O'e} aN!q:Y02,servers. Approximately 70 machines currently run LiveJournal.com, a blogging and social networking system with 2.5 million accounts. In addition to the typical blogging and friend/interest/profile declaration features, LiveJournal also sports forums, polls, a per-user news aggregator, audio posts by phone and other features useful for bringing people together.
ANT/Ib@;gz0
q,L6B2h7Lo6}0J(N03, On the contrary, one of the core factors of a computer's performance is the speed, size and depth of its memory hierarchy. Caching definitely is necessary, but only if you do it on the right medium and at the right granularity.木铎校园 BBS 社区_'ox'_^ nO{"_!\
木铎校园 BBS 社区&XBMu"@}2C(g
4,I find it best to cache each object on a page separately, rather than caching the entire page as a whole. That way you don't end up wasting space by redundantly caching objects and template elements that appear on more than one page木铎校园 BBS 社区)QyZD0T"g4C%R o
Lg.Nm;Q+F fP05,Because processors keep getting faster, I find it preferable to burn CPU cycles rather than wait for disks. Modern disks keeping growing larger and cheaper, but they aren't getting much faster. Considering how slow and crash-prone they are, I try to avoid disks as much as possible. LiveJournal's Web nodes are all diskless, Netbooting off a common yet redundant NFS root image. Not only is this cheaper, but it requires significantly less maintenance
8y#P"U.rGb0木铎校园 BBS 社区FE[;?~$eeB.l
6,setups. We actually have ten different database clusters, each with two or more machines. Nine of the clusters are user clusters, containing data specific to the users partitioned among them. One is our global cluster with non-user data and the table that maps users to their user clusters. The rationale for independent clusters is to spread writes. The alternative is having one big cluster with hundreds of slaves. The difficulty with such a monolithic cluster is it only spreads reads. The problem of diminishing returns appears as each new slave is added and increasingly is consumed by the writes necessary to stay up to date木铎校园 BBS 社区 ^3RK7o |B|
木铎校园 BBS 社区/fIP3H)GY3M0k4~ \2F.T
7,At this point you can see LiveJournal's back-end philosophy:木铎校园 BBS 社区KfF"]x5t/p
木铎校园 BBS 社区Z(zllbj
Avoid disks: they're a pain. When necessary, use only fast, redundant I/O systems.木铎校园 BBS 社区r,G5Rzy6c)K T
木铎校园 BBS 社区 X%Q'F*c2Lf]
Scale out, not up: many little machines, not big machines木铎校园 BBS 社区6bb$t9X?Du
9V&P#XeD"M.J'Gi9@+T08, The basic idea is you run Memcached instances all over your network, wherever you have free memory and your application uses them all. It's even useful to run multiple instances on the same machine, if that machine is 32-bit and has more total memory than the kernel makes available to a single process木铎校园 BBS 社区2DN3Lsw[Y3vi
ne/fX7`0[N2]6W)X09,LiveJournal.com currently has 28 Memcached instances running on our network on ten unique hosts, caching the most popular 30GB of data. Our hit rate is around 92%, which means we're hitting our databases a lot less often than before.
oa\2flp.P L+jZ0
~2X2QQ2_)O.S010,. Running Memcached on the same machine as mod_perl works well, because our mod_perl code is CPU-heavy, whereas Memcached hardly touches the CPU. Certainly, we could buy machines dedicated to Memcached, but we find it more economical to throw up Memcached instances wherever we happen to have extra memory and buy extra memory for any old machine that can take it
l4S5jX uY(tWF@u0
!T3G@X,@k011,The advantage of libevent is that it picks the best available strategy for dealing with file descriptors at runtime. For example, it chooses kqueue on BSD and epoll on Linux 2.6, which are efficient when dealing with thousands of concurrent connections. On other systems, libevent falls back to the traditional poll and select methods.
@(\!Z K5qR8_i;a0木铎校园 BBS 社区k,F/Zaw.k4{ Zz u
12,Inside Memcached, all algorithms are O(1). That is, the runtime of the algorithms and CPU used never varies with the number of concurrent clients, at least when using kqueue or epoll, or with the size of the data or any other factor.木铎校园 BBS 社区H$?7v6^@e
? um]"r@&g z)f013,Of note, Memcached uses a slab allocator for memory allocation. Early versions of Memcached used the malloc from glibc and ended up falling on their faces after about a week, eating up a lot of CPU space due to address space fragmentation. A slab allocator allocates only large chunks of memory, slicing them up into little chunks for particular classes of items, then maintaining freelists for each class whenever an object is freed. See the Bonwick paper in Resources for more details. Memcached currently generates slab classes for all power-of-two sizes from 64 bytes to 1MB, and it allocates an object of the smallest size that can hold a submitted item. As a result of using a slab allocator, we can guarantee performance over any length of time. Indeed, we've had production Memcached servers up for 4–5 months at a time, averaging 7,000 queries/second, without problems and maintaining consistently low CPU usage.木铎校园 BBS 社区[&{1fEHC3o
木铎校园 BBS 社区9c6^L5XT:i7v6Np
14,A final optimization worth noting is that the protocol allows fetching multiple keys at once. This is useful if your application knows it needs to load a few hundred keys. Instead of retrieving them all sequentially, which would take a fraction of a second in network round-trips, the application can fetch them all in one request. When necessary, the client libraries automatically split multi-key loads from the application into separate parallel multi-key loads to the Memcached instances. Alternatively, applications can provide explicit hash values with keys to keep groups of data on the same instance. That also saves the client library a bit of CPU time by not needing to calculate hash values.木铎校园 BBS 社区c2s7aP`#X L
木铎校园 BBS 社区7WR'x0M(Q |1?s;o;s
R_ax_k~0http://www.linuxjournal.com/article/7451
Si}_2Bh9UL@D F0
Ko[P.ax l01, Memcached is used on LiveJournal, Slashdot, Wikipedia and other high-traffic sites.
Ed8h[ l$~"?0
2g8x O'e} aN!q:Y02,servers. Approximately 70 machines currently run LiveJournal.com, a blogging and social networking system with 2.5 million accounts. In addition to the typical blogging and friend/interest/profile declaration features, LiveJournal also sports forums, polls, a per-user news aggregator, audio posts by phone and other features useful for bringing people together.
ANT/Ib@;gz0
q,L6B2h7Lo6}0J(N03, On the contrary, one of the core factors of a computer's performance is the speed, size and depth of its memory hierarchy. Caching definitely is necessary, but only if you do it on the right medium and at the right granularity.木铎校园 BBS 社区_'ox'_^ nO{"_!\
木铎校园 BBS 社区&XBMu"@}2C(g
4,I find it best to cache each object on a page separately, rather than caching the entire page as a whole. That way you don't end up wasting space by redundantly caching objects and template elements that appear on more than one page木铎校园 BBS 社区)QyZD0T"g4C%R o
Lg.Nm;Q+F fP05,Because processors keep getting faster, I find it preferable to burn CPU cycles rather than wait for disks. Modern disks keeping growing larger and cheaper, but they aren't getting much faster. Considering how slow and crash-prone they are, I try to avoid disks as much as possible. LiveJournal's Web nodes are all diskless, Netbooting off a common yet redundant NFS root image. Not only is this cheaper, but it requires significantly less maintenance
8y#P"U.rGb0木铎校园 BBS 社区FE[;?~$eeB.l
6,setups. We actually have ten different database clusters, each with two or more machines. Nine of the clusters are user clusters, containing data specific to the users partitioned among them. One is our global cluster with non-user data and the table that maps users to their user clusters. The rationale for independent clusters is to spread writes. The alternative is having one big cluster with hundreds of slaves. The difficulty with such a monolithic cluster is it only spreads reads. The problem of diminishing returns appears as each new slave is added and increasingly is consumed by the writes necessary to stay up to date木铎校园 BBS 社区 ^3RK7o |B|
木铎校园 BBS 社区/fIP3H)GY3M0k4~ \2F.T
7,At this point you can see LiveJournal's back-end philosophy:木铎校园 BBS 社区KfF"]x5t/p
木铎校园 BBS 社区Z(zllbj
Avoid disks: they're a pain. When necessary, use only fast, redundant I/O systems.木铎校园 BBS 社区r,G5Rzy6c)K T
木铎校园 BBS 社区 X%Q'F*c2Lf]
Scale out, not up: many little machines, not big machines木铎校园 BBS 社区6bb$t9X?Du
9V&P#XeD"M.J'Gi9@+T08, The basic idea is you run Memcached instances all over your network, wherever you have free memory and your application uses them all. It's even useful to run multiple instances on the same machine, if that machine is 32-bit and has more total memory than the kernel makes available to a single process木铎校园 BBS 社区2DN3Lsw[Y3vi
ne/fX7`0[N2]6W)X09,LiveJournal.com currently has 28 Memcached instances running on our network on ten unique hosts, caching the most popular 30GB of data. Our hit rate is around 92%, which means we're hitting our databases a lot less often than before.
oa\2flp.P L+jZ0
~2X2QQ2_)O.S010,. Running Memcached on the same machine as mod_perl works well, because our mod_perl code is CPU-heavy, whereas Memcached hardly touches the CPU. Certainly, we could buy machines dedicated to Memcached, but we find it more economical to throw up Memcached instances wherever we happen to have extra memory and buy extra memory for any old machine that can take it
l4S5jX uY(tWF@u0
!T3G@X,@k011,The advantage of libevent is that it picks the best available strategy for dealing with file descriptors at runtime. For example, it chooses kqueue on BSD and epoll on Linux 2.6, which are efficient when dealing with thousands of concurrent connections. On other systems, libevent falls back to the traditional poll and select methods.
@(\!Z K5qR8_i;a0木铎校园 BBS 社区k,F/Zaw.k4{ Zz u
12,Inside Memcached, all algorithms are O(1). That is, the runtime of the algorithms and CPU used never varies with the number of concurrent clients, at least when using kqueue or epoll, or with the size of the data or any other factor.木铎校园 BBS 社区H$?7v6^@e
? um]"r@&g z)f013,Of note, Memcached uses a slab allocator for memory allocation. Early versions of Memcached used the malloc from glibc and ended up falling on their faces after about a week, eating up a lot of CPU space due to address space fragmentation. A slab allocator allocates only large chunks of memory, slicing them up into little chunks for particular classes of items, then maintaining freelists for each class whenever an object is freed. See the Bonwick paper in Resources for more details. Memcached currently generates slab classes for all power-of-two sizes from 64 bytes to 1MB, and it allocates an object of the smallest size that can hold a submitted item. As a result of using a slab allocator, we can guarantee performance over any length of time. Indeed, we've had production Memcached servers up for 4–5 months at a time, averaging 7,000 queries/second, without problems and maintaining consistently low CPU usage.木铎校园 BBS 社区[&{1fEHC3o
木铎校园 BBS 社区9c6^L5XT:i7v6Np
14,A final optimization worth noting is that the protocol allows fetching multiple keys at once. This is useful if your application knows it needs to load a few hundred keys. Instead of retrieving them all sequentially, which would take a fraction of a second in network round-trips, the application can fetch them all in one request. When necessary, the client libraries automatically split multi-key loads from the application into separate parallel multi-key loads to the Memcached instances. Alternatively, applications can provide explicit hash values with keys to keep groups of data on the same instance. That also saves the client library a bit of CPU time by not needing to calculate hash values.木铎校园 BBS 社区c2s7aP`#X L
木铎校园 BBS 社区7WR'x0M(Q |1?s;o;s
TAG: 摘要 Caching Distributed Memcached

