Caching for Cash
There are two ways to make code faster: Either deleting code, or reducing the amount of operations it is doing. But since deleting code isn't the most practical solution, reducing operations by caching is the preferred approach.
In this talk, Kent C. Dodds explores the fundamentals of caching through a series of illustrated examples.
Topics include working cache keys, cache management, rendering strategy tradeoffs, and other best practices.
Caching is a deep area, but after this talk you'll be prepared to dig in further to determine the best course of action for your own applications!
Transcript
00:00 Hello, Epic Web Dev friends, my name is Kent C. Dodds and I am so excited to talk with you about Caching for Cash. This is more than you knew you need to know about caching. Before we get into this, I want to make sure that your brain is prepared for the learning that's about to happen. So if normally before I give my talks, I have people
00:18 stand up and we do air squats together and stretches and stuff. I'm not going to do that now, but if you haven't moved your body in a little while, you need to get that blood flowing for your brain to be operating at peak capacity. So I want you to pause the video, get your blood flowing, and then you come back and we can go through this together. You done? Alright, awesome, welcome
00:38 back. So, this talk is a deep dive on caching fundamentals and some code examples, but it is a little bit surface-y on a lot of areas because caching is a really big topic and we don't have like all day. Maybe we'll have a caching workshop on Epic Web Dev in the future maybe take a look and see
00:57 if we've got it now but yeah today we're just gonna kind of give you a general sense of the problems surrounding caching. So, let's get going. There are two ways to make your code faster. You can either delete it, just eliminate it, or reduce the amount of stuff your code is doing. But we've got
01:15 problems with this, so if we just say let's delete the code well then what are we gonna get paid to do? I kind of like eating, I like being paid for the work that I do, I like having work to do, so we can't just delete everything that we're writing. Sometimes you can delete some stuff and and swap it out for
01:33 something else, but that's more like reducing the stuff. And at some level, there's only so much you can reduce as far as the stuff is concerned. Maybe that stuff is actually the stuff that you're trying to sell. So you can do these optimizations but at some point you get to this level where it's
01:52 like well I can't get rid of any more code, I can't reduce the stuff it's doing. But before we get into caching I thought it was really important for me to talk about these two things first because Caching is a pretty significant challenge and it brings it along a lot of complexity And so I wanted to call
02:12 this out because I have this idea that I like to talk about called Don't Solve Problems, Eliminate Them, and by that I mean we've got this really really big problem of caching and if we just go for that solution right out of the gate we might
02:31 actually miss the opportunities to actually just change our approach where we actually don't need caching at all. And I have something also relatively similar in this vein called fix a slow render before you fix the rerender. This is a React specific post, but the idea is if your rerender of your component is really slow, then reducing the
02:50 amount of times that it re-renders is fine, but if you speed up how fast it re-renders, then you won't care about how many times it re-renders. And so looking at ways to make it so you don't have to cache is much better than figuring out how to cache, that's the point, but at some, for a lot of
03:10 situations you can't delete it, you can't reduce it, and you can't just make it fast and so for that we're going to cache it. Unfortunately, we're not going to C-A-S-H it, we're going to C-A-C-H-E cache it, which is not nearly as fun, I promise, but it can be kind of interesting and fun, so let's talk
03:31 about this. What is caching? So Wikipedia says in computing a cache is a hardware or software... No that's ridiculous I'm not going to read that. Let's look at examples instead I think that will be a lot more efficient for us. So here's a function
03:47 called compute pi that computes a certain decimal place of pi and if we wanted to cache this it would actually be pretty straightforward so we add a variable outside of the function, we have this special function that's our cached version of
04:07 computePy, we check whether py is undefined and if it is undefined then we'll computePy and then we'll return it and then the next time this is called, pi will be assigned, so we'll actually skip on computing pi and we'll just be able to return pi right away. This has really two nice
04:25 benefits. One is we don't have to pay the cost of calling compute pi multiple times which means it will speed things up because we don't have to go through the process of this entire function but then the other one is if for some reason we have to pay money to call this function we don't have to pay
04:43 that money either and that may sound ridiculous but think about this as like a third-party API you need to call. If you're able to cache the result of that third-party API then you don't need to call it so now you don't have to pay the money that it costs to call that API. Just as an example. So this is a pretty simple example it's actually in some ways
05:01 it reminds me of the singleton pattern that you might be familiar with. But yeah, there's some, it's a pretty simple example, but let's pull out the things that we can learn from this and what the fundamental idea of what caching is. So you store the results of the computation somewhere and
05:19 return that stored value instead of recomputing it again. That's it, that's what caching is, that's all it is. Storing the results somewhere and then returning that stored result instead of recomputing that result every time. I wish it were that easy all the time though. So let's look at a
05:38 little bit more complex example. Let's add a precision argument where people can specify how many decimal places out on pi we want to compute. So this adds quite a bit of complexity because before we were just we had a hard-coded amount but now people can change the precision as much as they
05:57 like and so we have to change our caching strategy a little bit and it's not all that bad so now we have this PyCache and it's an object this time and then we say a type of PyCache at this part of the object so we can kind of have a mapping of the precision to its
06:16 result and so then we set that to compute pi and so if we're calling compute pi cached with the same precision multiple times then we won't have to call compute pi for that specific value and this brings us to another concept called cache keys and
06:34 the key in our case is this precision value but the key is actually a pretty tricky beast and I think we're gonna need a different example to get an idea of why the key is a little bit tricky. So here we have a sum function, it's still like super super
06:52 simple. We have our sum cache that's just an object and then we take two arguments and now we have to compute the key in some way. So in our case we're just going to say the first argument with a comma and then the second argument so that way we can always make sure that given the same two inputs we give the same results and so then everything else
07:11 is the same I think this would actually be useful to kind of watch what the code execution looks like so let's watch this. First we're going to call the function and we generate the key it's 1, 2 that's not in the cache and so we call sum and then we get the
07:29 return value which will be 3 so we're going to just return what we just barely set into the cache. And now with that our cache has that 1 2 in the cache key. So now when we compute the key it's going to be the same as before and so when we check in the sum cache we're going to have a value for 1, 2 and so we can skip
07:49 over that if statement and just return the value in the cache. So that's how we're able to skip over recomputing the sum is because we calculate the same cache key that we had before for that same value. So that is cache keys but there's a pretty significant challenge with cache keys and that
08:09 challenge is visible right here so I want you to play a little game with me actually I want you to pause the video find the bug there's a bug in here that is really really bad and yeah go ahead and take a second find the bug did you find it? Okay so the bug
08:26 is right here in our key our key is incomplete So let's look at the specific example. So we call addDaysCached 3, so we want to add 3 days from today. And then we call it again, we get 3 days from today again, that's our cache hit. And then if we wait 24 hours and our program is still running, so
08:45 that cache is still there, then we call it again with 3, now we want 3 days from that 24 hours later, right? And unfortunately that would be 3 days from yesterday, which is 2 days from today, so we got the wrong answer here and that's because our cache key was incomplete. We got a cache hit and we
09:05 definitely should not have gotten a cache hit. So yes, we have ourselves our first wonderful caching bug. First of many I'm afraid. Poor Homer, I'm so sorry. So cache keys are tricky. The cache key must account for all inputs required to
09:23 determine the result. Unfortunately in our case one of the inputs is date.now and so yeah that's that's not going to work. So there there's a caveat to cache keys or like we kind of end up cheating a little bit. So first it's easy to miss an input
09:43 we'll look at examples of each one of these. You also could possibly have too many inputs and computing the correct cache key can be costly, that is that alliteration is intentional. So let's talk about missing the input. So it can definitely be easy to miss an input if you've ever
10:01 used react and use memo or use effect or use callback, these dependency array those are inputs into a cache and yeah it's very unfortunate actually it's so tricky that we need an ESLint plugin that doesn't always work exactly the way we would expect. So yeah, really easy to miss inputs
10:21 and really like devastating to miss those inputs as we saw with poor Homer. So especially like in When you're writing it from the very first, it can be relatively easy to miss an input, but also as you're making changes to existing code, you can
10:38 add inputs without actually really knowing, and so yeah, easy to miss an input. Also, could have too many inputs. So, a good example of this would be Google Flights. There are just an outrageous number of possible inputs for the results that you
10:54 get here. So it's like the origin destination, the ultimate destination, the dates, and then all of these filters like which airlines you want, the stops that you want, the bags, the price, the times, emissions, connecting airports, duration, all of this stuff results in just so many
11:14 inputs that you probably would never even get a cache hit because it's going to be different. And in fact, it's very possible that Google uses my own user history as an input into this that I don't even see in the UI here. So yeah, there are too many inputs. Now there are probably lower level caches because they're probably talking to third-party
11:35 APIs for getting the flight information. Maybe they're caching some of that stuff but yeah for caching the entire page that's not going to happen, no way. And so I don't have any insider input or view into how Google Flights works but I'm pretty confident they do not cache this entire
11:54 page. That just would not work. Too many inputs. And then another one is computing the correct cache key is costly. So for my personal blog, all of my blog posts are actually MDX files on GitHub and the compilation process can take a couple hundred
12:11 milliseconds to compile a blog post and the input for compiling that blog post is going to be this MDX file. Well getting that MDX file string contents that actually comes from GitHub and so for me to use like the proper key for the compiled blog post is
12:30 the MDX input string that the entire thing would be the key, but that doesn't really work because getting that entire string requires going to GitHub which is going to take a couple hundred milliseconds itself and so computing that cache key just is not reasonable like the only thing that
12:48 I really have is the slug in the URL and so I don't really have a great way to have a cache key that is a reliable cache key for compiling this blog post so that's not going to work so we're going to cheat We're going to start doing some cache
13:06 revalidation. We're going to acknowledge and embrace the fact that we don't have the ability to provide all cache keys and we're going to revalidate. There are a couple different approaches for this. So let's talk about cache revalidation. So there are a couple options you can proactively update the cache. So for example
13:24 on a post update you just update the cache. I have a github action where every time I make a change to content it tells my server that's actively running and my server will say, oh great, let me, I'll keep on serving cached versions, but while that's happening I'm gonna go to GitHub, go grab that MDX that was changed, and update the cache so
13:43 that the version that I'm sending is a cached version. So that's the approach that I take, it works really well. It can definitely fall out of date like if the GitHub action fails for some weird reason or something and so having a couple of these other strategies is still useful even if you're doing that. Another approach is timed
14:02 invalidation. A good example of this is cache control headers. So when a browser makes an HTTP call to a server, that server can send a header back on the response that has a couple directives for the cache control. One of those is max age, so you can say how long you want this HTTP response to be
14:22 cached in the browser cache. The browser has a cache built in, it's awesome. And so you can say I want this to be assumed valid for 60 seconds and so if the browser makes another or is going to make another request for that same resource it's going to first check its cache and say oh that it was valid
14:40 for 60 seconds it's been 49 seconds now so instead of actually making this request I'll just grab the thing that I saved earlier. So that works and after 60 seconds it will go and revalidate, it will actually make that call to the server. This works pretty well but increasingly people are kind of
14:59 used to just having the latest and greatest up-to-date value of everything, so this doesn't always work perfectly well but for a lot of use cases this will work just fine. So another thing that we can do is stale while revalidate, so we can update the cache in the background. So this kind of
15:16 piggybacks on top of the timed invalidation with cache control headers where you could say after 60 seconds we're going to consider it stale but then I'm actually okay with you serving stale values for another 120 seconds and so if the browser said
15:35 or yeah as an example the browser goes to a CDN and then the CDN is managing this stale while revalidate. So if the CDN sees, oh I've got a cache of this HTTP resource and it says that it was valid for 60 seconds but it's now been 65 seconds since I
15:54 last updated that. So what I'm going to do because it has the stale while revalidate header or directive on that header is I'm going to send them back the stale version but in the background I'm going to go to the origin server so that way the user who's waiting on this response they don't have to
16:11 wait for going back to the server to get the fresh version but they're the ones who triggered refreshing things in the background. So this staleware-revalidate actually is really really effective for making sure things stay fast even while taking into account the fact that they're
16:30 going to be a little bit stale and sometimes that's perfectly reasonable. Another thing you can do is force a fresh value. So I actually do this on my website as well, so when I'm signed in as an admin I can add a fresh query param to any page and it will force refresh all of the things that are cached on
16:48 that page and so that allows me to deal with situations where maybe the GitHub action didn't work or I know that this particular the third-party cache that I have is not going to refresh for another week or something so I can like force that to refresh if I want to. For like
17:06 podcasts and stuff I've got like a week on those so I'm not hitting their API's all the time. So you can force and manually update things. But one of my favorites is SoftPurge and I was clued into this
17:21 idea from Fastly which is a CDN that kind of as far as I'm aware has popularized this idea where it's kind of like a manual stale while revalidate. So let's say that I do have stale while revalidate on some HTTP resource, like say a static web page like
17:40 my about page or something, and I have made an update to it, so I want to make the CDN update its version of that page so that users get the latest version, but I have the cache headers on there that
17:58 say it's valid for like two days and it can be stale for an additional three days or something like that. So if the user comes to the page they're gonna get potentially a five day old version. So what we can do is we can say, hey I've updated this let's go
18:17 tell Fastly or whatever CDN that this is actually stale now even though it's been like maybe one day it's not going to actually update for another day after that so we can just proactively say hey no just mark it all as stale and immediately go and refresh it the next time somebody requests that
18:36 page. So this can be really really efficient for certain use cases and it's just the reality of the world that we have to deal with revalidating things because our cache keys are not perfect. We can't make them perfect. So along with all this we're talking about HTTP and and web servers and
18:56 stuff like that, I just want to make a little side note because I think it's relevant here. Static site generation, SSG, was really really popular. It's becoming less popular for this reason. SSG is basically build time caching with CDN and proper cache control headers. That's all that SSG is and
19:16 it's important to recognize that has no performance benefits over a fully server rendered site with a CDN and proper cache control headers. It is exactly the same as if you have an origin server that's generating these pages on demand and then caching it with cache headers. So if you're able to cache the entire thing at build
19:35 time then you're also able to do this from server rendering and it's not going to have any performance impact at all. This is contrary to popular belief so I just want to clear that up. But I actually have a note on the other side and that is, SSG also has severe limitations as your product evolves into dynamic requirements. So like it's a
19:54 little box and as soon as you go outside of that little box you're going to have to make some major trade-offs that you won't like. So it forces you to choose between re-architecting or offering a worse UX and so my recommendation is to just start with fully server rendered. So that's it, there
20:11 you go. I'm not your mom, you do what you want, but I feel like SSR is the way to go, I do not do SSG. So there you go, a little spicy, sorry. So let's talk about another caching problem. There's another bug, we're gonna play this game again. There's a
20:30 major, major problem in this caching implementation. So we have this get video buffer that returns a promise, our get video buffer cached is also async, so that's not the problem that you can totally do async await and stuff like this. We're not really handling errors, that is also something you need to consider but just the
20:49 the basic premise here there's a major problem here so go ahead and pause the video and and come back when you think you got it. Okay you good? You got it? Here's the bug right there the video buffer cache that object You pretty much like very rarely want to use an in-memory object for something like
21:07 this and if you do you want to think about the cache size because in our case we're going to end up with some sort of JavaScript heap out of memory because we're loading video buffers into a cache and just keeping them around forever. Probably not the best idea, definitely could be useful in some
21:27 scenarios, but you just got to watch your memory. So let's talk about cache size and solutions. So one solution is least recently used algorithm, which basically whatever was used most recently gets pushed to the back of the line as far as things that are proactively ejected from the cache. And
21:45 so that's the idea is just that you set a fixed cache size and when you're going to add something into the cache, you're going to eject the thing that was used least recently. And there's a great module for this called LRU Cache. This is an in-memory cache, but it manages its size so that it never gets
22:05 too big. The file system is also another place to cache. It wouldn't work for our video buffer because we're literally like reading from the file system so it's not really caching anything but this can be really useful for a lot of different things like compilation of JavaScript files, Babel sticks things in the node modules dot cache
22:22 directory and so yeah using the file system for reducing amount of competition especially when reading from the file system is faster than whatever it is that you're doing that's a pretty common solution and related SQLite also a file but
22:40 yeah you can use SQLite as a cache and the Epic Stack actually uses SQLite for its cache using litefs and so its cache is actually distributed to all the instances of your app that are running which is actually pretty important and so yeah you can definitely use SQLite and EpochStack does that my
22:58 personal website does that it's great and then Redis is a really common solution to this as well. So it's a service that's running over here that you talk to and normally provided that the Redis instance is running close to where your app is that can be very fast as well and it is technically more than just a cache, I know the Redis folks are really
23:17 trying to push, it's not just a cache but that's what a lot of people use Redis for. Cache size can still just get exploded out of control so yeah keep an eye out on your cache but These are a couple of solutions or things that can alleviate some of the pain around caching size. So let's talk about
23:37 cache warming. When you have like, maybe you've got a bunch of products, hundreds, thousands of products, and each one of these needs some levels of caching implemented, when you first deploy that thing and you're bringing up or adding all the caching for all those things, that can lead to a
23:56 couple of problems I want to address. So first of all if you're talking to APIs like maybe you're talking to Shopify API or you're talking to a podcast API or something like that sometimes you can get rate limited so make sure that you are aware of the limitations of the API's that you're interacting with so that your IP doesn't get
24:14 blocked by those by those providers because once when you're bringing it all up all of a sudden you're just loading up the cache tons and tons and tons so think about that. It requires a lot of resources so like maybe if you're... It's
24:31 probably a good idea to be very thoughtful about how many of those things you're updating all at once, but if not then at least like during the warm-up process maybe give your CPU like a give yourself a bit beefier CPU, but yeah probably better just to make sure because you're going to be warming up the cache more than you think probably
24:50 but yeah just make sure that you aren't hogging all the resources of your web server as you're warming up the cache. And then it also makes users wait for the fresh value so it's probably a good idea to warm up the cache before you actually switch DNS over if you're deploying a new version of your app or
25:08 something like that. So yeah warming up the cache kind of turns this quiet nice setting into just total chaos where everybody's doing everything all at once. So just think about that when you're when you've got a lot of things that are cached and find ways to slowly warm up the cache rather than doing everything all at once. So yeah, soft purge
25:28 is actually a really good solution to this where you just mark everything as stale and then as users request those resources they get the old version and they kick off the refresh of that cache value. There's also this problem where once you put something in the cache, it now could possibly be
25:50 different from the thing that it's caching, because the thing that it's caching could get a change. And so, this is like, I suppose the actual function, what it actually is doing, or the values that it's returning, could technically be another key for your cache key or an input to your
26:09 cache key but that is not reasonable so you just what I recommend is you add some validation that will check what came out of the cache and make sure that it's what your application expects and you can use Zod for this and it's awesome. So just something else to think about. And then cache
26:28 requests deduplication. So if a bunch of people just like it's Christmas Day and now or Black Friday and now everybody's going to hit your product page, they're all going to be making these requests. You want to make sure that if one request is going out and that request result is going to get cached, make sure that the next person who's
26:48 requesting it doesn't actually make that request themselves and they just wait for the results of the first one. This is non-obvious when, unless you've thought about it. So once you think about it, it's like, oh yeah, of course, we'll want to do that. But yeah, this is actually pretty similar to
27:05 DataLoader. If you are familiar with that, you can take a look at DataLoader as a concept. It's pretty interesting. So really what you're looking for is Cacheified. So I actually built Cacheified in kentcdots.com as just a module. And then I invited people to like please open source this and so somebody did and it's awesome they did a much
27:25 better job and and it's tested it and all of that is good and also added a bunch of cool things like stale-wobble-validate and soft-purge and a bunch of other really cool things. So take a look at and also like request duplication, deduplication, which I had not done. So a bunch of
27:42 other cool things. So give Cachefied a look. It is phenomenal, strongly recommended to check out Cachefied. Really really great module for node caching. So with all that I just have one more thing
27:58 for you and that is you, hey you are awesome. Thank you so much, see you around the internet. Bye!