Caching Jamstack Sites With GitHub Actions
with Benjamin Lannon
Did you know you can combine GitHub Actions with Netlify to build powerful integrations? Benjamin Lannon teaches us all about it in this episode!
Topics
Resources & Links
Transcript
Captions provided by White Coat Captioning (https://whitecoatcaptioning.com/). Communication Access Realtime Translation (CART) is provided in order to facilitate communication accessibility and may not be a totally verbatim record of the proceedings.
Hello, everyone. And welcome to another episode of Learn with Jason. Today on the show, we've got party corgi, long time friend Benjamin Lannon here. How are you doing?
Good.
Good. For those of us who aren't familiar with your work, give us a little background.
My name is Benjamin Lannon, I'm a developer in upstate New York. I've really been all over the place. I've done stuff with Gatsby, VSCode. A lot of places around open source in general.
Yeah. For sure. And I've seen you across a whole bunch of open source libraries. And so, like, what kind of stuff, what are you most excited about right now? What are you having the most fun with?
Yeah, so I think the thing I kind of think about over, like, kind of the past year or so is a thing I really am interested in is kind of being able to take, data sets and get insights into them. One example was, I had, I want to take an insight into how VSCode, they had, they have thousands of issues open on the VSCode. And a couple months, I think it was November 2018 they said, hey, we're really focusing on trying to go through them, resolve things. And I was, like, I want to see how that actually goes. And I set up a very simple automation where, I think, once an hour, I take a snapshot of how many issues are open. And I did that over the course of a month and saw that and visualize that. And saw that the data actually went down very quickly.
Yeah.
It actually did.
I remember this. This was cool. This was a project they end up including it in their change logs, I think, for a major release. But that's really cool. And I've seen you do this a couple of places where you're kind of pulling data and finding ways to make that data, I guess, human interpretable. I feel like it's really hard to analyze data in four dimensional space, if we consider through time. We don't, I remember that there were more issues a while ago. But I don't remember how many. I don't remember, like, the relative distances are really hard to track, especially, over a long period of time. And so, having those types of records and visualizations is so nice we're being able to see, like, oh, I really did make progress. That's, I think that's why they tell you to journal or to write down your accomplishments so that when you get further along and you look back, you can go, oh, dang, I did do a lot. Even though it won't feel like you did anything. Cool. Very cool. You've been doing a lot of work with GitHub actions. What got you into those?
So, it was back when they first announced it. At first, I didn't quite understand it. It was some kind of CI on GitHub. But then, as I dug in and saw, this is really across the board, a way to really integrate applications into GitHub in a surplus manner across a spectrum of what GitHub provides. When you think about CI/CD, usually, you think when I make a PR, I can run some tests on it, preview deploy. And I'll go, actually, deploy that change up to wherever it is. Whether it's a website or some kind of service, those kinds of things. I'll deploy it. And what really interests me is everything else in that spectrum. What happens maybe if you want to watch for various issues. Like, if someone, like makes a comment in that issue. You may want to do some stuff based on the comment. Maybe I want to make sure that the common threads are being inclusive. So I could maybe try to alert people if someone posts something that isn't really going by the terms of, like, going against the code of conduct. Maybe I want to do something when people star the issue, when someone forks a repo. Really, the entire spectrum for GitHub as a platform rather than just this very small subset that is CI/CD.
Yeah, yeah. That's, I feel like this is an interesting space that we're starting to see get explored if a few different places. Where it's no longer you use CI/CD, as you said, build the site and run some tests. You can now do, like, really interesting stuff. You know, and GitHub Actions are in that space. And I've seen the build plugins are introducing a similar pattern where you have a lot of flexibility there. And now I'm seeing in other companies, as well, you get this extension. What if your CI/CD was more than that? I think that's exciting. So we're going to play with some of that today. But what are some things you've actually seen? Projects you've seen used in GitHub Actions that you're, or built that you're really excited about.
So some fun things, some very fun things I've done making those visualizations. Those I do all of the processing of every single time something happens, I trigger those things and save the data points up to, some of them, I'm saving down in the DD, some kind of service database. Other than things include just thing, one thing I had is whenever my website builds and I actually deploy it, as soon as it is successfully deployed up on to nullify, I make my lightbulb, either the lightbulb behind my desk or the one right behind me go green.
And if the build fills, it'll go red. I think I'm finding it interesting. Everyone, you did things with hardware and then everything becomes virtualized, it's software. What if we can take a step back and actually start throwing things back into the real world, again.
Mm hmm. That's something I've really enjoyed watching lately. Lucky Number 7 was on the show. And we were talking about Twtich overlays. And one thing he's done, he has an old flapper display board. You know the train station ones that roll over to show you whatever time it is. He has one of those hooked up to Twtich. And so, in his office, like, people can run a command and show a thing. I don't know exactly how he's got it set up, but you can control the flapper board from the computer, which is really, really fun. Or, like, Michael Jolly has his Twtich control puppet.
Where's yours, Jason? Behind you on the wall. When the Corgis come.
That actually would be. You know what would be fun, actually, we could extend the corgi stampede that have lights flash. That sounds like a lot of fun. Speaking of the corgi stampede, I figured out why it hasn't been working, everybody, it's because I muted it. I unmuted it today.
That needs to be a sound effect. Oh, dang, all right, Chris is giving out subs. All of you subbed, send the corgis and make some noise. Yeah, this is beautiful. So thank you, thank you so, so much for that. So then what are we going to do today? What are we going to build today?
So today, actually, it's something kind of based on what something you've previously built.
OK.
So you previously built a nullify build plugin for being able to cache a Jamstack site.
It's back. In record time. Thank you, Chad, well done.
There you go. Because, like, when we think about deploying websites when you do a build on Gatsby, you have to build the site out into files you go deploy on CDN and nullify. With that, you usually these layers have cache in there where you can store stuff between builds to make sure that, like, if I build a second time, if I'm only changing a couple files, it shouldn't need to recompile every single file every single time.
Mm hmm.
By default, GitHub Actions doesn't really, cache doesn't store a session because they treat it as an environment of you spin stuff up. GitHub has the team that build actions actually made an action call that you can actually add this caching support into your work flow. If I build in Gatsby, the next time I build it'll have that caching layer implemented so I can then, hopefully, get a shorter build time.
Nice. OK. That sounds awesome. I'm ready to do that. Before we switch over to start coding. Thank you Pristine 1 for the sub, that is very much appreciated. And shout out to White Coat Captioning, who is doing live captioning of the show today, you can go to lwj.dev/live to see the live captions. And I'm going to write that down in the chat so you can go look at them. I did break the Twtiching bed, I'm sorry for that, but the live captions work great. So thank you to White Coat Captioning for that. Those are made possible by the sponsors Nullify, Sanity and Off Zero who are putting money into cover the cost of captioning, which makes the show more accessible for everybody. We appreciate that very much. With that being said, let's jump over into pairing view. And let's write some code.
Oh, we would need to add code.
Indeed, we would. So let's go, definitely go follow Benjamin on Twitter. And where do you want to start? What should we do first?
I think maybe a good idea, let's go find a repo that we can use to kind of really show, like, some of these benefits. I think what I was reading through your build plugin, you have an example where you used one of your websites and it showed, like, very massive, like, differences between a build without any cache and with cache. That site is very image heavy sites. If anyone's worked with Gatsby in the past, the way that Gatsby images are rendered are kind of generated for you is it takes the single image and it'll create multiple various different sizes for you. If you were loading up on a phone, you don't need to have a 4 megabyte image compared to a 27 inch flat screen.
Yeah, so this is a monster site. We might regret this. 200 plus images that are huge. What I wanted to do was to make Gatsby earn it. Gatsby's going to process every image, but it has to load the image into memory, create the different versions. And so, this is 200 plus images at average of 4 megabytes per image. And each one of them gets turned into a page. We're going to generate over 200 pages with images. We'll have to do some stand up comedy.
OK. Yeah.
Was my top search Guy Fieri chef's kiss? What did I do? Let me show you why that was my top search. I got, where is it? Where did you go? So this is the party Corgi chat. And now, we have this emoji, look at that beautiful thing. That is animated chef's kiss from Guy Fieri. And I feel like that is worth the terrible search history. OK. Back on track here. Let's dive in.
Yep.
We've got big repo. We're ready.
I'd say let's fork this over so we can have a space to work with.
OK.
Can I fork my own code? I don't think I can. But what I can do.
Try to go to Learn with Jason.
I'll do that. We'll go to Learn with Jason. All right. And then, I'm going to get terminal set up.
Oh, yep, look at that. Look how many images we're cloning right now.
There we go.
Speaking about this noise, do you know about the jeopardy MPX model?
If you do jeopardy. Let me go find this.
Hold on, let me make sure my sound is piping through. I think I turned this off. All right. Let me do it.
If you do that and put a command after it. Maybe sleep 5. (Jeopardy music).
It's jeopardy.
Oh, that's amazing. I feel like that is really funny. I probably shouldn't do that very often. We get a DMC takedown. But yeah, that is a good command. I like that.
So let's see. OK, so the site should be local.
Yes. Let me actually open it, as well.
Yeah.
Opening the wrong window. Over here, come on. All right.
OK. So while
Oh, crap, this is the wrong thing.
What did we call this? Image processing?
Yep.
While we're here. Can we deploy this up to Nullify and do a manual deploy? Rather than hook it up with the git instance, do a manual deploy of the site. Go to Addnullify.com. And you mean drop the folder in?
Yeah, actually we could, do a build, like, behind the scenes. We could open up the S code, do install, build and drive the public folder in. Because, we're going to be looking into using yep, we've been looking into the Nullafi to do our deploys. The nice thing is, you could have it where we build up Nullafi build system. We can do a deploy from any service, whether it be on a computer, whether it be another CI provider. And those files, we can still push everything up into Nullafi. If we want to push our functions, our site, all of that content, we can use Nullafi as the CDN.
Mm hmm. Come on. So this may take a while. Let's run the, like run the build. And then, we're going to maybe just start working on the GitHub Action workflow for this.
Yeah, probably a good call. Because I think once we see the images start here, it is brutal. Like, I'm hammering on this computer to do this. If, like, the reason this is so slow, just to reiterate, I intentionally made this just a horribly inefficient site.. The images are huge and the queries make sure to generate every version of every image. So it is, it is, yeah, you can see, like, it's rolling backwards, the progress is going backwards, that's how big this site is. All right. While we're doing that, I need to set up a folder, right?
Yeah, we're going to create a actually, click new file, you can do this all in one swoop. If we do .GitHub/workflows with an S, and then, let's uh call this build.YAML. Like that?
Yeah.
So this is where your GitHub Actions workflows are going to be. So when it comes to GitHub Actions, workflow file is kind of one entire sequence of steps you do based on some type of trigger. Where action is one piece of that workflow.
Seg fall. Cool. Let's see if I run it, again. I think I ran out of memory.
We can still work through this. So to start out. We wanted to kind of define the environment. When we are going to run this.
OK.
First of all, we can give it a name. So if you do name. So each of these things are going to be a top level. So we can give some string for our name.
OK. We'll call this. And this gets displayed by GitHub actions. Cache built assets or something. And I don't need these. Do I?
Oh, no.
We have that, and then, we want to have an on trigger.
If we type in the word push.
YAML YAML, JSON.
On push. So this is telling us we're going to window.
Is that right?
We're never going to get anything done. I'm sorry.
On push is whenever we push up to the repo, this will trigger. Now, this will run on every single thing. Maybe if we do a branch, it'll run everywhere. We maybe want to define this down to just be on the master fringe. If you push enter right after on and indent push and put after it, we can have a field under push called "branches." And then yep, and then we have a colon and then it's going to be an array with one item called masker. Yep, like that. So this is now saying only build when you push the master branch. This is a flow of what you would do when a PR is pushed up, you don't want to build you want to do it for the master branch.
The chat is discussing. I definitely screwed up that sound effect. Thank you for the host. Yeah, I need to go fix that. It's in the repo. Probably just typoed when I was putting the URLs in for the different sounds.
Now, we're going to work in building up jobs if you type jobs and colon. This is kind of where we can run the various things. You can run things in parallel with one workflow. But we only need one job here. So if you push enter and call it
YAML YAML YAML YAML, JSON, JSON.
The site did build. So that means it grabbed the, Gatsby's smart enough once it builds the image, it saves it and didn't have to build it, again, which is the benefit of caching, right?
Yeah. So we'll push that up on to Nullafi later. We'll need a couple of credentials to say when we push up to Nullafi, where are we pushing to? So that will be later on. So let's call this job build and then
Or build?
Yeah. This is the build job in every single job, you need at least two parameters in here. So it's runs on. And this is going to say what operating system are we running? If we're going to do it on Windows, we do Windows, if we want MAC OS, we can. The one I go to is Linux. It has the stuff.
Most CIs run on ubuntu.
And ubuntu/latest.
Getting fancy over there. Next, we'll have a field called steps. These are the individual tasks.
Steps.
And this is an array where each item is a step we want to find. The first one, you usually want to do on every single repo, check out the branch we're working with.
OK.
If we want to use a, a prebuilt action, you're going to type "uses" and then a repo org/repo syntax. So we're going to say actions/checkout.
OK, and this is specifically we're going to GitHub.com/action/checkout for the code for this action?
Yep. And we want to get the V2 tag. V2.
Like that?
Yeah.
So here we're saying, we're going to use this defined version rather than the master branch, which we could, but if you're using someone else's action, it's safer to use a tag version where you know it's not going to pull in random things and it's going to cause issues later on.
Right. Right.
This will check out the master branch and we'll have the assets on the system.
OK.
Next, we are going to start caching some stuff.
OK. And this is just the new step?
Yep. When I say caching stuff, we're going to set up a workflow. So at the end of the build, it will cache it. Now, question, should we, do you want to cache the node modules? Or is it OK to cache the Gatsby files?
You're in charge. You tell me.
So I think, I'm going to, just a comment about this is since this could be, just as a reference for if people want to do this themselves, I think we're going to pass for the no modules, but you do not want to cache the specific mode modules folder. Certain folders, sometimes, when you build it on different operating systems, pulls down different binaries, and you don't want that to be cached. Rather, you would pull from the NPM that you have on your system.
Mm hmm.
But let's just cache the, cache the.
Public and cache folder.
Again, this is the, we're going to use the uses field, again.
OK.
And next, we'll do actions/cache@V2.
And Tony brings up a good question in the chat. Any memory or storage issues with this? Does it end up costing you money or anything if you get a large cache?
Yep, the nice thing is for public repos, you have unlimited, I'm pretty sure for caching, you have unlimited support. When you actually do workflow runs on public repos, it is completely free. On private repos, you have a set amount of minutes you can get. So on the free accounts for GitHub, I think you get 2,000 minutes. But something to be aware of, those minutes differ based on the system. So if you're working with Linux, one minute is one minute. If you're working with Windows, one minute is actually two build minutes and if you're working with MAC OS, 1 minute is 10 build minutes.
What?
Because they have their own hardware for Windows and Linux, they can run their own stuff. But for MAC, they use a third party service, so they amp that up for private repos.
Got it.
For public repos, it's completely free. So with these, I'll go search this up, but for the first action, the cache action, we want to give it parameters.
OK.
And the way we do this is you're going to give it a "with" keyword.
OK.
And now, we're going to put in some parameters. So the two parameters we want to use are back here, "path" and "key." So path is going to be what folder we want to cache.
An array?
No, we can put pass, we can pass just a actually, I think this is something if I'm not mistaken, the cache actually does have support for an array. So we could cache based upon different things, could cache it all in one thing. To be simpler, we're going to have two actions, two cache actions to cache each separately.
OK.
Here, type in public. And then, for this token, we want to get, we want to be able to cache what's in the contents of the public directory. We want to get a hash so this key will be used in subsequent builds. So if it finds something, it can pull it down be able to pull it down.
Sure.
Or if we break the cache, it'll rebuild.
OK. Let's call it, let's give it a, like a keyword in the front. So maybe cache public and and we're going to put an expression in here to get a hash of the contents of the directory.
So is this a batch expression?
Yes, and if you add one more curlies around it. Two on the left side, two on the right. Yep.
Do I need the dollar sign?
Yes.
Inside those two curlies, you're going to type "hash files." And this is a function. And we're going to put in a string, just going to do single quotes of public/and then star star. OK, this is saying kind of go in, find everything that's in this folder and hash it. And this will give you back a hash to kind of map to based upon what contents are in that folder. What does that give you?
OK. And then, I assume for the cache folder, I'm doing the same thing?
Yep.
Cache, cache. Whoops. OK. Spell check. Got it, cache, cache, cache.
There you go.
All right.
So we have these, well, actually, have we run NPM install yet?
Yes.
Up in the workflow, I mean.
No, we haven't.
I'll say, let's do this let's do this, we could do this next. So we're going to do run and then colon. And this is going to be just a normal shell script. NPM install.
OK.
Nikki, we are writing custom GitHub Actions to cache a site build. So we've got a big old Gatsby site, a bunch of images. And so far, we've given it a name. We're running that on this branch and we've got one job called build. We're using ubuntu to run this and setting up our steps now. So far, we check out the repo and tell it we're going to cache the public and cache folders and and telling it to use the hash of the folder contents as a key. And set up NPM install and I think we're about to get to the fun part.
Yep, so we have, we can do just another one of those, again of NPM run build.
OK.
And now, this is where we want to go. Let's open up, let's get this to Nullafi to get some stuff. OK. Get back over there.
Why? Why are you like this? All right. So I've saved this. You want me to do the drop?
Yep.
Let me open up, get a finder window, go to Learn with Jason and it was called image processing. And that's all we need. Let's drop that sucker in. And this is a big site, it's going to take a second.
So Jason, a question for you, is the public the site ID, is that OK to be, like, exposed? Or should it not be?
Site ID's fine, it doesn't really, I maybe wouldn't want it for super important site to have, if someone was going to hammer the API or something. But I don't care about what? Is it too big? Created a few seconds ago, might have just timed out. Still uploading. The site might be too big to straight upload. OK. Maybe what we need to do instead, let's use the same command.
Actually, I think the site, if we go into deploy settings, actually, not deploy settings, but the normal settings, if we go up to settings up there. I think we technically have a quote unquote site available to be able to push.
Yeah, we do.
We're going to copy it down. And this is actually we're going to put some secrets into GitHub Actions. If we go back to the Learn with Jason fork of the image processing site.
Here?
Yep, we're going to go to settings.
OK. If you go, scroll down, on the left, there's going to be a secrets tab. Click on that, and here is where we can kind of add environment variables that have been encrypted that only will be available at one time. So.
Did everything stop working? What's up?
Oh, so we would need to add code. The Nullafi CLI takes in two fields. This will be capital letters, Nullafi site ID with the underscore between the words.
OK.
If you click add secret, that will add it there. We'll also want to add another one. So do you have a Nullafi access?
I can get one.
Yeah. So grab an access token and then maybe move this off screen. That's not actually, that actually would be dangerous to show.
Yeah, that would not be good. So I'm going into my user settings. And then, under applications I'm creating a new personal access token, and I'm generating this token. What are you doing computer, why? Why are you like this? I'm going to create a new secret and you wanted this to be Nullafi?
Nullafi underscore off underscore token.
And I'm going to paste this value in here?
Yeah, so pull it off screen and we'll replace it.
OK. Now, I have let's see, it's not visible good. That's here.
OK. Now, we have.
And to show you what I did, here's
You hackers, you, you dirty hackers.
Weird. I don't know, it's like maybe the time outs are causing issues. But yeah, so this, I just created this new access token. So if you want to create another one, you go in here and it's in your user settings up here. So what we're going to do instead now, we're going to go back, here's our site.
The site so you can cancel that. But the site is technically available. So we should be able to deploy it with our, with the CLI. So if you go down, back into our workflow.
Over here somewhere, there it is.
There we are. We're going to add a new one where we are going to, let's give this one a name just so, well, we could, you can't actually give each of these a name. So just describe this is the, like, public cache, public cache, et cetera. But if we add a new one and this can be for getting the Nullafi CLI to deploy.
OK. So this one, give it a name.
Yeah. Give it a name.
We'll call it deploy to Nullafi.
Yep, and add an environment. So we will have the run, but we also want nth. And then, if we do, if we add two basic parameters to this field that are the same exact name as we put up on GitHub actions, like, on the UI, so Nullafi site and Nullafi
Is it like this?
Without the dash up front.
Does that work?
These are going to be the environments on the runner when it builds. Now, we have to tell it where to find the various keys.
Oh, I was like, this doesn't look like valid YAML to me. I understand now.
Now, we're going to do the dollar sign with the two curlies, again. Similar to how we did.
YAML YAML YAML YAML JSON, JSON.
This is going to be the rest of my life, isn't it?
So if we do the dollar sign with the two curlies, this is what GitHub, just an expression. Now if we do secrets dot that exact statement.
OK. And then, same thing down here. Secrets Nullafi off token.
You hackers, you, you dirty hackers.
That is what we called it? Auth token?
Yes. These are the two environments that Nullafi and CLI will look for when you do a deploy.
Got it. OK. Now, we want to do a run. And we are going to do NPX and nullify CLI. Deploy. And we can have two flags of one is dash dash dir, set that to be public. And at least mine I have an equal sign.
I can never remember.
Yeah. And then dash dash prod, so when it hits, actually to production. And that should be it.
OK. Let's give this thing a shot. So I'm going to let's take a look at what we've built. So we added this GitHub Actions folder. Now we've got our build.YAML, I'm going to get commit, call this a chore add GitHub actions for deployment OK. And then, let's push. All right. So then, I go back out here, I think I've already got it open. I don't. Just open it.
OK. If we go to the Actions tab. We can see now that the file is here. If we go into the actions tab, you can see we have a new action that has, well workflow that has been triggered.
And it's already running.
Yeah. It's triggered. This front page will show if there's artifacts, any, various things. But here, we can see now each step will start triggering. This may take a little bit.
Caching found.
The cache wasn't found. So if you look and it's going to run NPM run build. You can see there's some new kind of steps that are actually added into here that post run actions/cache. What that means is after, like, everything's done and said and done, it will actually save that into the cache. And it's actually something any action can do. They can, you can basically set up to do like a post, like after the work, the work load's kind of done, you can set up post steps or post commands to run afterwards.
Yeah.
Currently, it's going through.
So can we I want to look at this, actually. I'm kind of curious how this works. Let's look at GitHub.com/actionscheckout.
Yep.
This is what we loaded. Actions checkout. And it's going to be here?
So this actually is the action. This is the action itself. If you go into source, into source. It's just a bunch of type script. Now, so the first, the initial version of the checkout action was internal. But they've open sourced this as well as the cache action.
OK.
So if you want to just look at the source code. This just here.
Oh, it does so much. OK. But we could've, if we wanted to, we could've written this like a type script file?
Yeah. So if you saw in the, like the root of this project, there was a dist folder. By default, GitHub Actions doesn't install node modules and all of that stuff for actions for you. It has to compile this stuff down to JavaScript. As soon as it hits the JavaScript layer, it can run that for you.
Clicking that file was a mistake.
Oh, no. Maybe we'll just go over here.
I'm looking in the comments. So Nikki pushed a question. Does workflow get canceled if another commit is made that triggers the same workflow?
So no, it will just keep on adding more, it will add a new workflow run once it goes. Like, once a new, so if you do another push right now, it would, technically, start a new trigger and since we haven't cached the stuff, it would, it would have to go through a full, again.
If you knew you had done that, you could come in and axe this one.
Correct.
And what's interesting, we named this. So we can choose what gets communicated over here.
Yep.
If we look up at the top here, we said cache built assets. We could've said something. We can, you know, we can be way more descriptive.
Yep.
And down here
We have jobs build up top. There, jobs build.
We could've named that whatever we wanted.
Yep. Now, something that is useful. Here, we just have one job. But the nice thing, you can actually run jobs in parallel. So for instance, on my site, I do some work where I build my open graph images. So the images that appear on Twitter or any other social platform, I build the images in parallel to one of my site builds. I don't need it to be done sequentially, I can do them in parallel, merge those two assets together and then deploy everything. So it's nice that you can do multiple things at the same time if, like, if there are things that are separate and don't need to depend on one another. They can all run at the same time. And then you can have jobs that depend on those various tasks to finish to trigger.
Nice. This built so much faster than I expected it to.
OK. So so 169 seconds, that's the number we're going to be trying to see if we can
And theoretically, like, this site is almost entirely image processing. There are 18 seconds of nonimage processing time in this build. So if the cache works, we should see, you know, probably a 75% reduction in build time.
That's just a couple of files. Nice thing about Nullafi, too, they do this hashing for you. If I don't change any files, it would dif based on the CDN, it won't do anything, it'll say, yep, we're good.
We could push an empty commit to prove that.
Yep.
So apparently we got some of the way there.
Yeah. So hmm, I wonder why this looks like Azure DevOps. This is built on the same infrastructure that DevOps is built on which is another CI/CD provider.
And that makes sense because, it would be really odd for Microsoft to purchase GitHub and continue, maintain two completely separate infrastructures. It would make sense for those two to start growing together. And hopefully, they grow toward, like, the GitHub side of the UX flow. Because GitHub has been very good at making things developer friendly.
Mm hmm.
And Azure has been very good at providing services, but always, hasn't always been easy to onboard.
Yep.
What I think is interesting, you can see really earnest effort from the Microsoft team to close that gap. And I really liked seeing the way their team is actively seeking out and responding to feedback as opposed to being our way is fine, you just have to learn it.
The Azure DevOps, I like how they do their side. It's a little bit more GUI based, where if you go back to VSCode, we wrote a bunch of YAML. If we had an incorrect tab or space somewhere, YAML isn't the best format, it works. It can be very finicky. Back in the day, it had a GUI interface where you got actions on to a graph based form.
Interesting.
And actually something actually look here that's, you actually see right here, the post, the actions. All we're doing to compress this is just running TAR to take those directories and literally compress them up.
Side note, I feel like TAR balls are one of the best named pieces of, like, CLI software. Because when I create a TAR ball, I immediately get a visual of what I've done. I took all of this stuff and just made it into a pile and now, I have my TAR ball. I don't know, that joke was funnier to me.
Now, let's go over to Nullafi and see. So supposedly, it was deployed. So if we go over to Nullafi.
Yes. We should be able to open this. And here we go, there's our tons and tons of images with each one having a full size site.
Yep.
Nice.
Yeah.
I just realized I pulled all bear images.
This is the panda site.
Yeah.
Oh, this is like weird. I thought this was Splash, but these are my photos. I found out. I shouldn't say this on stream. But lighting one of these things in Thailand now is like a capital offense.
Interesting.
Every Singh L time. This is for a party that happens every year where they, the whole city lights off the lanterns and you can see this is not a great photo, but each one of these is one of the lanterns. This is effectively the largest party in the year where everybody gets together and does crime together.
Kind of like in the United States with the Fourth of July. I think there's fireworks blown off until 11:00 or 12:00 on the Fourth this year here.
That's a little rough.
And some of the family went to sleep around 9:30ish. And guess what the neighbors literally right next door decide? Let's start some fireworks off.
Yeah. We've been getting last night, somebody lit off one firework at like 10:00 p.m..
Just a single firework?
Just the one. It was like, oh, we forgot to light this one. Let's do it real quick.
OK.
So this worked? Right? This did what we wanted.
Deploy the image? Add another image.
Yeah.
I think. So in the source, I forgot how I did this. I think I dropped all of the images in here. So UL a of these are images. Let's find an image, we'll go to unsplash. And since you're on here, we're definitely using a Corgi.
We have to get a new Corgi.
Yeah, can't use the same Corgi I always use.
There's some good ones.
Corgis going on out here. That's the one. That's the Corgi, look at those. So here is another big old image. This is, oh, this one is actually not that bad. 438 kilobytes.
If we commit that, this will, should trigger another build and it should pull down that cache that we made previously.
OK. We're going to push, and now when I go back to GitHub. (Clears throat) what? Oh, the chat's frozen?
In the overlay.
Let me refresh it. Stream, blitz, where you at? Just refresh this page. I think something went weird with my chat. I feel like it was fine and then it really wasn't. That's the wrong one anyway. This is the one I need to refresh. There we go. Should be working, again.
People, throw some Corgis in the chat, just test this.
Let's look at actions, our new Corgi image. It is building. Look at the build job. It should be looking for cache now.
It's taking longer because I can pull down that cache.
So it shows, displays content after. It partially shows it as it's going. It's not, like, a full realtime display.
Mmm. There it is.
Yep.
Here we go.
Cache restored. And did say the cache size. The public one would be the big one. The cache size is 233 megabytes.
OK.
That makes sense, the site is almost entirely images. It wouldn't be the .cache folder. We're not using the data relationships.
Yep. OK. And then, it's going to go through the process all over, again. Hopefully, when it hits the NPM run build, it should be somewhat faster.
Mm hmm.
OK. NPM install. So caching the node modules, you made it sound like that's a whole different kind of thing?
So you can, you can cache modules, but it's a thing you don't want cache node modules directly. If you're running this on multiple platforms, you may cause issues where maybe you build, like, the sharp for Linux, and then, brings that file over to Windows.
Right.
It's probably going to flow up. So that's a lot faster.
Way faster. And this should go away faster, too, most of the images will be identical, so the hash will eliminate the upload.
150 seconds to build all of the images versus now two to add one more.
That went about six times faster, little under. Yeah, and then we only have to upload 13 files because, again, the image, the reason it's 13 files and not one is for the thumbnails, we're getting three different resolutions. And for the full size image, the fluid image, that's the remainder of the resolutions. It generates, like, small to big so that we can do the adaptive images. That's what Gatsby images uses.
Mm hmm.
Yeah, 13 files as opposed to the 2600 that were there. This part because we use Gatsby with a warm cache. This is what inspired me to build the Gatsby cache. I think this is kind of the who impetus behind Gatsby Cloud, the company, how do we make sure that Gatsby runs like it does when you're developing locally? It basically comes down to keep the cache warm. That's 99% of the battle.
It did take a little more time to pull the cache down in subsequent runs.
So look at this, it's not saving the cache.
So that's cache cache. Oh, I'm not sure if that hash we had afterwards is working.
Yeah, doesn't look like it is.
Yeah.
But in other instances, if you're working with something, cache node modules, you can use the .JSON file. You can use those to get that hash.
Does it give us an arrow when we try to do that? Run actions cache. We straight broke it. Something's not right. Oh, I know why. We're trying to hash files that don't exist yet. The public folder doesn't exist until after the build. Would we need to flip the order? That wouldn't make sense. Tony, there is a, the way Nullafi builds work. Each build is atomic, meaning that when we go and look at the deploy log, we'll see that each one of these builds that we've done is a separate build. So we can see 480 files were uploaded. But because we don't want you to have to wait for the same files to be uploaded, we'll just copy all of the ones that didn't change from last time. Only new files get brought over. And that's what we see in the GitHub CLI here.
I think Tony's asking specifically on GitHub's side.
Right. That's what we see here, it hashes the files and checks to see whether any of them changed.
Yeah on the whole work flow portion, I don't believe the builds are atomic. You have to be careful that you don't, something, if you're doing something here like a site. If you're working on a site for, maybe your team's, like marketing site. You probably aren't, most people probably aren't pushing alt to master at the same time. They're pushing PRRs and merging them in. So that shouldn't really be an issue.
Yeah. So one potential issue here is, like, I think on our next build, we're going to be missing the images for this because they didn't get cached.
Yeah.
So let's figure out how we can do this.
Yeah.
I think, let's just throw another one in here. And I want to, I kind of just want to test this to see if it's going to be an issue for us.
Let me go through, dig through some docs.
OK. What we need to figure out is what information do we have that we can hash? Because I'm pretty sure, like if I was going to place bets, my bet is that this folder doesn't exist. So when it runs hash files, it finds nothing. No ops and returns an empty string. So
Can you just on a whim, can you put spaces between right there.
Here?
Yeah. At least in the docs. I don't know if this is a.
I'll try anything. Let's give it a shot. Let's see. Commit.
Oh. Oh. Oh. Stop. You see that word hash files?
Yeah.
Capitalize the F.
Oh, yes. OK. I'm still curious how it would hash it without.
Well, if that, so I think if that's returned. If that function returned, I think it would be the cache public dash with nothing else.
So that's the expression would return an empty string. And then, we would have the cache public dash and cache cache dash without anything.
Let's put that up. And what I will do, I will go to actions. I'm going to skip this one.
Yep, we can cancel that. Go ahead, if you go into load and click cancel workflow. Yeah, it takes a couple seconds, but it should just cancel. So if you go back to actions.
Then, this is the one that should be running. So we can watch this one. And let's see what happens.
Still doesn't
So that's, those are the keys previously. It's it does a thing of, hey, it does a glob rather than full.
I understand. So that what I was thinking, that key had to get created at the beginning. That is where it runs in the steps. That is basically treated like a callback. When it creates the cache, it's going to run. I understand. That makes more sense. What we should see in here is the post run, that's when we'll see it generate that key?
Yeah, it should be.
Got it.
Actually, scroll up to the top. We can look at it inside. Well, in the click on the run actions. Click on the first line of those, it'll show us the details.
Still shows nothing there. But I think we'll see it once we get down here. And this should happen nice and fast, hopefully. I don't think this got to the build part.
There you are. So I think it's, it should have built two images. Look in there. 18. That would be the two new images. OK. So that's good to know. Gatsby builds the images it doesn't have. I thought it was going to cache, all of the images and just wouldn't do those. That's cool. Yeah, we're in good shape here. Hashing files, 13 files, hopefully, that's, for two new images, that's concerning.
Nullafi already had that one image.
Right. Right.
Kind of already has its cache. Of the build artifacts.
Got it. Did you try casing your YAML? We're not doing KAML YAML. There we go, caching. Still says not saving cache. Still says not saving cache.
This is just up there.
Cool.
That's mine.
Anything else you want to try? I think I saw BW in the chat. I don't know if he's looking at the screen. Like, I know what you're doing wrong. Better fix this. Yeah. So I think. Well, let's make sure the builds are working. The biggest concern would be if we're, like, missing pieces. Those load the way we expect. Full size image loads. Are you supposed to eat that?
Is that the foreground or the background?
No he's eating those berries. Not allowed, those are probably poisonous. That was probably a messy day for that dog's owner. OK. So at the very least, it is, like, the biggest part is working. The downside is that if this was a photo sharing site, when we get another 500 images down the line, we're going to be processing the 500 images. So if we want to yeah, everything's saved. We're saved here. Up to date. So
This is a little finicky.
I don't need to do this glob?
I don't think so.
Other things that they expect people to cache by default are things, like, NPM node modules.
Generated stuff.
Stuff with a lock file of some kind.
Mm hmm. That's easy. As soon as the, it's a single file.
Yeah.
This has got me thinking about how you can do this. It would be one thing if this was just static files, right? So for this particular site, we could do something like grab the source folder and hash it and use that as our key. But, most Gatsby sites are not just using the source folder, they're also pulling in from an API from whatever. So we wouldn't be, that would give you false caches. That was my thought, too, but apparently that's not necessary. So, like, maybe we give it a shot. I'd be surprised if we have to do it like that one. Let's see what happens. Why not, right?
So another thing.
I think it was Tony asked previously, about sizes of caches.
Mm hmm. Could have a cache up to 5 gigabytes. That's the limits. And once you've hit that, go over that, it would just the least recently used cache. And if it's not accessed within a week, it'll also kind of be cleared out. Objects, those are doing what we want.
All right. Get me something.
The thing that gets me most it's dropping that expression entirely from the key generation. Which makes me worry it's something we're doing is causing GitHub to ignore it. Let's go to the phones. Hash files. Oh, wait. So it's hash files the path is relative to the GitHub workspace.
Yeah, that's the repo by default. Matches any package lock. Matches any sensitive lock. So is there a way to what if we use the commit and take the most recent cache and drop it in?
So
The actual commit hash. It'll pull the most recent. That's what I do in the build plugin.
Yeah, let's try that.
How do I get that?
Let me get that. Let me go get that for you. There's a context that GitHub provides you.
OK. And you are absolutely correct, Nikki, that is what all of my CI debugging commits look like.
OK. Replace hash files in both of those instances with GitHub.shaw.
Like that?
Yeah.
Here we go. This is going to be the one. I thought there was a CI prefix for this. I've got to figure out why some of my sound effects aren't working. It's ironic that one doesn't. Don't get it. Is it over here? All right. So we have done the thing, we're in here. I think I need adult supervision here.
Why is it working for some and not others?
I don't know what it was. All right. So now we've got the keys unique every time. So what should happen because we're basically saying, like, every build will update the cache in some way. Then, we're just I don't know how they managed it. They're going to drop the old cache and replace it with the new one so we will always have the most recent build cache stored.
Oh, and something we'll need. Let's add one more thing. We'll add the second thing in just so it'll now how to glob downward. We'll let this build in the background. If you open up our profile, again.
OK. Add a third parameter to both of these called restore keys.
Like that?
With a dash in the middle. And then put a one second. If you just put a bar and then a new line. So a
A bar.
Not helpful to italicize that.
And type in the suffix of these caches. Cache public .
OK.
That's saying, this is like, if it can't find the full thing, it'll start working downward. We'll save this. Let's check how the
Queries should be down here in a second. Now, it did cache not found for input keys. So since we I think since we're getting a newer, fuller cache.
Oh, no.
It may be from scratch.
It definitely is rebuilding from scratch. So far that doesn't do what we want. And I bet we can't find the previous commit.
Well, actually
To drop that back in there.
No. This will be all right. Because if we keep this, it's saying the cache with the most recent creation date will be used. When it comes in here, it'll say, hey, do any of the any commits that have a cache public prefix? And by the time we do the second one, it was like, yeah, we found one and then, like, we'll find some.
So this gets used as a prefix. Not like it doesn't have to be an exact match for this?
Yeah. It'll be a prefix. And it'll find the one with the most recent creation dates. That's what we want.
OK. So then that, so this ran, I'm actually kind of blown away by how much power these GitHub Actions boxes have.
It's actually quite impressive.
Yeah. There's a lot of compute resources.
These are a if anyone works with Azure, it is a standard V S2 virtual machine. What that means in reality is you have a two core CPU, 7 gigs of ram and 14 gig SSD.
That's a lot.
It's a pretty beefy VM that you get access for free.
That's wild. So this is almost done. While we're waiting for that, we can go out here and look at the other one. Which I believe is running in parallel. It is.. Come on, do what I want.
It's sitting around. I think may have found that cache. Maybe.
It hasn't saved the previous. Oh, damn it. We'll have to do one more of these. While we wait for these to build, then, let me get one more.
Maybe cancel that, the one that was
I can do that, yeah. I'll cancel this one.
OK.
That one got built. So now, there should be a cache available. And I'm going to add one more Corgi. Let's add this one. That's how I feel about. That's how I feel about caches are doing with us today. Gloomy corgo. I just realized we have used all of this photographer's. Great, topnotch Corgi photography coming out of Alvin. Let's see, commit. OK. Now we'll push, and what we should see is, when we go back to our actions, we should see a new Corgi. We'll go into the build. Fingers crossed, we should see the cache get pulled that matches the hash and not just the prefix with no hash.
Yep.
Come on.
This is our last chance. If it doesn't work, we've got to call it. I believe in us. Was it a miss?
No, so that's there's a cache. So if you try opening up the so I think.
This is our new one.
Yep. And we don't need to specify that as a glob or something?
Well, it kind of is a glob. It's saying, if any of them have, start with that, then it will go grab the newest one.
OK. I want to look at this one that worked. And so, this one add.
So it's the hash of that was 54 zero.
OK.
The first one.
Yeah.
I don't know if it says the file name, but
It does safe this cache.
Yeah, saves a cache. Now, if we go back to our new run.
And our new run in the build.
We technically only should see one.
Half restored from key.
Yeah, if we go to NPM run build, we should only see about nine new images rather than 27.
- So I think it pulled the most recent actual cache cache instead of the cache cache with hash. Cache cache hash. So we can do digging on this and figure out what happened. And we may have messed ourselves up by creating an exact match which would be an issue with the restore can keys in and of themselves. Unfortunately, we are out of time. But, what we can do is we can try. OK. I'm going to try one more thing because I really, really want this to work. Let's do a V2, and I we don't have time to build twice, though. OK. So here's what we're going to try. We'll do this off screen and we'll commit the results. But my theory is that because we won't have a bare cache with this name, that might trigger the fallback correctly a posed to what we were doing before which was hitting an exact match.
Well, as they say, cache and validation is one of the hardest things in pure science. This is why.
Cache cache validation. And with that, I think it's time for us to call it. So Benjamin, thank you so much for coming on the show today. This was super fun. I feel like we learned quite a bit, like, in GitHub Actions really are, they feel like a Swiss army knife. You can do a lot with them. So if somebody wants to learn more, where would you send them? Yes, so if you go to .GitHub.com, which this is actually fairly new for GitHub. But if you click on that actions tab on the left, this will give you a huge amount of documentation about GitHub. About GitHub Actions. And if you read on the left side, or at least up in front. Like, there's stuff about talking about work flows. It'll have stuff, if you're working with specific languages, if you are coming from a platform like Travis or CI, coming from another CI vendor, they'll talk about that. It goes through all of these things. If you want to build your own action.
Nice.
If you actually don't want to use their servers and want to use your own servers, you can do that. You can run.
Yeah. Self hosted. Yeah, yeah. Very cool. And if people want to keep up with you, where should they go?
Yeah, so. If you go to my Twitter lanbr. Yep.
Nice. Well, thank you, again, and another shout out to White Coat Captioning for doing the live captions today. I'm so, so happy we're able to make this show a little more accessible for everybody. And that is made possible by our sponsors. We've got Nullafi, auth 0 and all kicking in to make this more accessible. And thank you, chat. That is always a great time. With that, I think we're good. So chat, stay tuned, we're going to raid. Ben, thank you so much for coming on. We'll see you next time.
You're welcome.