Back
Milestone
Posted

About WIP's recent downtime

Just wanted to give you all a quick explanation on WIP being down today and earlier this week.

I've been in process of switching many of my smaller apps from Render to Digital Ocean, deployed through Kamal. While I'm still a big fan of Render and I think it's great for hosting profitable sites, its pricing structure can become problematic when you have many smaller apps each with multiple services. All those small monthly costs start to add up.

Migrating these apps to Kamal went well. It definitely took some effort getting acquainted with Docker, etc, but I was learning fast and getting a better grasp at what it takes to run your own apps rather than relying on the abstractions of platforms like Render and Heroku. Valuable experience to have as a developer.

I've been running these apps on Kamal for a little while and all seemed okay.

That's why I decided to try and migrate WIP as well. It would allow me to reduce costs and increase CPU/memory resources making the site faster.

Unfortunately, migrating WIP, a more complex app than the others I migrated so far, is proving to be more difficult.

The promise of Kamal is that it lets you deploy apps easily without much devops experience, but it's still early software and I'm starting to run into the rough edges.

There is a nice community forming around it, but it's still young, so a lot of stuff you still need to figure out yourself.

Anyway, long story short, sorry for the downtime. I realize how frustrating it must be, especially considering our streak feature.

If you lost your streak because of the downtime, please add your todos now, and email me a link to them with the rough time you completed them and I'll backdate them for you fixing your streak.
 
I haven't decided yet whether to continue running WIP on Digital Ocean through Kamal or revert back to Render. I think I'll give it a few days and if things stabilize, we keep it hosted as is. But if things keep going down, I'll bring it back to Render.

Thanks for your patience during all of this!


Thanks for the update! Since we can post updates via Telegram and streaks are secured then in my opinion keep working on the new setup to make it stable. I have no problem with some downtimes if there is a chance that the speed is better while your costs are cut down.

Appreciate that! Yes, the Telegram bot is a nice backup option for those that use it.

Thanks for letting us now! And I really appreciate that you support Telegram, unlike Makerlog which will deprecate the bot :(

Oh, that's surprising. I wonder why?

The (new) founder, John, said that the user base isn’t as active as on Slack (though I doubt it is that much higher), and he is rebuilding the project. As far as I understood, the community can make their own bots, but no official ones in Telegram.

I mean it’s his own project, all and all, so I hope someone will make a TG bot at some point.

Again, gladly, WIP has an official bot hehe!

Loved that you wrote this down to share what happened. Appreciate it Marc 🤝

Thanks for the update. I haven't looked at Kamal in much detail, but it looks like a wrapper around Docker.

I use Docker and Compose a lot, and assuming Kamal allows you to pass through extra options, you should be able to throttle the CPU/memory to keep a process from running away with your droplet. That might be worth in case you keep seeing spikes. It won't limit the appearance of downtime for that service, but it'll keep the box and other containers running.

Also, if you aren't running DO's managed databases, please save yourself some grief and do that vs. trying to run it in Docker.

100% using managed databases. Don't trust myself to run those myself haha. At least with the web server, I know I can just nuke it and reset it with a few commands without any data loss.

And yes, Kamal lets you configure resources limits using Docker's config. I think it's called cap ? I haven't looked into it yet, but it seems like a good idea. So maybe set each container to use a maximum of 80% CPU usage, just so it never takes down the full machine? (assuming the other containers don't use up the remaining 20% at the same time)

I have never used cap but I have used resources with limits + reservations via compose which should just be passing those through as cli options: docs.docker.com/compose/compo… - I would ChatGPT this to see what args are 🤣

80% of 90% should be good enough to keep control of the box. I'm lazy and I have two or three dozen containers running without issues on a 4 GB droplet. Worse case, you can narrow it down to the service.

Thanks a lot for sharing, a great opportunity for all of us to learn from your experience.

And you can count on us to support and be patient too.

Sounds like a fun, challenging and worth it. It's one of my favourite past times to tinker with configuration and DevOps. In the past I went more bare metal with LUKS encryption, ZFS pools to be able to take snapshots, backup everything to an off-site and do all the updates myself.
In some cases it can provide advantages given the requirement, running up hours spend on maintenance 😊

Managed VPS is a concept that can proved the sweet spot between cost and effort. It's why Laravel Vapor has been amazing in the past with it's amazing scaling.

Even when starting out with a staging environment, a lot of experience will only present it self with real usage and customers.

Good luck, have fun and don't hesitate to ask, you have a community full of people with knowledge and willing to help 😅

Thank you! Appreciate it. And yes, production always ends up presenting new challenges somehow haha

best downtown message I've ever read. Marc is showing the way how to be transparent

Thanks for sharing @marc appreciate the transparency!