Looking Back on our TechTalkThursday #11 and #12

Written by Thomas Hug | 9/22/20 1:15 PM

Besides our normal TechTalkThursday's in the evening, we tried new times during lunch and at 08:00 in the morning. Neither of them proved to be better than in the evening as we didn't have the same amount of participants.

We use this article to summarize the topics of Demian Thoma and Daniel Lorch.

How a Titan empowers our Cloud Monitoring Infrastructure

Nine is hosting and managing thousands of servers for its customers. They recently moved to a new monitoring solution based on the open-source tools around Prometheus. Nine’s Demian Thoma talks about how nine implemented its new monitoring solution and how it gave them more insight into their infrastructure.

Nine was using Nagios before switching to Prometheus. By changing their monitoring stack, it allowed them to simplify the setup, get more insights into their services and to remove a separate analytics stack of infrastructure.

https://youtu.be/C4PY3ADkxaQ

Site Reliability Engineering: What you need to know about Service Level Indicators, Service Level Objectives and Error Budgets

What does reliability mean to you? In his talk, Daniel Lorch reiterates the claim that reliability is the most important feature of any system. But services need to be just reliable enough to make its users happy - investing too much in reliability results in higher cost (engineering time and infrastructure) without added benefit. Investing too little on the other hand will result in unhappy users.

How do you determine and agree upon what “reliable enough” is to your services and your organization? Site Reliability Engineering provides tools and concepts to formalize this discussion, notably:

Service Level Indicators (SLIs): a monitoring metric that is indicative of a user’s goal
Service Level Objectives (SLOs): a target on an SLI that if barely met, keeps the users happy
Error Budgets: the maximum amount of time the system can fail without contractual consequences. It is the remainder / inverse of the SLO

Watch the 30’ talk below to learn about these concepts and see how an example SLI/SLO is being defined for a fictitious game platform. Links to further information are provided at the end of the talk.

https://youtu.be/PLc2QoYh7I0

On this occasion, we would like to once again thank our speakers for presenting!

Find future TechTalks on our Meetup page:

TechTalkThursday

Zürich, CH
948 TechTalkers

Bei unserem TechTalkThursday möchten hier unser Know-how über die neusten Technologien mit dir teilen und darüber diskutieren. Bei SlideShare findest du alle Präsentationen zu...

Next Meetup

TechTalkThursday @nine

Thursday, Oct 1, 2020, 5:45 PM
9 Attending

Check out this Meetup Group →

Want to stay up to date?

View full post