: Evolving to Microservices

Evolving to Microservices

SmartThings Case Study

Agenda

Intro
What is SmartThings
Eras of Our Architecture
What Failed for Us
Patterns Working Across Eras
Will that Monolith Ever Really Die

Networking

To the people around you:

What company or organization are you from?
What are you all looking to take away from this case study?

$whoami

Jeff Beck

Software at SmartThings

@Beckje01

What is SmartThings?

A platform for IoT, connect and automate all your devices.

The Platform

Multiple Mobile Clients
Many Connected Devices
Open API

The Platform Today

>150 μ-services
Java, Groovy, Kotlin, Scala, Rust, JavaScript, Swift
Grails, Ratpack, SpringBoot, Micronaut, Dropwizard, …

Why Call It Era

The architecture changed over time the lines between each era are blurry.

Monolith

One service in the cloud for all things to connect to.

What it looked like

Why did we pick this design?

First iteration, make something work with the least amount of work.

What worked?

Quick/Simple Deployment
Debugging was all one place to look.

What was painful?

Hard to solve all connectivity with one codebase
Mobile Client
Website
Hubs (Devices)

When was it time to change?

When we had lots of different things connected in different ways.

Monolith Plus

Some extra services stood up to deal with special connections but can not operate without the monolith.

What it looked like

Why did we pick this design?

Needed special connections to the phone and hub, but didn’t want to split up the core business logic.

What worked?

Correct Technology for each use case
Simple Deployment
Debugging business logic was always in the core, connectivity in the other services.

What was painful?

Harder to test limited features in an environment
Team size getting bigger, starting to feel crowded in the codebase
Coupled features

When was it time to change?

Starting to hit scaling issues for the core service.
How to support a global rollout of our platform.

Global Layer μ-Services

A small set of services available globally, with our core deployed in different geos.

What it looked like

Why did we pick this design?

Support multiple geographies at once.
Minimal change to support business needs.
Global services were new to us so limiting how many we needed

What worked?

Quick escape valve for scale issues, just add another shard.
Minimal changes to mobile clients and hubs.
Pulled out Auth / User to a Microservice

What was painful?

Couldn’t scale different operations in different ways
Too many conflicting changes when trying to experiment.

When was it time to change?

Couldn’t scale different workloads in different geographies
Team has grown too big to share a codebase
Team is now distributed around the globe

Layers of Microservices

Breaking up the Monolith is now a goal.

We created these categories of microservices that should have common operational patterns.

What it looked like

Why did we pick this design?

Teams were just starting to take over on call
Wanted to allow teams in a timezone to build their own service

What worked?

Easy to reason about different services operationally
Creation of paved path

What was painful?

Reasoning about the whole system functionality is getting harder
Fitting new workloads etc into existing buckets
When is it worth creating a new "bucket"

When was it time to change?

We outgrew the need for making services operate the same way.

Use Case Driven Microservice Deployments

Refocus on letting teams to deploy and design how they want.

What it looks Like

Why did we pick this design?

Flexibility to teams
Unblock teams thinking

What worked?

React to evolving regulations by changing service / feature deployment architecture
Cost optimizations

What was painful?

Building common tools to support most of the use cases
Observability
What should be a paved path?

When was it time to change?

We will let you know.

What Failed for Us

You should try something else then these ideas.

Defined Categories

Too rigid
Limited engineers thinking of solutions

Building Generic First

Too costly
Sometimes miss at primitives

Automated Architecture Diagrams

As system got advanced enough there was no longer use from the automated architecture view.

Patterns Working Across Eras

No matter what era of architecture, these work for us.

Human Made Diagrams

A drillable diagram. High level moving to details of functional areas.

Example

Example

Thin Persistent Connectors

A service that maintains a connect ideally has 0 logic outside of the connection.

Allows deployments to not disconnect hubs/devices
Small easy to maintain, hopefully mostly off the shelf.

Routing Keys

Allows Moving URLs without rewritting all destinations.
Allows telling a coordinator where to route

Example

Tooling

Build tooling only looking ahead 1 era and plan on deprecation.

Deployment Tooling

Spinnaker with 5 microservices is too heavy.

Spinnaker with 100+ works great.

Paved Path

Creating a documented tooled path that covers your 80% case.

Break Out Auth First

Auth systems are so important to everything done you should always break them out first.

Allows adding new services without interacting with the old monolith.

Will that Monolith Ever Really Die?

You have tons of new services but that old big one is still there…

Does it have to die?

Does the rest of the system grow to eclipse it?
Can we still operate it?
Can we isolate it off behind some cleaner services?

Monolith as Legacy Support

If you can still operate this system and everything isn’t depending on it you can effectively keep the monolith in a corner