Notes on creating microservices-based applications
This post is a collection of tips and notes I gathered while working on microservices-based applications for last couple of months.
The notes are divided in a couple of sections that focus on the different areas during development and running your services.
I have decided to write more low level notes/tips to focus on specific problems, for more high-level overview see: The Twelve-Factor App
Project Setup
-
Each service should be a self-contained project, hosted in a separate repository.
-
The microservices shouldn’t have any code level dependencies on each other
- For example they shouldn’t depend on each other during build time
-
All dependencies should be factored into separate libraries
- Also keep them as small as possible
-
Ideally only dependencies you have should be the open source libraries that you use
- As a workaround, you can also open source your own libraries
-
The
README.md
should have some basic description of what the project does, what are steps to start developing the project -
Ideally you should have instructions on how to run the project inside the docker container
- This will help other developers but also if you use something like Kubernetes it will help down the line
-
After adopting docker as main tool to deploy the code, you should create appropriate repository in ECR or Docker Hub to host your containers
Specification
-
Apply API-first principles
-
Use a widely supported tools like RAML or Swagger to design your API endpoints and schemas first
-
Iteratively implement new endpoints, replacing static examples of the responses with live endpoints
-
Setup infrastructure to validate your schemas
- Integration testing seems like a good step, your schemas can be validated as a “proxy” during testing
Implementation
-
Make sure that you handle error responses by other services or applications you depend on
-
Make sure that you set correct response type - HTTP Header
-
You should also handle API versioning, ideally this should be done on the higher level as well
-
Add support for
X-Trace-Token
, make sure to pass it around as you make further HTTP requests to other services -
Also add
X-Trace-Token
to all log messages. -
Ideally you could implement a Zipkin-like service to help with that
Monitoring
-
Your services should have a standard health check endpoint
-
You should standardize on what data is shown there
-
Format should be readable by the monitoring infrastructure
-
During health-checking, the service should send
ping
requests to all services it depends on and report status of those connections
-
-
You should have also tools to perform instrumentation / metric collection
- Tools like Prometheus, NewRelic, Grafana or similar can be very helpful here
-
Logging should be written to standard output
-
Error logs should be written to standard error
-
Those logs should be captured by the tooling around docker containers (like Kubernetes) and redirected to Kibana or similar tool
Configuration
-
Make sure that you set sensible defaults for all configurable parameters
- For example the defaults should allow you to run service on localhost for development
-
Configuration that changes in each environment (for example testing and production) should be read through environment variables
- Which could be also configured by the Kubernetes or alternative approaches
-
Configuration shouldn’t change while your service is running
- It’s better to design for applications that can quickly restart and apply new configuration than have long running processes that can change their config
Resiliency
-
Set a reasonable timeout for all outgoing calls you make
- Also consider implementing circuit breakers like Hystrix to improve resiliency even more by avoiding cascading failures
-
Make sure your application can continue running while services it depends on are down
- Make sure your application doesn’t require any manual administration when dependencies are down and later start up
-
Your service should start up even when dependencies are not available
- For example you shouldn’t make any pre-startup checks if database is connectable
-
Make sure that increased rates or complexities of incoming requests won’t kill your application
-
Implement measures to protect your service from abuse
-
For example set a maximum
page[limit]
to avoid making heavy database calls or to limit response size
-
-
Setup error reporting service
- Services like Airbrake or Rollbar will notify you of any errors that your service generates
Scaling
-
Services should follow shared-nothing practices
-
You shouldn’t directly modify state of other services or databases that you don’t ‘own’
-
You also shouldn’t allow other services to modify your internals state
-
-
Service should be effectively state less
-
All durable state should exist in the database
-
Caching is OK, but your service should function correctly without it
-
-
It should be possible to start more copies of your service without modifying existing ones
-
Prefer horizontal scalability over vertical one
-
Don’t use mechanisms like sticky sessions
- These usually can prevent you from handling load evenly among instances of your service
Other
-
The gap between testing and production environments should be a small as possible
- Ideally these environments should differ only by environment variables and scaling
-
Setup a traffic mirroring service
-
A portion of your live production traffic could be sent over to testing environments
-
This will allow you to spot bugs more easily
-
-
One-off admin processes that need to be run during deployment should ideally be automated
-
Or at least those scripts should be bundled with your application
-
For example: database schema migrations
-