Below are some notes on real-life learnings from a live project where the Capgemini Cloud team built over 60 services.
Designing the microservices:
- Start with the business focus for each service
- Design these as small services with CRUD operations on a single business function or domain.
- Use the API-first design approach to ensure that reusable services are built across LOBs and these conform to enterprise-wide consumption and comply with security and governance requirements.
- Domain-driven design (aka DDD) is the de facto architecture pattern for microservices. This helps break up the complex system into data-driven microservices that reflect the business problem.
- The key challenge in DDD is deciding where to define the boundaries of each bounded context. This will require multiple iterations – you start with unique contexts and redesign to avoid chattiness between services.
- Use lightweight REST-based communication (client-to-service and service-to-service) where there is need to communicate between services. This, however, must have loosely coupled services and you must ensure that the services are stateless.
- Break up services into synchronous and asynchronous, preferring async as far as the business process allows. This slots into the favored philosophy of “smart endpoints and dump pipes.” This basically means that the microservice is built using asynchronous communications thereby keeping the service autonomous and not depending on other services.
- Appropriate design patternssuch as aggregator, proxy, and branch patterns are commonly used.
- In unavoidable cases, there will be distributed transactions and the need to design for them. The common solution to this is the saga pattern, which helps design for failures in case of distributed in case of partial executions.
- Go beyond the twelve-factor design for microservices. Capgemini Cloud Native uses fifteen-factor design principles to achieve agility, scalability, and operational efficiency.
- Microservices must be designed for network and system failures, for example delays, errors, or unavailability of another service or third-party system.
- As mentioned earlier, keeping communication async allows an architecture that is resilient when some services fail.
- They must provide a default functionality in case of failures from a service. This could be an error message or if the business case permits, a default value that is acceptable until the external service is available.
- Even if services are built for a UI screen to consume, it must be responsible for all data input validation – (client-to-service and service-to-service). There are common frameworks to do this using expression language and annotations rather than code.
- Centralized logging and monitoring is a must across distributed microservices along with tracing and alerting mechanisms
- Log events for timeouts and shut downs
- Logging should include the level, hostname (instance name), message
- Log events can be used for capacity planning and scaling, for example which services need higher instances
- Business data related metrics such as number of bookings, or time taken to fill out the form.
- Use Testing tools for integration of services – in unit testing the service as well as contract testing to ensure the APIs are functioning even though the service is treated as a black box.
- Quick feedback on check-ins and failures in the build and CI/CD pipeline are possible through automation testing as soon as code is committed and built.
Would you like to learn more about the fifteen-factor design or our point of view on the containerization platform options for microservices –contact me for more information.