Glossary

Message Broker

Enterprise Integration Patterns defines a Message Broker as an architectural building block that can receive messages from multiple destinations, determine the correct destination and route the message to the correct channel.

Message

Enterprise Integration Patterns defines a Message as data record that the messaging system can transmit through a message channel.

Command Message

Enterprise Integration Patterns defines a Command Message as a Message which instructs a worker to execute a task.

Event Message

Enterprise Integration Patterns defines an Event Message as a Message which indicates that an event has occurred.

Document Message

Enterprise Integration Patterns defines a Document Message as a Message containing data from a data source.

Service Activator

Enterprise Integration Patterns defines a Service Activator as a one-way (request only) or two-way (request-reply) adapter between the Message and the service it invokes. The service can be a simple as a method call. The activator handles all of the messaging details and invokes the service like any other client, such that the service doesn’t even know it’s being invoked through messaging.

Idempotent Receiver

Enterprise Integration Patterns defines an Idempotent Receiver as a component that can safely receive the same message multiple times but will produce the same side effects when facing duplicated messages.

Message Dispatcher

Enterprise Integration Patterns defines a Message Dispatcher as a component that will consume messages from a channel and distribute them to performers.

Process Manager

Enterprise Integration Patterns defines a Process Manager as a component that maintains the state of the sequence and determines the next processing step based on intermediate results.

Event Driven Consumer

Enterprise Integration Patterns defines an Event Driven Consumer as a component that consumes a message as soon as it is delivered.

Circuit Breaker

Martin Fowler defines a Circuit Breaker in the following fashion:

The basic idea behind the circuit breaker is very simple. You wrap a protected function call in a circuit breaker object, which monitors for failures. Once the failures reach a certain threshold, the circuit breaker trips, and all further calls to the circuit breaker return with an error, without the protected call being made at all. Usually you’ll also want some kind of monitor alert if the circuit breaker trips.

CAP Theorem

The CAP theorem categorizes systems into three categories:

  • CP (Consistent and Partition Tolerant) — At first glance, the CP category is confusing, i.e., a system that is consistent and partition tolerant but never available. CP is referring to a category of systems where availability is sacrificed only in the case of a network partition.

  • CA (Consistent and Available) — CA systems are consistent and available systems in the absence of any network partition. Often a single node’s DB servers are categorized as CA systems. Single node DB servers do not need to deal with partition tolerance and are thus considered CA systems. The only hole in this theory is that single node DB systems are not a network of shared data systems and thus do not fall under the preview of CAP.

  • AP (Available and Partition Tolerant) — These are systems that are available and partition tolerant but cannot guarantee consistency.

Consistency

A guarantee that every node in a distributed cluster returns the same, most recent, successful write. Consistency refers to every client having the same view of the data. There are various types of consistency models. Consistency in CAP (used to prove the theorem) refers to linearizability or sequential consistency, a very strong form of consistency.

Availability

Every non-failing node returns a response for all read and write requests in a reasonable amount of time. The key word here is every. To be available, every node on (either side of a network partition) must be able to respond in a reasonable amount of time.

Partition Tolerant

The system continues to function and upholds its consistency guarantees in spite of network partitions. Network partitions are a fact of life. Distributed systems guaranteeing partition tolerance can gracefully recover from partitions once the partition heals.

Fault Tolerance

TODO

Network Resilience

According to Wikipedia Network Resilience is:

In computer networking: resilience is the ability to provide and maintain an acceptable level of service in the face of faults and challenges to normal operation.” Threats and challenges for services can range from simple misconfiguration over large scale natural disasters to targeted attacks. As such, network resilience touches a very wide range of topics. In order to increase the resilience of a given communication network, the probable challenges and risks have to be identified and appropriate resilience metrics have to be defined for the service to be protected.

Monitoring

According to fastly monitoring is:

The activity of observing the state of a system over time. It uses instrumentation for problem detection, resolution, and continuous improvement. Monitoring alerts are reactive–they tell you when a known issue has already occurred (i.e. maybe your available memory is too low or you need more compute). Monitoring provides automated checks that you can execute against a distributed system to make sure that none of the things you predicted signify any trouble. While monitoring these known quantities is important, the practice also has limitations, including the fact that you are only looking for known issues. Which begs an important question, “what about the problems that you didn’t predict?”

Observability

According to Wikipedia in the context of control theory Observability is:

In control theory, observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs.

In the context of distributed systems observability is a super-set of Monitoring.

According to fastly the three pillars of observability are:

Logs: Logs are a verbose representation of events that have happened. Logs tell a linear story about an event using string processing and regular expressions. A common challenge with logs is that if you haven’t properly indexed something, it will be difficult to find due to the sheer volume of log data. Traces: A trace captures a user’s journey through your application. Traces provide end-to-end visibility and are useful when you need to identify which components cause system errors, find performance bottlenecks, or monitor flow through modules. Metrics: Metrics can be either a point in time or monitored over intervals. These data points could be counters, gauges, etc. They typically represent data over intervals, but sometimes sacrifice details of an event in order to present data that is easier to assimilate.

Structured Logging

Structured Logging is a method to make log messages easy to process by machines. A usual log message is a timestamp, level and a message string. The context describing the logged event is embedded inside the message string. A structured log message store their context in a predetermined message format which allows machines to parse them more easily.

JSON

JSON stands for JavaScript Object Notation, which is a way to format data so that it can be transmitted from one place to another, most commonly between a server and a Web application.

stdout

Stdout, also known as standard output, is the default file descriptor where a process can write output.

Service Locator

Martin Fowler defines a Service Locator in the following fashion:

The basic idea behind a service locator is to have an object that knows how to get hold of all of the services that an application might need. So a service locator for this application would have a method that returns a movie finder when one is needed.

GIL

The Global Interpreter Lock, abbreviated as the GIL is a mutex which prevents executing threads in parallel if both are about to execute a python bytecode.

This is by design since Python has many atomic operations and maintaining individual locks on each object results in slower execution.

Depending on the implementation, a thread may be forced to release the GIL when a condition is met. In CPython’s implementation of Python 3, a thread is forced to release the GIL after a it executes for a period of time.

A thread may also release the GIL voluntarily when it uses a system call or when a C extension instructs to do so.

IPC

According to Wikipedia Inter-process Communication:

refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data. Typically, applications can use IPC, categorized as clients and servers, where the client requests data and the server responds to client requests. Many applications are both clients and servers, as commonly seen in distributed computing.

There are many approaches to IPC. Some of them are available in all operating systems, some are only available in specific operating systems.

Task

A task is a unit of business logic that is completely independent and can be executed regardless of the execution platform.

Domain Model

Martin Fowler defines a Domain Model in the following fashion:

An object model of the domain that incorporates both behavior and data.

Domain Event

Martin Fowler defines a Domain Event in the following fashion:

I go to Babur’s for a meal on Tuesday, and pay by credit card. This might be modeled as an event, whose event type is ‘make purchase’, whose subject is my credit card, and whose occurred date is Tuesday. If Babur’s uses an old manual system and doesn’t transmit the transaction until Friday, the noticed date would be Friday.

Things happen. Not all of them are interesting, some may be worth recording but don’t provoke a reaction. The most interesting ones cause a reaction. Many systems need to react to interesting events. Often you need to know why a system reacts in the way it did.

By funneling inputs to a system into streams of Domain Event you can keep a record of all the inputs to a system. This helps you organize your processing logic, and also allows you to keep an audit log of the inputs to the system.

Serverless Computing

TODO

Ubiquitous Language

TODO

Result Backend

TODO

Celery Master

TODO

Celery Worker

TODO

Celery Multi

TODO

Celery Beat

TODO

Flower

TODO

Cell

TODO

ETL

TODO

Data Integration

TODO

Python

Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.

CPython

CPython is the reference implementation of the Python programming language. Written in C and Python, CPython is the default and most widely used implementation of the language.

PyPy

PyPy is a replacement for CPython. It is built using the RPython language that was co-developed with it. The main reason to use it instead of CPython is speed: it runs generally faster.