🔑 Key Systems Engineering Concepts
Performance Vs. Scalability
A service is scalable if it results in increased performance in a manner proportional to resources added. Generally, increasing performance means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow.
Latency Vs. Throughoutput
Latency and throughput are two important measures of a system’s performance. Latency refers to the amount of time it takes for a system to respond to a request. Throughput refers to the number of requests that a system can handle at the same time. Generally, you should aim for maximal throughput with acceptable latency.
Availability Vs. Consistency
Availability
refers to the ability of a system to provide its services to clients even in the presence of failures. This is often measured in terms of the percentage of time that the system is up and running, also known as its uptime.
Consistency
On the other hand, refers to the property that all clients see the same data at the same time. This is important for maintaining the integrity of the data stored in the system. In distributed systems, it is often a trade-off between availability and consistency. Systems that prioritize high availability may sacrifice consistency, while systems that prioritize consistency may sacrifice availability. Different distributed systems use different approaches to balance the trade-off between availability and consistency, such as using replication or consensus algorithms.
CAP Theorem
AP - Availability + Partition Tolerance CP - Consistency + Partition Tolerance
Consistency Patterns
Weak Consistency
Weak consistency patterns in distributed systems refer to scenarios where not all nodes immediately reflect updates to a shared state. These patterns prioritize availability and partition tolerance over immediate consistency, allowing for temporary discrepancies between nodes. This approach is suitable for systems where immediate consistency is less critical, and eventual consistency—where all nodes will eventually converge to the same state—is acceptable. Weak consistency can lead to better performance and scalability, as it reduces the overhead of maintaining strict synchronization across distributed components.
Key Characteristics:
- Prioritizes availability and partition tolerance.
- Allows temporary discrepancies between nodes.
- Ensures eventual consistency.
- Reduces synchronization overhead.
- Improves performance and scalability.
- Suitable for applications where immediate consistency is not critical.
- Can result in stale reads and temporary data inconsistency.
Eventual Consistency
Eventual consistency is a consistency model used in distributed systems where updates to a data item will eventually propagate to all nodes, ensuring that all replicas converge to the same state over time. Unlike strong consistency models that require immediate synchronization, eventual consistency allows for temporary inconsistencies between nodes. This model is ideal for systems that prioritize availability and partition tolerance, as it permits operations to continue even during network partitions or node failures. Eventual consistency is commonly employed in large-scale, distributed databases and systems where immediate accuracy is not critical.
Key Characteristics:
- Ensures all replicas converge to the same state over time.
- Allows temporary inconsistencies between nodes.
- Prioritizes availability and partition tolerance.
- Suitable for large-scale distributed systems.
- Permits continued operation during network partitions or node failures.
- Reduces synchronization overhead.
- Common in NoSQL databases and large-scale cloud systems.
Strong Consistency
Strong consistency is a consistency model in distributed systems where any read operation returns the most recent write for a given piece of data, ensuring immediate synchronization across all nodes. This model guarantees that once a write operation is confirmed, all subsequent read operations will reflect that write, providing a uniform view of the data across the system. Strong consistency is critical for applications requiring precise and immediate data accuracy, such as financial transactions or inventory management. However, achieving strong consistency often involves higher latency and reduced availability due to the overhead of coordinating updates across multiple nodes.
Key Characteristics:
- Guarantees immediate synchronization across all nodes.
- Ensures read operations return the most recent write.
- Provides a uniform view of data.
- Critical for applications needing precise and immediate data accuracy.
- Involves higher latency and reduced availability.
- Requires coordination of updates across multiple nodes.
- Common in systems where data integrity is paramount.
Availability Patterns
Failover
is a process in which a system automatically switches to a standby or backup component when the primary component fails or becomes unavailable. This mechanism is designed to ensure the continued availability and reliability of services, minimizing downtime and maintaining operational continuity. Failover can be implemented at various levels, including hardware, software, or network systems, and typically involves monitoring the health of the primary component, detecting failures, and seamlessly transferring operations to a backup. The goal is to provide high availability and prevent disruption in critical services or applications.
Active
Active failover patterns refer to a high-availability configuration where multiple instances of a system are active and handle traffic simultaneously. In this setup, all nodes are actively processing requests and sharing the load, providing redundancy and ensuring continuous operation even if one or more instances fail. This approach improves performance and resilience by distributing workload across multiple active instances and enabling automatic failover without significant downtime. It contrasts with active-passive patterns, where only one instance is active at a time while others remain on standby.
Passive
Passive failover patterns are a high-availability configuration where one instance of a system is active and handles all traffic, while one or more passive instances remain on standby, ready to take over in case of a failure. In this setup, the active instance performs all processing tasks, and the passive instances are kept synchronized with the active one, but do not handle traffic unless needed. This approach ensures that there is a backup available to take over seamlessly if the active instance fails, although it may involve some downtime during failover. Active-passive patterns are simpler to implement than active-active configurations but might not utilize resources as efficiently.
Replication
Replication is the process of copying and maintaining data across multiple systems or locations to ensure consistency, reliability, and availability. By duplicating data, replication allows for data redundancy, which enhances fault tolerance and enables data recovery in case of hardware failures or data corruption. This process can be implemented at various levels, such as database, file systems, or network services, and can be configured to occur synchronously or asynchronously. Replication helps distribute load, improve performance, and ensure that data remains accessible even if one or more nodes fail.
Master-Slave
Master-slave replication is a data replication model where one node, known as the master, is responsible for handling all write operations, while one or more nodes, known as slaves, handle read operations and replicate the data from the master. In this setup, the master node processes and updates data, which is then propagated to the slave nodes. This model allows for load balancing of read requests and provides redundancy, as the slave nodes can serve as backups if the master fails. However, it also introduces a single point of failure and potential replication lag.
Key Characteristics
- One master node handles all write operations.
- One or more slave nodes handle read operations.
- Data is replicated from the master to the slaves.
- Allows load balancing of read requests.
- Provides redundancy and backup through slave nodes.
- May introduce replication lag and a single point of failure.
- Suitable for read-heavy workloads with occasional write operations.
Master-Master
replication, also known as multi-master replication, is a data replication model where multiple nodes act as masters, each capable of handling both read and write operations. In this setup, changes made to any master node are replicated to all other master nodes, allowing for high availability and load distribution. This model supports bi-directional synchronization and enables fault tolerance, as any node can take over if another fails. However, it introduces complexities such as conflict resolution and data consistency challenges due to simultaneous updates on multiple nodes.
Key Characteristics
- Multiple nodes act as masters, handling both read and write operations.
- Changes are replicated across all master nodes.
- Supports bi-directional synchronization.
- Enhances high availability and load distribution.
- Provides fault tolerance with any node able to take over.
- Requires conflict resolution mechanisms for simultaneous updates.
- Can introduce complexity in maintaining data consistency.
Background Jobs
Background jobs are tasks or processes that run asynchronously and independently of the main application workflow. They are typically used for long-running, resource-intensive operations that do not need to be completed immediately, such as data processing, email notifications, or periodic tasks. By offloading these tasks to background jobs, applications can maintain responsiveness and avoid blocking user interactions. Background jobs are managed by job queues and workers, which handle the execution of tasks outside of the main application threads, often allowing for retry mechanisms, scheduling, and monitoring.
Key Characteristics
- Run asynchronously and independently of the main application workflow.
- Used for long-running or resource-intensive tasks.
- Offload tasks to maintain application responsiveness.
- Managed by job queues and workers.
- Support retry mechanisms and scheduling.
- Allow monitoring and logging of task execution.
- Improve overall performance and user experience.
Event Driven Background Jobs
Event-driven background jobs are tasks that are triggered and executed based on specific events or changes in the system rather than being scheduled at fixed intervals. This approach allows jobs to respond dynamically to events such as user actions, system updates, or external messages. By using an event-driven model, applications can process tasks more efficiently and in real time, as jobs are only executed when relevant events occur. This can improve resource utilization and reduce unnecessary processing by ensuring that background jobs are only performed when needed.
Schedule Driven Background Jobs
Schedule-driven background jobs are tasks that are executed based on a predetermined schedule or time intervals, rather than being triggered by specific events. These jobs are planned to run at regular intervals, such as hourly, daily, or weekly, and are useful for repetitive tasks that need to occur at specific times or frequencies. Scheduling allows for systematic and predictable execution of tasks, such as batch processing, data backups, or periodic maintenance. This approach ensures that jobs are performed consistently according to the defined schedule, regardless of other system activities.
Key Characteristics
- Executed based on a predetermined schedule or time intervals.
- Useful for repetitive tasks with regular timing requirements.
- Ensures systematic and predictable execution of tasks.
- Commonly used for batch processing, backups, and maintenance.
- Allows for planning and automation of routine operations.
- Can be managed by scheduling systems or cron jobs.
- Ensures tasks are performed consistently according to the defined schedule.
Result Return Background Jobs
Returning results refers to the process of delivering the output or response from background jobs back to the requesting system or user after the job has been completed. This involves capturing the results of the background task, such as processed data or computed values, and ensuring they are accessible or communicated to the appropriate component. This step is crucial for integrating the outcomes of asynchronous operations into the main application workflow, enabling users or systems to make use of the results as needed. Effective result handling ensures that background jobs provide meaningful and timely output for further processing or user interaction.
- Involves delivering output or response after background job completion.
- Captures results like processed data or computed values.
- Ensures results are communicated to the requesting system or user.
- Integrates outcomes into the main application workflow.
- Enables further processing or user interaction with the results.
- Can involve result storage, notifications, or direct updates.
- Important for completing the feedback loop of asynchronous operations.
A Domain Name System (DNS) translates a domain name such as www.example.com to an IP address.
DNS is hierarchical, with a few authoritative servers at the top level. Your router or ISP provides information about which DNS server(s)
- to contact when doing a lookup. Lower level DNS servers cache mappings,
- which could become stale due to DNS propagation delays. DNS results can
- also be cached by your browser or OS for a certain period of time,
- determined by the time to live (TTL).
- NS record (name server) - Specifies the DNS servers for your domain/subdomain.
- MX record (mail exchange) - Specifies the mail servers for accepting messages.
- A record (address) - Points a name to an IP address.
- CNAME (canonical) - Points a name to another name or CNAME (example.com to www.example.com) or to an A record.
Content Delivery Networks
Push CDN's Pull CDN's
Load Balancer
Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client. Load balancers are effective at:
- Preventing requests from going to unhealthy servers
- Preventing overloading resources
- Helping to eliminate a single point of failure
Load Balancer vs. Remote Proxy
Load balancing and remote proxy serve distinct purposes in network management. Load balancing distributes incoming network traffic across multiple servers to optimize resource utilization, ensure high availability, and improve response times. It focuses on efficiently managing the workload among several servers. On the other hand, a remote proxy acts as an intermediary between clients and servers, forwarding client requests to the server and returning the server's response to the client. It is primarily used for anonymity, security, caching, and access control. While both can enhance performance and security, their roles and functionalities differ significantly.
Key Differences:
Purpose:
Load Balancing: Distributes traffic to multiple servers. Remote Proxy: Acts as an intermediary to forward requests and responses.
Primary Goal:
Load Balancing: Optimize resource use and availability. Remote Proxy: Enhance security, anonymity, and access control.
Implementation Level:
Load Balancing: Operates at transport (Layer 4) or application (Layer 7) levels. Remote Proxy: Operates at the application level.
Use Cases:
Load Balancing: High traffic websites, distributed systems. Remote Proxy: Secure browsing, content filtering, caching.
Load Balancing Algorithms
Load balancing algorithms are methods used to efficiently distribute incoming network traffic across multiple servers or resources. These algorithms determine the most suitable server to handle each request, optimizing resource use, improving response times, and ensuring no single server becomes overwhelmed. By implementing effective load balancing algorithms, systems can achieve higher availability, reliability, and performance. Different algorithms offer varying strategies for distribution, catering to specific requirements and workload patterns.
Popular Load Balancing Algorithms:
- Round Robin
- Least Connections
- Least Response Time
- Source IP Hash
- Weighted Round Robin
- Weighted Least Connections
- Random
Layer 7 Load Balancing
Layer 7 load balancing operates at the application layer of the OSI model, distributing network traffic based on application-level data such as HTTP headers, URLs, or cookies. This type of load balancing enables more intelligent routing decisions by inspecting the content of the requests and directing them to the appropriate servers based on specific rules. It supports advanced features like SSL termination, content-based routing, and user session persistence, making it ideal for web applications that require detailed traffic management and the ability to deliver personalized user experiences.
Layer 4 Load Balancing
Layer 4 load balancing operates at the transport layer of the OSI model, directing network traffic based on data from the transport layer protocol, such as TCP or UDP. It uses information like source and destination IP addresses and ports to distribute incoming traffic across multiple servers, ensuring optimal resource utilization, high availability, and redundancy. By balancing traffic at this layer, it can handle a large volume of requests with minimal processing overhead, making it suitable for scenarios where quick and efficient distribution of traffic is required without the need for deep inspection of the packet content.
Horizontal Scaling
https://youtu.be/dvRFHG2-uYs
Microservices
Microservices are an architectural style in software development where an application is structured as a collection of loosely coupled, independently deployable services. Each service is fine-grained and focuses on a specific business function, allowing for better scalability, flexibility, and maintenance. This approach contrasts with traditional monolithic architectures, where all functionalities are tightly integrated into a single system. By using microservices, organizations can adopt a more agile development process, enabling faster updates and improvements to individual components without affecting the entire application.
Service Discovery
Service discovery is a key component in microservices architectures that automates the detection of services in a network. It enables services to find and communicate with each other dynamically without needing hard-coded IP addresses or endpoints. This is essential for managing microservices at scale, as it ensures that services can locate each other, even as they scale up or down, move across hosts, or change due to updates. Service discovery typically involves a service registry, where service instances register their availability, and clients query this registry to find the locations of services they need to interact with.
...