Scalable System Design Patterns== 可扩展系统设计模式
https://dzone.com/articles/scalable-system-design
负载均衡
分发器负责将请求根据不同的策略,分发到对应的worker实例上。
要求所有的worker上不存储状态信息。
状态信息需要从第三方源头获取,例如对于web业务服务器,将用户登录的会话信息,存储在redis中,那么web业务服务器就可以作为worker,可扩展。
例子:
nignx master worker方式。
uwsgi + django
Load Balancer
In this model, there is a dispatcher that determines which worker instance will handle the request based on different policies. The application should best be "stateless" so any worker instance can handle the request.
This pattern is deployed in almost every medium to large web site setup.
分散集合模式
分发器,将请求分解为子任务, 分配到各个worker上运算, 运算结果再返回到分发器上,
分发器将所有结果汇总, 作为响应给客户端。
例如
ray框架
搜索引擎-会将搜索的关键字,分散到不同的worker中搜索,每个worker负责搜索一定范围的资料
Scatter and Gather
In this model, the dispatcher multicast the request to all workers of the pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client.
This pattern is used in Search engines like Yahoo, Google to handle user's keyword search request ... etc.
结果缓存模式
查询表模式
将第一请求的结果缓存到 内存服务器上, 后面对于相同的请求, 可以先查内存服务器, 然后直接返回结果到客户端。
这种模式使用内存换取了响应时间,节省计算资源。
适用于:
幂等请求,计算耗时的请求。
Result Cache
In this model, the dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.
This pattern is commonly used in large enterprise application. Memcached is a very commonly deployed cache server.
共享空间
黑板模式,
老师(client)抛出一个问题,到黑板上,
所有的学生(worker)对此问题,给出自己的知识范围内的解答,等到一个完整的方案形时候, 响应客户端。
Shared Space
This model also known as "Blackboard"; all workers monitors information from the shared space and contributes partial knowledge back to the blackboard. The information is continuously enriched until a solution is reached.
This pattern is used in JavaSpace and also commercial product GigaSpace.
管道和过滤器
数据流编程
所有的worker都被管道连接, 数据会流过worker
消息处理系统,是此例子。
EAI : https://www.enterpriseintegrationpatterns.com/patterns/messaging/
Pipe and Filter
This model is also known as "Data Flow Programming"; all workers connected by pipes where data is flow across.
This pattern is a very common EAI pattern.
映射规约模式
面向批处理任务, 这种情况磁盘IO是瓶颈,
对应大数据情况, 采用分布式系统架构,来并行操作 磁盘IO 获取计算性能, 和 巨量数据存储的可能性。
Map Reduce
The model is targeting batch jobs where disk I/O is the major bottleneck. It use a distributed file system so that disk I/O can be done in parallel.
This pattern is used in many of Google's internal application, as well as implemented in open source Hadoop parallel processing framework. I also find this pattern can be used in many many application design scenarios.
批量异步并行模式
Bulk Synchronous Parellel
This model is based on lock-step execution across all workers, coordinated by a master. Each worker repeat the following steps until the exit condition is reached, when there is no more active workers.
- Each worker read data from input queue
- Each worker perform local processing based on the read data
- Each worker push local result along its direct connection
This pattern has been used in Google's Pregel graph processing model as well as the Apache Hama project.
执行器编排模型
引入DAG到调度器中, 将任务分发到集群中的节点。
Execution Orchestrator
This model is based on an intelligent scheduler / orchestrator to schedule ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers.
This pattern is used in Microsoft's Dryad project