ArthurChiao's Blog

Recent Posts

  • 2023-03-19

    TCP Retransmission May Be Misleading (2023)

    TL; DRModern kernels by default enable a TCP option called Tail Loss Probe (TLP),which actively sends the so-called “probe” packets to achieve TCP fastrecovery. A side effect is that a large part of those probe packets isclassified into TCP retransmissions (in good q...

  • 2023-03-02

    [译] Borg、Omega、K8s:Google 十年三代容器管理系统的设计与思考(ACM, 2016)

    译者序本文翻译自 Borg, Omega, and Kubernetes,acmqueue Volume 14,issue 1(2016),原文副标题为 Lessons learned from three container-management systems over a decade。作者 Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes,均来自 Google。文章介绍了 Google 在过去十多年设计和使用前后三代容器...

  • 2023-02-05

    Linux CFS 调度器:原理、设计与内核实现(2023)

    整理一些 Linux 默认调度器 CFS 相关的东西。CFS、cgroup 等内核技术合力实现了进程的CPU 资源限额(CPU 带宽控制),这是容器的基础之一。 1 概念及关系 1.1 CFS:进程(task)的公平调度 1.2 CFS 扩展 1.2.1 前提:CONFIG_CGROUPS 1.2.2 前提:CONFIG_CGROUP_SCHED 1.2.3 扩展:支持实时进程组(CONFIG_RT_GROUP_SCHED) ...

  • 2023-01-25

    k8s 基于 cgroup 的资源限额(capacity enforcement):模型设计与代码实现(2023)

    1 引言 2 k8s 资源模型 2.1 Node 资源抽象 2.1.1 Capacity 2.1.2 Allocatable 2.1.3 Allocated 2.2 Node 资源切分(预留) 2.2.1 SystemReserved 2.2.2 KubeReserved 2.2.3 EvictionThreshold(驱逐门限)...

  • 2022-12-11

    Pidfd and Socket-lookup BPF (SK_LOOKUP) Illustrated (2022)

    TL; DRMost unix programming text books as well as practices hold the following statements to be true: One socket could be opened by one and only one process (application); One socket could listen/serve on one and only one port; Recall the bind system callint...

  • 2022-12-11

    [译] Socket listen 多地址需求与 SK_LOOKUP BPF 的诞生(LPC, 2019)

    译者序本文组合翻译 Cloudflare 的几篇分享,介绍了他们面临的独特网络需求、解决方案的演进,以及终极解决方案 SK_LOOKUP BPF 的诞生: Programming socket lookup with BPF, LPC, 2019 It’s crowded in here, Cloudflare blog, 2019 Steering connections to sockets with BPF socket lookup hook,eBPF Summit,2020由于译者水平有限,本文不免存在遗漏或错误...

  • 2022-11-12

    [译] Cilium 未来数据平面:支撑 100Gbit/s k8s 集群(KubeCon, 2022)

    译者序本文翻译自 KubeCon+CloudNativeCon North America 2022 的一篇分享:100 Gbit/s Clusters with Cilium: Building Tomorrow’s Networking Data Plane。作者 Daniel Borkmann, Nikolay Aleksandrov, Nico Vibert 都来自 Isovalent(Cilium 母公司)。翻译时补充了一些背景知识、代码片段和链接,以方便理解。翻译已获得 Daniel 授权。由于译者水平有限,本文不免...

  • 2022-10-30

    [译] Cilium:基于 BPF+EDT+FQ+BBR 实现更好的带宽管理(KubeCon, 2022)

    译者序本文翻译自 KubeCon+CloudNativeCon Europe 2022 的一篇分享:Better Bandwidth Management with eBPF。作者 Daniel Borkmann, Christopher, Nikolay 都来自 Isovalent(Cilium 母公司)。翻译时补充了一些背景知识、代码片段和链接,以方便理解。翻译已获得 Daniel 授权。由于译者水平有限,本文不免存在遗漏或错误之处。如有疑问,请查阅原文。以下是译文。 译者序 1 问题描述 1.1 容器...

  • 2022-10-07

    [译] 流量控制(TC)五十年:从基于缓冲队列(Queue)到基于时间(EDT)的演进(Google, 2018)

    译者序本文组合翻译了 Google 2018 年两篇分享中的技术部分,二者讲的同一件事情,但层次侧重不同: Netdev 2018: Evolving from AFAP: Teaching NICs about time,视角更宏观,因果关系和历史演进讲地较好; OCT 2018: From Queues to Earliest Departure Time,更技术和细节一些。另外翻译过程中适当补充了一些与 Linux/Cilium/BPF 相关的内容。由于译者水平有限,本文不免存在遗漏或错误之处。如有疑问,请查阅原文。以...

  • 2022-09-28

    Trip.com: Large Scale Cloud Native Networking & Security with Cilium/eBPF (eBPFSummit, 2022)

    This is an entended version of my talk at eBPF Summit 2022:Large scale cloud native networking and security with Cilium/eBPF: 4 years production experiences from Trip.com.This version covers more contents and details that’s missing from the talk (for time limitation)...