YouTube is a video-sharing platform that allows users to upload, view, and interact with video content. As of this writing, it is the second most visited website in the world 🤯. YouTube 是一个视频分享平台,允许用户上传、观看和互动视频内容。截至本文撰写时,它是世界上访问量第二大的网站,令人惊叹。
There's some conceptual overlap between this question and designing Dropbox. If you're less familiar with system design principles for file upload / download designs, I'd recommend reading that guide first. 这个问题在概念上与设计 Dropbox 有一些重叠。如果你不熟悉文件上传/下载设计的系统设计原则,我建议你先阅读那篇指南。
Users can view information about a video, such as view counts. 用户可以查看视频信息,例如观看次数。
Users can search for videos. 用户可以搜索视频。
Users can comment on videos. 用户可以在视频上留言。
Users can see recommended videos. 用户可以查看推荐视频。
Users can make a channel and manage their channel. 用户可以创建频道并管理他们的频道。
Users can subscribe to channels. 用户可以订阅频道。
It's worth noting that this question is mostly focused video-sharing aspects of YouTube. If you're unsure what features to focus on for a feature-rich app like YouTube or similar, have some brief back and forth with the interviewer to figure out what part of the system they care the most about. 需要注意的是,这个问题主要关注的是 YouTube 的视频分享方面。如果你不确定要为一个功能丰富的应用,如 YouTube 或类似的应用,关注哪些功能,可以与面试官进行简短的交流,以确定他们最关心的系统部分。
The system should be highly available (prioritizing availability over consistency). 系统应高度可用(优先考虑可用性而非一致性)。
The system should support uploading and streaming large videos (10s of GBs). 系统应支持上传和流式传输大视频(数十 GB)。
The system should allow for low latency streaming of videos, even in low bandwidth environments. 系统应支持在低带宽环境下低延迟流式传输视频。
The system should scale to a high number of videos uploaded and watched per day (~1M videos uploaded per day, 100M videos watched per day). 系统应能够处理每天上传和观看大量视频(每天上传约 100 万视频,每天观看 1000 万视频)。
The system should support resumable uploads. 系统应支持断点续传上传。
Below the line (out of scope) 线下(超出范围)
The system should protect against bad content in videos. 系统应该保护防止视频中的不良内容。
The system should protect against bots or fake accounts uploading or consuming videos. 系统应该保护防止机器人或虚假账号上传或观看视频。
The system should have monitoring / alerting. 系统应该具有监控/告警功能。
Here's how it might look on a whiteboard: 这可能在白板上看起来是这样的:
Non-Functional Requirements 非功能性需求
For this question, given the small number of functional requirements, the non-functional requirements are even more important to pin down. They characterize the complexity of these deceptively simple "upload" and "watch" interactions. Enumerating these challenges is important, as it will deeply affect your design. 对于此问题,鉴于功能需求较少,非功能性需求更加重要,需要明确界定。这些需求描述了这些看似简单的“上传”和“观看”交互的复杂性。列举这些挑战非常重要,因为它们将深刻影响你的设计。
The Set Up 准备工作
Planning the Approach 规划方法论
Before you move on to designing the system, it's important to start by taking a moment to plan your strategy. Generally, we recommend building your design up sequentially, going one by one through your functional requirements. This will help you stay focused and ensure you don't get lost in the weeds as you go. Once you've satisfied the functional requirements, you'll rely on your non-functional requirements to guide you through the deep dives. 在你开始设计系统之前,重要的是花点时间来规划你的策略。我们通常建议你按顺序构建设计,一项接一项地通过你的功能需求。这将帮助你保持专注,并确保你在前进过程中不会迷失方向。一旦你满足了功能需求,你将依靠非功能需求来引导你进行深入研究。
I like to start with a broad overview of the primary entities. At this stage, it is not necessary to know every specific column or detail. We will focus on these intricacies later when we have a clearer grasp of the system (during the high-level design). Initially, establishing these key entities will guide our thought process and lay a solid foundation as we progress towards defining the API. 我喜欢从对主要实体的广泛概述开始。在这个阶段,你不需要知道每个具体的列或细节。我们将在系统更清晰时(在高层次设计阶段)关注这些细节。最初,建立这些关键实体将引导我们的思考过程,并为我们逐步定义 API 奠定坚实的基础。
For YouTube, the primary entities are pretty straightforward: 对于 YouTube,主要实体相当直接:
User: A user of the system, either an uploader or viewer. 用户:系统的使用者,既可以是上传者也可以是观看者。
Video: A video that is uploaded / watched. 视频:上传或观看的视频。
VideoMetadata: This is metadata associated with the video, such as the uploading user, URL reference to a transcript, etc. We'll go into more detail later about what specifically we'll be storing here. 视频元数据:这是与视频相关的元数据,例如上传用户、转录的 URL 引用等。稍后我们将详细讨论我们将在此存储的具体内容。
In the actual interview, this can be as simple as a short list like this. Just make sure you talk through the entities with your interviewer to ensure you are on the same page. 在实际采访中,这可以是一个简单的列表,比如这样。确保与面试官讨论这些实体,以确保你们在同一页面上。
The API is the primary interface that users will interact with. It's important to define the API early on, as it will guide your high-level design. We just need to define an endpoint for each of our functional requirements. API 是用户主要交互的接口。尽早定义 API 非常重要,因为它将指导你的高层次设计。我们只需要为每个功能需求定义一个端点。
Let's start with an endpoint to upload a video. We might have an endpoint like this: 首先,我们可以有一个上传视频的端点。我们可能会有一个这样的端点:
POST /upload
Request:
{
Video,
VideoMetadata
}
To stream a video, our endpoint might retrieve the video data to play it on device: 要流式传输视频,我们的端点可能会检索视频数据以在设备上播放:
GET /videos/{videoId} -> Video & VideoMetadata
Be aware that your APIs may change or evolve as you progress. In this case, our upload and stream APIs actually evolve significantly as we weigh the trade-offs of various approaches in our high-level design (more on this later). You can proactively communicate this to your interviewer by saying, "I am going to outline some simple APIs, but may come back and improve them as we delve deeper into the design." 请注意,随着你的发展,你的 API 可能会发生变化或进化。在这种情况下,我们在高层次设计中权衡各种方法时,我们的上传和流式传输 API 实际上会有显著的进化(稍后会详细介绍这一点)。你可以在与面试官交流时主动说明:“我将概述一些简单的 API,但随着我们深入设计,可能会回来对其进行改进。”
Before jumping into each requirement, it's worth laying out some fundamental information about video storage and streaming that is worth knowing. 在深入每个需求之前,值得了解一下一些关于视频存储和流媒体的基本信息,这是值得知道的。
To succeed in designing YouTube, you don't need to be an expert on video streaming or video storage. This would be an unreasonable ask. However, it's worth understanding the fundamentals at a high level to be able to successfully navigate this question. 要成功设计 YouTube,你不需要成为视频流媒体或视频存储方面的专家。这要求过高。然而,了解这些基本知识对于成功回答这个问题是有帮助的。
Video Codec - A video codec compresses and decompresses digital video, making it more efficient for storage and transmission. Codec is an abbreviation for "encoder/decoder." Codecs attempt to reduce the size of the video while preserving quality. Codecs usually trade-off on the following: 1) time required to compress a file, 2) support on different platforms, 3) compression efficiency (a.k.a. how much the original file is reduced in size), and 4) compression quality (lossy or not). Some popular codecs include: H.264, H.265 (HEVC), VP9, AV1, MPEG-2, and MPEG-4. 视频编解码器 - 视频编解码器压缩和解压缩数字视频,使其更高效地存储和传输。编解码器是“编码器/解码器”的缩写。编解码器试图在保持质量的同时减少视频的大小。编解码器通常在以下方面进行权衡:1)压缩文件所需的时间,2)在不同平台上的支持,3)压缩效率(即原始文件减少了多少大小),4)压缩质量(有损或无损)。一些流行的编解码器包括:H.264,H.265(HEVC),VP9,AV1,MPEG-2,MPEG-4。
Video Container - A video container is a file format that stores video data (frames, audio) and metadata. A container might house information like video transcripts as well. It differs from a codec in the sense that a codec determines how a video is compressed / decompressed, whereas a container dictates file format for how the video is stored. Support for video containers varies by device / OS. 视频容器 - 视频容器是一种文件格式,用于存储视频数据(帧、音频)和元数据。容器还可能包含视频字幕等信息。与编解码器不同,编解码器决定了视频是如何被压缩/解压缩的,而容器则规定了视频存储的文件格式。不同设备/操作系统对视频容器的支持程度不同。
Bitrate - The bitrate of a video is the number of bits transmitted over a period of time, typically measured in kilobytes per second (kbps) or megabytes per second (mbps). The size and quality of the video affect the bitrate. High resolution videos with higher framerates (measured in FPS) have significantly higher bitrates vs. low resolution videos at lower framerates. This is because there's literally more data that needs to be transferred in order for the video to play. Compression via codecs can also have an effect on bitrate, as more efficient compression can lead to a larger video being compressed to a much smaller size prior to transmission. 比特率 - 视频的比特率是在一定时间内传输的位数,通常以千字节每秒(kbps)或兆字节每秒(mbps)为单位衡量。视频的大小和质量会影响比特率。高分辨率、高帧率(以 FPS 衡量)的视频比特率远高于低分辨率、低帧率的视频。这是因为视频播放需要传输更多的数据。编码器(编解码器)的压缩方式也会影响比特率,更高效的压缩可以使视频在传输前被压缩到更小的大小。
Manifest Files - Manifest files are text-based documents that give details about video streams. There's typically 2 types of manifest files: primary and media files. A primary manifest file lists all the available versions of a video (the different formats). The primary is the "root" file and points to media manifest files, each representing a different version of the video. A video version is typically split into small segments, each a few seconds long. Media manifest files list out the links to these clip files and are used by video players to stream video by serving as an "index" to these segments. Manifest 文件 - Manifest 文件是文本格式的文档,提供了视频流的详细信息。通常有两类 Manifest 文件:主 Manifest 文件和媒体 Manifest 文件。主 Manifest 文件列出了视频的所有可用版本(不同的格式)。主 Manifest 文件是“根”文件,并指向媒体 Manifest 文件,每个媒体 Manifest 文件代表视频的一个不同版本。视频版本通常被分割成小段,每段大约几秒钟。媒体 Manifest 文件列出了这些片段文件的链接,并由视频播放器使用,作为这些片段的“索引”来流式传输视频。
Throughout this write-up, a "video format" will be a shorthand term we use for referring to a container and codec combination. Now, let's jump into the functional requirements! 在整个文档中,“视频格式”将是我们用来指代容器和编解码器组合的简称。现在,让我们进入功能需求部分!
1) Users can upload videos 1) 用户可以上传视频
One of the main requirements of YouTube is to allow users to upload videos that are eventually viewed by other users. When we upload a video, we need to consider a few fundamental things: YouTube 的主要要求之一是允许用户上传视频,最终由其他用户观看。当我们上传视频时,我们需要考虑一些基本的事情:
Where do we store the video metadata (name, description, etc.)? 我们在哪里存储视频元数据(名称、描述等)?
Where do we store the video data (frames, audio, etc.)? 我们在哪里存储视频数据(帧、音频等)?
What do we store for video data? 我们为视频数据存储什么?
For the video metadata, we are assuming an upload rate of ~1M videos/day. This means, over the course of the year, we'll have ~365M records in the database representing videos. As a result, we should consider storing video metadata in a database that can be horizontally partitioned, such as Cassandra. Cassandra offers high availability and enables us to choose a partition key for our data. We can partition on the videoId, since we aren't really worried about bulk-accessing videos in this design, just querying individual videos. 对于视频元数据,我们假设上传速率为每天约 100 万条视频。这意味着,在一年的时间里,数据库中将有约 3.65 亿条记录代表视频。因此,我们应该考虑将视频元数据存储在可以水平分区的数据库中,例如 Cassandra。Cassandra 提供了高可用性,并允许我们为数据选择一个分区键。我们可以根据视频 ID 进行分区,因为我们在这个设计中并不真正关心批量访问视频,只是查询单个视频。
When designing a data storage solution that needs to scale, it's worthwhile to think about partitioning. Some systems necessitate careful partitioning such that data can be read from a single or a handful of nodes. Other systems require consistency within some scoped domain, necessitating a relational DB (with ACID guarantees) sharded by a key that represents that domain, e.g. Ticketmaster might shard by concert ID to ensure consistency of ticket purchases. However, there's some systems that don't require careful partitioning at all; for this system, we can shard by videoId because we'd only ever do a point look-up by videoId. 在设计需要扩展的数据存储解决方案时,考虑分区是值得的。一些系统需要仔细分区,以便可以从单个节点或几个节点读取数据。其他系统需要在某个范围内的一致性,这要求使用具有 ACID 保证的关联数据库,并通过代表该范围的键进行分片,例如 Ticketmaster 可能会按音乐会 ID 分片以确保票务购买的一致性。然而,有些系统根本不需要仔细分区;对于这种系统,我们可以按视频 ID 进行分片,因为我们只会通过视频 ID 进行精确查找。
For storing video data, we can see some overlap with this problem and the Dropbox interview question. The TL;DR is that it's most efficient to upload data directly to a blob store like S3 via a presigned URL with multi-part upload. Feel free to read our write-up on Dropbox for an analysis of that part of the system. 用于存储视频数据,我们可以看到这个问题与 Dropbox 面试问题有一些重叠。TL;DR 是直接通过预签名 URL 和多部分上传将数据上传到像 S3 这样的对象存储是最高效的。您可以阅读我们关于 Dropbox 的分析文章,了解该系统这部分的内容。
The design decision to upload directly to S3 means we have to change our POST /upload API to a POST /presigned_url API. The server will create a presigned URL to enable the client to upload directly to S3. The request payload will just be the video metadata, vs. the metadata and the video file. 直接上传到 S3 的设计决策意味着我们需要将 POST /upload API 改为 POST /presigned_url API。服务器将创建一个预签名 URL,以使客户端能够直接上传到 S3。请求负载将只是视频元数据,而不是元数据和视频文件。
At this point, the system will look like this: 到目前为止,系统将如下所示:
Users can upload videos 用户可以上传视频
Finally, when it comes to storing video, it's worthwhile to consider what we'll be storing. Understanding this will inform what deep dives we'll need to do later to clarify how we'll process videos to enable our system to successfully service our functional and non-functional requirements. Let's look at some options. 最后,当我们谈到存储视频时,值得考虑我们将存储什么。了解这一点将指导我们后续需要深入研究的内容,以明确如何处理视频,从而使我们的系统能够成功满足功能性和非功能性需求。让我们来看看一些选项。
Approach
This approach basically ignores the fact that we'll need to do any video post-processing. We store just the file the user provides and don't perform any post-processing.
Challenges
This is a naive design and one that won't work in practice because different devices require diffrent video formats in order to play back video. We need to be more sophisticated when it comes to how we store videos.
Approach
This approach involves doing some sort of post-processing of a video to ensure we convert the video into different formats playable on different devices. Once the user uploads a video, S3 will fire an event notification to a video processing service. This service will do the work to convert the original video into different formats. It will store each format as a file in S3. It will also update the video metadata record with the file URLs representing the different formats.
Store different video formats
Challenges
This approach fails to anticipate the need to store small segments of video for streaming later. If we store the entire video, there's no way for the client to download "part" of a video. As we will see later, downloading "part" of a video is really important for streaming for various reasons.
Approach
This approach post-processes videos by splitting them into small segments (each a playable unit that's a few seconds in length) and then converts each segment into different formats playable on different devices. This is a strictly better design than the previous design because it enables more efficient video streaming later, which we will clarify as we design our system.
Store different video formats as segments
Challenges
This approach introduces some complexity. Firstly, it makes our post-processing service more complex, turning it into a "pipeline." It first must split up the video into segments, and then generate video formats per segment. In addition, the system needs to store references to these segments in a sane way and use them downstream in our streaming flow effectively. This will be a key system to clarify when we get to the "deep dive" portion of the interview.
2) Users can watch videos 2) 用户可以观看视频
Now that we're storing videos, users should be able to watch them. When initially watching a video, we can assume the system fetches the VideoMetadata from the video metadata DB. Given that we'll be storing video content in S3, we'll want to modify our GET /video endpoint to return just the VideoMetadata, as that record will contain the URL(s) necessary to watch the video. 现在我们已经存储了视频,用户应该能够观看这些视频。当用户首次观看视频时,我们可以假设系统会从视频元数据数据库中获取 VideoMetadata。由于我们将视频内容存储在 S3 中,我们需要修改 GET /video 接口,使其仅返回 VideoMetadata,因为该记录将包含观看视频所需的 URL。
Let's look at the options we have when it comes to enabling users to watch videos. 让我们来看看如何让用户观看视频的选项。
Approach
This approach involves the client downloading the whole video to play it. Technically, this isn't "streaming", but rather video download and playback. The video would be downloaded from S3 via a URL.
Challenges
In order to play the video, the client needs to download the video in its entirety. Even if the video is extensively compressed and has a low resolution, it can still be large, leading to a long download time. This can be problematic for 2 reasons:
If the entire video needs to be downloaded before playback, the user could be waiting to watch the video for a long time. For example, a 10GB video would take 13+ minutes to download on 100 MBPS internet, which is an unreasonable amount of time.
If the client requests the video in a single HTTP request and experiences a network disruption during that request, the download could fail and any download progress would be lost, resulting a lot of time wasted on a re-attempt.
In general, this approach is not viable when building a video streaming service.
Approach
Rather than forcing a full video download all at once, the system can instead download video segments to properly "stream" the video.
The client would choose a video format based on the user's device, bandwidth, and preferences (e.g. if the user specified HD video, the client would stream 1080p video). The client would then load the first segment for the video, which would be a few seconds in length. This would allow the user to start watching the video quickly without excess loading. In the background, the client would start loading more segments so that it could continue playing the video seamlessly.
Download segments incrementally
Challenges
This approach is more complex and relies on the uploaded videos being stored as segments (which was a previous design decision). Additionally, this approach doesn't take into account fluctuating network constraints while a user is watching a video. If a 1080p video is streamed and network conditions get worse, loading 1080p segments might get slower, resulting in buffering for the user.
On the surface, it may seem like the "good" approach just seems like a 'chunked' download of the video, but it is actually a distinct approach and strictly better / more clear. Some video formats could be stored in order, meaning downloading the file in chunks (e.g. 5MB at a time), it will result in a file stream that is actually playable. However, this isn't always the case for all video formats, and it might be unclear to the interviewer that this approach is actually resulting in data that can be played back to the user with a quick turnaround time. Because the "good" solution involves dividing a video into playable segments, it more clearly conveys the means by which the video can be quickly and incrementally played, and it's strictly closer to the "great" solution below.
Approach
Adaptive bitrate streaming relies on having stored segments of videos in different formats. It also relies on a manifest file being created during video upload time, which references all the video segments that are available in different formats. Think of a manifest file as an "index" for all the different video segments with different formats; it will be used by the client to stream segments of video as network conditions vary (see an example here).
The client will execute the following logic when streaming the video:
The client will fetch the VideoMetadata, which will have a URL pointing to the manifest file in S3.
The client will download the manifest file.
The client will choose a format based on network conditions / user settings. The client retrieves the URL for this segment in its chosen format from the manifest file. The client will download the first segment.
The client will play that segment and begin downloading more segments.
If the client detects that network conditions are slowing down (or improving), it will vary the format of the video segments it is downloading. If network conditions get worse (e.g. the bitrate is lower), the client will attempt to download more compressed, lower resolution segments to avoid any interruption in streaming.
Adaptive bitrate streaming
Challenges
This approach is the most complex and involves the client being a more active participant in the system (which isn't a bad thing). It also relies on upstream design decisions involving video splitting by segments, variance of format storage, and the creation of manifest files.
1) How can we handle processing a video to support adaptive bitrate streaming? 1) 我们如何处理将视频转换为支持自适应比特率流媒体?
Smooth video playback is key for the user experience of this system. In order to support smooth video playback, we need to support adaptive bitrate streaming, so the client can incrementally download segments of videos with varying formats to adapt to fluctuating network conditions. To support such a design, it is important to dig into the details of how video data is processed and stored. 平滑的视频播放是该系统用户体验的关键。为了支持平滑的视频播放,我们需要支持自适应比特率流媒体,这样客户端就可以根据网络条件的变化,逐步下载不同格式的视频片段。为了实现这样的设计,深入研究视频数据的处理和存储细节是至关重要的。
When a video is uploaded in its original format, it needs to be post-processed to make it available as a streamable video to a wide range of devices. As indicated previously, post-processing a video ends up being a "pipeline". The output of this pipeline is: 当视频以原始格式上传时,需要对其进行后处理,以便在各种设备上进行流媒体播放。如前所述,视频的后处理实际上是一个“管道”。这个管道的输出是:
Video segment files in different formats (codec and container combinations) stored in S3. 不同格式(编解码器和容器组合)的视频片段文件存储在 S3 中。
Manifest files (a primary manifest file and several media manifest files) stored in S3. The media manifest files will reference segment files in S3. 存储在 S3 中的描述文件(主要描述文件和多个媒体描述文件)。媒体描述文件将引用 S3 中的片段文件。
In order to generate the segments and manifest files, the stepwise order of operations will be: 为了生成片段和描述文件,操作步骤顺序将是:
Split up the original file into segments (using a tool like ffmpeg or similar). These segments will be transcoded (converted from one encoding to another) and used to generated different video containers. 将原始文件分割成片段(使用 ffmpeg 或其他类似工具)。这些片段将被重新编码(从一种编码转换为另一种编码)并用于生成不同的视频容器。
Transcode (convert from one encoding to another) each segment and process other aspects of the segments (audio, transcript generation). 重新编码每个片段并处理片段的其他方面(音频、字幕生成)。
Create manifest files referencing the different segments in different video formats. 创建引用不同格式视频片段的清单文件。
Mark the upload as "complete". 标记上传为“完成”。
Of note, this design assumes that we upload the original video in full first, before we start processing / splitting. Some video services start processing earlier if the client splits up the video on upload into segments. This enables a "pipeline" where the client uploads part of the video and the downstream work can happen before the client is even done uploading the full original foramt. For the sake of simplicity, we avoid this approach, but this is an additional optimization we can consider if we want to speed up the upload experience for the user. 值得注意的是,此设计假设我们首先完整上传原始视频,然后再开始处理/分割。一些视频服务在上传时如果客户端将视频分割成段,会更早开始处理。这使得可以实现一个“流水线”,客户端上传视频的一部分时,后续工作就可以开始,而客户端甚至还没有完成完整原始格式的上传。为了简化起见,我们避免了这种方法,但如果想加快用户的上传体验,这也是一个可以考虑的额外优化。
This series of operations can be thought of as a "graph" of work to be done. Each operation is a step with some fan-out / fan-in of work based on what "dependencies" exist. The segment processing (transcoding, audio processing, transcription) can be done in parallel on different worker nodes since there's no dependencies between segments and the segments have been "split up". Given the graph of work and the one-way dependencies, the work to be done here can be considered a directed, acyclic graph, or a "DAG." 这一系列操作可以被视为“工作图”。每个操作是一个步骤,基于存在的“依赖关系”,该步骤具有一定的工作扇出/扇入。由于不同片段之间没有依赖关系,且片段已经被“分割”,因此可以并行地在不同的工作节点上对片段进行处理(转码、音频处理、转录)。给定工作图和单向依赖关系,这里要完成的工作可以被视为有向无环图,或“DAG”。
For this DAG of work, the most expensive computation to be performed is the video segment transcoding. Ideally, this work is done with extreme parallelism and on many different computers (and cores), given the CPU-bound nature of converting a video from one codec to another. 对于这个 DAG 中的工作,最昂贵的计算是视频片段的转码。理想情况下,鉴于从一种编解码器转换到另一种编解码器的 CPU 密集型性质,这项工作应该以极高的并行度在许多不同的计算机(和核心)上完成。
Additionally, for actually orchestrating the work of this DAG, an orchestrator system can be used to handle building the graph of work to be done and assigning worker nodes work at the right time. We can assume we're using an existing technology to do this, such as Temporal. 此外,为了实际协调这个 DAG 的工作,可以使用一个协调系统来处理构建工作图并按时分配工作节点任务。我们可以假设我们使用现有的技术来完成这项工作,例如 Temporal。
Finally, for any temporary data produced in this pipeline (segments, audio files, etc), S3 can be used to avoid passing files between workers. URLs can be passed around to reference said files. 最后,对于此管道中产生的任何临时数据(片段、音频文件等),可以使用 S3 来避免在工作节点之间传递文件。可以传递 URL 来引用这些文件。
Below is how our system might look with our post-processor DAG drawn out: 下面是我们的系统可能的外观,我们的后处理器 DAG 绘制出来:
How can we handle processing a video to support adaptive bitrate streaming? 我们如何处理视频以支持自适应比特率流媒体?
We don't need to draw out a full DAG with exact steps and precise examples of transcoding, audio processing, etc. Rather, it's important to dive into the explicit inputs and outputs of video post-processing, and understand how we can process videos in a scalable and efficient way. Given the parallelism and orchestration that is required to efficiently process a video, detailing a DAG solution makes sense here. 我们不需要详细绘制出包含具体步骤和精确示例的 DAG 图,比如转码、音频处理等。相反,重要的是深入探讨视频后处理的显式输入和输出,并理解我们如何以可扩展和高效的方式处理视频。鉴于高效处理视频所需的并行性和编排,详细说明 DAG 解决方案是有意义的。
2) How do we support resumable uploads? 2) 我们如何支持断点续传上传?
In order to support resumable uploads for larger videos, we'll need to consider how we'll track progress for the video upload flow. Of note, this refers to tracking progress of the original upload. 为了支持大视频的断点续传上传,我们需要考虑如何跟踪视频上传流程的进度。值得注意的是,这指的是原始上传的进度跟踪。
If you've read our Dropbox guide, you'll recognize this deep dive has strong overlap with the Dropbox deep dive involving how to support large file uploads. The TL;DR is that: 如果你读过我们的 Dropbox 指南,你会认识到这篇深入分析与关于如何支持大文件上传的 Dropbox 深入分析有很强的重叠。简而言之:
The client would divide the video file into chunks, each with a fingerprint hash. A chunk would be small, ~5-10MB in size. 客户会将视频文件分割成块,每个块都有一个指纹哈希。每个块会很小,大约 5-10MB 左右。
VideoMetadata would have a field called chunks which would be a list of chunk JSONs, each with fingerprint and status field. 视频元数据会有一个名为 chunks 的字段,这是一个包含块 JSON 的列表,每个块都有指纹和状态字段。
The client would POST request to the backend to update the VideoMetadata with the list of chunks, each with status NotUploaded. 客户会向后端发送 POST 请求,更新视频元数据,其中包含状态为 NotUploaded 的块列表。
The client would upload each chunk to S3. 客户会将每个块上传到 S3。
S3 would fire S3 event notifications to an AWS Lambda that would update the VideoMetadata by marking the chunk (identified by its fingerprint) as Uploaded. S3 会触发 S3 事件通知到一个 AWS Lambda,该 Lambda 会更新视频元数据,将通过其指纹标识的块标记为已上传。
If the client stopped uploading, it could resume by fetching the VideoMetadata to see the uploaded chunks and to skip chunks that had been uploaded already. 如果客户停止上传,他们可以通过获取视频元数据来恢复上传,查看已上传的块,并跳过已上传的块。
Below is how the system looks after we implement resumable uploads: 实现可恢复上传后,系统如下所示:
How do we support resumable uploads? 我们如何支持可恢复上传?
The above, in practice, is handled by AWS multipart upload. However, diving into the details in an interview like this is important to represent your depth of understanding of how file uploads occur in practice. 实际上,这在实践中是由 AWS 多部分上传处理的。然而,在这种面试中深入探讨这些细节对于展示你对文件上传在实践中是如何进行的理解深度是很重要的。
3) How do we scale to a large number of videos uploaded / watched a day? 3) 我们如何扩展以应对每天上传和观看大量视频?
Our system assumes that ~1M videos are uploaded per day and that 100M videos are watched per day. This is a lot of traffic and necessitates that all of our systems scale and ensure a solid experience for end users. 我们的系统假设每天上传约 100 万条视频,每天观看 1 亿条视频。这是一大笔流量,需要我们所有的系统都进行扩展,以确保为最终用户提供良好的体验。
Let's walk through each major system component to analyze how it will scale: 让我们逐一分析每个主要系统组件,看看它是如何扩展的:
Video Service - This service is stateless and will be responsible for responding to HTTP requests for presigned URLs and video metadata point queries. It can be horizontally scaled and has a load balancer proxying it. 视频服务 - 这个服务是无状态的,将负责响应预签名 URL 和视频元数据查询的 HTTP 请求。它可以水平扩展,并且有一个负载均衡器为其提供代理服务。
Video Metadata - This is a Cassandra DB and will horizontally scale efficiently due to Cassandra's leaderless replication and internal consistent hashing. Videos will be uniformly distributed as the data will be partitioned by videoId. Of note, a node that houses a popular video might become "hot" of that video is popular, which could be a bottleneck. 视频元数据 - 这是一个 Cassandra 数据库,由于 Cassandra 的无领导者复制和内部一致性哈希,它能够高效地水平扩展。视频将均匀分布,因为数据将按照 videoId 进行分区。值得注意的是,如果某个视频很受欢迎,那么存储该视频的节点可能会变得“热点”,这可能会成为瓶颈。
Video Processing Service - This service can scale to a high number of videos and can have internal coordination around how it distributes DAG work across worker nodes. Of note, this service will likely have some internal queuing mechanism that we don't visualize. This queue system will allow it to handle bursts in video uploads. Additionally, the number of jobs in the queue might be a trigger for this system to elastically scale to have more worker nodes. 视频处理服务 - 该服务可以扩展到处理大量视频,并且可以在内部协调如何将 DAG 工作分配给工作节点。值得注意的是,我们没有可视化这个服务的内部队列机制。这个队列系统将允许它处理视频上传的突发情况。此外,队列中的任务数量可能是这个系统弹性扩展以增加更多工作节点的触发因素。
S3 - S3 scales extremely well to high traffic / high file volumes. It is multi-region and can elastically scale. However, the data center that houses S3 might be far from some % of users streaming the video if the video is streamed by a wide audience, which might slow down the initial loading of a video or cause buffering for those users. S3 - S3 可以非常很好地应对高流量和大量文件。它支持多区域并且可以弹性扩展。然而,如果视频被广泛观众流式传输,存储 S3 的数据中心可能距离某些用户较远,这可能会导致视频初始加载速度变慢或这些用户在观看时出现卡顿。
Based on the above analysis, we identified a few opportunities to improve our design's ability to scale and provide a great user experience to a wide userbase. 根据上述分析,我们确定了一些机会,以提高设计的扩展能力和为广泛用户群体提供良好的用户体验。
To address the "hot" video problem, we can consider tuning Cassandra to replicate data to a few nodes that can share the burden of storing video metadata. This will mean that several nodes can service queries for video data. Additionally, we can add a cache that will store accessed video metadata. This cache can store popular video metadata to avoid having to query the DB for it. The cache can be distributed, use the least-recently-used (LRU) strategy, and partitioned on videoId as well. The cache would be a faster way to retrieve data for popular videos and would insulate the DB a bit. 为了解决“热门”视频问题,我们可以考虑调整 Cassandra,使其将数据复制到几个可以分担存储视频元数据负担的节点上。这意味着多个节点可以处理视频数据的查询请求。此外,我们还可以添加一个缓存,用于存储访问过的视频元数据。该缓存可以存储热门视频的元数据,以避免每次都查询数据库。缓存可以分布式部署,使用最近最少使用(LRU)策略,并按视频 ID 进行分区。缓存将更快地检索热门视频的数据,并在一定程度上隔离数据库。
To address the streaming issues for users who might be far away from data centers with S3, we can consider employing CDNs. CDNs could cache popular video files (both segments and manifest files) and would be geographically proximate to many users via a network of edge servers. This would mean that video data would have to travel a significantly shorter distance to reach users, reducing slowness / buffering. Also, if all the data (manifest files and segments) was in the CDN, then the service would never need to interact with the backend at all to continue streaming a video. 为了应对可能远离数据中心的 S3 用户在流媒体方面的问题,我们可以考虑使用 CDN。CDN 可以缓存热门视频文件(包括片段和清单文件),并通过边缘服务器网络,使这些文件地理位置上接近许多用户。这意味着视频数据需要传输的距离会大大缩短,从而减少延迟/缓冲。此外,如果所有的数据(清单文件和片段)都在 CDN 中,那么服务在继续播放视频时就完全不需要与后端进行交互了。
Below is our final system, with the above enhancements applied: 下面是我们的最终系统,应用了上述改进:
How do we scale to a large number of videos uploaded / watched a day? 我们如何扩展以应对每天上传和观看的大量视频?
Some additional deep dives you might consider 你可能还需要进行一些更深入的研究
YouTube is a complex and interesting application, and it's hard to cover every possible consideration in this guide. Here are a few additional deep dives you might consider: YouTube 是一个复杂而有趣的应用程序,很难在这个指南中涵盖每一个可能的考虑因素。你可能还需要进行一些更深入的研究:
Speeding up uploads: Our design above assumes that the client uploads the entire video file first, before it's post-processed (segmented, transcoded, manifest files generated). To speed up uploads, we might consider "pipelining" the upload and post-processing. The client can segment the video and upload segments, and the backend can take those segments and immediately operate on them. This requires the client to play a role in video processing and could create "garbage" video segments if a video isn't fully uploaded, but on average, this would help improve the user experience and make uploads faster. 加速上传:我们的设计假设客户端会先上传整个视频文件,然后再进行后期处理(分割、转码、生成播放列表文件)。为了加速上传,我们可以考虑“流水线”处理,即客户端可以分割视频并上传片段,后端可以立即处理这些片段。这需要客户端参与视频处理,如果视频未完全上传,可能会产生“无效”的视频片段,但总体上这将有助于改善用户体验并加快上传速度。
Resume streaming where the user left off: A lot of applications offer the ability to resume watching a video where the user previously left off. This would require storing more data per user per video. 从用户上次观看的地方继续播放视频:许多应用程序提供了从用户上次离开的地方继续观看视频的功能。这需要为每个用户每个视频存储更多数据。
View counts: If the interviewer wishes to include more features in scope for the system, you might consider discussing the different options for maintaining video counts, either exact or estimated. This can easily be a dedicated deep-dive on its own. 播放次数:如果面试官希望在系统范围内增加更多功能,你可能需要考虑讨论不同选项以维护视频播放次数,无论是精确的还是估算的。这很容易成为一个单独深入探讨的主题。
Ok, that was a lot. You may be thinking, "how much of that is actually required from me in an interview?" Let’s break it down. 好了,这已经很多了。你可能会想,“在面试中我实际上需要完成多少内容?”让我们来分解一下。
Mid-level 中级
Breadth vs. Depth: A mid-level candidate will be mostly focused on breadth (80% vs 20%). You should be able to craft a high-level design that meets the functional requirements you've defined, but many of the components will be abstractions with which you only have surface-level familiarity. 宽度 vs. 深度:中级候选人将主要关注宽度(80% vs 20%)。你应该能够设计一个高层次的架构,满足你定义的功能需求,但许多组件只是你表面了解的抽象概念。
Probing the Basics: Your interviewer will spend some time probing the basics to confirm that you know what each component in your system does. For example, if you add an API Gateway, expect that they may ask you what it does and how it works (at a high level). In short, the interviewer is not taking anything for granted with respect to your knowledge. 探索基础知识:面试官会花一些时间探索基础知识,以确认你了解系统中每个组件的作用。例如,如果你添加了一个 API 网关,他们可能会问你它做什么以及它是如何工作的(大致上)。简而言之,面试官不会对你所掌握的知识有任何假设。
Mixture of Driving and Taking the Backseat: You should drive the early stages of the interview in particular, but the interviewer doesn’t expect that you are able to proactively recognize problems in your design with high precision. Because of this, it’s reasonable that they will take over and drive the later stages of the interview while probing your design. 驾驶与坐冷板凳的混合:你应该在面试的早期阶段主导讨论,但面试官并不期望你能精准地主动识别出设计中的问题。因此,他们很可能会在后期阶段接管并主导讨论,同时继续探索你的设计。
The Bar for YouTube: For this question, a mid-level candidate will have clearly defined the API endpoints and data model, landed on a high-level design that is functional for video upload / playback. I don't expect candidates to know in-depth information about video streaming, but the candidate should converge on ideas involving multipart upload and segment-based streaming. I also expect the candidate to understand the need to interface with a system like S3 directly for upload / streaming. I expect the candidate to drive some clarity about one relevant "deep dive" topic. YouTube 的门槛:对于这个问题,中级候选人需要明确定义 API 接口和数据模型,并确定一个适用于视频上传/播放的高层次设计。我不期望候选人对视频流传输有深入了解,但候选人应该集中在多部分上传和基于段的流传输的想法上。我也期望候选人理解直接与类似 S3 的系统接口进行上传/流传输的必要性。我期望候选人能够明确一个相关的“深入探讨”主题。
Senior 资深
Depth of Expertise: As a senior candidate, expectations shift towards more in-depth knowledge — about 60% breadth and 40% depth. This means you should be able to go into technical details in areas where you have hands-on experience. It's crucial that you demonstrate a deep understanding of key concepts and technologies relevant to the task at hand. 深度专长:作为资深候选人,期望转向更深入的知识——大约 60%的广度和 40%的深度。这意味着你应该能够在你有实践经验的领域深入到技术细节。你必须展示出对与手头任务相关的关键概念和技术的深刻理解。
Advanced System Design: You should be familiar with advanced system design principles (different technologies, their use-cases, how they fit together). Your ability to navigate these advanced topics with confidence and clarity is key. 高级系统设计:你应该熟悉高级系统设计原则(不同的技术,它们的应用场景,它们如何相互配合)。你能够自信且清晰地导航这些高级主题至关重要。
Articulating Architectural Decisions: You should be able to clearly articulate the pros and cons of different architectural choices, especially how they impact scalability, performance, and maintainability. You justify your decisions and explain the trade-offs involved in your design choices. 陈述架构决策:你应该能够清楚地阐述不同架构选择的优缺点,特别是它们如何影响可扩展性、性能和可维护性。你需要为你的设计选择进行辩护,并解释其中的权衡。
Problem-Solving and Proactivity: You should demonstrate strong problem-solving skills and a proactive approach. This includes anticipating potential challenges in your designs and suggesting improvements. You need to be adept at identifying and addressing bottlenecks, optimizing performance, and ensuring system reliability. 问题解决和主动性:你应该展示出强大的问题解决能力和主动的态度。这包括预见设计中可能出现的潜在挑战并提出改进措施。你需要擅长识别和解决瓶颈,优化性能,并确保系统的可靠性。
The Bar for YouTube: For this question, senior candidates are expected to quickly go through the initial high-level design so that they can spend time discussing, in detail, how to handle video post-processing and the details of uploads. I expect the candidate to know about multipart upload and discuss how it would be used to handle resumable uploads. I also expect the candidate to know how a video would be post-processed efficiently to create files to enable streaming in an adaptive way. YouTube 的标准:对于这个问题,高级候选人应该快速浏览初始的高层次设计,以便他们可以花时间详细讨论如何处理视频后处理以及上传的细节。我希望候选人了解多部分上传,并讨论它是如何用于处理可恢复上传的。我还希望候选人了解如何高效地处理视频以创建文件,从而实现按需流式传输。
Staff+
Emphasis on Depth: As a staff+ candidate, the expectation is a deep dive into the nuances of system design — I'm looking for about 40% breadth and 60% depth in your understanding. This level is all about demonstrating that, while you may not have solved this particular problem before, you have solved enough problems in the real world to be able to confidently design a solution backed by your experience. 重视深度:作为 staff+候选人,期望你对系统设计有深入的理解——我期望你有大约 40%的广度和 60%的深度。这个水平表明,虽然你可能还没有解决过这个问题,但你有足够的实际经验来自信地设计一个基于你经验的解决方案。
You should know which technologies to use, not just in theory but in practice, and be able to draw from your past experiences to explain how they’d be applied to solve specific problems effectively. The interviewer knows you know the small stuff (REST API, data normalization, etc.) so you can breeze through that at a high level so you have time to get into what is interesting. 你应该知道在实践中使用哪些技术,而不仅仅是理论知识,并能够从你的过去经验中解释它们如何有效地应用于解决特定问题。面试官知道你了解一些小知识(如 REST API、数据规范化等),所以你可以快速概述这些内容,以便有时间深入探讨有趣的部分。
High Degree of Proactivity: At this level, an exceptional degree of proactivity is expected. You should be able to identify and solve issues independently, demonstrating a strong ability to recognize and address the core challenges in system design. This involves not just responding to problems as they arise but anticipating them and implementing preemptive solutions. Your interviewer should intervene only to focus, not to steer. 极高的主动性:在这个水平上,预期有极高的主动性。你应该能够独立地识别和解决问题,展示出在系统设计中识别和解决核心挑战的强大能力。这不仅包括对问题的响应,还包括预见问题并实施预防性解决方案。只有在需要聚焦时,面试官才应介入,而不是引导。
Practical Application of Technology: You should be well-versed in the practical application of various technologies. Your experience should guide the conversation, showing a clear understanding of how different tools and systems can be configured in real-world scenarios to meet specific requirements. 技术的实际应用:你应该对各种技术的实际应用有深入的了解。你的经验应该引导对话,展示出对不同工具和系统如何在实际场景中配置以满足特定需求的清晰理解。
Complex Problem-Solving and Decision-Making: Your problem-solving skills should be top-notch. This means not only being able to tackle complex technical challenges but also making informed decisions that consider various factors such as scalability, performance, reliability, and maintenance. 复杂问题解决和决策制定:你的问题解决能力应该非常出色。这意味着不仅要能够解决复杂的技术挑战,还要能够做出基于各种因素(如可扩展性、性能、可靠性和维护性)的明智决策。
Advanced System Design and Scalability: Your approach to system design should be advanced, focusing on scalability and reliability, especially under high load conditions. This includes a thorough understanding of distributed systems, load balancing, caching strategies, and other advanced concepts necessary for building robust, scalable systems. 高级系统设计与可扩展性:你在系统设计方面的做法应该是高级的,重点是可扩展性和可靠性,尤其是在高负载条件下。这包括对分布式系统、负载均衡、缓存策略以及其他构建稳健、可扩展系统的高级概念有深入的理解。
The Bar for YouTube: For a staff-level candidate, expectations are high regarding the depth and quality of solutions, especially for the complex scenarios discussed earlier. Exceptional candidates delve deeply into each of the topics mentioned above and may even steer the conversation in a different direction, focusing extensively on a topic they find particularly interesting or relevant. They are also expected to possess a solid understanding of the trade-offs between various solutions and to be able to articulate them clearly, treating the interviewer as a peer. YouTube 的标准:对于高级候选人来说,对解决方案的深度和质量有很高的期望,尤其是在之前讨论的复杂场景方面。出色的候选人会深入探讨上述提到的每一个主题,并且可能会将讨论引向不同的方向,专注于他们特别感兴趣或相关的主题。他们还应该对各种解决方案之间的权衡有扎实的理解,并能够清晰地表达这些权衡,将面试官视为平级。
Login to mark as read 登录标记为已读
Not sure where your gaps are? 不确定你的短板在哪里?
Mock interview with an interviewer from your target company. Learn exactly what's standing in between you and your dream job. 与目标公司的面试官进行模拟面试。了解你与梦想工作之间的差距。
Your account is free and you can post anonymously if you choose. 你的账户是免费的,你可以选择匿名发帖。
Thanks for the detailed writeup.
Some companies (like Google), expect candidates to not rely too much on external solutions, S3 multipart upload in this case. It'd be great to have the writeup without too much reliance on the S3 multipart specific functionality (like events notifications and Lambda) and just call out what functional assumptions are we making on this file uploading framework. 谢谢详细的文档。
有些公司(如谷歌),期望候选人不要过多依赖外部解决方案,例如 S3 多部分上传。最好能在不依赖 S3 多部分上传特定功能(如事件通知和 Lambda)的情况下撰写文档,并明确我们在文件上传框架上做了哪些功能假设。
It'd also be ideal to explore 1 more alternative to S3 since they could ask about why S3 and not something else. A reasonable choice there could be tus.io 另外,最好再探索一个 S3 之外的替代方案,因为他们可能会问为什么不选择其他方案。一个合理的选项可以是 tus.io。
Love the callout to Temporal as the workflow orchestration choice. 提到 Temporal 作为工作流编排的选择很好。
Nit: there's a typo in "handleful of nodes" -> "handful of nodes" 指误:“handleful of nodes” -> “handful of nodes”
Should we also consider geographical sharding of the database since there could be videos which are more popular in some regions as compared to others? 我们是否也应该考虑数据库的地理分片,因为有些视频在某些地区比其他地区更受欢迎?
How would you handle these two cases? 你会如何处理这两种情况?
Resume streaming where the user left off -- would it only involve having a DB table where we keep a relationship between User and the video id and the timestamp they left off and based on the timestamp and the video segments created we calculate the segment # and the resolution to start streaming from by using the manifest files? 续播用户上次离开的地方——这是否只涉及一个数据库表,我们在这个表中保持用户和视频 ID 以及他们离开的时间戳之间的关系,然后根据时间戳和创建的视频片段,通过使用清单文件来计算要开始播放的片段号和分辨率?
Serve Video catalog, and it should be searchable with title names, last added date, and view counts -- how do we approach this? Do we really need to use ElasticSearch here? 提供视频目录,并且应该可以通过标题名称、最后添加日期和观看次数进行搜索——我们如何处理这个问题?我们真的需要在这里使用 ElasticSearch 吗?
looks very reasonable, I think there's a lot of different solutions. One extreme is what you suggest: the service should do an excellent job of tracking watch progress across the user's clients. That would argue for storage in a DB with a very long or infinite TTL. In the other extreme the client-side front end can store video->time pairs in the browser's cache, and if it gets lost, then oh well. A middle ground could be browser cache plus a Redis-like k-v store with a TTL on the order of days, and cleared when done watching. In other words do a simple best-effort approach to sync across devices assuming that people will finish watching within a day or two. Empirically YT doesn't have any recording of watch progress so doing something quick and dirty is probably okay. 看起来非常合理,我认为有很多不同的解决方案。一种极端情况是你们建议的服务应该很好地跟踪用户各客户端的观看进度。这会要求在具有非常长或无限 TTL 的数据库中存储数据。另一种极端情况是客户端前端可以在浏览器缓存中存储视频->时间对,如果丢失了也就算了。介于两者之间的情况可能是浏览器缓存加上一个类似 Redis 的键值存储,TTL 为几天,并在看完后清除。换句话说,假设人们在一天或两天内会看完,采取一个简单的尽力而为的方法来跨设备同步观看进度应该可以接受。实际上,YouTube 并没有记录观看进度,所以采取一个快速且简单的解决方案可能没问题。
I'm not strong on search but my guess is we'd indeed really need ElasticSearch or similar. Google has order 10B videos so making it all searchable at scale seems like a significant undertaking. I don't think there's time to do a deep dive into an ElasticSearch-like service in addition to all the other things that need to be discussed, so name-dropping "to make it searchable we'd probably want a separate search subsystem for indexing and serving queries, such as ElasticSearch" is probably the "right" answer. 我对搜索了解不多,但我觉得我们确实需要使用 ElasticSearch 或其他类似的服务。谷歌有大约 100 亿个视频,要在大规模下使其全部可搜索,这似乎是一项重大工程。我们没有时间在讨论其他所有需要讨论的事情之外,再深入研究类似 ElasticSearch 的服务,因此提到“为了使其可搜索,我们可能需要一个单独的搜索子系统来进行索引和查询服务,例如 ElasticSearch”可能是正确的答案。
In practice I think YT/Google have their own search infra for YT videos... but then again Google has it's own infra and software for pretty much everything!
It's common to generate the id client side and send it within the post request. This also ensures idempotency. It's just a random uuid, or cuid, etc. 通常会在客户端生成 id ,并在 POST 请求中一并发送。这也能确保幂等性。它只是一个随机的 uuid ,或 cuid 等。
Why are "Users" called out as a core entity in this case, when in other problems that's been called out as something that doesn't matter? It doesn't seem to fundamentally affect the design of the system, and we never end up seeing a data model for it 为什么在这里“用户”被列为一个核心实体,而在其他问题中却被称为不重要的东西?这似乎并不从根本上影响系统的架构设计,而且我们从未看到过它的数据模型
Not important either way to be honest. Include or don't, it's not what matters in these designs. Just don't spend a lot of time detailing an obvious user model. 实际上,无论是包含还是不包含都无关紧要。在这些设计中,重要的是不要花费太多时间详细描述一个显而易见的用户模型。
Thanks for the detail design, it is really helpful. Will it be possible to also some reference links to go through actual design of these products like youtube, google dropbox. To understand and relate and map how they are actually handling these non-functional requirements. 感谢你提供的详细设计,真的很有帮助。能否也提供一些参考链接,以便我们能够了解和参考 YouTube、Google Dropbox 等产品的实际设计,理解并关联它们是如何处理这些非功能性需求的。
For the sick of completeness and clearness, could you please help write down the revised/improved APIs. Particularly for the video upload API, it is two roundtrip (get presigned url first, perform chunking and upload, then inform the backend the chunking meta) or one roundtrip (chunking and all meta data ready before calling API to get presigned url)?
For resumable uploads it's assumed that the Client is doing the chunking akin to the Dropbox post. In the additional deep dives it's stated that the "client is uploading the entire video at once". I thought we would incorporate chunked uploads in the final design since it's a non-functional requirement.
Additionally, ~10GB+ payloads may not be supported across the upload stack. (or a pain regardless due to network / client failures)
When the author says "client is uploading the entire video at once" during deep dives, I believe they are referring to the fact that the post-processing step occurs after the video upload is fully completed. However, the video upload process itself still uses the chunked method.
Your account is free and you can post anonymously if you choose.
你的账户是免费的,你可以选择匿名发帖。
CommonSapphireShrew524
Awesome, thank you! I'm excited for the Auction Bidding one (presumably next)!
太棒了,谢谢! 我很期待那个拍卖出价的(可能是下一个)!
0
Evan King
Yup! Seems a lot of interest in that one.
没错!看来很多人都对那个很感兴趣。
0
AppropriateGreenHoverfly664
It would be great if we get the Auction one :). Thanks a lot for all your detailed videos
如果我们能得到那个拍卖的就太好了。非常感谢你所有的详细视频
0
Viresh 维瑞什
Thanks for the detailed writeup. Some companies (like Google), expect candidates to not rely too much on external solutions, S3 multipart upload in this case. It'd be great to have the writeup without too much reliance on the S3 multipart specific functionality (like events notifications and Lambda) and just call out what functional assumptions are we making on this file uploading framework.
谢谢详细的文档。 有些公司(如谷歌),期望候选人不要过多依赖外部解决方案,例如 S3 多部分上传。最好能在不依赖 S3 多部分上传特定功能(如事件通知和 Lambda)的情况下撰写文档,并明确我们在文件上传框架上做了哪些功能假设。
It'd also be ideal to explore 1 more alternative to S3 since they could ask about why S3 and not something else. A reasonable choice there could be tus.io
另外,最好再探索一个 S3 之外的替代方案,因为他们可能会问为什么不选择其他方案。一个合理的选项可以是 tus.io。
Love the callout to Temporal as the workflow orchestration choice.
提到 Temporal 作为工作流编排的选择很好。
Nit: there's a typo in "handleful of nodes" -> "handful of nodes"
指误:“handleful of nodes” -> “handful of nodes”
2
Evan King 艾 van King
Good call out! Fixing the typo now, thanks for flagging
提得好!现在修正了拼写错误,谢谢提醒
1
DevelopedWhiteSwift800 开发者 WhiteSwift800
Should we also consider geographical sharding of the database since there could be videos which are more popular in some regions as compared to others?
我们是否也应该考虑数据库的地理分片,因为有些视频在某些地区比其他地区更受欢迎?
0
krishan gopal 克里什纳·古帕尔
that partially we already doing via cdn
这部分我们已经通过 cdn 部分实现了
0
Manav Mittal
How would you handle these two cases?
你会如何处理这两种情况?
续播用户上次离开的地方——这是否只涉及一个数据库表,我们在这个表中保持用户和视频 ID 以及他们离开的时间戳之间的关系,然后根据时间戳和创建的视频片段,通过使用清单文件来计算要开始播放的片段号和分辨率?
提供视频目录,并且应该可以通过标题名称、最后添加日期和观看次数进行搜索——我们如何处理这个问题?我们真的需要在这里使用 ElasticSearch 吗?
2
core2extremist
looks very reasonable, I think there's a lot of different solutions. One extreme is what you suggest: the service should do an excellent job of tracking watch progress across the user's clients. That would argue for storage in a DB with a very long or infinite TTL. In the other extreme the client-side front end can store video->time pairs in the browser's cache, and if it gets lost, then oh well. A middle ground could be browser cache plus a Redis-like k-v store with a TTL on the order of days, and cleared when done watching. In other words do a simple best-effort approach to sync across devices assuming that people will finish watching within a day or two. Empirically YT doesn't have any recording of watch progress so doing something quick and dirty is probably okay.
看起来非常合理,我认为有很多不同的解决方案。一种极端情况是你们建议的服务应该很好地跟踪用户各客户端的观看进度。这会要求在具有非常长或无限 TTL 的数据库中存储数据。另一种极端情况是客户端前端可以在浏览器缓存中存储视频->时间对,如果丢失了也就算了。介于两者之间的情况可能是浏览器缓存加上一个类似 Redis 的键值存储,TTL 为几天,并在看完后清除。换句话说,假设人们在一天或两天内会看完,采取一个简单的尽力而为的方法来跨设备同步观看进度应该可以接受。实际上,YouTube 并没有记录观看进度,所以采取一个快速且简单的解决方案可能没问题。
I'm not strong on search but my guess is we'd indeed really need ElasticSearch or similar. Google has order 10B videos so making it all searchable at scale seems like a significant undertaking. I don't think there's time to do a deep dive into an ElasticSearch-like service in addition to all the other things that need to be discussed, so name-dropping "to make it searchable we'd probably want a separate search subsystem for indexing and serving queries, such as ElasticSearch" is probably the "right" answer.
我对搜索了解不多,但我觉得我们确实需要使用 ElasticSearch 或其他类似的服务。谷歌有大约 100 亿个视频,要在大规模下使其全部可搜索,这似乎是一项重大工程。我们没有时间在讨论其他所有需要讨论的事情之外,再深入研究类似 ElasticSearch 的服务,因此提到“为了使其可搜索,我们可能需要一个单独的搜索子系统来进行索引和查询服务,例如 ElasticSearch”可能是正确的答案。
In practice I think YT/Google have their own search infra for YT videos... but then again Google has it's own infra and software for pretty much everything!
0
FunnyPeachHare909
how does the video metadata have a videoId when we initially try to upload the video?
我们在最初尝试上传视频时,视频元数据是如何包含视频 ID 的?
0
Evan King
It's common to generate the
id
client side and send it within the post request. This also ensures idempotency. It's just a randomuuid
, orcuid
, etc.通常会在客户端生成
id
,并在 POST 请求中一并发送。这也能确保幂等性。它只是一个随机的uuid
,或cuid
等。0
SharpLimePiranha849
Why are "Users" called out as a core entity in this case, when in other problems that's been called out as something that doesn't matter? It doesn't seem to fundamentally affect the design of the system, and we never end up seeing a data model for it
为什么在这里“用户”被列为一个核心实体,而在其他问题中却被称为不重要的东西?这似乎并不从根本上影响系统的架构设计,而且我们从未看到过它的数据模型
0
Evan King
Not important either way to be honest. Include or don't, it's not what matters in these designs. Just don't spend a lot of time detailing an obvious user model.
实际上,无论是包含还是不包含都无关紧要。在这些设计中,重要的是不要花费太多时间详细描述一个显而易见的用户模型。
0
PositiveSilverFinch306
Thanks for the detail design, it is really helpful. Will it be possible to also some reference links to go through actual design of these products like youtube, google dropbox. To understand and relate and map how they are actually handling these non-functional requirements.
感谢你提供的详细设计,真的很有帮助。能否也提供一些参考链接,以便我们能够了解和参考 YouTube、Google Dropbox 等产品的实际设计,理解并关联它们是如何处理这些非功能性需求的。
0
Evan King
Google might be your best bet there :)
0
ExtensiveCoffeeSnipe503
For the sick of completeness and clearness, could you please help write down the revised/improved APIs. Particularly for the video upload API, it is two roundtrip (get presigned url first, perform chunking and upload, then inform the backend the chunking meta) or one roundtrip (chunking and all meta data ready before calling API to get presigned url)?
0
John Doe
For resumable uploads it's assumed that the Client is doing the chunking akin to the Dropbox post. In the additional deep dives it's stated that the "client is uploading the entire video at once". I thought we would incorporate chunked uploads in the final design since it's a non-functional requirement.
Additionally, ~10GB+ payloads may not be supported across the upload stack. (or a pain regardless due to network / client failures)
0
DustyTurquoiseCardinal816
yeah, those two seem to be conflicting with each other.
0
ValidLavenderDragon603
When the author says "client is uploading the entire video at once" during deep dives, I believe they are referring to the fact that the post-processing step occurs after the video upload is fully completed. However, the video upload process itself still uses the chunked method.
2
AYUSH GUPTA
Hi , Can anyone tell mw which website is being used to create the flowchart above?
0
Evan King
excalidraw
0