翻译 - 介绍 S3A Committers

原文地址：

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/bk_cloud-data-access/content/ch03s08s01.html

As covered in the S3Guard section, Amazon's S3 Object Store is not a filesystem: some expected behaviors of one are missing. Specifically:

Directory listings are only eventually consistent.
File overwrites and deletes are only eventually consistent: readers may get old data.
There is no rename operation; it is mimicked in the S3A client by listing a directory and copying each file underneath, one-by-one.

作为覆盖S3Guard 功能，Amazon's S3 对象存储不是一个文件系统，一些希望的行为丢失了

目录类表最终一致性
文件覆盖和删除最终是一致的：可能会获取旧数据。
没有重命名操作；它被模拟，在S3A客户端，通过列出一个目录并逐个复制下面的每个文件。

Because directory rename is mimicked by listing and then copying files the eventual consistency of both listing and reading fails may result in incorrect data. And, because of the copying: it is slow.

因为目录从命名的操作是被模拟的，通过文件的复制和文件，因为最终一致（文件最终一致和读取失败），导致不正确的数据，因为复制的操作是慢的

S3Guard addresses the listing inconsistency problem. However, it does not address the update consistency or performance.

S3guard解决了列表不一致问题。但是，它不处理更新一致性或性能。

The normal means by which Hadoop MapReduce and Apache Spark commit work from multiple tasks is through renaming the output. Each task attempt writes to a private task attempt directory; when the task is given permission to commit by the MapReduce Application Master or Spark Driver, this task attempt directory is renamed into the job attempt directory. When the job is ready to commit, the output of all the tasks is merged into the final output directory, again by renaming files and directories.

HadoopMapReduce和ApacheSark从多个任务提交工作的正常方法是对输出结果重命名。每个任务尝试写入一个私有任务目录；当MapReduce应用程序或Spark Driver授予任务提交权限时，此任务目录将重命名为作业目录。当作业准备好提交时，所有任务的输出合并到最终输出目录，再重命名文件和目录

This is fast and safe on HDFS and similar filesystems, and on object stores with rename operations, such as Azure WASB. On S3, it is unreliable without S3Guard, and even with S3Guard, the the time to commit work to S3 is proportional to the amount of data written. An operation which may work well during development can turn out to be unusable in production.

对于HDFS 和 HDFS相似的文件系统，这是安全的和快速的，从命名的操作对于对象存储的系统，例如 Azure WASB 或者S3，在没有S3Guard的情况下，它是不可信赖的，甚至在有S3Guard的情况下，提交到S3的时间与写入的数据量成正比。这个操作在开发环境中运行的很好，但是在生产环境中可能发生不安全。

To address this the S3A committers were developed. They allow the output of MapReduce and Spark jobs to be written directly to S3, with a time to commit the job independent of the amount of data created.

为了解决这个问题，S3A committer被开发出来。它们允许将mapreduce和spark job的输出直接写入S3，而提交作业的时间与创建的数据量无关。

What Are the S3A Committers?

什么是S3A committers

The S3A committers are three different committers which can be used to commit work directly to Map-reduce and Spark. They differ in how they deal with conflict and how they upload data to the destination bucket —but underneath they all share much of the same code.

S3A committers 是3种不同的committers，

相同点是，Map-reduce 和 spark的job 可以直接被提交作业，不需要从命名

不同点是，他们处理冲突和上传文件的目的地。

但是在底层他们共享很多代码

They rely on a specific S3 feature: multipart upload of large files.

When writing a large object to S3, S3A and other S3 clients use a mechanism called “Multipart Upload”.

它们依赖于一个特定的S3特性：大文件的多部分上传。

当向S3、S3a和其他S3客户端写入大对象时，使用一种称为“多部分上传”的机制。

The caller initiates a “multipart upload request”, listing the destination path and receiving an upload ID to use in the upload operations.

The caller then uploads the data in a series of HTTP POST requests, closing with a final POST listing the blocks uploaded.

The uploaded data is not visible until that final POST request is made, at which point it is published in a single atomic operation.

调用者启动“多部分上传请求”，列出目标路径并接收上传操作中使用的上传ID。

调用者在一系列HTTP POST请求中上传数据，在最后一个POST将列出上传后关闭。

在发出最终的POST请求之前，上传的数据是不可见的，此时它在单个原子操作中发布。

This mechanism is always used by S3A whenever it writes large files; the size of each part is set to the value of fs.s3a.multipart.size

每当S3A写入大文件时，它总是使用这个机制；每个部分的大小都设置为fs.s3a.multipart.size的值。

The S3A Committers use the same multipart upload process, but convert it into a mechanism for committing the work of tasks because of a special feature of the mechanism: The final POST request does not need to be issued by the same process or host which uploaded the data.

S3A Committers使用相同的多部分上传处理，但由于该机制的一个特殊特性，将其转换为任务工作提交机制：最终的post请求不需要由上传数据的同一进程或主机发出。

The output of each worker task can be uploaded to S3 as a multipart upload, but without issuing the final POST request to complete the upload. Instead, all the information needed to issue that request is saved to a cluster-wide filesystem (HDFS or potentially S3 itself)

每个工作任务的输出可以作为多部分上传上传到S3，但不发出完成上传的最终post请求。相反，发出该请求所需的所有信息都保存到集群范围的文件系统中（HDF或可能是S3本身）。

When a job is committed, this information is loaded, and the upload completed. If a a task is aborted of fails: the upload is not completed, so the output does not become visible. If the entire job fails, none of its output becomes visible.

提交作业时，将加载此信息，并完成上载。如果任务中止失败：上载未完成，因此输出不可见。如果整个作业失败，则不会显示任何输出。

For further reading, see:

HADOOP-13786: Add S3A committers for zero-rename commits to S3 endpoints.
S3A Committers: Architecture and Implementation.
A Zero-Rename Committer: Object-storage as a Destination for Apache Hadoop and Spark.

Why Three Committers?

Why three different committers? Work was underway addressing the S3 commit problem with the “magic committer” when Netflix donated their own committers to this work. Their directory and partitioned committers, were in production use, and was gratefully accepted.

为什么有3个 committers？当Netflix将自己的 committer 捐赠给这项工作（Apache项目）时，这个工作正在开发“magic committer”解决S3提交问题。Netflix 的directory 和 partitioned 已经被生产中使用，并被很好地接受。

Directory Committer: Buffers working data to the local disk, uses HDFS to propagate commit information from tasks to job committer, and manages conflict across the entire destination directory tree.

Directory Committer: 缓冲工作数据到本地磁盘，使用HDF将提交信息从任务传播到作业提交者，并在整个目标目录树中管理冲突。

Partitioned Committer: Identical to the Directory committer except that conflict is managed on a partition-by-partition basis. This allows it to be used for in-place updates of existing datasets. It is only suitable for use with Spark.

Partitioned Committer: 相同的的目录committer，除了通过分区管理冲突，其他的与Directory Committer 相同。这个被允许原地更新存在的dataset，它仅仅合适被Spark 使用。

Magic Committer: Data is written directly to S3, but “magically” retargeted at the final destination. Conflict is managed across the directory tree. It requires a consistent S3 object store, which means S3Guard is a mandatory pre-requisite.

Magic Committer: 数据被直接的写入S3，但是“magically” 定位到最终的目标地址。在目录树中管理冲突。它需要一个一致的S3对象存储，这意味着S3guard是必需的先决条件。

We currently recommend use of the “Directory” committer: it is the simplest of the set, and by using HDFS to propagate data from workers to the job committer, does not directly require S3Guard – this makes it easier to set up. And, as Netflix have been using its predecessor in production, it can be considered more mature.

我们推荐使用 “Directory” committer这是最简单的设置，通过使用HDF将数据从工作人员传播到工作提交者，不直接需要S3guard——这使得设置更容易。 Netflix 已经使用它在生产环境中，他被判断更成熟。

The rest of the documentation only covers the Directory Committer: to explore the other options, consult the Apache documentation.

文档的其余部分只包含Directory Committer:：要了解其他选项，请参考Apache文档。

翻译 - 介绍 S3A Committers

猜你喜欢