这是用户在 2025-3-5 9:13 为 https://milvus.io/api-reference/pymilvus/v2.4.x/DataImport/LocalBulkWriter/LocalBulkWriter.md 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

🚀 Try Zilliz Cloud, the fully managed Milvus, for free—experience 10x faster performance! Try Now>>
🚀 尝试 Zilliz Cloud,体验全托管的 Milvus,并享受高达 10 倍的性能提升!立即尝试>>

LocalBulkWriter

A LocalBulkWriter instance rewrites your raw data locally in a format that Milvus understands.
一个 LocalBulkWriter 实例会将你的原始数据以 Milvus 能理解的格式本地重新写入。

class pymilvus.LocalBulkWriter

Constructor  构造函数

Constructs a LocalBulkWriter object by schema, output path, segment size, and file type.
通过模式、输出路径、段大小和文件类型构造一个 LocalBulkWriter 对象。

notes  注释

A LocalBulkWriter object intends to rewrite your raw data locally in a format that Milvus understands.
一个 LocalBulkWriter 对象旨在以 Milvus 能理解的格式本地重写您的原始数据。

from pymilvus import CollectionSchema
from pymilvus.bulk_writer import LocalBulkWriter, BulkFileType

writer = LocalBulkWriter(
    schema=CollectionSchema(),
    local_path="string",
    chunk_size=512*1024*1024,
    file_type=BulkFileType.PARQUET
)

PARAMETERS:  参数:

  • schema (CollectionSchema) -
    schema (CollectionSchema) -

    [REQUIRED]  [ REQUIRED ]

    The schema of a target collection to which the rewritten data is to be imported. 

  • local_path (str) -

    [REQUIRED]  [ REQUIRED ]

    The path to the directory that is to hold the rewritten data.
    要存放重写数据的目录路径。

  • chunk_size (int) -

    The maximum size of a file segment.
    文件段的最大大小。

    While rewriting your raw data, Milvus splits your raw data into segments. 

    The value defaults to 536,870,912 in bytes, which is 512 MB. 

    how does bulkwriter segment my data? 

    The way BulkWriter segments your data varies with the target file type. 

    If the generated file exceeds the specified segment size, BulkWriter creates multiple files and names them in sequence numbers, each no larger than the segment size. 

  • file_type (BulkFileType) - 

    The type of the output file. 

    The value defaults to BulkFileType.PARQUET. 

    Possible options are BulkFileType.JSON and BulkFileType.PARQUET. 

RETURN TYPE: 

LocalBulkWriter 

RETURNS: 

A LocalBulkWriter object. 

EXCEPTIONS: 

  • SchemaNotReadyException 

    This exception will be raised when the provided schema is invalid. 

Properties 

  • uuid (str) - 

    A randomly generated UUID, used to name the output file or directory, with JSON, Parquet, and NumPy formats supported. 

  • data_path (pathlib.PosixPath) - 

    The path to the output directory. 

  • batch_files (str) - 

    A list of the generated file names. 

Methods 

The following are the methods of the LocalBulkWriter class: 

Try Managed Milvus for Free 

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster. 

Get Started 
Feedback 

Was this page helpful? 

127.0.0.1