read_avro_schema¶
jetliner.read_avro_schema(source, *, storage_options=None)
¶
Read the schema from an Avro file without reading data.
Returns the Polars schema that would result from reading the file. Avro types are mapped to Polars types (e.g., Avro records become Structs, Avro arrays become Lists, Avro enums become Categorical).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | Sequence[str] | Sequence[Path]
|
Path to Avro file. If multiple files are provided, reads schema from the first file only. |
required |
storage_options
|
dict[str, str] | None
|
Configuration for S3 connections. Supported keys:
Values here take precedence over environment variables. |
None
|
Returns:
| Type | Description |
|---|---|
Schema
|
The Polars schema corresponding to the Avro file's schema. |
Examples:
>>> import jetliner
>>>
>>> # Read schema from local file
>>> schema = jetliner.read_avro_schema("data.avro")
>>> print(schema)
>>>
>>> # Read schema from S3
>>> schema = jetliner.read_avro_schema(
... "s3://bucket/data.avro",
... storage_options={"region": "us-west-2"}
... )
>>> print(schema.names()) # Column names
>>> print(schema.dtypes()) # Column types