Skip to content

read_avro_schema

jetliner.read_avro_schema(source, *, storage_options=None)

Read the schema from an Avro file without reading data.

Returns the Polars schema that would result from reading the file. Avro types are mapped to Polars types (e.g., Avro records become Structs, Avro arrays become Lists, Avro enums become Categorical).

Parameters:

Name Type Description Default
source str | Path | Sequence[str] | Sequence[Path]

Path to Avro file. If multiple files are provided, reads schema from the first file only.

required
storage_options dict[str, str] | None

Configuration for S3 connections. Supported keys:

  • endpoint: Custom S3 endpoint (for MinIO, LocalStack, R2, etc.)
  • aws_access_key_id: AWS access key (overrides environment)
  • aws_secret_access_key: AWS secret key (overrides environment)
  • region: AWS region (overrides environment)

Values here take precedence over environment variables.

None

Returns:

Type Description
Schema

The Polars schema corresponding to the Avro file's schema.

Examples:

>>> import jetliner
>>>
>>> # Read schema from local file
>>> schema = jetliner.read_avro_schema("data.avro")
>>> print(schema)
>>>
>>> # Read schema from S3
>>> schema = jetliner.read_avro_schema(
...     "s3://bucket/data.avro",
...     storage_options={"region": "us-west-2"}
... )
>>> print(schema.names())  # Column names
>>> print(schema.dtypes())  # Column types