Iceberg value_schema_latest mode #1068

kbatuigas · 2025-04-08T13:31:42Z

Description

PR to add to Cloud docs: redpanda-data/cloud-docs#260

This pull request introduces significant enhancements to Iceberg integration in Redpanda, including new documentation on supported Iceberg modes, updates to existing Iceberg-related pages, and improvements to the Schema Registry documentation. The changes aim to provide clearer guidance on configuring and using Iceberg modes, enhance usability, and ensure consistency across documentation.

Iceberg Integration Enhancements:

Added a new page, choose-iceberg-mode.adoc, detailing supported Iceberg modes (key_value, value_schema_id_prefix, value_schema_latest, and disabled), their configurations, and how they translate to table formats. This page provides examples and explains schema translation for Avro and Protobuf data.
Updated navigation in nav.adoc to include a link to the new "Choose Iceberg Mode" page.

Documentation Updates:

Revised the release notes in redpanda.adoc to list new features for Iceberg-enabled topics, such as custom partitioning, snapshot expiry, dead-letter queues, schema evolution, and structured Iceberg tables for Avro/Protobuf data without Schema Registry wire format.
Updated about-iceberg-topics.adoc to reflect changes in supported Iceberg modes and removed outdated details about custom partitioning. Added a cross-reference to the new "Choose Iceberg Mode" page. [1] [2] [3]
Modified query-iceberg-topics.adoc to reference the new "Choose Iceberg Mode" page for clarity on consuming Iceberg topics.

Schema Registry Documentation:

Expanded schema-reg-overview.adoc with a new section on serialization and deserialization, explaining the Schema Registry wire format and its role in message processing.

Resolves https://redpandadata.atlassian.net/browse/
Review deadline: 10 April

Page previews

Choose an Iceberg Mode

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

Summary by CodeRabbit

New Features
- Added documentation for a new Iceberg integration mode, value_schema_latest, enabling Iceberg table creation from the latest schema in the Schema Registry without requiring the wire format.
- Expanded documentation on handling Avro and Protobuf data in Iceberg topics, including support for structured tables without Schema Registry wire format or SerDes.
Documentation
- Updated navigation and added a new page detailing all supported Iceberg integration modes and their configurations.
- Improved and clarified release notes and topic property documentation to reflect new Iceberg features and modes.
- Enhanced Schema Registry documentation with a new section explaining the wire format for serialization and deserialization.
- Streamlined and clarified Iceberg documentation, removing redundant environment-specific instructions and improving references and formatting throughout.

netlify · 2025-04-08T13:31:58Z

✅ Deploy Preview for redpanda-docs-preview ready!

Name	Link
🔨 Latest commit	`b0d933c`
🔍 Latest deploy log	https://app.netlify.com/sites/redpanda-docs-preview/deploys/6802abb0eac6bb0008ea4057
😎 Deploy Preview	https://deploy-preview-1068--redpanda-docs-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

hyperlint-ai · 2025-04-08T13:32:21Z

PR Change Summary

Enhanced Iceberg integration documentation in Redpanda with a focus on the new value_subject_latest mode and schema integration usage.

Expanded the list of features supported by Iceberg-enabled topics, including custom partitioning and schema evolution.
Introduced detailed descriptions and examples for the new value_subject_latest mode.
Updated guidance on using value_schema_id_prefix and value_subject_latest modes with Spark SQL queries.
Clarified syntax and usage for the value_subject_latest mode, including optional key-value pairs.

Modified Files

modules/get-started/pages/release-notes/redpanda.adoc
modules/manage/partials/iceberg/about-iceberg-topics.adoc
modules/manage/partials/iceberg/query-iceberg-topics.adoc
modules/reference/pages/properties/topic-properties.adoc

How can I customize these reviews?

Check out the Hyperlint AI Reviewer docs for more information on how to customize the review.

If you just want to ignore it on this PR, you can add the hyperlint-ignore label to the PR. Future changes won't trigger a Hyperlint review.

Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add hyperlint-ignore to the PR to ignore the link check for this PR.

modules/reference/pages/properties/topic-properties.adoc

modules/manage/partials/iceberg/about-iceberg-topics.adoc

modules/reference/pages/properties/topic-properties.adoc

modules/manage/partials/iceberg/query-iceberg-topics.adoc

modules/reference/pages/properties/topic-properties.adoc

modules/manage/partials/iceberg/about-iceberg-topics.adoc

kbatuigas · 2025-04-08T14:26:38Z

modules/manage/partials/iceberg/query-iceberg-topics.adoc

@@ -43,7 +43,7 @@ endif::[]
 {"user_id": 2324, "event_type": "BUTTON_CLICK", "ts": "2024-11-25T20:23:59.380Z"}
 ----

-=== Topic with schema (`value_schema_id_prefix` mode)
+=== Topic with schema (`value_schema_id_prefix` or `value_schema_latest` mode)


Do we need a separate section for value_schema_latest? See https://deploy-preview-1068--redpanda-docs-preview.netlify.app/current/manage/iceberg/query-iceberg-topics/#topic-with-schema-value_schema_id_prefix-or-value_schema_latest-mode

yes because rpk only produces using the schema registry wire format and the other mode is how to do it without the wire format

Discussed with Tyler last week and agreed that a new section for value_schema_latest would be a nice to have for later if we want to demonstrate producing to a topic without using rpk

modules/get-started/pages/release-notes/redpanda.adoc

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

rockwotj

LGTM, a couple of small suggestions.

rockwotj · 2025-04-15T20:58:27Z

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

+
+=== value_schema_latest
+
+Creates an Iceberg table whose structure matches the latest schema registered for the subject in the Schema Registry. You must register a schema in the xref:manage:schema-reg/schema-reg-overview.adoc[Schema Registry]. Unlike the `value_schema_id_prefix` mode,  `value_schema_latest` does not require that producers use the wire format.


the latest schema is cached periodically. The cache period is defined by the cluster config iceberg_latest_schema_cache_ttl_ms which defaults to 5min

We don't have this config in our docs yet - we'll have to re-run our config script and double check that it gets pulled in.

rockwotj · 2025-04-15T20:59:50Z

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

+[[override-value-schema-latest-default]]
+=== Override `value_schema_latest` default
+
+In `value_schema_latest` mode, only the string `value_schema_latest` is required in the property value. This sets `value_schema_latest` mode to its default behavior, which derives the subject for the topic using xref:manage:schema-reg/schema-id-validation.adoc#set-subject-name-strategy-per-topic[TopicNameStrategy]. For Protobuf data, the default behavior also deserializes records using the first message within the corresponding Protobuf schema in the Schema Registry.


Worth while to give an example of TopicNamingStrategy: if your topic is named foo the schema is looked up in foo-value.

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

rockwotj · 2025-04-15T21:04:20Z

modules/manage/partials/iceberg/query-iceberg-topics.adoc

@@ -76,8 +76,7 @@ rpk registry schema create ClickEvent-value --schema path/to/schema.avsc --type
 echo '"key1" {"user_id":2324,"event_type":"BUTTON_CLICK","ts":"2024-11-25T20:23:59.380Z"}' | rpk topic produce ClickEvent --format='%k %v\n' --schema-id=topic
 ----
 +
-The `value_schema_id_prefix` requires that you produce to a topic using the Schema Registry wire format, which includes the magic byte and schema ID in the prefix of the message payload. This allows Redpanda to identify the correct schema version in the Schema Registry for a record. See the https://www.redpanda.com/blog/schema-registry-kafka-streaming#how-does-serialization-work-with-schema-registry-in-kafka[Understanding Apache Kafka Schema Registry^] blog post to learn more.
-
+The `value_schema_id_prefix` mode requires that you produce to a topic using the Schema Registry wire format, which includes the magic byte and schema ID in the prefix of the message payload. This allows Redpanda to identify the correct schema version in the Schema Registry for a record. 


link to examples like in the modes doc?

Added link to new section on wire format

Co-authored-by: Joyce Fee <[email protected]>

kbatuigas · 2025-04-16T22:28:34Z

modules/manage/pages/schema-reg/schema-reg-overview.adoc

-Redpanda Schema Registry uses the default port 8081. 
+Redpanda Schema Registry uses the default port 8081.
+
+== Serialization and deserialization


@rockwotj @mattschumpert Does this subheading make sense or does it need to specifically mention the wire format?

Calling it the wire format makes sense, because you can serialize/deserialize without it by having another mechanism to map a topic record to a schema: static mapping of topic to latest schema in your producer/consumer, communicating the schema ID using some other out of band mechanism (message header, control messages, etc).

Generally this is the "eco system standard" way of doing it.

modules/manage/pages/schema-reg/schema-reg-overview.adoc

rockwotj · 2025-04-17T05:27:25Z

modules/manage/pages/schema-reg/schema-reg-overview.adoc

+The wire format is a sequence of bytes consisting of the following:
+
+. The "magic byte," a single byte that always contains the value of 0.
+. A four-byte integer containing the schema ID.


technically for protobuf there is additionally a series of variants as well encoding which protobuf message in the protobuf schema was used. I don't feel strongly about if we need to call that out however.

modules/manage/pages/schema-reg/schema-reg-overview.adoc

Feediver1 · 2025-04-18T16:35:31Z

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

+
+Creates an Iceberg table whose structure matches the Redpanda schema for the topic, with columns corresponding to each field. You must register a schema in the xref:manage:schema-reg/schema-reg-overview.adoc[Schema Registry] and producers must write to the topic using the Schema Registry wire format.
+
+In the xref:manage:schema-reg/schema-reg-overview.adoc#serialization-and-deserialization[Schema Registry wire format], a "magic byte" and schema ID are embedded in the message payload header. Producers to the topic must use the wire format in the serialization process so Redpanda can determine the schema used for each record, use the schema to define the Iceberg table, and store the topic values in the corresponding table columns.


no def/link for "magic byte"?

Feediver1 · 2025-04-18T16:40:16Z

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

+)
+----
+
+Use `key_value` mode if the topic data is in JSON or if you can use the Iceberg data in its semi-structured format.


I had to read this sentence 3-4 times, and am not 100% clear on its meaning.
Use key_value mode if the topic data is in JSON, or if you can, use the Iceberg data in its semi-structured format.
Use key_value mode if the topic data is in JSON, or the Iceberg data in its semi-structured format.

Feediver1 · 2025-04-18T16:43:37Z

modules/manage/pages/schema-reg/schema-reg-overview.adoc

+
+The wire format is a sequence of bytes consisting of the following:
+
+. The "magic byte," a single byte that always contains the value of 0.


oh good--you defined it here. thx

Feediver1

Nice job Kat.

coderabbitai · 2025-04-18T18:09:29Z

Walkthrough

This update introduces a new documentation page detailing the supported Iceberg integration modes in Redpanda, updates navigation and cross-references to include this new content, and refines existing documentation to clarify the configuration and schema translation for Iceberg-enabled topics. The release notes and topic property references are expanded to enumerate new features and modes, including support for a new value_schema_latest mode. The Schema Registry documentation is enhanced with a thorough explanation of the wire format. Several sections are streamlined for clarity, removing redundant environment-specific instructions and reorganizing content for better readability.

Changes

File(s)	Change Summary
modules/manage/pages/iceberg/choose-iceberg-mode.adoc	Added a new documentation page explaining Iceberg integration modes, configuration, schema translation, and table format mappings for Redpanda topics.
modules/ROOT/nav.adoc	Added a navigation entry under "Iceberg" linking to the new "Choose Iceberg Mode" documentation.
modules/reference/pages/properties/topic-properties.adoc	Updated `redpanda.iceberg.mode` topic property documentation: expanded mode descriptions, added new `value_schema_latest` mode, clarified wire format requirements, and added visual separators.
modules/get-started/pages/release-notes/redpanda.adoc	Rewrote and expanded the Iceberg improvements section in release notes to enumerate features and add Avro/Protobuf support details.
modules/manage/pages/iceberg/query-iceberg-topics.adoc	Updated cross-reference to point to the new Iceberg mode documentation; removed a phrase for clarity.
modules/manage/pages/schema-reg/schema-reg-overview.adoc	Added a new "Wire format" section explaining the Schema Registry message format, serialization/deserialization process, and schema cache interactions.
modules/manage/partials/iceberg/about-iceberg-topics.adoc	Consolidated environment-specific instructions, removed the partitioning section, added the new Iceberg mode, and replaced schema translation details with a reference to the new documentation.
modules/manage/partials/iceberg/query-iceberg-topics.adoc	Clarified Iceberg table creation, improved explanation of schema registry wire format, updated references, and improved formatting.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Redpanda
    participant SchemaRegistry

    User->>Redpanda: Create or alter topic with redpanda.iceberg.mode
    alt value_schema_id_prefix mode
        User->>SchemaRegistry: Register schema (if needed)
        User->>Redpanda: Produce message with Schema Registry wire format
        Redpanda->>Redpanda: Parse message using schema ID from header
        Redpanda->>Iceberg: Map fields to table columns
    else value_schema_latest mode
        User->>SchemaRegistry: Register schema (if needed)
        User->>Redpanda: Produce message (no wire format required)
        Redpanda->>SchemaRegistry: Fetch latest schema for subject
        Redpanda->>Iceberg: Map fields to table columns
    else key_value mode
        User->>Redpanda: Produce message
        Redpanda->>Iceberg: Store key and value as columns
    else disabled mode
        User->>Redpanda: Produce message
        Redpanda->>Iceberg: Iceberg integration disabled
    end

Poem

In burrows deep, a change took root,
With modes for Iceberg, clear and astute.
No more confusion, the docs now gleam,
Four modes to choose, as smooth as cream.
Wire formats explained, schemas in tow—
This rabbit’s delighted to see knowledge grow!
🐇✨

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)

We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1136e9e and b0d933c.

📒 Files selected for processing (2)

modules/get-started/pages/release-notes/redpanda.adoc (1 hunks)
modules/manage/partials/iceberg/query-iceberg-topics.adoc (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

modules/get-started/pages/release-notes/redpanda.adoc
modules/manage/partials/iceberg/query-iceberg-topics.adoc

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Redirect rules - redpanda-docs-preview
GitHub Check: Header rules - redpanda-docs-preview
GitHub Check: Pages changed - redpanda-docs-preview

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (11)

modules/manage/partials/iceberg/query-iceberg-topics.adoc (1)
4-5: Simplify table naming description.

The sentence "Redpanda generates an Iceberg table that has the same name as the topic name." is wordy and repetitive. Consider refactoring to:
Redpanda generates an Iceberg table with the same name as the topic.
This improves readability.
modules/manage/partials/iceberg/about-iceberg-topics.adoc (2)

121-122: Link new modes to the detailed mode guide.

You’ve added value_schema_id_prefix and value_schema_latest modes here, but these entries lack cross‑references to the more detailed configuration and schema‑translation guidance on the new “Choose an Iceberg Mode” page. Consider xref‑linking each mode name to that page (e.g., xref:manage/iceberg/choose-iceberg-mode.adoc[value_schema_latest]).

139-139: Include link to Schema Registry doc.

The step to register a schema is clear, but you may want to xref the exact Schema Registry API or UI page (e.g., xref:manage:schema-reg/schema-reg-overview.adoc[Schema Registry wire format]) so users know where to go next.

modules/manage/pages/schema-reg/schema-reg-overview.adoc (4)

7-7: Consider relocating this paragraph.

The new sentence on message exchange sits just above the design overview. It might fit more naturally under the “Serialization format” section to maintain topical flow.

36-36: Clarify default port note.

You’ve added “Redpanda Schema Registry uses the default port 8081.” To highlight this, consider wrapping it in an AsciiDoc [NOTE] block for greater visibility.

50-56: Unify conditional serialization blocks.

The non‑cloud and cloud variants for the serializer description are identical except for minor naming differences. Consider merging them into one block using xref macros or a single conditional, to reduce duplication.

60-61: Use precise terminology for prefixing.

Instead of “pads the beginning of the message,” you may want to say “prepends the magic byte and schema ID to the message payload” to avoid ambiguity.

modules/manage/pages/iceberg/choose-iceberg-mode.adoc (4)

3-3: Trim page categories.

You’ve listed six categories—consider narrowing this to the most relevant (e.g., Iceberg and Integration) to avoid over‑categorization.

36-37: Clarify “message payload header.”

There’s no separate header wrapper—this is simply prefixed data. Consider rephrasing to “embedded at the start of the message payload” for accuracy.

42-43: Link the TTL configuration.

You mention iceberg_latest_schema_cache_ttl_ms—xref the cluster property reference (e.g., xref:reference/cluster-properties.adoc#iceberg_latest_schema_cache_ttl_ms) so users can find details on adjusting this TTL.

67-73: Merge override blocks for clarity.

The ifndef::env-cloud[] and ifdef::env-cloud[] sections are identical. Combining them—or moving the shared content outside the conditional—would simplify maintenance.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f3bed1 and 4e3fca9.

⛔ Files ignored due to path filters (1)

modules/shared/images/schema-registry-wire-format.png is excluded by !**/*.png

📒 Files selected for processing (8)

modules/ROOT/nav.adoc (1 hunks)
modules/get-started/pages/release-notes/redpanda.adoc (1 hunks)
modules/manage/pages/iceberg/choose-iceberg-mode.adoc (1 hunks)
modules/manage/pages/iceberg/query-iceberg-topics.adoc (1 hunks)
modules/manage/pages/schema-reg/schema-reg-overview.adoc (2 hunks)
modules/manage/partials/iceberg/about-iceberg-topics.adoc (3 hunks)
modules/manage/partials/iceberg/query-iceberg-topics.adoc (2 hunks)
modules/reference/pages/properties/topic-properties.adoc (4 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: Redirect rules - redpanda-docs-preview
GitHub Check: Header rules - redpanda-docs-preview
GitHub Check: Pages changed - redpanda-docs-preview

🔇 Additional comments (10)

modules/ROOT/nav.adoc (1)

184-184: Added Iceberg mode navigation entry looks correct.

The new Choose Iceberg Mode page link is properly placed under the Iceberg section and follows the existing indentation and naming conventions.

modules/manage/pages/iceberg/query-iceberg-topics.adoc (1)

10-10: Updated cross-reference to the new Iceberg mode page is accurate.

The paragraph now points to the choose-iceberg-mode.adoc page for mode details, which improves discoverability. The removal of the redundant phrase enhances clarity.
modules/manage/partials/iceberg/query-iceberg-topics.adoc (1)

79-80:
✅ Verification successful

Verify anchor for the Schema Registry wire format link.

Please confirm that the target anchor #wire-format exists in modules/manage/pages/schema-reg/schema-reg-overview.adoc. If the actual anchor differs (e.g., #serialization-and-deserialization), update the xref accordingly.

🏁 Script executed:
#!/bin/bash
# Verify the anchor for wire format in the Schema Registry overview
grep -R "wire-format" modules/manage/pages/schema-reg/schema-reg-overview.adoc
Length of output: 159

Continuing verification to locate the actual section anchor:

🏁 Script executed:
#!/bin/bash
# List all explicit Asciidoc anchors
grep -R "\[\[" -n modules/manage/pages/schema-reg/schema-reg-overview.adoc

# List all section headings
grep -R "^=+" -n modules/manage/pages/schema-reg/schema-reg-overview.adoc

# Search for serialization-related headings
grep -R "[sS]erialization" -n modules/manage/pages/schema-reg/schema-reg-overview.adoc

# Search for any occurrence of “format”
grep -R "format" -n modules/manage/pages/schema-reg/schema-reg-overview.adoc
Length of output: 5430
Anchor Confirmed – No Change Required

The == Wire format heading in modules/manage/pages/schema-reg/schema-reg-overview.adoc (line 38) auto‑generates the wire‑format anchor. The existing
xref:manage:schema-reg/schema-reg-overview.adoc#wire-format[Schema Registry wire format]
is therefore correct.
modules/manage/partials/iceberg/about-iceberg-topics.adoc (3)

22-22: Verify the manifest file format.

You’ve streamlined the manifest files section to state they are in JSON format. Please confirm that Redpanda’s Iceberg integration indeed emits JSON manifest files (the Iceberg spec defaults to Avro manifests).

136-137: Good addition: cross‑reference to the new guide.

Linking out to choose-iceberg-mode.adoc here helps users find the in‑depth mode explanations.

166-166: Validate the schema‑translation cross‑reference.

Ensure that the anchor #schema-types-translation exists in the target page (choose-iceberg-mode.adoc) so the link resolves correctly.

modules/manage/pages/schema-reg/schema-reg-overview.adoc (1)

38-47: Great addition of the wire-format section.

This new “Wire format” section clearly defines the magic byte and schema ID prefix. It fills the previous documentation gap regarding message framing.

modules/manage/pages/iceberg/choose-iceberg-mode.adoc (3)

1-5: File header looks good.

The title, description, and single-source tag are correctly set for this new page.

14-16: Clear introduction.

The link to the about-iceberg-topics page and the property description provides good context.

52-56: Verify CLI syntax.

Please confirm that the --topic-config= and --set flags shown here match the latest rpk version’s syntax, as these occasionally change between releases.

modules/get-started/pages/release-notes/redpanda.adoc

coderabbitai · 2025-04-18T18:14:53Z

modules/get-started/pages/release-notes/redpanda.adoc

+Iceberg-enabled topics now support the following:
+
+- xref:manage:iceberg/about-iceberg-topics.adoc#use-custom-partitioning[Custom partitioning] for improved query performance.
+- Snapshot expiry.
+- xref:manage:iceberg/about-iceberg-topics.adoc#manage-dead-letter-queue[Dead-letter queue] for invalid records.


🛠️ Refactor suggestion

Link snapshot expiry to its documentation anchor.

The bullet for Snapshot expiry is missing a cross-reference. Please add an xref to the corresponding section in about-iceberg-topics.adoc (e.g., #snapshot-expiry) for consistency with the other bullets.

modules/reference/pages/properties/topic-properties.adoc

coderabbitai · 2025-04-18T18:14:53Z

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

+| sfixed64 | int
+| string | string


coderabbitai · 2025-04-18T18:14:53Z

modules/manage/pages/iceberg/choose-iceberg-mode.adoc

+| string | string
+| record | struct
+| array | list
+| maps | list


⚠️ Potential issue

Fix Avro map mapping.

The table lists Avro maps → Iceberg list, but Iceberg supports native map types. Please update to:

-| maps | list +| maps | map

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| maps | list

| maps | map

mattschumpert · 2025-04-18T18:25:20Z

modules/manage/pages/iceberg/query-iceberg-topics.adoc

@@ -7,7 +7,7 @@
 include::shared:partial$enterprise-license.adoc[]
 ====

-When you access Iceberg topics from a data lakehouse or other Iceberg-compatible tools, how you consume the data depends on the topic xref:manage:iceberg/about-iceberg-topics.adoc#enable-iceberg-integration[Iceberg mode] and whether you've registered a schema for the topic in the xref:manage:schema-reg/schema-reg-overview.adoc[Redpanda Schema Registry]. In either mode, you do not need to rely on complex ETL jobs or pipelines to access real-time data from Redpanda.
+When you access Iceberg topics from a data lakehouse or other Iceberg-compatible tools, how you consume the data depends on the topic xref:manage:iceberg/choose-iceberg-mode.adoc[Iceberg mode] and whether you've registered a schema for the topic in the xref:manage:schema-reg/schema-reg-overview.adoc[Redpanda Schema Registry]. You do not need to rely on complex ETL jobs or pipelines to access real-time data from Redpanda.


@kbatuigas this page has examples using the other mode but no mention of "value_schema_latest" mode at all. Even if we don't have an example it's probably worth a mention how querying works in this mode (essentially the same as with value_schema_id_prefix)

kbatuigas requested a review from a team as a code owner April 8, 2025 13:31

hyperlint-ai bot reviewed Apr 8, 2025

View reviewed changes

modules/reference/pages/properties/topic-properties.adoc Outdated Show resolved Hide resolved

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/manage/partials/iceberg/about-iceberg-topics.adoc Outdated Show resolved Hide resolved

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/manage/partials/iceberg/about-iceberg-topics.adoc Outdated Show resolved Hide resolved

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/reference/pages/properties/topic-properties.adoc Outdated Show resolved Hide resolved

kbatuigas requested review from rockwotj and mattschumpert April 8, 2025 13:51

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/reference/pages/properties/topic-properties.adoc Outdated Show resolved Hide resolved

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/manage/partials/iceberg/query-iceberg-topics.adoc Outdated Show resolved Hide resolved

kbatuigas changed the title ~~Iceberg value_subject_latest mode~~ Iceberg value_schema_latest mode Apr 8, 2025

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/reference/pages/properties/topic-properties.adoc Outdated Show resolved Hide resolved

kbatuigas commented Apr 8, 2025

View reviewed changes

modules/manage/partials/iceberg/about-iceberg-topics.adoc Outdated Show resolved Hide resolved

kbatuigas commented Apr 8, 2025

View reviewed changes

kbatuigas added 13 commits April 9, 2025 15:29

Add value_subject_latest mode

98ea0e3

Standardize Iceberg topic properties with the rest of the page

7efa089

Fix xref

edea27d

Clarify that key_value doesn't use a schema

b1cd821

Style edit per automated review

a6519d0

Name change

bb13c76

Refactor Iceberg mode content

bc50015

Cross-reference new page

bdebe7e

Refactor Iceberg mode and partitioning property

7a9d836

Add standalone doc to nav tree

b6914f5

Edits for clarity

a29f2c4

Rephrase explicit table creation

b5e26af

Merge branch 'main' into DOC-1140-Document-value_latest_schema-mode

1b50f17

Feediver1 reviewed Apr 14, 2025

View reviewed changes

modules/get-started/pages/release-notes/redpanda.adoc Outdated Show resolved Hide resolved

Feediver1 reviewed Apr 14, 2025

View reviewed changes

modules/manage/pages/iceberg/choose-iceberg-mode.adoc Outdated Show resolved Hide resolved

rockwotj reviewed Apr 15, 2025

View reviewed changes

kbatuigas and others added 4 commits April 16, 2025 18:01

Add wire format to Schema Reg docs and apply review suggestions

172eebf

Move schema types translation to new subsection in Modes doc

756eda7

Add more cross references

3e4e2ef

Update modules/manage/pages/iceberg/choose-iceberg-mode.adoc

9759e8d

Co-authored-by: Joyce Fee <[email protected]>

kbatuigas commented Apr 16, 2025

View reviewed changes

modules/manage/pages/schema-reg/schema-reg-overview.adoc Outdated Show resolved Hide resolved

kbatuigas requested a review from rockwotj April 16, 2025 22:32

kbatuigas mentioned this pull request Apr 16, 2025

Add new Iceberg modes doc in Cloud redpanda-data/cloud-docs#260

Merged

4 tasks

Example of TopicNameStrategy

e9c6743

kbatuigas commented Apr 16, 2025

View reviewed changes

modules/manage/pages/schema-reg/schema-reg-overview.adoc Show resolved Hide resolved

rockwotj approved these changes Apr 17, 2025

View reviewed changes

Additional edits per SME review

f8b6619

kbatuigas requested a review from Feediver1 April 18, 2025 15:16

Feediver1 reviewed Apr 18, 2025

View reviewed changes

Feediver1 approved these changes Apr 18, 2025

View reviewed changes

Apply suggestions from review

4e3fca9

coderabbitai bot reviewed Apr 18, 2025

View reviewed changes

mattschumpert approved these changes Apr 18, 2025

View reviewed changes

kbatuigas added 2 commits April 18, 2025 15:32

Add note about querying in value_schema_latest mode

1136e9e

Apply suggestions from automated review

b0d933c

kbatuigas merged commit e68ee61 into main Apr 18, 2025
8 checks passed

kbatuigas deleted the DOC-1140-Document-value_latest_schema-mode branch April 18, 2025 19:49

coderabbitai bot mentioned this pull request Jun 13, 2025

Partition evolution supported in Unity Catalog #1163

Merged

4 tasks

coderabbitai bot mentioned this pull request Jul 24, 2025

Iceberg updates for REST catalog and wire format compatible modes #1238

Merged

4 tasks

coderabbitai bot mentioned this pull request Aug 6, 2025

docs: fix property descritption #1289

Merged

4 tasks

coderabbitai bot mentioned this pull request Aug 22, 2025

Topic props #1337

Open

4 tasks


		=== value_schema_latest

		Creates an Iceberg table whose structure matches the latest schema registered for the subject in the Schema Registry. You must register a schema in the xref:manage:schema-reg/schema-reg-overview.adoc[Schema Registry]. Unlike the `value_schema_id_prefix` mode, `value_schema_latest` does not require that producers use the wire format.


		Creates an Iceberg table whose structure matches the Redpanda schema for the topic, with columns corresponding to each field. You must register a schema in the xref:manage:schema-reg/schema-reg-overview.adoc[Schema Registry] and producers must write to the topic using the Schema Registry wire format.

		In the xref:manage:schema-reg/schema-reg-overview.adoc#serialization-and-deserialization[Schema Registry wire format], a "magic byte" and schema ID are embedded in the message payload header. Producers to the topic must use the wire format in the serialization process so Redpanda can determine the schema used for each record, use the schema to define the Iceberg table, and store the topic values in the corresponding table columns.


		The wire format is a sequence of bytes consisting of the following:

		. The "magic byte," a single byte that always contains the value of 0.

Iceberg value_schema_latest mode #1068

Iceberg value_schema_latest mode #1068

Uh oh!

Conversation

kbatuigas commented Apr 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Iceberg Integration Enhancements:

Documentation Updates:

Schema Registry Documentation:

Page previews

Checks

Summary by CodeRabbit

Uh oh!

netlify bot commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for redpanda-docs-preview ready!

Uh oh!

hyperlint-ai bot commented Apr 8, 2025

PR Change Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rockwotj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Feediver1 left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

kbatuigas commented Apr 8, 2025 •

edited by coderabbitai bot

Loading

netlify bot commented Apr 8, 2025 •

edited

Loading

coderabbitai bot commented Apr 18, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)