Skip to content

sam: Clarify the usage of UTF-8 characters in header #719

@zaeleus

Description

@zaeleus

This is in regard to Sequence Alignment/Map Format Specification (2022-08-22).

§ 1.3 "The header section" defines patterns for header lines:

Thus header lines match /^@(HD|SQ|RG|PG)(\t[A-Za-z][A-Za-z0-9]:[ -~]+)+$/ or /^@CO\t.*/.

This invalidates the following test examples:

The text "UTF-8 encoding may be used" for the CL and DS fields does not remove the character set constraint. It also remains arbitrary as to why only some fields have this definition.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Progressing

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions