Ticket #8081 (closed defect: wontfix)

Opened 2 years ago

Last modified 2 years ago

UUID identifiers from calibre don't follow URN standards

Reported by: StrahinjaMarkovic Owned by: kovidgoyal
Priority: default Milestone:
Component: Default Version: trunk
Keywords: Cc: kevin@…

Description

Calibre creates UUID identifiers without the "urn:uuid:" prefix, as the spec says it has to. This is a problem when those same identifiers are used as keys in Adobe's font obfuscation method.

The method expects the identifier to be specified in URN syntax:  http://www.adobe.com/content/dam/Adobe/en/devnet/digitalpublishing/pdfs/content_protection.pdf

RFC 2141 specifies URN syntax:  http://www.ietf.org/rfc/rfc2141.txt

Here's a highlight:  http://www.itu.int/ITU-T/asn1/uuid.html

At the bottom of that page: A URN (see IETF RFC 2141) formed using a UUID shall be the string "urn:uuid:" followed by the hexadecimal representation of a UUID. Example: The following is an example of the string representation of a UUID as a URN: urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6

The key is still the hex string without the prefix, but the prefix needs to be in the OPF ident.

Bottom line, Calibre creates for instance this: <dc:identifier id="uuid_id" opf:scheme="uuid">7b5969f9-295c-4840-a560-68e6dc0c7f31</dc:identifier>

instead of this: <dc:identifier id="uuid_id" opf:scheme="uuid">urn:uuid:7b5969f9-295c-4840-a560-68e6dc0c7f31</dc:identifier>

Attachments

8081.patch Download (2.7 KB) - added by lacqui 2 years ago.
Patch to fix

Change History

Changed 2 years ago by lacqui

Patch to fix

comment:1 Changed 2 years ago by lacqui

  • Cc kevin@… added

comment:2 Changed 2 years ago by kovidgoyal

I'm not actually convinced of the validity of this bug. The identifier is not a URN, it's just a UUID. As per the OPF spec  http://www.idpf.org/2007/opf/OPF_2.0_final_spec.html#Section2.2.10 there is no requirement that it be a URN. Indeed, given that the scheme specifically says it is a uuid and not a urn, I would say putting a URN into the field would be wrong.

Can you elaborate on the actual problem with the Adobe font obfuscation. As far as I recall epub files with obfuscated fonts generated by calibre work with ADE.

comment:3 Changed 2 years ago by StrahinjaMarkovic

The identifier is not a URN, it's just a UUID.

But the Adobe font obfuscation method expects an URN, not a UUID. From the Adobe PDF I linked: "Mangling key is a big-endian binary form (16 bytes) of the first UUID URN-based unique identifier [3] in the publication’s OPF file." It says URN. I also links to RFC 4122, “A Universally Unique Identifier (UUID) URN Namespace” in the References section. It's right there, black on white.

The OPF couldn't care less what you put in the <identifier> elements, but the Adobe font obfuscation method is quite clear on it. I don't see how "the first UUID URN-based unique identifier" could be interpreted as anything other than requiring a URN UUID.

comment:4 Changed 2 years ago by kovidgoyal

Yes, but does it actually need the urn prefix, as I said before, IIRC calibre generated epubs with obfuscated fonts work fine with ADE.

comment:5 Changed 2 years ago by StrahinjaMarkovic

If by "need" you mean "does it use the prefix as part of the key", then no, it does not use the prefix. The prefix is the prefix, it is there to denote that what comes after it is a valid UUID. That UUID is then used as the key.

The prefix is important because it identifies the UUID as such, otherwise any random junk that happens to match the format of an UUID could be interpreted as one. Also, it removes the need to pattern match strings against the UUID spec; if it says it's an URN UUID, then it is one.

It also leaves room for future expansion/modification of the spec where future UUIDs of a possibly different format would use a different prefix.

comment:6 Changed 2 years ago by kovidgoyal

  • Status changed from new to closed
  • Resolution set to wontfix

Wontfix, as IMO putting a urn into an identifier whose scheme is uuid is just wrong. And I am not going to change that to comply with guidelines from a company whose own software does not follow those guidelines.

Making this change would mean having to pattern match scheme="uuid" identifiers to check if they are actually URNs.

comment:7 Changed 2 years ago by StrahinjaMarkovic

IMO putting a urn into an identifier whose scheme is uuid is just wrong.

Opinions are one thing, published RFCs are another.

And I am not going to change that to comply with guidelines from a company whose own software does not follow those guidelines.

I don't know about previous versions of In Design?, but In Design? CS5 creates identifiers with the urn:uuid prefix.

In the end, I really don't care whether you fix this or not. I got a bug report about it on the Sigil tracker since Sigil was doing what the specs say and expecting an URN UUID (which don't exist in Calibre-generated epubs), but I've added it to my list of internal Calibre workarounds so to me it doesn't matter anymore.

I reported it because it was the right thing to do. You are of course free to do with this information as you wish. My conscience is clear.

Note: See TracTickets for help on using tickets.