rfc9627v1.txt | rfc9627.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) J. Lennox | Internet Engineering Task Force (IETF) J. Lennox | |||
Request for Comments: 9627 D. Hong | Request for Comments: 9627 8x8 / Jitsi | |||
Category: Standards Track Vidyo | Category: Standards Track D. Hong | |||
ISSN: 2070-1721 J. Uberti | ISSN: 2070-1721 Google | |||
J. Uberti | ||||
OpenAI | ||||
S. Holmer | S. Holmer | |||
M. Flodman | M. Flodman | |||
August 2024 | February 2025 | |||
The Layer Refresh Request (LRR) RTCP Feedback Message | The Layer Refresh Request (LRR) RTCP Feedback Message | |||
Abstract | Abstract | |||
This memo describes the RTCP Payload-Specific Feedback Message Layer | This memo describes the RTCP Payload-Specific Feedback Message Layer | |||
Refresh Request (LRR), which can be used to request a state refresh | Refresh Request (LRR), which can be used to request a state refresh | |||
of one or more substreams of a layered media stream. It also defines | of one or more substreams of a layered media stream. This document | |||
its use with several RTP payloads for scalable media formats. | also defines its use with several RTP payloads for scalable media | |||
formats. | ||||
Status of This Memo | Status of This Memo | |||
This is an Internet Standards Track document. | This is an Internet Standards Track document. | |||
This document is a product of the Internet Engineering Task Force | This document is a product of the Internet Engineering Task Force | |||
(IETF). It represents the consensus of the IETF community. It has | (IETF). It represents the consensus of the IETF community. It has | |||
received public review and has been approved for publication by the | received public review and has been approved for publication by the | |||
Internet Engineering Steering Group (IESG). Further information on | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | Internet Standards is available in Section 2 of RFC 7841. | |||
Information about the current status of this document, any errata, | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | and how to provide feedback on it may be obtained at | |||
https://www.rfc-editor.org/info/rfc9627. | https://www.rfc-editor.org/info/rfc9627. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2024 IETF Trust and the persons identified as the | Copyright (c) 2025 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Revised BSD License text as described in Section 4.e of the | include Revised BSD License text as described in Section 4.e of the | |||
Trust Legal Provisions and are provided without warranty as described | Trust Legal Provisions and are provided without warranty as described | |||
skipping to change at line 58 ¶ | skipping to change at line 61 ¶ | |||
Table of Contents | Table of Contents | |||
1. Introduction | 1. Introduction | |||
2. Conventions and Terminology | 2. Conventions and Terminology | |||
2.1. Terminology | 2.1. Terminology | |||
3. Layer Refresh Request | 3. Layer Refresh Request | |||
3.1. Message Format | 3.1. Message Format | |||
3.2. Semantics | 3.2. Semantics | |||
4. Usage with Specific Codecs | 4. Usage with Specific Codecs | |||
4.1. H264 SVC | 4.1. H.264 SVC | |||
4.2. VP8 | 4.2. VP8 | |||
4.3. H265 | 4.3. H.265 | |||
5. Usage with Different Scalability Transmission Mechanisms | 5. Usage with Different Scalability Transmission Mechanisms | |||
6. SDP Definitions | 6. SDP Definitions | |||
7. Security Considerations | 7. Security Considerations | |||
8. IANA Considerations | 8. IANA Considerations | |||
9. References | 9. References | |||
9.1. Normative References | 9.1. Normative References | |||
9.2. Informative References | 9.2. Informative References | |||
Authors' Addresses | Authors' Addresses | |||
1. Introduction | 1. Introduction | |||
This memo describes an RTCP [RFC3550] Payload-Specific Feedback | This memo describes an RTCP [RFC3550] Payload-Specific Feedback | |||
Message [RFC4585] Layer Refresh Request (LRR). It is designed to | Message [RFC4585] "Layer Refresh Request" (LRR), which is designed to | |||
allow a receiver of a layered media stream to request that one or | allow a receiver of a layered media stream to request that one or | |||
more of its substreams be refreshed such that it can then be decoded | more of its substreams be refreshed. The stream can then be decoded | |||
by an endpoint that previously was not receiving those layers, | by an endpoint that previously was not receiving those layers, | |||
without requiring that the entire stream be refreshed (as it would be | without requiring that the entire stream be refreshed (as it would be | |||
if the receiver sent a Full Intra Request (FIR) [RFC5104]; see also | if the receiver sent a Full Intra Request (FIR) [RFC5104]; see also | |||
[RFC8082]). | [RFC8082]). | |||
The feedback message is applicable to both temporally and spatially | The feedback message is applicable to both temporally and spatially | |||
scaled streams and to both single-stream and multi-stream scalability | scaled streams and to both single-stream and multi-stream scalability | |||
modes. | modes. | |||
2. Conventions and Terminology | 2. Conventions and Terminology | |||
skipping to change at line 103 ¶ | skipping to change at line 106 ¶ | |||
2.1. Terminology | 2.1. Terminology | |||
A "layer refresh point" is a point in a scalable stream after which a | A "layer refresh point" is a point in a scalable stream after which a | |||
decoder, which previously had been able to decode only some (possibly | decoder, which previously had been able to decode only some (possibly | |||
none) of the available layers of stream, is able to decode a greater | none) of the available layers of stream, is able to decode a greater | |||
number of the layers. | number of the layers. | |||
For spatial (or quality) layers, in normal encoding, a subpicture can | For spatial (or quality) layers, in normal encoding, a subpicture can | |||
depend both on earlier pictures of that spatial layer and also on | depend both on earlier pictures of that spatial layer and also on | |||
lower-layer pictures of the current picture. However, a layer | lower-layer pictures of the current picture. However, a layer | |||
refresh typically requires that a spatial layer picture be encoded in | refresh typically requires that a spatial-layer picture be encoded in | |||
a way that references only the lower-layer subpictures of the current | a way that references only the lower-layer subpictures of the current | |||
picture, not any earlier pictures of that spatial layer. | picture, not any earlier pictures of that spatial layer. | |||
Additionally, the encoder must promise that no earlier pictures of | Additionally, the encoder must promise that no earlier pictures of | |||
that spatial layer will be used as reference in the future. | that spatial layer will be used as reference in the future. | |||
However, even in a layer refresh, layers other than the ones being | However, even in a layer refresh, layers other than the ones being | |||
refreshed may still maintain dependency on earlier content of the | refreshed may still maintain dependency on earlier content of the | |||
stream. This is the difference between a layer refresh and a FIR | stream. This is the difference between a layer refresh and an FIR | |||
[RFC5104]. This minimizes the coding overhead of refresh to only | [RFC5104]. This minimizes the coding overhead of refresh to only | |||
those parts of the stream that actually need to be refreshed at any | those parts of the stream that actually need to be refreshed at any | |||
given time. | given time. | |||
The spatial layer refresh of an enhancement layer is shown below. | The spatial-layer refresh of an enhancement layer is shown below. | |||
The "<--" indicates a coding dependency. | The "<--" indicates a coding dependency. | |||
... <-- S1 <-- S1 S1 <-- S1 <-- ... | ... <-- S1 <-- S1 S1 <-- S1 <-- ... | |||
| | | | | | | | | | |||
\/ \/ \/ \/ | \/ \/ \/ \/ | |||
... <-- S0 <-- S0 <-- S0 <-- S0 <-- ... | ... <-- S0 <-- S0 <-- S0 <-- S0 <-- ... | |||
1 2 3 4 | 1 2 3 4 | |||
Figure 1 | Figure 1: Refresh of a Spatial Enhancement Layer | |||
In Figure 1, frame 3 is a layer refresh point for spatial layer S1; a | In Figure 1, frame 3 is a layer refresh point for spatial layer S1; a | |||
decoder that had previously only been decoding spatial layer S0 would | decoder that had previously only been decoding spatial layer S0 would | |||
be able to decode layer S1 starting at frame 3. | be able to decode layer S1 starting at frame 3. | |||
The spatial layer refresh of a base layer is shown below. The "<--" | The spatial-layer refresh of a base layer is shown below. The "<--" | |||
indicates a coding dependency. | indicates a coding dependency. | |||
... <-- S1 <-- S1 <-- S1 <-- S1 <-- ... | ... <-- S1 <-- S1 <-- S1 <-- S1 <-- ... | |||
| | | | | | | | | | |||
\/ \/ \/ \/ | \/ \/ \/ \/ | |||
... <-- S0 <-- S0 S0 <-- S0 <-- ... | ... <-- S0 <-- S0 S0 <-- S0 <-- ... | |||
1 2 3 4 | 1 2 3 4 | |||
Figure 2 | Figure 2: Refresh of a Spatial Base Layer | |||
In Figure 2, frame 3 is a layer refresh point for spatial layer S0; a | In Figure 2, frame 3 is a layer refresh point for spatial layer S0; a | |||
decoder that had previously not been decoding the stream at all could | decoder that had previously not been decoding the stream at all could | |||
decode layer S0 starting at frame 3. | decode layer S0 starting at frame 3. | |||
For temporal layers, while normal encoding allows frames to depend on | For temporal layers, while normal encoding allows frames to depend on | |||
earlier frames of the same temporal layer, layer refresh requires | earlier frames of the same temporal layer, layer refresh requires | |||
that the layer be "temporally nested", i.e., use as reference only | that the layer be "temporally nested", i.e., use as reference only | |||
earlier frames of a lower temporal layer, not any earlier frames of | earlier frames of a lower temporal layer, not any earlier frames of | |||
this temporal layer and promise that no future frames of this | this temporal layer and promise that no future frames of this | |||
skipping to change at line 168 ¶ | skipping to change at line 171 ¶ | |||
The temporal layer refresh is shown below. The "<--" indicates a | The temporal layer refresh is shown below. The "<--" indicates a | |||
coding dependency. | coding dependency. | |||
... <----- T1 <------ T1 T1 <------ ... | ... <----- T1 <------ T1 T1 <------ ... | |||
/ / / | / / / | |||
|_ |_ |_ | |_ |_ |_ | |||
... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | |||
1 2 3 4 5 6 7 | 1 2 3 4 5 6 7 | |||
Figure 3 | Figure 3: Refresh of a Temporal Layer | |||
In Figure 3, frame 6 is a layer refresh point for temporal layer T1; | In Figure 3, frame 6 is a layer refresh point for temporal layer T1; | |||
a decoder that had previously only been decoding temporal layer T0 | a decoder that had previously only been decoding temporal layer T0 | |||
would be able to decode layer T1 starting at frame 6. | would be able to decode layer T1 starting at frame 6. | |||
An inherently temporally nested stream is shown below. The "<--" | An inherently temporally nested stream is shown below. The "<--" | |||
indicates a coding dependency. | indicates a coding dependency. | |||
T1 T1 T1 | T1 T1 T1 | |||
/ / / | / / / | |||
|_ |_ |_ | |_ |_ |_ | |||
... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | |||
1 2 3 4 5 6 7 | 1 2 3 4 5 6 7 | |||
Figure 4 | Figure 4: An Inherently Temporally Nested Stream | |||
In Figure 4, the stream is temporally nested in its ordinary | In Figure 4, the stream is temporally nested in its ordinary | |||
structure; a decoder receiving layer T0 can begin decoding layer T1 | structure; a decoder receiving layer T0 can begin decoding layer T1 | |||
at any point. | at any point. | |||
A "layer index" is a numeric label for a specific spatial and | A "layer index" is a numeric label for a specific spatial and | |||
temporal layer of a scalable stream. It consists of both a "temporal | temporal layer of a scalable stream. It consists of both a | |||
ID" identifying the temporal layer and a "layer ID" identifying the | "temporal-layer ID" identifying the temporal layer and a "layer ID" | |||
spatial or quality layer. The details of how layers of a scalable | identifying the spatial or quality layer. The details of how layers | |||
stream are labeled are codec specific. Details for several codecs | of a scalable stream are labeled are codec specific. Details for | |||
are defined in Section 4. | several codecs are defined in Section 4. | |||
3. Layer Refresh Request | 3. Layer Refresh Request | |||
A layer refresh frame can be requested by sending a Layer Refresh | A layer refresh frame can be requested by sending a Layer Refresh | |||
Request (LRR), which is an RTCP [RFC3550] payload-specific feedback | Request (LRR), which is an RTCP [RFC3550] payload-specific feedback | |||
message [RFC4585] asking the encoder to encode a frame that makes it | message [RFC4585] asking the encoder to encode a frame that makes it | |||
possible to upgrade to a higher layer. The LRR contains one or two | possible to upgrade to a higher layer. The LRR contains one or two | |||
tuples, indicating the temporal and spatial layer the decoder wants | tuples, indicating the temporal and spatial layer the decoder wants | |||
to upgrade to and (optionally) the currently highest temporal and | to upgrade to and (optionally) the currently highest temporal and | |||
spatial layer the decoder can decode. | spatial layer the decoder can decode. | |||
The specific format of the tuples, and the mechanism by which a | The specific format of the tuples, and the mechanism by which a | |||
receiver recognizes a refresh frame, is codec dependent. Usage for | receiver recognizes a refresh frame, is codec dependent. Usage for | |||
several codecs is discussed in Section 4. | several codecs is discussed in Section 4. | |||
An LRR follows the FIR model (Section 3.5.1 of [RFC5104]) for its | The design of LRR follows the FIR model (Section 3.5.1 of [RFC5104]) | |||
retransmission, reliability, and use in multipoint conferences. | for its retransmission, reliability, and use in multipoint | |||
conferences. | ||||
The LRR message is identified by RTCP packet type value PT=PSFB and | The LRR message is identified by RTCP packet type value PT=PSFB and | |||
FMT=10. The Feedback Control Information (FCI) field MUST contain | FMT=10. The Feedback Control Information (FCI) field MUST contain | |||
one or more LRR entries. Each entry applies to a different media | one or more LRR entries. Each entry applies to a different media | |||
sender, identified by its Synchronization Source (SSRC). | sender, identified by its Synchronization Source (SSRC). | |||
3.1. Message Format | 3.1. Message Format | |||
The FCI for the Layer Refresh Request consists of one or more FCI | The FCI for the Layer Refresh Request consists of one or more FCI | |||
entries, the content of which is depicted in Figure 5. The length of | entries, the content of which is depicted in Figure 5. The length of | |||
skipping to change at line 236 ¶ | skipping to change at line 240 ¶ | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| SSRC | | | SSRC | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Seq nr. |C| Payload Type| Reserved | | | Seq nr. |C| Payload Type| Reserved | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RES | TTID| TLID | RES | CTID| CLID | | | RES | TTID| TLID | RES | CTID| CLID | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
Figure 5 | Figure 5: Layer Refresh Request FCI Format | |||
Synchronization Source (SSRC) (32 bits): | Synchronization Source (SSRC) (32 bits): | |||
The SSRC value of the media sender that is requested to send a | The SSRC value of the media sender that is requested to send a | |||
layer refresh point. | layer refresh point. | |||
Seq nr. (8 bits): | Seq nr. (8 bits): | |||
The command sequence number. The sequence number space is unique | The command sequence number. The sequence number space is unique | |||
for each pairing of the SSRC of command source and the SSRC of the | for each pairing of the SSRC of command source and the SSRC of the | |||
command target. The sequence number SHALL be increased by 1 for | command target. The sequence number SHALL be increased by 1 for | |||
each new command (modulo 256, so the value after 255 is 0). A | each new command (modulo 256, so the value after 255 is 0). A | |||
repetition SHALL NOT increase the sequence number. The initial | repetition SHALL NOT increase the sequence number. The initial | |||
value is arbitrary. | value is arbitrary. | |||
C (1 bit): | C (1 bit): | |||
A flag bit indicating whether the Current Temporal Layer ID (CTID) | A flag bit indicating whether the Current Temporal-layer ID (CTID) | |||
and Current Layer ID (CLID) fields are present in the FCI. If | and Current Layer ID (CLID) fields are present in the FCI. If | |||
this bit is 0, the sender of the LRR message is requesting refresh | this bit is 0, the sender of the LRR message is requesting refresh | |||
of all layers up to and including the target layer. | of all layers up to and including the target layer. | |||
Payload Type (7 bits): | Payload Type (7 bits): | |||
The RTP payload type for which the LRR is being requested. This | The RTP payload type for which the LRR is being requested. This | |||
gives the context in which the target layer index is to be | gives the context in which the target layer index is to be | |||
interpreted. | interpreted. | |||
Reserved (RES) (three separate fields of 16 bits / 5 bits / 5 | Reserved (RES) (three separate fields of 16 bits / 5 bits / 5 | |||
bits): | bits): | |||
All bits SHALL be set to 0 by the sender and SHALL be ignored on | All bits SHALL be set to zero by the sender and SHALL be ignored | |||
reception. | on reception. | |||
Target Temporal Layer ID (TTID) (3 bits): | Target Temporal-layer ID (TTID) (3 bits): | |||
The temporal ID of the target layer for which the receiver wishes | The temporal-layer ID of the target layer for which the receiver | |||
a refresh point. | wishes a refresh point. | |||
Target Layer ID (TLID) (8 bits): | Target Layer ID (TLID) (8 bits): | |||
The layer ID of the target spatial or quality layer for which the | The layer ID of the target spatial or quality layer for which the | |||
receiver wishes a refresh point. Its format is dependent on the | receiver wishes a refresh point. Its format is dependent on the | |||
payload type field. | payload type field. | |||
Current Temporal Layer ID (CTID) (3 bits): | Current Temporal-layer ID (CTID) (3 bits): | |||
If C is 1, the ID of the current temporal layer being decoded by | If C is 1, the ID of the current temporal layer being decoded by | |||
the receiver. This message is not requesting refresh of layers at | the receiver. This message is not requesting refresh of layers at | |||
or below this layer. If C is 0, this field SHALL be set to 0 by | or below this layer. If C is 0, this field SHALL be set to zero | |||
the sender and SHALL be ignored on reception. | by the sender and SHALL be ignored on reception. | |||
Current Layer ID (CLID) (8 bits): | Current Layer ID (CLID) (8 bits): | |||
If C is 1, the layer ID of the current spatial or quality layer | If C is 1, the layer ID of the current spatial or quality layer | |||
being decoded by the receiver. This message is not requesting | being decoded by the receiver. This message is not requesting | |||
refresh of layers at or below this layer. If C is 0, this field | refresh of layers at or below this layer. If C is 0, this field | |||
SHALL be set to 0 by the sender and SHALL be ignored on reception. | SHALL be set to zero by the sender and SHALL be ignored on | |||
reception. | ||||
When C is 1, TTID MUST NOT be less than CTID, and TLID MUST NOT be | When C is 1, TTID MUST NOT be less than CTID, and TLID MUST NOT be | |||
less than CLID; at least one of either TTID or TLID MUST be greater | less than CLID; at least one of either TTID or TLID MUST be greater | |||
than CTID or CLID, respectively. That is to say, the target layer | than CTID or CLID, respectively. That is to say, the target layer | |||
index <TTID, TLID> MUST be a layer upgrade from the current layer | index <TTID, TLID> MUST be a layer upgrade from the current layer | |||
index <CTID, CLID>. A sender MAY request an upgrade in both temporal | index <CTID, CLID>. A sender MAY request an upgrade in both temporal | |||
and spatial/quality layers simultaneously. | and spatial or quality layers simultaneously. | |||
A receiver receiving an LRR feedback packet that does not satisfy the | A receiver receiving an LRR feedback packet that does not satisfy the | |||
requirements of the previous paragraph, i.e., one where the C bit is | requirements of the previous paragraph, i.e., one where the C bit is | |||
present but the TTID is less than the CTID or the TLID is less than | present but the TTID is less than the CTID or the TLID is less than | |||
the CLID, MUST discard the request. | the CLID, MUST discard the request. | |||
Note: the syntax of the TTID, TLID, CTID, and CLID fields match, by | | Note: The syntax of the TTID, TLID, CTID, and CLID fields | |||
design, the TID and LID fields in [RFC9626]. | | match, by design, the TID and LID fields in [RFC9626]. | |||
3.2. Semantics | 3.2. Semantics | |||
Within the common packet header for feedback messages (as defined in | Within the common packet header for feedback messages (as defined in | |||
Section 6.1 of [RFC4585]), the "SSRC of packet sender" field | Section 6.1 of [RFC4585]), the "SSRC of packet sender" field | |||
indicates the source of the request, and the "SSRC of media source" | indicates the source of the request, and the "SSRC of media source" | |||
is not used and SHALL be set to 0. The SSRCs of the media senders to | is not used and SHALL be set to zero. The SSRCs of the media senders | |||
which the LRR command applies are in the corresponding FCI entries. | to which the LRR command applies are in the corresponding FCI | |||
An LRR message MAY contain requests to multiple media senders, using | entries. An LRR message MAY contain requests to multiple media | |||
one FCI entry per target media sender. | senders, using one FCI entry per target media sender. | |||
Upon reception of an LRR, the encoder MUST send a decoder refresh | Upon reception of an LRR, the encoder MUST send a decoder refresh | |||
point (see Section 2.1) as soon as possible. | point (see Section 2.1) as soon as possible. | |||
The sender MUST respect bandwidth limits provided by the application | The sender MUST respect bandwidth limits provided by the application | |||
of congestion control, as described in Section 5 of [RFC5104]. As | of congestion control, as described in Section 5 of [RFC5104]. As | |||
layer refresh points will often be larger than non-refreshing frames, | layer refresh points will often be larger than non-refreshing frames, | |||
this may restrict a sender's ability to send a layer refresh point | this may restrict a sender's ability to send a layer refresh point | |||
quickly. | quickly. | |||
An LRR MUST NOT be sent as a reaction to picture losses due to packet | An LRR MUST NOT be sent as a reaction to picture losses due to packet | |||
loss or corruption; it is RECOMMENDED to use a PLI (Picture Loss | loss or corruption; it is RECOMMENDED to use a PLI (Picture Loss | |||
Indication) [RFC4585] instead. An LRR SHOULD be used only in | Indication) [RFC4585] instead. An LRR SHOULD be used only in | |||
situations where there is an explicit change in a decoders' behavior: | situations where there is an explicit change in a decoder's behavior: | |||
for example, when a receiver will start decoding a layer that it | for example, when a receiver will start decoding a layer that it | |||
previously had been discarding. | previously had been discarding. | |||
4. Usage with Specific Codecs | 4. Usage with Specific Codecs | |||
In order for an LRR to be used with a scalable codec, the format of | In order for an LRR to be used with a scalable codec, the format of | |||
the temporal and layer ID fields (for both the target and current | the temporal and layer ID fields (for both the target and current | |||
layer indices) needs to be specified for that codec's RTP | layer indices) needs to be specified for that codec's RTP | |||
packetization. New RTP packetization specifications for scalable | packetization. New RTP packetization specifications for scalable | |||
codecs SHOULD define how this is done. (The VP9 payload [RFC9628], | codecs SHOULD define how this is done. (The VP9 payload [RFC9628], | |||
for instance, has done so.) If the payload also specifies how it is | for instance, has done so.) If the payload also specifies how it is | |||
used with the Frame Marking RTP Header Extension [RFC9626], the | used with the Video Frame Marking RTP Header Extension described in | |||
syntax MUST be defined in the same manner as the TID and LID fields | [RFC9626], the syntax MUST be defined in the same manner as the TID | |||
in that header. | and LID fields in that header. | |||
4.1. H264 SVC | 4.1. H.264 SVC | |||
H.264 SVC [RFC6190] defines temporal, dependency (spatial), and | H.264 SVC [RFC6190] defines temporal, dependency (spatial), and | |||
quality scalability modes. | quality scalability modes. | |||
+---------------+---------------+ | +---------------+---------------+ | |||
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RES | TID |R| DID | QID | | | RES | TID |R| DID | QID | | |||
+---------------+---------------+ | +---------------+---------------+ | |||
Figure 6 | Figure 6: H.264 SVC Layer Index Fields Format | |||
Figure 6 shows the format of the layer index fields for H.264 SVC | Figure 6 shows the format of the layer index fields for H.264 SVC | |||
streams. The "R" and "RES" fields MUST be set to 0 on transmission | streams. The "R" and "RES" fields MUST be set to zero on | |||
and ignored on reception. See Section 1.1.3 of [RFC6190] for details | transmission and ignored on reception. See Section 1.1.3 of | |||
on the dependency_id (DID), quality_id (QID), and temporal_id (TID) | [RFC6190] for details on the dependency_id (DID), quality_id (QID), | |||
fields. | and temporal_id (TID) fields. | |||
A dependency or quality layer refresh of a given layer in H.264 SVC | A dependency or quality layer refresh of a given layer in H.264 SVC | |||
can be identified by the "I" bit (idr_flag) in the extended Network | can be identified by the I bit (idr_flag) in the extended Network | |||
Abstraction Layer (NAL) unit header, present in NAL unit types 14 | Abstraction Layer (NAL) unit header, present in NAL unit types 14 | |||
(prefix NAL unit) and 20 (coded scalable slice). Layer refresh of | (prefix NAL unit) and 20 (coded scalable slice). Layer refresh of | |||
the base layer can also be identified by its NAL unit type of its | the base layer can also be identified by its NAL unit type of its | |||
coded slices, which is "5" rather than "1". A dependency or quality | coded slices, which is "5" rather than "1". A dependency or quality | |||
layer refresh is complete once this bit has been seen on all the | layer refresh is complete once this bit has been seen on all the | |||
appropriate layers (in decoding order) above the current layer index | appropriate layers (in decoding order) above the current layer index | |||
(if any, or beginning from the base layer if not) through the target | (if any, or beginning from the base layer if not) through the target | |||
layer index. | layer index. | |||
Note that as the "I" bit in a Payload Content Scalability Information | Note that as the I bit in a Payload Content Scalability Information | |||
(PACSI) header is set if the corresponding bit is set in any of the | (PACSI) header is set if the corresponding bit is set in any of the | |||
aggregated NAL units it describes; thus, it is not sufficient to | aggregated NAL units it describes; thus, it is not sufficient to | |||
identify layer refresh when NAL units of multiple dependency or | identify layer refresh when NAL units of multiple dependency or | |||
quality layers are aggregated. | quality layers are aggregated. | |||
In H.264 SVC, temporal layer refresh information can be determined | In H.264 SVC, temporal layer refresh information can be determined | |||
from various Supplemental Encoding Information (SEI) messages in the | from various Supplemental Encoding Information (SEI) messages in the | |||
bitstream. | bitstream. | |||
Whether an H.264 SVC stream is scalably nested can be determined from | Whether an H.264 SVC stream is scalably nested can be determined from | |||
skipping to change at line 411 ¶ | skipping to change at line 416 ¶ | |||
The VP8 RTP payload format [RFC7741] defines temporal scalability | The VP8 RTP payload format [RFC7741] defines temporal scalability | |||
modes. It does not support spatial scalability. | modes. It does not support spatial scalability. | |||
+---------------+---------------+ | +---------------+---------------+ | |||
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RES | TID | RES | | | RES | TID | RES | | |||
+---------------+---------------+ | +---------------+---------------+ | |||
Figure 7 | Figure 7: VP8 Layer Index Field Format | |||
Figure 7 shows the format of the layer index field for VP8 streams. | Figure 7 shows the format of the layer index field for VP8 streams. | |||
The "RES" fields MUST be set to 0 on transmission and be ignored on | The "RES" fields MUST be set to zero on transmission and be ignored | |||
reception. See Section 4.2 of [RFC7741] for details on the TID | on reception. See Section 4.2 of [RFC7741] for details on the TID | |||
field. | field. | |||
A VP8 layer refresh point can be identified by the presence of the | A VP8 layer refresh point can be identified by the presence of the Y | |||
"Y" bit in the VP8 payload header. When this bit is set, this and | bit (see [RFC7741]) in the VP8 payload header. When this bit is set, | |||
all subsequent frames depend only on the current base temporal layer. | this and all subsequent frames depend only on the current base | |||
On receipt of an LRR for a VP8 stream, a sender that supports LRRs | temporal layer. On receipt of an LRR for a VP8 stream, a sender that | |||
MUST encode the stream so it can set the Y bit in a packet whose | supports LRRs MUST encode the stream so it can set the Y bit in a | |||
temporal layer is at or below the target layer index. | packet whose temporal layer is at or below the target layer index. | |||
Note that in VP8, not every layer switch point can be identified by | Note that in VP8, not every layer switch point can be identified by | |||
the Y bit since the Y bit implies layer switch of all layers, not | the Y bit since the Y bit implies layer switch of all layers, not | |||
just the layer in which it is sent. Thus, the use of an LRR with VP8 | just the layer in which it is sent. Thus, the use of an LRR with VP8 | |||
can result in some inefficiency in transmission. However, this is | can result in some inefficiency in transmission. However, this is | |||
not expected to be a major issue for temporal structures in normal | not expected to be a major issue for temporal structures in normal | |||
use. | use. | |||
4.3. H265 | 4.3. H.265 | |||
The initial version of the H.265 payload format [RFC7798] defines | The initial version of the H.265 payload format [RFC7798] defines | |||
temporal scalability, with protocol elements reserved for spatial or | temporal scalability, with protocol elements reserved for spatial or | |||
other scalability modes (which are expected to be defined in a future | other scalability modes (which are expected to be defined in a future | |||
version of the specification). | version of the specification). | |||
+---------------+---------------+ | +---------------+---------------+ | |||
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RES | TID |RES| LayerId | | | RES | TID |RES| layer ID | | |||
+---------------+---------------+ | +---------------+---------------+ | |||
Figure 8 | Figure 8: H.265 Layer Index Fields Format | |||
Figure 8 shows the format of the layer index field for H.265 streams. | Figure 8 shows the format of the layer index field for H.265 streams. | |||
The "RES" fields MUST be set to 0 on transmission and ignored on | The "RES" fields MUST be set to zero on transmission and ignored on | |||
reception. See Section 1.1.4 of [RFC7798] for details on the LayerId | reception. See Section 1.1.4 of [RFC7798] for details on the layer | |||
and TID fields. | ID and TID fields. | |||
H.265 streams signal whether they are temporally nested by using the | H.265 streams signal whether they are temporally nested by using the | |||
vps_temporal_id_nesting_flag in the Video Parameter Set (VPS) and the | vps_temporal_id_nesting_flag in the Video Parameter Set (VPS) and the | |||
sps_temporal_id_nesting_flag in the Sequence Parameter Set (SPS). If | sps_temporal_id_nesting_flag in the Sequence Parameter Set (SPS). If | |||
this flag is set in a stream's currently applicable VPS or SPS, | this flag is set in a stream's currently applicable VPS or SPS, | |||
receivers SHOULD NOT send temporal LRR messages for that stream, as | receivers SHOULD NOT send temporal LRR messages for that stream, as | |||
every frame is implicitly a temporal layer refresh point. | every frame is implicitly a temporal layer refresh point. | |||
If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit | If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit | |||
types 2 to 5 inclusively identify temporal layer switching points. A | types 2 to 5 inclusively identify temporal layer switching points. A | |||
layer refresh to any higher target temporal layer is satisfied when a | layer refresh to any higher target temporal layer is satisfied when a | |||
NAL unit type of 4 or 5 with TID equal to 1 more than current TID is | NAL unit type of 4 or 5 with TID equal to 1 more than current TID is | |||
seen. Alternatively, layer refresh to a target temporal layer can be | seen. Alternatively, layer refresh to a target temporal layer can be | |||
incrementally satisfied with a NAL unit type of 2 or 3. In this | incrementally satisfied with a NAL unit type of 2 or 3. In this | |||
case, given current TID = TO and target TID = TN, layer refresh to TN | case, given current TID = TO and target TID = TN, layer refresh to TN | |||
is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1, | is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1, | |||
then TID = T2, all the way up to TID = TN. During this incremental | then TID = T2, all the way up to TID = TN (note that TN and TO refer | |||
to nonce variables in this instance). During this incremental | ||||
process, layer refresh to TN can be completely satisfied as soon as a | process, layer refresh to TN can be completely satisfied as soon as a | |||
NAL unit type of 2 or 3 is seen. | NAL unit type of 2 or 3 is seen. | |||
Of course, temporal layer refresh can also be satisfied whenever any | Of course, temporal layer refresh can also be satisfied whenever any | |||
Intra-Random Access Point (IRAP) NAL unit type (with values 16-23, | Intra-Random Access Point (IRAP) NAL unit type (with values 16-23, | |||
inclusively) is seen. An IRAP picture is similar to an IDR picture | inclusively) is seen. An IRAP picture is similar to an IDR picture | |||
in H.264 (NAL unit type of 5 in H.264) where decoding of the picture | in H.264 (NAL unit type of 5 in H.264) where decoding of the picture | |||
can start without any older pictures. | can start without any older pictures. | |||
In the (future) H.265 payloads that support spatial scalability, a | In the (future) H.265 payloads that support spatial scalability, a | |||
spatial layer refresh of a specific layer can be identified by NAL | spatial-layer refresh of a specific layer can be identified by NAL | |||
units with the requested layer ID and NAL unit types between 16 and | units with the requested layer ID and NAL unit types between 16 and | |||
21, inclusive. A dependency or quality layer refresh is complete | 21, inclusive. A dependency or quality layer refresh is complete | |||
once NAL units of this type have been seen on all the appropriate | once NAL units of this type have been seen on all the appropriate | |||
layers (in decoding order) above the current layer index (if any, or | layers (in decoding order) above the current layer index (if any, or | |||
beginning from the base layer if not) through the target layer index. | beginning from the base layer if not) through the target layer index. | |||
5. Usage with Different Scalability Transmission Mechanisms | 5. Usage with Different Scalability Transmission Mechanisms | |||
Several different mechanisms are defined for how scalable streams can | Several different mechanisms are defined for how scalable streams can | |||
be transmitted in RTP. The RTP Taxonomy (Section 3.7 of [RFC7656]) | be transmitted in RTP. Section 3.7 of "A Taxonomy of Semantics and | |||
Mechanisms for Real-Time Transport Protocol (RTP) Sources" [RFC7656] | ||||
defines three mechanisms: Single RTP stream on a Single media | defines three mechanisms: Single RTP stream on a Single media | |||
Transport (SRST), Multiple RTP streams on a Single media Transport | Transport (SRST), Multiple RTP streams on a Single media Transport | |||
(MRST), and Multiple RTP streams on Multiple media Transports (MRMT). | (MRST), and Multiple RTP streams on Multiple media Transports (MRMT). | |||
The LRR message is applicable to all these mechanisms. For MRST and | The LRR message is applicable to all these mechanisms. For MRST and | |||
MRMT mechanisms, the "media source" field of the LRR FCI is set to | MRMT mechanisms, the "media source" field of the LRR FCI is set to | |||
the SSRC of the RTP stream containing the layer indicated by the | the SSRC of the RTP stream containing the layer indicated by the | |||
Current Layer Index (if "C" is 1) or the stream containing the base | Current Layer Index (if "C" is 1) or the stream containing the base | |||
encoded stream (if "C" is 0). For MRMT, it is sent on the RTP | encoded stream (if "C" is 0). For MRMT, the LRR message is sent on | |||
session on which this stream is sent. On receipt, the sender MUST | the RTP session on which this stream is sent. On receipt, the sender | |||
refresh all the layers requested in the stream, simultaneously in | MUST refresh all the layers requested in the stream, simultaneously | |||
decode order. | in decode order. | |||
6. SDP Definitions | 6. SDP Definitions | |||
Section 7 of [RFC5104] defines Session Description Protocol (SDP) | Section 7 of [RFC5104] defines Session Description Protocol (SDP) | |||
procedures for indicating and negotiating support for Codec Control | procedures for indicating and negotiating support for Codec Control | |||
Messages (CCM) in SDP. This document extends this with a new codec | Messages (CCM) in SDP. This document extends this with a new codec | |||
control command, "lrr", which indicates support of the LRR. | control command, "lrr", which indicates support of the LRR. | |||
Figure 9 gives a formal Augmented Backus-Naur Form (ABNF) [RFC5234] | Figure 9 gives a formal Augmented Backus-Naur Form (ABNF) [RFC5234] | |||
showing this grammar extension, extending the grammar defined in | showing this grammar extension, extending the grammar defined in | |||
skipping to change at line 531 ¶ | skipping to change at line 538 ¶ | |||
All the security considerations of FIR feedback packets [RFC5104] | All the security considerations of FIR feedback packets [RFC5104] | |||
apply to LRR feedback packets as well. Additionally, media senders | apply to LRR feedback packets as well. Additionally, media senders | |||
receiving LRR feedback packets MUST validate that the payload types | receiving LRR feedback packets MUST validate that the payload types | |||
and layer indices they are receiving are valid for the stream they | and layer indices they are receiving are valid for the stream they | |||
are currently sending, and discard the requests if not. | are currently sending, and discard the requests if not. | |||
8. IANA Considerations | 8. IANA Considerations | |||
This document defines a new entry to the "Codec Control Messages" | This document defines a new entry to the "Codec Control Messages" | |||
subregistry of the "Session Description Protocol (SDP) Parameters" | registry of the "Session Description Protocol (SDP) Parameters" | |||
registry, according to the following data: | registry group, according to the following data: | |||
Value Name: lrr | Value Name: lrr | |||
Long Name: Layer Refresh Request Command | Long Name: Layer Refresh Request Command | |||
Usable with: ccm | Usable with: ccm | |||
Mux: IDENTICAL-PER-PT | Mux: IDENTICAL-PER-PT | |||
Reference: RFC 9627 | Reference: RFC 9627 | |||
This document also defines a new entry to the "FMT Values for PSFB | This document also defines a new entry to the "FMT Values for PSFB | |||
Payload Types" subregistry of the "Real-Time Transport Protocol (RTP) | Payload Types" registry of the "Real-Time Transport Protocol (RTP) | |||
Parameters" registry, according to the following data: | Parameters" registry group, according to the following data: | |||
Name: LRR | Name: LRR | |||
Long Name: Layer Refresh Request Command | Long Name: Layer Refresh Request Command | |||
Value: 10 | Value: 10 | |||
Reference: RFC 9627 | Reference: RFC 9627 | |||
9. References | 9. References | |||
9.1. Normative References | 9.1. Normative References | |||
skipping to change at line 599 ¶ | skipping to change at line 606 ¶ | |||
[RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. | [RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. | |||
M. Hannuksela, "RTP Payload Format for High Efficiency | M. Hannuksela, "RTP Payload Format for High Efficiency | |||
Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, | Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, | |||
March 2016, <https://www.rfc-editor.org/info/rfc7798>. | March 2016, <https://www.rfc-editor.org/info/rfc7798>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame | [RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame | |||
Marking RTP Header Extension", RFC 9621, | Marking RTP Header Extension", RFC 9626, | |||
DOI 10.17487/RFC9621, August 2024, | DOI 10.17487/RFC9626, February 2025, | |||
<https://www.rfc-editor.org/info/rfc9626>. | <https://www.rfc-editor.org/info/rfc9626>. | |||
9.2. Informative References | 9.2. Informative References | |||
[RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and | [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and | |||
B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms | B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms | |||
for Real-Time Transport Protocol (RTP) Sources", RFC 7656, | for Real-Time Transport Protocol (RTP) Sources", RFC 7656, | |||
DOI 10.17487/RFC7656, November 2015, | DOI 10.17487/RFC7656, November 2015, | |||
<https://www.rfc-editor.org/info/rfc7656>. | <https://www.rfc-editor.org/info/rfc7656>. | |||
[RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, | [RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, | |||
"Using Codec Control Messages in the RTP Audio-Visual | "Using Codec Control Messages in the RTP Audio-Visual | |||
Profile with Feedback with Layered Codecs", RFC 8082, | Profile with Feedback with Layered Codecs", RFC 8082, | |||
DOI 10.17487/RFC8082, March 2017, | DOI 10.17487/RFC8082, March 2017, | |||
<https://www.rfc-editor.org/info/rfc8082>. | <https://www.rfc-editor.org/info/rfc8082>. | |||
[RFC9628] Lennox, J., Hong, D., Uberti, J., Holmer, S., and M. | [RFC9628] Lennox, J., Hong, D., Uberti, J., Holmer, S., and M. | |||
Flodman, "The Layer Refresh Request (LRR) RTCP Feedback | Flodman, "The Layer Refresh Request (LRR) RTCP Feedback | |||
Message", RFC 9628, DOI 10.17487/RFC9628, August 2024, | Message", RFC 9628, DOI 10.17487/RFC9628, February 2025, | |||
<https://www.rfc-editor.org/info/rfc9628>. | <https://www.rfc-editor.org/info/rfc9628>. | |||
Authors' Addresses | Authors' Addresses | |||
Jonathan Lennox | Jonathan Lennox | |||
Vidyo, Inc. | 8x8, Inc. / Jitsi | |||
433 Hackensack Avenue | Jersey City, NJ 07302 | |||
Seventh Floor | ||||
Hackensack, NJ 07601 | ||||
United States of America | United States of America | |||
Email: jonathan@vidyo.com | Email: jonathan.lennox@8x8.com | |||
Danny Hong | Danny Hong | |||
Vidyo, Inc. | Google, Inc. | |||
433 Hackensack Avenue | 315 Hudson St. | |||
Seventh Floor | New York, NY 10013 | |||
Hackensack, NJ 07601 | ||||
United States of America | United States of America | |||
Email: danny@vidyo.com | Email: dannyhong@google.com | |||
Justin Uberti | Justin Uberti | |||
Google, Inc. | OpenAI | |||
747 6th Street South | 747 6th Street South | |||
Kirkland, WA 98033 | Kirkland, WA 98033 | |||
United States of America | United States of America | |||
Email: justin@uberti.name | Email: justin@uberti.name | |||
Stefan Holmer | Stefan Holmer | |||
Google, Inc. | Google, Inc. | |||
Kungsbron 2 | Kungsbron 2 | |||
SE-111 22 Stockholm | SE-111 22 Stockholm | |||
Sweden | Sweden | |||
End of changes. 55 change blocks. | ||||
96 lines changed or deleted | 100 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |