Skip to content

Latest commit

 

History

History
84 lines (70 loc) · 4.72 KB

README.md

File metadata and controls

84 lines (70 loc) · 4.72 KB

Transcoding

CI

A simple Swift package for video encoding and decoding with Annex-B adaptors optimized for transferring video over a network.

Transcoding is used for video encoding and decoding in Castaway, an app that streams HDMI capture devices from an iPad or Mac to a nearby Vision Pro.

Example data flow

Usage

VideoEncoder

VideoEncoder is an object that takes CMSampleBuffers containing CVPixelBuffers and outputs a stream of CMSampleBuffers containing CMBlockBuffers containing compressed H264/HEVC data. VideoEncoder is initialized with a Config, with presets for live capture, active transcoding, background transcoding, and ultra-low latency following Apple's recommendations.

Usage

let videoEncoder = VideoEncoder(config: .ultraLowLatency)
encoderStreamTask = Task {
    for await encodedSampleBuffer in videoEncoder.encodedSampleBuffers {
        // encodedSampleBuffer: CMSampleBuffer > CMBlockBuffer
    }
}
videoEncoder.encode(sampleBuffer)

VideoDecoder

VideoDecoder is an object that takes CMSampleBuffers containing CMBlockBufferss containing compressed H264/HEVC data and outputs a stream of CMSampleBuffers containing CVPixelBuffers. VideoDecoder is initialized with a Config containing various optional decompression settings.

Usage

let videoDecoder = VideoDecoder(config: .init(realTime: true))
decoderStreamTask = Task {
    for await decodedSampleBuffer in videoDecoder.decodedSampleBuffers {
        // decodedSampleBuffer: CMSampleBuffer > CVPixelBuffer
    }
}
videoDecoder.decode(sampleBuffer)

Annex B

VideoEncoderAnnexBAdaptor and VideoDecoderAnnexBAdaptor can be used to convert compressed CMSampleBuffers to and from an Annex B (ITU-T-REC-H.265) byte stream. This is ideal for sending compressed video data over a network.

Example Pipeline

In this example, video frames from a capture device are encoded and decoded as an Annex-B data stream optimized for low latency.

let videoEncoder = VideoEncoder(config: .ultraLowLatency)
let videoEncoderAnnexBAdaptor = VideoEncoderAnnexBAdaptor(
    videoEncoder: videoEncoder
)
let videoDecoder = VideoDecoder(config: .init(realTime: true))
let videoDecoderAnnexBAdaptor = VideoDecoderAnnexBAdaptor(
    videoDecoder: videoDecoder,
    codec: .hevc
)

videoEncoderTask = Task {
    for await data in videoEncoderAnnexBAdaptor.annexBData {
        // send data over network or whatever
    }
}

videoDecoderTask = Task {
    for await decodedSampleBuffer in videoDecoder.decodedSampleBuffers {
        // here you have a received decoded sample buffer with image buffer
    }
}

receivedMessageTask = Task {
    // Replace `realtimeStreaming.receivedMessages` with however you receive encoded data packets 
    for await (data, _) in realtimeStreaming.receivedMessages {
        videoDecoderAnnexBAdaptor.decode(data)
    }
}

captureSessionTask = Task {
    // Replace `captureSession.pixelBuffers` with your video data source
    for await pixelBuffer in captureSession.pixelBuffers {
        videoEncoder.encode(pixelBuffer)
    }
}

Important Notes

  • Currently VideoDecoderAnnexBAdaptor only supports decoding full NALU's, meaning if you are passing data over the network or some stream you must ensure you receive full video frame packets, not just an arbitrarily sized data stream. When using Network.framework, for example, you would use a custom NWProtocolFramerImplementation to receive individual messages.
  • There are a number of instances where either the encoder or decoder needs to be reset during an application lifecycle. The encoder and decoder automatically handle resetting after an iOS app has been backgrounded, but you may need to handle other cases by calling encoder/decoder.invalidate(). For example, if you maintain one encoder and disconnect then reconnect from a peer/decoder, the encoder will need to be invalidated to ensure it sends over the H264/HEVC parameter sets again. Otherwise the decoder will not be able to decode frames, as the encoder is optimized to only send SPS/PPS/VPS when necessary.