Categories
Uncategorized

How To Implement Screen Sharing in iOS App using ReplayKit and App Extension

Intro

Screen sharing – capturing user’s display and demonstrating it to peers during a video call. 

There’re 2 ways how you can implement screen sharing into your iOS app:

  1. Screen sharing in app. It suggests that a user can only share their screen from one particular app. If they minimize the app window, broadcasting will stop. It’s quite easy to implement.
  2. Screen sharing with extensions. This approach enables screen sharing from almost any point of the OS: e.g. Homescreen, external apps, system settings. But the implementation might be quite time-consuming.

In this article, we’ll share guides on both.

Screen sharing in app

Starting off easy – how to screen share within an app. We’ll use an Apple Framework, ReplayKit.

import ReplayKit

class ScreenShareViewController: UIViewController {

		lazy var startScreenShareButton: UIButton = {
        let button = UIButton()
        button.setTitle("Start screen share", for: .normal)
        button.setTitleColor(.systemGreen, for: .normal)
        return button
    }()
    
    lazy var stopScreenShareButton: UIButton = {
        let button = UIButton()
        button.setTitle("Stop screen share", for: .normal)
        button.setTitleColor(.systemRed, for: .normal)
        return button
    }()

		lazy var changeBgColorButton: UIButton = {
        let button = UIButton()
        button.setTitle("Change background color", for: .normal)
        button.setTitleColor(.gray, for: .normal)
        return button
    }()
    
    lazy var videoImageView: UIImageView = {
        let imageView = UIImageView()
        imageView.image = UIImage(systemName: "rectangle.slash")
        imageView.contentMode = .scaleAspectFit
        return imageView
    }()
}

Here we added it to the ViewController where recording, background color change buttons and imageView are – this is where the captured video will appear later.

the ViewController

To capture the screen, we address the RPScreenRecorder.shared() class and then call startCapture(handler: completionHandler:).

@objc func startScreenShareButtonTapped() {
		RPScreenRecorder.shared().startCapture { sampleBuffer, sampleBufferType, error in
				self.handleSampleBuffer(sampleBuffer, sampleType: sampleBufferType)
            if let error = error {
                print(error.localizedDescription)
            }
        } completionHandler: { error in
            print(error?.localizedDescription)
        }
}

Then the app asks for a permission to capture the screen: 

the permission pop-up

ReplayKit starts generating a CMSampleBuffer stream for each media type – audio or video. The stream contains the media fragment itself – the captured video – and all necessary information. 

func handleSampleBuffer(_ sampleBuffer: CMSampleBuffer, sampleType: RPSampleBufferType ) {
        switch sampleType {
        case .video:
            handleVideoFrame(sampleBuffer: sampleBuffer)
        case .audioApp:
//             handle audio app
            break
        case .audioMic:
//             handle audio mic
            break
        }
    }

The function converted into the UIImage type will then process each generated videoshot and display it on the screen.

func handleVideoFrame(sampleBuffer: CMSampleBuffer) {
        let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
        let ciimage = CIImage(cvPixelBuffer: imageBuffer)
        
        let context = CIContext(options: nil)
        var cgImage = context.createCGImage(ciimage, from: ciimage.extent)!
        let image = UIImage(cgImage: cgImage)
        render(image: image)
}

Here’s what it looks like:

generated frames

Captured screen broadcasting in WebRTC 

Usual setting: during a video call one peer wants to demonstrate the other what’s up on their screen. WebRTC is a great pick for it.

WebRTC connects 2 clients to deliver video data without any additional servers – it’s peer-to-peer connection (p2p). Check out this article to learn about it in detail. 

Data streams that clients exchange are media streams that contain audio and video streams. A video stream might be a camera image or a screen image.

To establish p2p connection successfully, configure a local mediastream that will further be delivered to the session descriptor. To do that, get an object of the RTCPeerConnectionFactory class and add the media stream packed with audio and video tracks to it.

func start(peerConnectionFactory: RTCPeerConnectionFactory) {

        self.peerConnectionFactory = peerConnectionFactory
        if self.localMediaStream != nil {
            self.startBroadcast()
        } else {
            let streamLabel = UUID().uuidString.replacingOccurrences(of: "-", with: "")
            self.localMediaStream = peerConnectionFactory.mediaStream(withStreamId: "\\(streamLabel)")
            
            let audioTrack = peerConnectionFactory.audioTrack(withTrackId: "\\(streamLabel)a0")
            self.localMediaStream?.addAudioTrack(audioTrack)

            self.videoSource = peerConnectionFactory.videoSource()
            self.screenVideoCapturer = RTCVideoCapturer(delegate: videoSource!)
            self.startBroadcast()
            
            self.localVideoTrack = peerConnectionFactory.videoTrack(with: videoSource!, trackId: "\\(streamLabel)v0")
            if let videoTrack = self.localVideoTrack  {
                self.localMediaStream?.addVideoTrack(videoTrack)
            }
            self.configureScreenCapturerPreview()
        }
    }

Pay attention to the video track configuration:

func handleSampleBuffer(sampleBuffer: CMSampleBuffer, type: RPSampleBufferType) {
        if type == .video {
            guard let videoSource = videoSource,
                  let screenVideoCapturer = screenVideoCapturer,
                  let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
            
            let width = CVPixelBufferGetWidth(pixelBuffer)
            let height = CVPixelBufferGetHeight(pixelBuffer)
            videoSource.adaptOutputFormat(toWidth: Int32(width), height: Int32(height), fps: 24)
            
            let rtcpixelBuffer = RTCCVPixelBuffer(pixelBuffer: pixelBuffer)
            let timestamp = NSDate().timeIntervalSince1970 * 1000 * 1000
            
            let videoFrame = RTCVideoFrame(buffer: rtcpixelBuffer, rotation: RTCVideoRotation._0, timeStampNs: Int64(timestamp))
            videoSource.capturer(screenVideoCapturer, didCapture: videoFrame)
        }
}

Screen sharing with App Extension

Since iOS is a quite closed and highly protected OS, it’s not that easy to address storage space outside an app. To let developers access certain features outside an app, Apple created App Extensions – external apps with access to certain relationships in iOS. They operate according to their types. App Extensions and the main app (let’s call it Containing app) don’t interact with each other directly but can share a data storing container. To ensure that, create an AppGroup on the Apple Developer website, then link the group with the Containing App and App Extension. 

containing app and extension relation
Scheme of data exchange between entities

Now to devising the App Extension. Create a new Target and select Broadcast Upload Extension. It has access to the recording stream and its further processing. Create and set up the App Group between targets. Now you can see the created folder with App Extension. There’re Info.plist, the extension file, and the swift SampleHandler file. There’s also a class with the same name written in SampleHandler that the recorded stream will process. 

The methods we can operate with are already written in this class as well: 

override func broadcastStarted(withSetupInfo setupInfo: [String : NSObject]?)
override func broadcastPaused() 
override func broadcastResumed() 
override func broadcastFinished()
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType)

We know what they’re responsible for by their names. All of them, but the last one. It’s where the last CMSampleBuffer and its type arrives. In case the buffer type is .video, this is where the last shot will be.

Now let’s get to implementing screen sharing with launching iOS Broadcast. To start off, we demonstrate the RPSystemBroadcastPickerView itself and set the extension to call.

let frame = CGRect(x: 0, y: 0, width: 60, height: 60)
let systemBroadcastPicker = RPSystemBroadcastPickerView(frame: frame)
systemBroadcastPicker.autoresizingMask = [.flexibleTopMargin, .flexibleRightMargin]
if let url = Bundle.main.url(forResource: "<OurName>BroadcastExtension", withExtension: "appex", subdirectory: "PlugIns") {
    if let bundle = Bundle(url: url) {
           systemBroadcastPicker.preferredExtension = bundle.bundleIdentifier
     }
}
view.addSubview(systemBroadcastPicker)

Once a user taps “Start broadcast” the broadcast will start and the selected extension will process the state and the stream itself. But how will the Containing App know this? Since the storage container is shared, we can exchange data via the file system – e.g. UserDefaults(suiteName) and FileManager. With it we can set up a timer, check up the states within certain periods of time, record and read data along a certain track. An alternative to that is to launch a local web-socket server and address it. But in this article we’ll only cover exchanging via files.

Write the BroadcastStatusManagerImpl class that will record current broadcast status as well as communicate to the delegate on status alterations. We’ll check on the updated info using a timer with 0.5 sec frequency. 

protocol BroadcastStatusSubscriber: AnyObject {
    func onChange(status: Bool)
}

protocol BroadcastStatusManager: AnyObject {
    func start()
    func stop()
    func subscribe(_ subscriber: BroadcastStatusSubscriber)
}

final class BroadcastStatusManagerImpl: BroadcastStatusManager {

    // MARK: Private properties

    private let suiteName = "group.com.<YourOrganizationName>.<>"
    private let forKey = "broadcastIsActive"

    private weak var subscriber: BroadcastStatusSubscriber?
    private var isActiveTimer: DispatchTimer?
    private var isActive = false

    deinit {
        isActiveTimer = nil
    }

    // MARK: Public methods

    func start() {
        setStatus(true)
    }

    func stop() {
        setStatus(false)
    }

    func subscribe(_ subscriber: BroadcastStatusSubscriber) {
        self.subscriber = subscriber
        isActive = getStatus()

        isActiveTimer = DispatchTimer(timeout: 0.5, repeat: true, completion: { [weak self] in
            guard let self = self else { return }

            let newStatus = self.getStatus()

            guard self.isActive != newStatus else { return }

            self.isActive = newStatus
            self.subscriber?.onChange(status: newStatus)
        }, queue: DispatchQueue.main)

        isActiveTimer?.start()
    }

    // MARK: Private methods

    private func setStatus(_ status: Bool) {
        UserDefaults(suiteName: suiteName)?.set(status, forKey: forKey)
    }

    private func getStatus() -> Bool {
        UserDefaults(suiteName: suiteName)?.bool(forKey: forKey) ?? false
    }
}

Now we create samples of BroadcastStatusManagerImpl to the App Extension and the Containing App, so that they know the broadcast state and record it. The Containing App can’t stop the broadcast directly. This is why we subscribe to the state – this way, when it reports false, App Extension will terminate broadcasting, using the finishBroadcastWithError method. Even though, in fact, we end it with no error, this is the only method that Apple SDK provides for program broadcast termination. 

extension SampleHandler: BroadcastStatusSubscriber {
    func onChange(status: Bool) {
        if status == false {
            finishBroadcastWithError(NSError(domain: "<YourName>BroadcastExtension", code: 1, userInfo: [
                NSLocalizedDescriptionKey: "Broadcast completed"
            ]))
        }
    }
}

Now both apps know when the broadcast started and ended. Then, we need to deliver data from the last shot. To do that, we create the PixelBufferSerializer class where we declare the serializing and deserializing methods. In the SampleHandler’s processSampleBuffer method we convert CMSampleBuffer to CVPixelBuffer and then serialize it to Data. When serializing to Data it’s important to record the format type, height, width and processing increment for each surface in it. In this particular case we have two of them: luminance and chrominance, and their data. To get the buffer data, use CVPixelBuffer-kind functions.

While testing from iOS to Android we’ve faced this problem: the device just wouldn’t display the screen shared. It’s that Android OS doesn’t support the irregular resolution the video had. We’ve solved it by just turning it into 1080×720. 

Once having serialized into Data, record the link to the bytes gained into the file.

memcpy(mappedFile.memory, baseAddress, data.count)

Then create the BroadcastBufferContext class in the Containing App. Its operation logic is alike BroadcastStatusManagerImpl: the file discerns each timer iteration and reports on the data for further processing. The stream itself comes in 60 FPS, but it’s better to read it with 30 FPS, since the system doesn’t perform well when processing in 60 FPS due to lack of the resource. 

func subscribe(_ subscriber: BroadcastBufferContextSubscriber) {
        self.subscriber = subscriber

        framePollTimer = DispatchTimer(timeout: 1.0 / 30.0, repeat: true, completion: { [weak self] in
            guard let mappedFile = self?.mappedFile else {
                return
            }

            var orientationValue: Int32 = 0
            mappedFile.read(at: 0 ..< 4, to: &orientationValue)
            self?.subscriber?.newFrame(Data(
                bytesNoCopy: mappedFile.memory.advanced(by: 4),
                count: mappedFile.size - 4,
                deallocator: .none
            ))
        }, queue: DispatchQueue.main)
        framePollTimer?.start()
    }

Deserialize it all back to CVPixelBuffer, likewise we serialized it but in reverse. Then we configure the video track by setting up the extension and FPS.

videoSource.adaptOutputFormat(toWidth: Int32(width), height: Int32(height), fps: 60)

Now add the frame RTCVideoFrame(buffer: rtcpixelBuffer, rotation: RTCVideoRotation._0, timeStampNs: Int64(timestamp)). This track goes to the local stream.

localMediaStream.addVideoTrack(videoTrack)

Conclusion 

Implementing screen sharing in iOS is not that easy as it may seem. Reservedness and security of the OS force developers into looking for workarounds to deal with such tasks. We’ve found some – check out the result in our Fora Soft Video Calls app. Download on AppStore.