Use Android to implement real-time video calls (with source code)

1. Project introduction

In the era of mobile Internet, real-time video calls have become a standard feature in various scenarios such as social, collaboration, education, and medical care. To realize a high-quality Android video call function, it is necessary to solve many difficulties such as video acquisition, encoding and decoding, network transmission, signaling negotiation, echo cancellation, and network jitter control. This project will build a WebRTC-based Android video call example from scratch, with the following capabilities:

Dual-end interoperability: Android ↔ Android, Android ↔ Web (or iOS)
Video acquisition and rendering: Render local images using Camera2 API + OpenGL
Audio processing: automatic echo cancellation (AEC), automatic gain control (AGC), noise suppression (NS)
Network transmission: UDP-based SRTP encryption channel, supports STUN/TURN penetration
Signaling exchange: WebSocket implements SDP negotiation and ICE candidate exchange
Adaptive network: Real-time monitoring of packet loss rate and round trip delay, dynamically adjusting the transmission resolution and code rate
Optional third-party integration: docking Agora, Tencent Cloud TRTC, Alibaba Cloud RTC and other commercial SDKs

2. Related knowledge

Overview

PeerConnection: Core interface, responsible for SDP negotiation, ICE connection, SRTP encryption and decryption
MediaStream: Manage a set of audio and video tracks (VideoTrack, AudioTrack)
SurfaceViewRenderer/GLSurfaceView: Video rendering control

2. Video collection

Camera2 API: Supports high resolution, manual focus, but complex callbacks
WebRTC’s CameraCapturer: encapsulates the old Camera API and Camera2, supports front and rear camera switching

3. Audio processing

WebRTC has built-in AEC, AGC, NS, no additional integration required
Parameters can be adjusted through the AudioProcessing interface

4. Signaling and NAT penetration

SDP Offer/Answer: Describe audio and video capabilities and network parameters
ICE Candidate: Transfer candidate addresses to realize P2P connection
STUN/TURN: Turn on IceServer to solve the problem of direct connection to the private network

5. Network adaptation

Monitor network status through BitrateObserver and ConnectionStateChange callbacks
Adjust the target bitrate and resolution of VideoEncoder in real time

6. Third-party SDK comparison

Agora/Tencent Cloud/Ali Cloud: Provides higher-level packaging, built-in signaling and cross-platform adaptation
WebRTC native: free, deeply customized, but you need to build signaling and TURN services yourself

Ideas for realization

1. Integrate WebRTC Native

Add webrtc source code or use compiled AAR
Initialize PeerConnectionFactory, enable hardware encoding/decoding

design

Two SurfaceViewRenderers: Local Preview and Remote Screen
Control buttons: initiate a call, hang up, switch the camera, mute, mirror switch

3. Signaling module

Use WebSocket to communicate with signaling servers
Define a simple protocol: {"type":"offer","sdp":...}, {"type":"answer",...}, {"type":"candidate",...}

4.P2P connection process

On the A side, click "Call" → Create offer → Send to the B side.
Side B receives → Set remoteDesc → Create answer → Send to A
Both parties exchange ICE candidate → Trigger onIceConnectionChange = CONNECTED

5. Audio and video acquisition and rendering

Initialize VideoCapturer with Camera2Enumerator and create VideoSource
() Add video and audio tracks
Remote orbital rendering via (remoteRenderer)

6. Network optimization

Setting AdaptiveVideoTrackSource to listen for network bandwidth in onAddTrack
Dynamic call ().find { is VideoTrack }.setParameters()

7. Server-side construction

+Ws library implements signaling forwarding
STUN: stun::19302; TURN: Deploy or rent it yourself

4. Environment and dependence

// app/
plugins {
  id ''
  id 'kotlin-android'
}
 
android {
  compileSdkVersion 34
  defaultConfig {
    applicationId ""
    minSdkVersion 21
    targetSdkVersion 34
    // Permissions to camera and microphone are required  }
  buildFeatures { viewBinding true }
  kotlinOptions { jvmTarget = "1.8" }
}
 
dependencies {
  implementation ':google-webrtc:1.0.32006' // Official AAR  implementation ':kotlinx-coroutines-android:1.6.4'
  implementation '.okhttp3:okhttp:4.10.0'    // WebSocket
}

5. Integrate code

// =======================================================
// document:// Description: Camera and microphone permissions// =======================================================
&lt;manifest xmlns:andro
    package=""&gt;
  &lt;uses-permission android:name=""/&gt;
  &lt;uses-permission android:name=".RECORD_AUDIO"/&gt;
  &lt;uses-permission android:name=""/&gt;
  &lt;application&gt;
    &lt;activity android:name=".MainActivity"
        android:theme="@style/"
        android:exported="true"&gt;
      &lt;intent-filter&gt;
        &lt;action android:name=""/&gt;
        &lt;category android:name=""/&gt;
      &lt;/intent-filter&gt;
    &lt;/activity&gt;
  &lt;/application&gt;
&lt;/manifest&gt;
 
// =======================================================
// File: res/layout/activity_main.xml// Description: Local and remote screen + control buttons// =======================================================
&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;FrameLayout xmlns:andro
    xmlns:app="/apk/res-auto"
    android:layout_width="match_parent" android:layout_height="match_parent"&gt;
 
  &lt;!-- Remote screen --&gt;
  &lt;
      android:
      android:layout_width="match_parent"
      android:layout_height="match_parent"/&gt;
 
  &lt;!-- Local preview（Small window in the upper right corner） --&gt;
  &lt;
      android:
      android:layout_width="120dp"
      android:layout_height="160dp"
      android:layout_margin="16dp"
      android:layout_gravity="top|end"/&gt;
 
  &lt;!-- Button bar --&gt;
  &lt;LinearLayout
      android:orientation="horizontal"
      android:layout_width="match_parent"
      android:layout_height="wrap_content"
      android:layout_gravity="bottom|center"
      android:gravity="center"
      android:padding="16dp"&gt;
 
    &lt;Button android:
        android:layout_width="wrap_content" android:layout_height="wrap_content"
        android:text="call"/&gt;
    &lt;Button android:
        android:layout_width="wrap_content" android:layout_height="wrap_content"
        android:text="hang up" android:layout_marginStart="16dp"/&gt;
    &lt;Button android:
        android:layout_width="wrap_content" android:layout_height="wrap_content"
        android:text="Switch camera" android:layout_marginStart="16dp"/&gt;
  &lt;/LinearLayout&gt;
&lt;/FrameLayout&gt;
 
// =======================================================
// document:// Description: WebSocket signaling client// =======================================================
package 
 
import .*
import okhttp3.*
import 
import 
 
class SignalingClient(
  private val serverUrl: String,
  private val listener: Listener
) : WebSocketListener() {
 
  interface Listener {
    fun onOffer(sdp: String)
    fun onAnswer(sdp: String)
    fun onCandidate(sdpMid: String, sdpMLineIndex: Int, candidate: String)
  }
 
  private val client = ()
    .connectTimeout(10, )
    .build()
  private var ws: WebSocket? = null
 
  fun connect() {
    val req = ().url(serverUrl).build()
    ws = (req, this)
  }
  fun close() { ws?.close(1000, "bye") }
 
  fun sendOffer(sdp: String) {
    val obj = JSONObject().apply {
      put("type", "offer"); put("sdp", sdp)
    }
    ws?.send(())
  }
  fun sendAnswer(sdp: String) {
    val obj = JSONObject().apply {
      put("type", "answer"); put("sdp", sdp)
    }
    ws?.send(())
  }
  fun sendCandidate(c: ) {
    val obj = JSONObject().apply {
      put("type", "candidate")
      put("sdpMid", ); put("sdpMLineIndex", )
      put("candidate", )
    }
    ws?.send(())
  }
 
  override fun onMessage(webSocket: WebSocket, text: String) {
    val obj = JSONObject(text)
    when (("type")) {
      "offer" -&gt; (("sdp"))
      "answer"-&gt; (("sdp"))
      "candidate"-&gt; (
        ("sdpMid"), ("sdpMLineIndex"),
        ("candidate")
      )
    }
  }
}
 
// =======================================================
// document:// Description: Core video call logic// =======================================================
package 
 
import 
import 
import 
import 
import 
import 
import .*
import .*
 
class MainActivity : AppCompatActivity(),  {
 
  private lateinit var binding: ActivityMainBinding
 
  // WebRTC
  private lateinit var peerFactory: PeerConnectionFactory
  private var peerConnection: PeerConnection? = null
  private lateinit var localVideoSource: VideoSource
  private lateinit var localAudioSource: AudioSource
  private lateinit var localVideoTrack: VideoTrack
  private lateinit var localAudioTrack: AudioTrack
  private lateinit var videoCapturer: VideoCapturer
 
  private lateinit var signalingClient: SignalingClient
  private val coroutineScope = CoroutineScope()
 
  override fun onCreate(s: Bundle?) {
    (s)
    binding = (layoutInflater)
    setContentView()
 
    // 1. Authorization application    (this,
      arrayOf(, .RECORD_AUDIO), 1)
 
    // 2. Initialize PeerConnectionFactory    (
      (this)
        .createInitializationOptions()
    )
    peerFactory = ().createPeerConnectionFactory()
 
    // 3. Initialize local collection and rendering    initLocalMedia()
 
    // 4. Initialize signaling    signalingClient = SignalingClient("wss://", this)
    ()
 
    // 5. Button Event     { startCall() }
     { hangUp() }
     { switchCamera() }
  }
 
  private fun initLocalMedia() {
    // SurfaceViewRenderer initialization    (().eglBaseContext, null)
    (().eglBaseContext, null)
 
    // Camera capture    val enumerator = Camera2Enumerator(this)
    val camName = [0]
    videoCapturer = (camName, null)
    val surfaceTextureHelper = ("CaptureThread",
      ().eglBaseContext)
    localVideoSource = ()
    (surfaceTextureHelper, this, )
    (1280, 720, 30)
 
    localVideoTrack = ("ARDAMSv0", localVideoSource)
    ()
 
    localAudioSource = (MediaConstraints())
    localAudioTrack = ("ARDAMSa0", localAudioSource)
  }
 
  private fun createPeerConnection() {
    val iceServers = listOf(
      ("stun::19302").createIceServer()
    )
    val rtcConfig = (iceServers).apply {
      continualGatheringPolicy = .GATHER_CONTINUALLY
    }
    peerConnection = (rtcConfig, object :  {
      override fun onIceCandidate(c: IceCandidate) {
        (c)
      }
      override fun onAddStream(stream: MediaStream) {
        runOnUiThread {
          [0].addSink()
        }
      }
      override fun onConnectionChange(newState: ) {
        ("PC", "State = $newState")
      }
      //Omit other callbacks      override fun onIceConnectionChange(state: ) {}
      override fun onIceGatheringChange(state: ) {}
      override fun onSignalingChange(state: ) {}
      override fun onIceCandidatesRemoved(candidates: Array&lt;out IceCandidate&gt;?) {}
      override fun onRemoveStream(stream: MediaStream?) {}
      override fun onDataChannel(dc: DataChannel?) {}
      override fun onRenegotiationNeeded() {}
      override fun onTrack(transceiver: RtpTransceiver?) {}
    })
    // Add audio and video tracks    peerConnection?.addTrack(localVideoTrack)
    peerConnection?.addTrack(localAudioTrack)
  }
 
  private fun startCall() {
    createPeerConnection()
    peerConnection?.createOffer(object : SdpObserver {
      override fun onCreateSuccess(desc: SessionDescription) {
        peerConnection?.setLocalDescription(this, desc)
        ()
      }
      override fun onSetSuccess() {}
      override fun onCreateFailure(e: String) { }
      override fun onSetFailure(e: String) { }
    }, MediaConstraints())
  }
 
  private fun hangUp() {
    peerConnection?.close(); peerConnection = null
    ()
  }
 
  private fun switchCamera() {
    (videoCapturer as CameraVideoCapturer).switchCamera(null)
  }
 
  // ===== Callback =====  override fun onOffer(sdp: String) {
    if (peerConnection == null) createPeerConnection()
    val offer = SessionDescription(, sdp)
    peerConnection?.setRemoteDescription(object: SdpObserver {
      override fun onSetSuccess() {
        peerConnection?.createAnswer(object : SdpObserver {
          override fun onCreateSuccess(desc: SessionDescription) {
            peerConnection?.setLocalDescription(this, desc)
            ()
          }
          override fun onSetSuccess() {}
          override fun onCreateFailure(e: String) {}
          override fun onSetFailure(e: String) {}
        }, MediaConstraints())
      }
      override fun onCreateSuccess(p0: SessionDescription?) {}
      override fun onCreateFailure(p0: String?) {}
      override fun onSetFailure(p0: String?) {}
    }, offer)
  }
 
  override fun onAnswer(sdp: String) {
    val answer = SessionDescription(, sdp)
    peerConnection?.setRemoteDescription(object: SdpObserver {
      override fun onSetSuccess() {}
      override fun onCreateSuccess(p0: SessionDescription?) {}
      override fun onCreateFailure(p0: String?) {}
      override fun onSetFailure(p0: String?) {}
    }, answer)
  }
 
  override fun onCandidate(sdpMid: String, sdpMLineIndex: Int, cand: String) {
    val candidate = IceCandidate(sdpMid, sdpMLineIndex, cand)
    peerConnection?.addIceCandidate(candidate)
  }
}
 
// =======================================================
// document:// Description: Simplified version of SdpObserver// =======================================================
package 
 
import 
import 
 
abstract class SimpleSdpObserver : SdpObserver {
  override fun onCreateSuccess(desc: SessionDescription?) {}
  override fun onSetSuccess() {}
  override fun onCreateFailure(error: String?) {}
  override fun onSetFailure(error: String?) {}
}

6. Code interpretation

1. Authorization application

Dynamically obtain camera and microphone permissions, and then initialize WebRTC after authorization.

Configure the global environment;
createPeerConnectionFactory generation factory, responsible for audio and video sources and underlying network stack.

3. Local collection and rendering

Using Camera2Enumerator, it is recommended to try out old API extension compatibility first;
Must be executed after EGLContext has been created;
Start real-time acquisition and push to VideoSource.

4. Signaling interaction

Simple JSON protocol, WebSocket single channel, suitable for small-scale demos;
It is recommended to add stability designs such as authentication, reconnection, and message queues in the production environment.

5.P2P and NAT penetration

STUN alone cannot solve the situation where both peers are on the intranet, and the TURN server needs to forward traffic;
Multiple IceServers can be added to rtcConfig.

6. Call control

"Call" creates a PeerConnection and creates an Offer;
"Hang up" requires closing PeerConnection and signaling channels at the same time, and releasing local resources.

7. Performance and Optimization

1. Hardware encoding/decoding

WebRTC enables hard-coded solution by default, and can be adjusted through options when building PeerConnectionFactory.

2. Adaptive code rate

Listen to googAvailableSendBandwidth in StatsObserver, call dynamically

val parameters = 
[0].maxBitrateBps = newRate
 = parameters

3. Multi-channel video

Multiple streams can be pulled at the same time (such as screen sharing + camera), and multiple RtpSenders need to be created.

4. Echo cancellation and volume balance

Use WebRTC default AEC and AGC; for special scenarios, software echo canceller can be turned on.

5. Traffic encryption

SRTP is enabled by default; if you need higher security, you can install a TLS tunnel on top of UDP.

8. Project Summary and Expansion

This article uses native WebRTC examples to fully demonstrate the entire process of real-time video calls on Android: from permissions, factory initialization, camera acquisition, signaling interaction to P2P connection building and dynamic network optimization. You can further expand:

Screen sharing: Implement in-app screen push streaming through () or MediaProjection interface

Multi-person calls: Introduce multi-channel hybrid streaming or SFU (such as Janus, Jitsi, MediaSoup)

Visual statistics: Display packet loss rate, frame rate, round trip delay, and code rate curve on the UI

Third-party SDK connection: combine WebRTC with Agora/Tencent TRTC to support more complete commercial functions

Compose Refactoring: Switch rendered views and controls to Jetpack Compose

9. Frequently Asked Questions

Q1: How to integrate WebRTC AAR?

A1: Add implementation ':google-webrtc:1.0.32006' directly in Gradle, without compiling it yourself.

Q2: Can the signaling server be used?

A2: Yes, use -client to communicate with the server; pay attention to cross-domain and binary message formats.

Q3: How to avoid camera conflicts?

A3: Check videoCapturer != null before starting collection, and call stopCapture() and dispose() in onDestroy.

Q4: What should I do if the video call quality is poor?

A4: Turn on the adaptive code rate, adjust the encoding resolution, or increase the number of TURN servers to reduce packet loss.

Q5: How to achieve cross-platform interoperability?

A5: It can be used on the web side, and iOS can be used on the unified signaling and ICE configuration to communicate.

The above is the detailed content of using Android to implement real-time video calls (with source code). For more information about Android video calls, please follow my other related articles!