1. Project introduction
In the era of mobile Internet, real-time video calls have become a standard feature in various scenarios such as social, collaboration, education, and medical care. To realize a high-quality Android video call function, it is necessary to solve many difficulties such as video acquisition, encoding and decoding, network transmission, signaling negotiation, echo cancellation, and network jitter control. This project will build a WebRTC-based Android video call example from scratch, with the following capabilities:
- Dual-end interoperability: Android ↔ Android, Android ↔ Web (or iOS)
- Video acquisition and rendering: Render local images using Camera2 API + OpenGL
- Audio processing: automatic echo cancellation (AEC), automatic gain control (AGC), noise suppression (NS)
- Network transmission: UDP-based SRTP encryption channel, supports STUN/TURN penetration
- Signaling exchange: WebSocket implements SDP negotiation and ICE candidate exchange
- Adaptive network: Real-time monitoring of packet loss rate and round trip delay, dynamically adjusting the transmission resolution and code rate
- Optional third-party integration: docking Agora, Tencent Cloud TRTC, Alibaba Cloud RTC and other commercial SDKs
2. Related knowledge
Overview
- PeerConnection: Core interface, responsible for SDP negotiation, ICE connection, SRTP encryption and decryption
- MediaStream: Manage a set of audio and video tracks (VideoTrack, AudioTrack)
- SurfaceViewRenderer/GLSurfaceView: Video rendering control
2. Video collection
- Camera2 API: Supports high resolution, manual focus, but complex callbacks
- WebRTC’s CameraCapturer: encapsulates the old Camera API and Camera2, supports front and rear camera switching
3. Audio processing
- WebRTC has built-in AEC, AGC, NS, no additional integration required
- Parameters can be adjusted through the AudioProcessing interface
4. Signaling and NAT penetration
- SDP Offer/Answer: Describe audio and video capabilities and network parameters
- ICE Candidate: Transfer candidate addresses to realize P2P connection
- STUN/TURN: Turn on IceServer to solve the problem of direct connection to the private network
5. Network adaptation
- Monitor network status through BitrateObserver and ConnectionStateChange callbacks
- Adjust the target bitrate and resolution of VideoEncoder in real time
6. Third-party SDK comparison
- Agora/Tencent Cloud/Ali Cloud: Provides higher-level packaging, built-in signaling and cross-platform adaptation
- WebRTC native: free, deeply customized, but you need to build signaling and TURN services yourself
Ideas for realization
1. Integrate WebRTC Native
- Add webrtc source code or use compiled AAR
- Initialize PeerConnectionFactory, enable hardware encoding/decoding
design
- Two SurfaceViewRenderers: Local Preview and Remote Screen
- Control buttons: initiate a call, hang up, switch the camera, mute, mirror switch
3. Signaling module
- Use WebSocket to communicate with signaling servers
- Define a simple protocol: {"type":"offer","sdp":...}, {"type":"answer",...}, {"type":"candidate",...}
4.P2P connection process
- On the A side, click "Call" → Create offer → Send to the B side.
- Side B receives → Set remoteDesc → Create answer → Send to A
- Both parties exchange ICE candidate → Trigger onIceConnectionChange = CONNECTED
5. Audio and video acquisition and rendering
- Initialize VideoCapturer with Camera2Enumerator and create VideoSource
- () Add video and audio tracks
- Remote orbital rendering via (remoteRenderer)
6. Network optimization
- Setting AdaptiveVideoTrackSource to listen for network bandwidth in onAddTrack
- Dynamic call ().find { is VideoTrack }.setParameters()
7. Server-side construction
- +Ws library implements signaling forwarding
- STUN: stun::19302; TURN: Deploy or rent it yourself
4. Environment and dependence
// app/ plugins { id '' id 'kotlin-android' } android { compileSdkVersion 34 defaultConfig { applicationId "" minSdkVersion 21 targetSdkVersion 34 // Permissions to camera and microphone are required } buildFeatures { viewBinding true } kotlinOptions { jvmTarget = "1.8" } } dependencies { implementation ':google-webrtc:1.0.32006' // Official AAR implementation ':kotlinx-coroutines-android:1.6.4' implementation '.okhttp3:okhttp:4.10.0' // WebSocket }
5. Integrate code
// ======================================================= // document:// Description: Camera and microphone permissions// ======================================================= <manifest xmlns:andro package=""> <uses-permission android:name=""/> <uses-permission android:name=".RECORD_AUDIO"/> <uses-permission android:name=""/> <application> <activity android:name=".MainActivity" android:theme="@style/" android:exported="true"> <intent-filter> <action android:name=""/> <category android:name=""/> </intent-filter> </activity> </application> </manifest> // ======================================================= // File: res/layout/activity_main.xml// Description: Local and remote screen + control buttons// ======================================================= <?xml version="1.0" encoding="utf-8"?> <FrameLayout xmlns:andro xmlns:app="/apk/res-auto" android:layout_width="match_parent" android:layout_height="match_parent"> <!-- Remote screen --> < android: android:layout_width="match_parent" android:layout_height="match_parent"/> <!-- Local preview(Small window in the upper right corner) --> < android: android:layout_width="120dp" android:layout_height="160dp" android:layout_margin="16dp" android:layout_gravity="top|end"/> <!-- Button bar --> <LinearLayout android:orientation="horizontal" android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_gravity="bottom|center" android:gravity="center" android:padding="16dp"> <Button android: android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="call"/> <Button android: android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="hang up" android:layout_marginStart="16dp"/> <Button android: android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="Switch camera" android:layout_marginStart="16dp"/> </LinearLayout> </FrameLayout> // ======================================================= // document:// Description: WebSocket signaling client// ======================================================= package import .* import okhttp3.* import import class SignalingClient( private val serverUrl: String, private val listener: Listener ) : WebSocketListener() { interface Listener { fun onOffer(sdp: String) fun onAnswer(sdp: String) fun onCandidate(sdpMid: String, sdpMLineIndex: Int, candidate: String) } private val client = () .connectTimeout(10, ) .build() private var ws: WebSocket? = null fun connect() { val req = ().url(serverUrl).build() ws = (req, this) } fun close() { ws?.close(1000, "bye") } fun sendOffer(sdp: String) { val obj = JSONObject().apply { put("type", "offer"); put("sdp", sdp) } ws?.send(()) } fun sendAnswer(sdp: String) { val obj = JSONObject().apply { put("type", "answer"); put("sdp", sdp) } ws?.send(()) } fun sendCandidate(c: ) { val obj = JSONObject().apply { put("type", "candidate") put("sdpMid", ); put("sdpMLineIndex", ) put("candidate", ) } ws?.send(()) } override fun onMessage(webSocket: WebSocket, text: String) { val obj = JSONObject(text) when (("type")) { "offer" -> (("sdp")) "answer"-> (("sdp")) "candidate"-> ( ("sdpMid"), ("sdpMLineIndex"), ("candidate") ) } } } // ======================================================= // document:// Description: Core video call logic// ======================================================= package import import import import import import import .* import .* class MainActivity : AppCompatActivity(), { private lateinit var binding: ActivityMainBinding // WebRTC private lateinit var peerFactory: PeerConnectionFactory private var peerConnection: PeerConnection? = null private lateinit var localVideoSource: VideoSource private lateinit var localAudioSource: AudioSource private lateinit var localVideoTrack: VideoTrack private lateinit var localAudioTrack: AudioTrack private lateinit var videoCapturer: VideoCapturer private lateinit var signalingClient: SignalingClient private val coroutineScope = CoroutineScope() override fun onCreate(s: Bundle?) { (s) binding = (layoutInflater) setContentView() // 1. Authorization application (this, arrayOf(, .RECORD_AUDIO), 1) // 2. Initialize PeerConnectionFactory ( (this) .createInitializationOptions() ) peerFactory = ().createPeerConnectionFactory() // 3. Initialize local collection and rendering initLocalMedia() // 4. Initialize signaling signalingClient = SignalingClient("wss://", this) () // 5. Button Event { startCall() } { hangUp() } { switchCamera() } } private fun initLocalMedia() { // SurfaceViewRenderer initialization (().eglBaseContext, null) (().eglBaseContext, null) // Camera capture val enumerator = Camera2Enumerator(this) val camName = [0] videoCapturer = (camName, null) val surfaceTextureHelper = ("CaptureThread", ().eglBaseContext) localVideoSource = () (surfaceTextureHelper, this, ) (1280, 720, 30) localVideoTrack = ("ARDAMSv0", localVideoSource) () localAudioSource = (MediaConstraints()) localAudioTrack = ("ARDAMSa0", localAudioSource) } private fun createPeerConnection() { val iceServers = listOf( ("stun::19302").createIceServer() ) val rtcConfig = (iceServers).apply { continualGatheringPolicy = .GATHER_CONTINUALLY } peerConnection = (rtcConfig, object : { override fun onIceCandidate(c: IceCandidate) { (c) } override fun onAddStream(stream: MediaStream) { runOnUiThread { [0].addSink() } } override fun onConnectionChange(newState: ) { ("PC", "State = $newState") } //Omit other callbacks override fun onIceConnectionChange(state: ) {} override fun onIceGatheringChange(state: ) {} override fun onSignalingChange(state: ) {} override fun onIceCandidatesRemoved(candidates: Array<out IceCandidate>?) {} override fun onRemoveStream(stream: MediaStream?) {} override fun onDataChannel(dc: DataChannel?) {} override fun onRenegotiationNeeded() {} override fun onTrack(transceiver: RtpTransceiver?) {} }) // Add audio and video tracks peerConnection?.addTrack(localVideoTrack) peerConnection?.addTrack(localAudioTrack) } private fun startCall() { createPeerConnection() peerConnection?.createOffer(object : SdpObserver { override fun onCreateSuccess(desc: SessionDescription) { peerConnection?.setLocalDescription(this, desc) () } override fun onSetSuccess() {} override fun onCreateFailure(e: String) { } override fun onSetFailure(e: String) { } }, MediaConstraints()) } private fun hangUp() { peerConnection?.close(); peerConnection = null () } private fun switchCamera() { (videoCapturer as CameraVideoCapturer).switchCamera(null) } // ===== Callback ===== override fun onOffer(sdp: String) { if (peerConnection == null) createPeerConnection() val offer = SessionDescription(, sdp) peerConnection?.setRemoteDescription(object: SdpObserver { override fun onSetSuccess() { peerConnection?.createAnswer(object : SdpObserver { override fun onCreateSuccess(desc: SessionDescription) { peerConnection?.setLocalDescription(this, desc) () } override fun onSetSuccess() {} override fun onCreateFailure(e: String) {} override fun onSetFailure(e: String) {} }, MediaConstraints()) } override fun onCreateSuccess(p0: SessionDescription?) {} override fun onCreateFailure(p0: String?) {} override fun onSetFailure(p0: String?) {} }, offer) } override fun onAnswer(sdp: String) { val answer = SessionDescription(, sdp) peerConnection?.setRemoteDescription(object: SdpObserver { override fun onSetSuccess() {} override fun onCreateSuccess(p0: SessionDescription?) {} override fun onCreateFailure(p0: String?) {} override fun onSetFailure(p0: String?) {} }, answer) } override fun onCandidate(sdpMid: String, sdpMLineIndex: Int, cand: String) { val candidate = IceCandidate(sdpMid, sdpMLineIndex, cand) peerConnection?.addIceCandidate(candidate) } } // ======================================================= // document:// Description: Simplified version of SdpObserver// ======================================================= package import import abstract class SimpleSdpObserver : SdpObserver { override fun onCreateSuccess(desc: SessionDescription?) {} override fun onSetSuccess() {} override fun onCreateFailure(error: String?) {} override fun onSetFailure(error: String?) {} }
6. Code interpretation
1. Authorization application
Dynamically obtain camera and microphone permissions, and then initialize WebRTC after authorization.
- Configure the global environment;
- createPeerConnectionFactory generation factory, responsible for audio and video sources and underlying network stack.
3. Local collection and rendering
- Using Camera2Enumerator, it is recommended to try out old API extension compatibility first;
- Must be executed after EGLContext has been created;
- Start real-time acquisition and push to VideoSource.
4. Signaling interaction
- Simple JSON protocol, WebSocket single channel, suitable for small-scale demos;
- It is recommended to add stability designs such as authentication, reconnection, and message queues in the production environment.
5.P2P and NAT penetration
- STUN alone cannot solve the situation where both peers are on the intranet, and the TURN server needs to forward traffic;
- Multiple IceServers can be added to rtcConfig.
6. Call control
- "Call" creates a PeerConnection and creates an Offer;
- "Hang up" requires closing PeerConnection and signaling channels at the same time, and releasing local resources.
7. Performance and Optimization
1. Hardware encoding/decoding
WebRTC enables hard-coded solution by default, and can be adjusted through options when building PeerConnectionFactory.
2. Adaptive code rate
Listen to googAvailableSendBandwidth in StatsObserver, call dynamically
val parameters = [0].maxBitrateBps = newRate = parameters
3. Multi-channel video
Multiple streams can be pulled at the same time (such as screen sharing + camera), and multiple RtpSenders need to be created.
4. Echo cancellation and volume balance
Use WebRTC default AEC and AGC; for special scenarios, software echo canceller can be turned on.
5. Traffic encryption
SRTP is enabled by default; if you need higher security, you can install a TLS tunnel on top of UDP.
8. Project Summary and Expansion
This article uses native WebRTC examples to fully demonstrate the entire process of real-time video calls on Android: from permissions, factory initialization, camera acquisition, signaling interaction to P2P connection building and dynamic network optimization. You can further expand:
Screen sharing: Implement in-app screen push streaming through () or MediaProjection interface
Multi-person calls: Introduce multi-channel hybrid streaming or SFU (such as Janus, Jitsi, MediaSoup)
Visual statistics: Display packet loss rate, frame rate, round trip delay, and code rate curve on the UI
Third-party SDK connection: combine WebRTC with Agora/Tencent TRTC to support more complete commercial functions
Compose Refactoring: Switch rendered views and controls to Jetpack Compose
9. Frequently Asked Questions
Q1: How to integrate WebRTC AAR?
A1: Add implementation ':google-webrtc:1.0.32006' directly in Gradle, without compiling it yourself.
Q2: Can the signaling server be used?
A2: Yes, use -client to communicate with the server; pay attention to cross-domain and binary message formats.
Q3: How to avoid camera conflicts?
A3: Check videoCapturer != null before starting collection, and call stopCapture() and dispose() in onDestroy.
Q4: What should I do if the video call quality is poor?
A4: Turn on the adaptive code rate, adjust the encoding resolution, or increase the number of TURN servers to reduce packet loss.
Q5: How to achieve cross-platform interoperability?
A5: It can be used on the web side, and iOS can be used on the unified signaling and ICE configuration to communicate.
The above is the detailed content of using Android to implement real-time video calls (with source code). For more information about Android video calls, please follow my other related articles!