直播一对一源码在Android音频开发中如何实现对讲机实时语音对话
直播一对一源码在Android音频开发中如何实现对讲机实时语音对话
准备工作
一开始本来打算用Web
端来做客户端, 但是由于技术有限, 中途换成Android(Kotlin)
端, 后台是 SpringBoot
. 前后端交互是通过WebSocket
进行实时数据交互. 确定技术方案后, 然后了解相关音频格式
-
PCM
PCM(Pulse Code Modulation)也被称为脉码编码调制。PCM文件是模拟音频信号经模数转换(A/D变换)直接形成的二进制序列,该文件没有附加的文件头
和文件结束
标志。PCM中的声音数据没有被压缩,如果是单声道的文件,采样数据按时间的先后顺序依次存入。但是只有这些数字化的音频二进制序列并不能够播放,因为任何的播放器都不知道应该以什么样的声道数、采样频率和采样位数播放,这个二进制序列没有任何自描述性。 -
WAV
WAVE(Waveform Audio File Format),又或者是因为扩展名而被大众所知的WAV,也是一种无损音频编码。WAV文件可以当成是PCM文件的wrapper,实际上查看pcm和对应wav文件的hex文件,可以发现,wav文件只是在pcm文件的开头多了44bytes
,来表征其声道数、采样频率和采样位数等信息。由于其具有自描述性,WAV文件可以被基本所有的音频播放器播放.自然而言的认为,若我们需要在web端播放纯纯的PCM码流,是否只需要在其头部
加上44bytes转成对应的WAV
文件,就可以播放了。 -
GSM
GSM 06.10有损声音压缩。用于压缩语音的有损格式,用于全球移动电信标准(GSM)。它的目的是有益于缩小音频数据大小,但是当给定的音频信号被多次编码和解码时,它会引入大量的噪声。这种格式被一些语音邮件应用程序使用。这是CPU密集型
基本流程
- 对讲机发起呼叫(或者Android直接发起对话)
- Android端接听, 发去网络请求.
- 后端接受请求, 调用对讲机接听方法.
- 后台相当于一个中转站, 转发Android的音频数据、对讲机回调音频数据.
- Android、后端 通过WebScoket 进行实时数据传输(
byte
)
注意
- 对讲机回调回来的是
GSM
音频数据 - Android端录音是数据是
PCM
音频数据
多说无益上码
Android 端
在app gradle下添加相关依赖
// WebSocketapi 'org.java-websocket:Java-WebSocket:1.3.6'api 'com.github.tbruyelle:rxpermissions:0.10.2'// retrofitString retrofit_version = '2.4.0'api "com.squareup.retrofit2:retrofit:$retrofit_version"api "com.squareup.retrofit2:converter-gson:${retrofit_version}"api "com.squareup.retrofit2:adapter-rxjava2:${retrofit_version}"// okhttpString okhttp_version = '3.4.1'api "com.squareup.okhttp3:okhttp:${okhttp_version}"api "com.squareup.okhttp3:logging-interceptor:${okhttp_version}"// RxKotlin and RxAndroid 2.xapi 'io.reactivex.rxjava2:rxkotlin:2.3.0'api 'io.reactivex.rxjava2:rxandroid:2.1.0'复制代码
新建JWebSocketClient 继承WebSocketClient
class JWebSocketClient(serverUri: URI,private val callback: ((data: ByteBuffer?) -> Unit)) : WebSocketClient(serverUri) { override fun onOpen(handshakedata: ServerHandshake?) { Log.d("LLLLLLLLLLLL", "onOpen") } override fun onClose(code: Int, reason: String?, remote: Boolean) { Log.d("LLLLLLLLLLLL", "code = $code, onClose = $reason") } override fun onMessage(message: String?) { //Log.d("LLLLLLLLLLLL", "onMessage = $message") } override fun onMessage(bytes: ByteBuffer?) { super.onMessage(bytes) //Log.d("LLLLLLLLLLLL", "onMessage2 = $bytes") callback.invoke(bytes) } override fun onError(ex: Exception?) { Log.d("LLLLLLLLLLLL", "onError = $ex") }}复制代码
onMessage方法, 接受到后台传过来的数据, 调用callback回调到Activity中处理。 MainActivitiy 相关代码
class MainActivity : AppCompatActivity() { private lateinit var client: WebSocketClient private var isGranted = false private var isRecording = true private var disposable: Disposable? = null private val service by lazy { RetrofitFactory.newInstance.create(ApiService::class.java) } private val sampleRate = 8000 private val channelIn = AudioFormat.CHANNEL_IN_MONO private val channelOut = AudioFormat.CHANNEL_OUT_MONO private val audioFormat = AudioFormat.ENCODING_PCM_16BIT private val trackBufferSize by lazy { AudioTrack.getMinBufferSize(sampleRate, channelOut, audioFormat) } private val recordBufferSize by lazy { AudioTrack.getMinBufferSize(sampleRate, channelOut, audioFormat) } private val audioTrack by lazy { AudioTrack(AudioManager.STREAM_MUSIC, sampleRate, channelOut, audioFormat, trackBufferSize, AudioTrack.MODE_STREAM) } /** * MediaRecorder.AudioSource.MIC指的是麦克风 */ private val audioRecord by lazy { AudioRecord(MediaRecorder.AudioSource.MIC, sampleRate, channelIn, audioFormat, recordBufferSize) } private val pcm2WavUtil by lazy { FileUtils(sampleRate, channelIn, audioFormat) } override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_main) // 权限申请 requestPermission() initWebSocket() btnReceive.setOnClickListener { if (client.readyState == WebSocket.READYSTATE.NOT_YET_CONNECTED) { client.connect() } audioTrack.play() // 传入设备 service.talkIntercom(IdModel(10)) .observeOn(AndroidSchedulers.mainThread()) .subscribeOn(Schedulers.io()) .subscribe({ if (!isGranted) { toast("拒绝权限申请, 录音功能无法使用") return@subscribe } //检测AudioRecord初始化是否成功 if (audioRecord.state != AudioRecord.STATE_INITIALIZED) { toast("录音初始化失败") return@subscribe } audioRecord.startRecording() isRecording = true thread { val data = ByteArray(recordBufferSize) while (isRecording) { val readSize = audioRecord.read(data, 0, recordBufferSize) if (readSize >= AudioRecord.SUCCESS) { // 进行转码, 吧pcm转化为 wav // 相当于添加 文件头 client.send(pcm2WavUtil.pcm2wav(data)) } else { "读取失败".showLog() } } } }, { "error = $it".showLog() }) } btnHangup.setOnClickListener { isRecording = false // 关掉录音 audioRecord.stop() // 关掉播放 audioTrack.stop() service.hangupIntercom(IdModel(10)) .observeOn(AndroidSchedulers.mainThread()) .subscribeOn(Schedulers.io()) .subscribe { toast("挂断成功") } } } private fun initWebSocket() { val uri = URI.create("ws://192.168.1.140:3014/websocket/16502") client = JWebSocketClient(uri) { val buffer = ByteArray(trackBufferSize) it?.let { byteBuffer -> //byteBuffer.array().size.toString().showLog() val inputStream = ByteArrayInputStream(byteBuffer.array()) while (inputStream.available() > 0) { val readCount = inputStream.read(buffer) if (readCount == -1) { "没有更多数据可以读取了".showLog() break } audioTrack.write(buffer, 0, readCount) } } } } private fun requestPermission() { disposable = RxPermissions(this) .request(android.Manifest.permission.RECORD_AUDIO, android.Manifest.permission.WRITE_EXTERNAL_STORAGE) .subscribe { granted -> if (!granted) { toast("拒绝权限申请, 录音功能无法使用") return@subscribe } isGranted = true } } override fun onDestroy() { super.onDestroy() client.close() disposable?.dispose() audioRecord.stop() audioRecord.release() }}
由于需要用到录音, 所以要申请录音权限
.在初始化 WebSocket
. 当点击接听按钮时, 发起请求, 请求成功执行subscribe里面逻辑, 读取录音数据(PCM), 转化成 WAV
格式传递给后端, 相关转码代码如下
fun pcm2wav(data: ByteArray): ByteArray{ val sampleRate = 8000 val channels = 1 val byteRate = (16 * sampleRate * channels / 8).toLong() val totalAudioLen = data.size val totalDataLen = totalAudioLen + 36 val header = ByteArray(44 + data.size) // RIFF/WAVE header header[0] = 'R'.toByte() header[1] = 'I'.toByte() header[2] = 'F'.toByte() header[3] = 'F'.toByte() header[4] = (totalDataLen and 0xff).toByte() header[5] = (totalDataLen shr 8 and 0xff).toByte() header[6] = (totalDataLen shr 16 and 0xff).toByte() header[7] = (totalDataLen shr 24 and 0xff).toByte() //WAVE header[8] = 'W'.toByte() header[9] = 'A'.toByte() header[10] = 'V'.toByte() header[11] = 'E'.toByte() // 'fmt ' chunk header[12] = 'f'.toByte() header[13] = 'm'.toByte() header[14] = 't'.toByte() header[15] = ' '.toByte() // 4 bytes: size of 'fmt ' chunk header[16] = 16 header[17] = 0 header[18] = 0 header[19] = 0 // format = 1 header[20] = 1 header[21] = 0 header[22] = channels.toByte() header[23] = 0 header[24] = (sampleRate and 0xff).toByte() header[25] = (sampleRate shr 8 and 0xff).toByte() header[26] = (sampleRate shr 16 and 0xff).toByte() header[27] = (sampleRate shr 24 and 0xff).toByte() header[28] = (byteRate and 0xff).toByte() header[29] = (byteRate shr 8 and 0xff).toByte() header[30] = (byteRate shr 16 and 0xff).toByte() header[31] = (byteRate shr 24 and 0xff).toByte() // block align header[32] = (2 * 16 / 8).toByte() header[33] = 0 // bits per sample header[34] = 16 header[35] = 0 //data header[36] = 'd'.toByte() header[37] = 'a'.toByte() header[38] = 't'.toByte() header[39] = 'a'.toByte() header[40] = (totalAudioLen and 0xff).toByte() header[41] = (totalAudioLen shr 8 and 0xff).toByte() header[42] = (totalAudioLen shr 16 and 0xff).toByte() header[43] = (totalAudioLen shr 24 and 0xff).toByte() // 添加原始数据 data.forEachIndexed { index, byte -> header[44 + index] = byte } return header }复制代码
后台
在SpringBoot项目中引入 WebSocket 所需要的依赖
org.springframework.boot spring-boot-starter-websocket 复制代码
然后新建一个WebScoket 类, 代码如下
import com.kapark.cloud.context.AudioSender;import com.xiaoleilu.hutool.log.Log;import com.xiaoleilu.hutool.log.LogFactory;import org.springframework.stereotype.Component;import javax.websocket.*;import javax.websocket.server.PathParam;import javax.websocket.server.ServerEndpoint;import java.io.IOException;import java.nio.ByteBuffer;import java.util.concurrent.ConcurrentHashMap;/** * @author: hyzhan * @date: 2019/6/14 * @desc: TODO */@Component@ServerEndpoint("/websocket/{devId}")public class AudioSocket { private static Log log = LogFactory.get(AudioSocket.class); //静态变量,用来记录当前在线连接数。应该把它设计成线程安全的。 private static int onlineCount = 0; //concurrent包的线程安全Set,用来存放每个客户端对应的 AudioSocket 对象。 private static ConcurrentHashMap webSocketMap = new ConcurrentHashMap<>(); //与某个客户端的连接会话,需要通过它来给客户端发送数据 private Session session; //接收sid private int devId; /** * 连接建立成功调用的方法 */ @OnOpen public void onOpen(Session session, @PathParam("devId") int devId) { this.session = session; this.devId = devId; webSocketMap.put(devId, this); addOnlineCount(); //在线数加1 log.info("有新窗口开始监听:" + devId + ",当前在线人数为" + getOnlineCount()); } /** * 连接关闭调用的方法 */ @OnClose public void onClose() { webSocketMap.remove(devId, this); //从set中删除 subOnlineCount(); //在线数减1 log.info("有一连接关闭!当前在线人数为" + getOnlineCount()); } @OnMessage public void onMessage(String message, Session session) { log.info("接受String: " + message); } /** * 收到客户端消息后调用的方法 * * @param message 客户端发送过来的消息 */ @OnMessage public void onMessage(byte[] message, Session session) { log.info("接受byte length: " + message.length); AudioSender.send2Intercom(devId, message); } @OnError public void onError(Session session, Throwable error) { log.error("发生错误"); error.printStackTrace(); } /** * 发自定义消息 */ public static void send2Client(int devId, byte[] data, int len) { AudioSocket audioSocket = webSocketMap.get(devId); if (audioSocket != null) { try { synchronized (audioSocket.session) { audioSocket.session.getBasicRemote().sendBinary(ByteBuffer.wrap(data, 0, len)); } } catch (IOException e) { e.printStackTrace(); } } } private static synchronized int getOnlineCount() { return onlineCount; } private static synchronized void addOnlineCount() { AudioSocket.onlineCount++; } private static synchronized void subOnlineCount() { AudioSocket.onlineCount--; }}复制代码
代码中添加了相关的注释, 我们关注其中的onMessage(), 和 send2Client()方法, onMessage()收到客户端发来消息, 根据设备id, 调用send2Intercom方法发送音频给对应id的对讲机, send2Intercom方法如下:
public static void send2Intercom(int devId, byte[] data) { try { // 把 传过来的音频数据转化成 inputStream InputStream inputStream = new ByteArrayInputStream(data); // 在根据 inputStream 转化成 audioInputStream (音频输入数据流) AudioInputStream pcmInputStream = AudioSystem.getAudioInputStream(inputStream); // 转码, pcm数据类型 转化成 gsm 类型 AudioInputStream gsmInputStream = AudioSystem.getAudioInputStream(gsmFormat, pcmInputStream); // 这个byte大小可根据自身自行调整 byte[] tempBytes = new byte[50]; int len; while ((len = gsmInputStream.read(tempBytes)) != -1) { // 调用对讲机相关方法(requestSendAudioData), 发送给对讲机 DongSDKProxy.requestSendAudioData(devId, tempBytes, len); } } catch (UnsupportedAudioFileException | IOException e) { e.printStackTrace(); } }
GMS格式代码如下
AudioFormat gsmFormat = new AudioFormat(org.tritonus.share.sampled.Encodings.getEncoding("GSM0610"), 8000.0F, // sampleRate -1, // sampleSizeInBits 1, // channels 33, // frameSize 50.0F, // frameRate false);复制代码
注意
由于实时语音传递数据, 默认有大小限制, 所以后台 WebSocket, 需要把BufferSize
设置传递数据大一些, 参考代码如下:
@Configurationpublic class WebSocketConfig { @Bean public ServerEndpointExporter serverEndpointExporter() { return new ServerEndpointExporter(); } @Bean public ServletServerContainerFactoryBean createWebSocketContainer() { ServletServerContainerFactoryBean container = new ServletServerContainerFactoryBean(); container.setMaxTextMessageBufferSize(500000); container.setMaxBinaryMessageBufferSize(500000); return container; }}复制代码
后台这边接收到对讲机的回调onAudioData, 调用audioSender.send2Client方法, 进行解码, 发送给Android端
/** * 音频数据 */ @Override public int onAudioData(int dwDeviceID, InfoMediaData audioData) { String clazzName = Thread.currentThread().getStackTrace()[1].getMethodName(); audioSender.send2Client(dwDeviceID, audioData.pRawData, audioData.nRawLen); return 0; }复制代码
public void send2Client(int devId, byte[] data, long total) { /** * 1、对讲机回调回来的数据是 gsm格式 * 2、把传过来的音频数据(gsm)转化成 inputStream * 3、把 inputStream 转化对应的音频输入流 audioInputStream * 4、在根据pcmFormat 转化成 pcm 格式的 inputStream * 5、在读取inputStream, 发送音频数据到Android端 */ try (InputStream inputStream = new ByteArrayInputStream(data); AudioInputStream gsmInputStream = AudioSystem.getAudioInputStream(inputStream); AudioInputStream pcmInputStream = AudioSystem.getAudioInputStream(pcmFormat, gsmInputStream)) { // 这个byte大小, 可以根据自身需求进行调整 byte[] tempBytes = new byte[50]; int len; while ((len = pcmInputStream.read(tempBytes)) != -1) { // 调用 WebSocket, 发送给客户端 AudioSocket.send2Client(devId, tempBytes, len); } } catch (UnsupportedAudioFileException | IOException e) { e.printStackTrace(); } }复制代码
pcmFormat 格式代码如下
// PCM_SIGNED 8000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endianpcmFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 8000f, 16, 1, 2, 8000f, false);复制代码
再回头看看Android 端的接受数据处理:
private fun initWebSocket() { val uri = URI.create("ws://192.168.1.140:3014/websocket/16502") client = JWebSocketClient(uri) { val buffer = ByteArray(trackBufferSize) it?.let { byteBuffer -> val inputStream = ByteArrayInputStream(byteBuffer.array()) while (inputStream.available() > 0) { val readCount = inputStream.read(buffer) if (readCount == -1) { "没有更多数据可以读取了".showLog() break } audioTrack.write(buffer, 0, readCount) } } } }复制代码
JWebSocketClient{ } 这里对应 WebSocket 的 onMessage
回调, 把读取的数据直接丢到 audioTrack
中即可, audioTrack
只播放PCM格式数据, 而我们在后台已经转码成PCM
格式, 所以可以直接播放
更多相关文章
- 没有一行代码,「2020 新冠肺炎记忆」这个项目却登上了 GitHub 中
- “罗永浩抖音首秀”销售数据的可视化大屏是怎么做出来的呢?
- Nginx系列教程(三)| 一文带你读懂Nginx的负载均衡
- 不吹不黑!GitHub 上帮助人们学习编码的 12 个资源,错过血亏...
- Android06之RecyclerView详解
- RN 与 Android(安卓)代码交互
- Android进程通信(IPC)之AIDL对象传递
- Android(安卓): 录音实现之AudioRecord类
- Android(安卓)解决hellocharts与ViewPager滑动冲突以及有且仅有