Smi字幕是韩国比较流行使用。

关于SMI字幕请看http://en.wikipedia.org/wiki/SAMI

主要格式如下:

<SAMI> <HEAD> <TITLE>SAMI Example</TITLE> <SAMIParam> Media {cheap44.wav} Metrics {time:ms;} Spec {MSFT:1.0;} </SAMIParam> <STYLE TYPE="text/css"> <!-- P { font-family: Arial; font-weight: normal; color: white; background-color: black; text-align: center; } #Source {color: red; background-color: blue; font-family: Courier; font-size: 12pt; font-weight: normal; text-align: left; } .ENUSCC { name: English; lang: en-US ; SAMIType: CC ; } .FRFRCC { name: French; lang: fr-FR ; SAMIType: CC ; } --> </STYLE> </HEAD> <BODY> <-- Open play menu, choose Captions and Subtiles, On if available --> <-- Open tools menu, Security, Show local captions when present --> <SYNC Start=0> <P Class=ENUSCC ID=Source>The Speaker</P> <P Class=ENUSCC>SAMI 0000 text</P> <P Class=FRFRCC ID=Source>French The Speaker</P> <P Class=FRFRCC>French SAMI 0000 text</P> </SYNC> <SYNC Start=1000> <P Class=ENUSCC>SAMI 1000 text</P> <P Class=FRFRCC>French SAMI 1000 text</P> </SYNC> <SYNC Start=2000> <P Class=ENUSCC>SAMI 2000 text</P> <P Class=FRFRCC>French SAMI 2000 text</P> </SYNC> <SYNC Start=3000> <P Class=ENUSCC>SAMI 3000 text</P> <P Class=FRFRCC>French SAMI 3000 text</P> </SYNC> </BODY> </SAMI>

因为它支持大部分的HTML,因此呢我直接用html来解析字幕主要是:smiTemp = smiTemp + Html.fromHtml(data).toString();

由于如果一般的smi字幕文件大概在200k到300k这样子,如果全部解析的话呢,花的时间也太长了。因此我就仅仅解析时间和字幕,分别用两个ArrayList来保存字幕和时间点。如下:

ArrayList<Integer> timeMills = new ArrayList<Integer>();
// smi context
ArrayList<String> messages = new ArrayList<String>();
ArrayList<ArrayList> arrays = new ArrayList<ArrayList>();

同时呢也用到正则表达式。具体见如下代码,不过解析200K左右的字幕也需要15s左右,还没有找到更好的办法。

import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.regex.Matcher; import java.util.regex.Pattern; import android.text.Html; public class SmiProcessor { // 字幕读取 // start time ArrayList<Integer> timeMills = new ArrayList<Integer>(); // smi context ArrayList<String> messages = new ArrayList<String>(); ArrayList<ArrayList> arrays = new ArrayList<ArrayList>(); String data = ""; boolean start_flag = false; boolean started = false; String smiTemp = ""; Pattern patternstart = Pattern.compile("<SYNC Start=(.*?)><P Class=(.*?)>", 2); Matcher matcher = null; Pattern time = Pattern.compile("><P Class=(.*?)>", 2); Pattern spaceP = Pattern .compile("<SYNC Start=//d+><P Class=//w+>", 2); Matcher spacematcher =null; public ArrayList<ArrayList> process(InputStream inputStream) { System.out.println("SmiProcessor"); try { // InputStreamReader inputReader = new InputStreamReader(inputStream, "EUC-KR"); BufferedReader br = new BufferedReader(inputReader); while ((data = br.readLine()) != null) { if (!started) { // 去除前面的字幕相关头部分 2 表示不区分大小写 matcher = patternstart.matcher(data); if (matcher.find()) { started = true; start_flag = true; } } if (start_flag) { // 取得时间点 matcher = patternstart.matcher(data); if (matcher.find()) { if (smiTemp != "") { messages.add(smiTemp); smiTemp = ""; } timeMills.add(Integer.parseInt(data.substring( data.indexOf("=") + 1, data.indexOf(">")))); spacematcher = spaceP.matcher(data); if (spacematcher.find()) { smiTemp = smiTemp + " "; } } else { smiTemp = smiTemp + Html.fromHtml(data).toString(); } } } arrays.add(timeMills); arrays.add(messages); } catch (Exception e) { e.printStackTrace(); } return arrays; }

更多相关文章

  1. Vsync垂直同步信号分发和SurfaceFlinger响应执行渲染流程分析(一)
  2. Android中Alarm的机制
  3. Android(安卓)闹钟详解
  4. Android(安卓)ant自动打包脚本:自动替换友盟渠道、版本号、包名
  5. Android中AlarmManager的使用
  6. Android(安卓)alarm解析
  7. Android屏幕待机时间的获取和设置
  8. android选择时间攻略
  9. android中数字及模拟小时钟设计

随机推荐

  1. How C/C++ Debugging Works on Android
  2. Android监听手机网络变化
  3. 使用Scala开发Android
  4. Android(安卓)Adapter详解
  5. Android(安卓)学习笔记--android——Acti
  6. Android的AnimationSet动画实现图片的轮
  7. Android好文章
  8. Android(安卓)SDK 自带项目GestureBuilde
  9. 在RelativeLayout布局中可以设置标签的an
  10. [2015-06-10 20:53:50 - Android(安卓)SD