接着上一篇WTD的介绍 ,看下实际死锁情况下,WTD的功能与改造。

最近遇见Android开机一直停留在动画界面,查看trace文件发现死锁了,简要信息如下:

"main" prio=5 tid=1 MONITOR  | group="main" sCount=1 dsCount=0 obj=0x4c20f360 self=0x71e1ade0  | sysTid=519 nice=-2 sched=0/0 cgrp=apps handle=1878216768  | state=S schedstat=( 736667963 56924727 1529 ) utm=62 stm=11 core=0  at com.android.server.am.ActivityManagerService.registerReceiver(ActivityManagerService.java:~13326)  - waiting to lock <0x4c6b2630> (a com.android.server.am.ActivityManagerService) held by tid=27 (InputDispatcher)  at android.app.ContextImpl.registerReceiverInternal(ContextImpl.java:1473)  at android.app.ContextImpl.registerReceiver(ContextImpl.java:1441)  at com.android.server.power.PowerManagerService.systemReady(PowerManagerService.java:494)  at com.android.server.ServerThread.initAndLoop(SystemServer.java:1050)  at com.android.server.SystemServer.main(SystemServer.java:1371)  at java.lang.reflect.Method.invokeNative(Native Method)  at java.lang.reflect.Method.invoke(Method.java:515)  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:794)  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:610)  at dalvik.system.NativeStart.main(Native Method)"InputDispatcher" prio=10 tid=27 MONITOR  | group="main" sCount=1 dsCount=0 obj=0x4c9c7d60 self=0x72010e50  | sysTid=554 nice=-8 sched=0/0 cgrp=apps handle=1912287104  | state=S schedstat=( 1007065539 96683590 71214 ) utm=22 stm=78 core=0  at com.android.server.power.PowerManagerService.setScreenBrightnessOverrideFromWindowManagerInternal(PowerManagerService.java:~2206)  - waiting to lock <0x4c6a8af0> (a java.lang.Object) held by tid=1 (main)  at com.android.server.power.PowerManagerService.setScreenBrightnessOverrideFromWindowManager(PowerManagerService.java:2199)  at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLockedInner(WindowManagerService.java:9818)  at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLockedLoop(WindowManagerService.java:8566)  at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLocked(WindowManagerService.java:8508)  at com.android.server.wm.WindowManagerService.setNewConfiguration(WindowManagerService.java:3847)  at com.android.server.am.ActivityManagerService.updateConfigurationLocked(ActivityManagerService.java:14490)  at com.android.server.am.ActivityManagerService.updateConfiguration(ActivityManagerService.java:14375)  at com.android.server.wm.WindowManagerService.sendNewConfiguration(WindowManagerService.java:6725)  at com.android.server.wm.InputMonitor.notifyConfigurationChanged(InputMonitor.java:325)  at com.android.server.input.InputManagerService.notifyConfigurationChanged(InputManagerService.java:1275)  at dalvik.system.NativeStart.run(Native Method)

trace很清楚的说明了main、InputDispatcher线程发生互相的死锁。从栈信息函数调用上可以看出两个线程都都用了AMS、PMS服务,从上一篇分析来看,AMS、PMS都是已经添加到WTD中进行检测的,为何服务发生死锁了,WTD没有检测到?

回到上一篇看一下有关AMS、PMS的启动流程,还有WTD的启动时间点,如下:

    public void initAndLoop() {        try {            // Wait for installd to finished starting up so that it has a chance to            // create critical directories such as /data/user with the appropriate            // permissions.  We need this to complete before we initialize other services.            Slog.i(TAG, "Waiting for installd to be ready.");            installer = new Installer();            installer.ping();            Slog.i(TAG, "Power Manager");            power = new PowerManagerService();            ServiceManager.addService(Context.POWER_SERVICE, power);            Slog.i(TAG, "Activity Manager");            context = ActivityManagerService.main(factoryTest);        } catch (RuntimeException e) {            Slog.e("System", "******************************************");            Slog.e("System", "************ Failure starting bootstrap service", e);        }            // only initialize the power service after we have started the            // lights service, content providers and the battery service.            power.init(context, lights, ActivityManagerService.self(), battery,                    BatteryStatsService.getService(),                    ActivityManagerService.self().getAppOpsService(), display);            Slog.i(TAG, "Init Watchdog");            Watchdog.getInstance().init(context, battery, power, alarm,                    ActivityManagerService.self());            Watchdog.getInstance().addThread(wmHandler, "WindowManager thread");        try {            <span style="color:#ff0000;">power.systemReady(twilight, dreamy);</span>        } catch (Throwable e) {            reportWtf("making Power Manager Service ready", e);        }        ActivityManagerService.self().systemReady(new Runnable() {            public void run() {                <span style="color:#cc0000;">Watchdog.getInstance().start();</span>
从systemserver.java文件上可以看到WTD线程的启动是在很多service注册之后才启动的,那么如果service注册过程死锁,WTD就没法启动检测了。所以上面trace死锁问题的原因就找到了,接下来想办法如何解决这个问题。我大致觉得办法有三,如下:

一. 提前WTD的运行,即在实例化后马上运行,这样当出现上诉死锁时,WTD将能够检测到并杀死死锁线程

二. 在AMS、PMS中设置ReentrantLock互斥锁,按照trace死锁的位置,设定函数访问互斥锁,当PMS systemready函数持有锁时,setScreenBrightnessOverrideFromWindowManager不去申请锁,访问死锁

三. 服务注册过程中禁止InputManagerService.notifyConfigurationChanged,这种做法我觉得没有办法二恰当,出现这个死锁是因为系统挂着USB输入设备,USB是热插拔设备,注册时间上是不可控的,也就导致了上诉的死锁。


重点说明方法一方法,加速WTD的运行。以下patch就是提前WTD运行的思路。结合WTD源码分析,加速WTD的运行首先要考虑这样做系统的稳定性。尤其是提前的WTD的运行,是否影响后续服务的WTD使用,以及WTD在此过程中,资源的访问是否存在问题。

--- a/frameworks/base/services/java/com/android/server/SystemServer.java+++ b/frameworks/base/services/java/com/android/server/SystemServer.java@@ -351,7 +351,9 @@ class ServerThread {             Watchdog.getInstance().init(context, battery, power, alarm,                     ActivityManagerService.self());             Watchdog.getInstance().addThread(wmHandler, "WindowManager thread");-+               Watchdog.getInstance().start();+                            Slog.i(TAG, "Input Manager");@@ -1165,8 +1167,8 @@ class ServerThread {                 } catch (Throwable e) {                     reportWtf("making Recognition Service ready", e);                 }-                Watchdog.getInstance().start();-+                //Watchdog.getInstance().start();                 // It is now okay to let the various system services start their                 // third party code...

针对以上问题综合分析,我认为这个过程存在的问题是可以避免的,只是在上诉patch的基础上,需要对watchdog.java文件进行一些额外处理。在此制作简单描述,实现起来比较简单。

1. 取消addMonitor、addThread函数接口中对线程状态的判断,否则WTD启动后不能添加监视器到WTD中

2. WTD启动后,run函数和addMonitor、addThread存在锁竞争,而run函数的执行周期很长,在系统启动过程中需要调节run函数的执行周期

按照上诉注意事项对WTD进行启动时序改造后,系统可以正常运行,WTD运行正常,我进行reboot测试一千次,暂无影响




更多相关文章

  1. android在一个程序中启动另一个程序
  2. Android之解决开启热点后跳转页面不稳定问题
  3. android webView调用js函数的几种方法
  4. android IPC通信机制中BBinder与BpBinder的区别
  5. 如何解决Android(安卓)5.0中出现的警告:Service Intent must be e
  6. Android(安卓)Activity 的详细启动过程分析
  7. Android(安卓)实现微信,QQ的程序前后台切换:back键切换后台;点击通
  8. Activity的启动模式和onNewIntent
  9. Android的存储系统—Vold与MountService分析(三)

随机推荐

  1. Android(安卓)下使用 JSON 实现 HTTP 请
  2. Android中图像的几何变化中Matrix的使用
  3. Android(安卓)架构组件之 ViewBinding(视
  4. 2011.06.07(2)——— android 调试android
  5. Android(安卓)中全局键的处理GlobalKeyMa
  6. android八种通信方式
  7. Xml解析之----Pull
  8. android环境搭建
  9. Android简明开发教程六:用户界面设计
  10. android:layout_gravity 和 android:grav