一、内容预览

SystemServer进程的启动.png

二、概述

前面进程系列已经更新了两篇,本文(基于Android O源码)主要讲解SystemServer进程创建流程上半部分,下半部梳理一下SytemServer进程创建之后的启动阶段以及运行的核心服务。
Android进程系列第一篇---进程基础
Android进程系列第二篇---Zygote进程的创建流程

简要回顾上一篇的重点的内容

  • Zygote进程实质是一种C/S架构,Zygote进程作为Server端,处理四面八方的客户端通过Socket发送来的创建进程的请求;
  • 总结了Socket通信的框架,Init进程add了socket的fd,Zygote进程get到这个fd,创建了LocalServerSocket;
  • 总结了Zygote进程做为所有应用进程的原因是什么;
  • 总结Zygote进程如何进行资源的预加载,以及Zygote进程为什么不能在子线程中加载进程的资源

本篇文章主要写SystemServer进程的创建,SystemServer进程是Zygote进程的大弟子,是Zygote进程fork的第一个进程,Zygote和SystemServer这两个进程顶起了Java世界的半边天,任何一个进程的死亡,都会导致Java世界的崩溃。通常我们大多数死机重启问题也是发生在了SystemServer进程中。SystemServer进程运行了几十种核心服务,为了防止应用进程对系统造成破坏,应用进程没有权限访问系统的资源,只能通过SystemServer进程的代理来访问,从这几点可见SystemServer进程相当重要。

三、SystemServer的创建流程

SystemServer进程的创建.png
3.1、ZygoteInit的main方法

上图是SystemServer的创建序列图,我们仍然从ZygoteInit的main方法开始说起,再次亮出下面的“模板”代码。

  frameworks/base/core/java/com/android/internal/os/ZygoteInit.java public static void main(String argv[]) {        //1、创建ZygoteServer        ZygoteServer zygoteServer = new ZygoteServer();        try {            //2、创建一个Server端的Socket            zygoteServer.registerServerSocket(socketName);            //3、加载进程的资源和类            preload(bootTimingsTraceLog);            if (startSystemServer) {                //4、开启SystemServer进程,这是受精卵进程的第一次分裂                startSystemServer(abiList, socketName, zygoteServer);            }            //5、启动一个死循环监听来自Client端的消息            zygoteServer.runSelectLoop(abiList);             //6、关闭SystemServer的Socket            zygoteServer.closeServerSocket();        } catch (Zygote.MethodAndArgsCaller caller) {             //7、这里捕获这个异常调用MethodAndArgsCaller的run方法。            caller.run();        } catch (Throwable ex) {            Log.e(TAG, "System zygote died with exception", ex);            zygoteServer.closeServerSocket();            throw ex;        }    }

ZygoteInit的main方法有7个关键点,1,2,3小点我们在上一篇已经进行了梳理,现在从第四点开始分析。

590    /**591     * Prepare the arguments and fork for the system server process.592     */593    private static boolean startSystemServer(String abiList, String socketName, ZygoteServer zygoteServer)594            throws Zygote.MethodAndArgsCaller, RuntimeException {              .........613        /* Hardcoded command line to start the system server */614        String args[] = {615            "--setuid=1000",616            "--setgid=1000",617            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",618            "--capabilities=" + capabilities + "," + capabilities,619            "--nice-name=system_server",620            "--runtime-args",621            "com.android.server.SystemServer",622        };623        ZygoteConnection.Arguments parsedArgs = null;624625        int pid;626627        try {628            parsedArgs = new ZygoteConnection.Arguments(args);629            ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);630            ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);631632            //创建System进程,底层调用fork函数,见3.2小节633            pid = Zygote.forkSystemServer(634                    parsedArgs.uid, parsedArgs.gid,635                    parsedArgs.gids,636                    parsedArgs.debugFlags,637                    null,638                    parsedArgs.permittedCapabilities,639                    parsedArgs.effectiveCapabilities);640        } catch (IllegalArgumentException ex) {641            throw new RuntimeException(ex);642        }643644        //fork函数会返回两次,pid==0意味着子进程创建成功645        if (pid == 0) {               //如果机器支持32位应用,需要等待32位的Zygote连接成功646            if (hasSecondZygote(abiList)) {647                waitForSecondaryZygote(socketName);648            }649            //关闭从Zygote进程继承来的Socket650            zygoteServer.closeServerSocket();                 //处理SytemServer进程接下来的事情,见3.4小节651            handleSystemServerProcess(parsedArgs);652        }653654        return true;655    }656
  • 1、将数组args转换成 ZygoteConnection.Arguments的形式,实质就是给 ZygoteConnection.Arguments中成员变量赋值,那么这些参数是什么意思呢?
614        String args[] = {615            "--setuid=1000",616            "--setgid=1000",617            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",618            "--capabilities=" + capabilities + "," + capabilities,619            "--nice-name=system_server",620            "--runtime-args",621            "com.android.server.SystemServer",622        };

SystemServer进程的pid和gid都设置成1000,setgroups指定进程所属组,capabilities可设定进程的权限,nice-names是进程的名称,执行类是com.android.server.SystemServer。

  • 2、调用forkSystemServer fork出系统进程,实质还是调用C层的fork函数(基于写时复制机制),如果返回的pid=0,代表成功fork出System进程。
  • 3 、当Zygote复制出新的进程时,由于复制出的新进程与Zygote进程共享内存空间,而在Zygote进程中创建的服务端Socket是新进程不需要的,所以新创建的进程需调用 zygoteServer.closeServerSocket()方法关闭该Socket服务端。
3.2、Zygote的forkSystemServer方法
/frameworks/base/core/java/com/android/internal/os/Zygote.java146    public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,147            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {148        VM_HOOKS.preFork();149        // Resets nice priority for zygote process.150        resetNicePriority();151        int pid = nativeForkSystemServer(152                uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);153        // Enable tracing as soon as we enter the system_server.154        if (pid == 0) {155            Trace.setTracingEnabled(true);156        }157        VM_HOOKS.postForkCommon();158        return pid;159    }

nativeForkSystemServer是一个JNI方法,是在AndroidRuntime.cpp中注册的,调用com_android_internal_os_Zygote.cpp中的register_com_android_internal_os_Zygote()方法建立native方法的映射关系。

/frameworks/base/core/jni/com_android_internal_os_Zygote.cpp728static jint com_android_internal_os_Zygote_nativeForkSystemServer(729        JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,730        jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,731        jlong effectiveCapabilities) {732  pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,733                                      debug_flags, rlimits,734                                      permittedCapabilities, effectiveCapabilities,735                                      MOUNT_EXTERNAL_DEFAULT, NULL, NULL, true, NULL,736                                      NULL, NULL, NULL);737  if (pid > 0) {738      // The zygote process checks whether the child process has died or not.739      ALOGI("System server process %d has been created", pid);740      gSystemServerPid = pid;741      // There is a slight window that the system server process has crashed742      // but it went unnoticed because we haven't published its pid yet. So743      // we recheck here just to make sure that all is well.744      int status;745      if (waitpid(pid, &status, WNOHANG) == pid) {746          ALOGE("System server process %d has died. Restarting Zygote!", pid);747          RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");748      }749  }750  return pid;751}

这里需要解释一下waitpid函数

  • 如果在调用waitpid()函数时,当指定等待的子进程已经停止运行或结束了,则waitpid()会立即返回;但是如果子进程还没有停止运行或结束,则调用waitpid()函数的父进程则会被阻塞,暂停运行。

  • status这个参数将保存子进程的状态信息,有了这个信息父进程就可以了解子进程为什么会退出,是正常退出还是出了什么错误。如果status不是空指针,则状态信息将被写入。

  • waitpid()函数第三个参数有两个选项,一是WNOHANG,如果pid指定的子进程没有结束,则waitpid()函数立即返回0,而不是阻塞在这个函数上等待;如果结束了,则返回该子进程的进程号。二是WUNTRACED,如果子进程进入暂停状态,则马上返回。

所以(waitpid(pid, &status, WNOHANG) == pid成立的时候,这意味着SytemServer进程died了,需要重启Zygote进程。继续看ForkAndSpecializeCommon函数。

474// Utility routine to fork zygote and specialize the child process.475static pid_t ForkAndSpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray javaGids,476                                     jint debug_flags, jobjectArray javaRlimits,477                                     jlong permittedCapabilities, jlong effectiveCapabilities,478                                     jint mount_external,479                                     jstring java_se_info, jstring java_se_name,480                                     bool is_system_server, jintArray fdsToClose,481                                     jintArray fdsToIgnore,482                                     jstring instructionSet, jstring dataDir) {       //设置子进程的signal信号处理函数,见3.3小节483  SetSigChldHandler();        516  ......        //fork子进程517  pid_t pid = fork();518519  if (pid == 0) {520    // The child process.       ......576    if (!is_system_server) {577        int rc = createProcessGroup(uid, getpid());578        if (rc != 0) {579            if (rc == -EROFS) {580                ALOGW("createProcessGroup failed, kernel missing CONFIG_CGROUP_CPUACCT?");581            } else {582                ALOGE("createProcessGroup(%d, %d) failed: %s", uid, pid, strerror(-rc));583            }584        }585    }586587    SetGids(env, javaGids);//设置设置group588589    SetRLimits(env, javaRlimits);//设置资源limit590597    int rc = setresgid(gid, gid, gid);598    if (rc == -1) {599      ALOGE("setresgid(%d) failed: %s", gid, strerror(errno));600      RuntimeAbort(env, __LINE__, "setresgid failed");601    }602603    rc = setresuid(uid, uid, uid);//设置uid      .......617618    SetCapabilities(env, permittedCapabilities, effectiveCapabilities, permittedCapabilities);619620    SetSchedulerPolicy(env);//设置调度策略621          .......         //创建selinux上下文640    rc = selinux_android_setcontext(uid, is_system_server, se_info_c_str, se_name_c_str);          .......666  } else if (pid > 0) {          .......673    }674  }675  return pid;676}677}  // anonymous namespace678

值得注意的是在fork之前,调用了SetSigChldHandler,SetSigChldHandler定义了信号处理函数SigChldHandler,当信号SIGCHLD到来的时候,会进入3.3中的信号处理函数。

3.3、SystemServer与Zygote共存亡
141// Configures the SIGCHLD handler for the zygote process. This is configured142// very late, because earlier in the runtime we may fork() and exec()143// other processes, and we want to waitpid() for those rather than144// have them be harvested immediately.145//146// This ends up being called repeatedly before each fork(), but there's147// no real harm in that.148static void SetSigChldHandler() {149  struct sigaction sa;150  memset(&sa, 0, sizeof(sa));151  sa.sa_handler = SigChldHandler;152153  int err = sigaction(SIGCHLD, &sa, NULL);154  if (err < 0) {155    ALOGW("Error setting SIGCHLD handler: %s", strerror(errno));156  }157}
89// This signal handler is for zygote mode, since the zygote must reap its children90static void SigChldHandler(int /*signal_number*/) {91  pid_t pid;92  int status;9394  // It's necessary to save and restore the errno during this function.95  // Since errno is stored per thread, changing it here modifies the errno96  // on the thread on which this signal handler executes. If a signal occurs97  // between a call and an errno check, it's possible to get the errno set98  // here.99  // See b/23572286 for extra information.100  int saved_errno = errno;101102  while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {103     // Log process-death status that we care about.  In general it is104     // not safe to call LOG(...) from a signal handler because of105     // possible reentrancy.  However, we know a priori that the106     // current implementation of LOG() is safe to call from a SIGCHLD107     // handler in the zygote process.  If the LOG() implementation108     // changes its locking strategy or its use of syscalls within the109     // lazy-init critical section, its use here may become unsafe.110    if (WIFEXITED(status)) {111      if (WEXITSTATUS(status)) {112        ALOGI("Process %d exited cleanly (%d)", pid, WEXITSTATUS(status));113      }114    } else if (WIFSIGNALED(status)) {115      if (WTERMSIG(status) != SIGKILL) {116        ALOGI("Process %d exited due to signal (%d)", pid, WTERMSIG(status));117      }118      if (WCOREDUMP(status)) {119        ALOGI("Process %d dumped core.", pid);120      }121    }122123    // If the just-crashed process is the system_server, bring down zygote124    // so that it is restarted by init and system server will be restarted125    // from there.126    if (pid == gSystemServerPid) {127      ALOGE("Exit zygote because system server (%d) has terminated", pid);128      kill(getpid(), SIGKILL);129    }130  }131132  // Note that we shouldn't consider ECHILD an error because133  // the secondary zygote might have no children left to wait for.134  if (pid < 0 && errno != ECHILD) {135    ALOGW("Zygote SIGCHLD error in waitpid: %s", strerror(errno));136  }137138  errno = saved_errno;139}

system_server进程是zygote的大弟子,是zygote进程fork的第一个进程,zygote和system_server这两个进程可以说是Java世界的半边天,任何一个进程的死亡,都会导致Java世界的崩溃。所以如果子进程SystemServer挂了,Zygote就会自杀,导致Zygote重启。也是Zygote和SystemServer是共存亡的。

3.4、handleSystemServerProcess方法处理fork的新进程
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java446    /**447     * Finish remaining work for the newly forked system server process.448     */449    private static void handleSystemServerProcess(450            ZygoteConnection.Arguments parsedArgs)451            throws Zygote.MethodAndArgsCaller {452453        // set umask to 0077 so new files and directories will default to owner-only permissions.454        Os.umask(S_IRWXG | S_IRWXO);455        //设置新进程的名字456        if (parsedArgs.niceName != null) {457            Process.setArgV0(parsedArgs.niceName);458        }459       //获取systemServerClasspath460        final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");461        if (systemServerClasspath != null) {                  //优化systemServerClasspath路径之下的dex文件,看延伸阅读462            performSystemServerDexOpt(systemServerClasspath);463            // Capturing profiles is only supported for debug or eng builds since selinux normally464            // prevents it.465            boolean profileSystemServer = SystemProperties.getBoolean(466                    "dalvik.vm.profilesystemserver", false);467            if (profileSystemServer && (Build.IS_USERDEBUG || Build.IS_ENG)) {468                try {469                    File profileDir = Environment.getDataProfilesDePackageDirectory(470                            Process.SYSTEM_UID, "system_server");471                    File profile = new File(profileDir, "primary.prof");472                    profile.getParentFile().mkdirs();473                    profile.createNewFile();474                    String[] codePaths = systemServerClasspath.split(":");475                    VMRuntime.registerAppInfo(profile.getPath(), codePaths);476                } catch (Exception e) {477                    Log.wtf(TAG, "Failed to set up system server profile", e);478                }479            }480        }481       //此处是空,所以是eles分之482        if (parsedArgs.invokeWith != null) {483            String[] args = parsedArgs.remainingArgs;484            // If we have a non-null system server class path, we'll have to duplicate the485            // existing arguments and append the classpath to it. ART will handle the classpath486            // correctly when we exec a new process.487            if (systemServerClasspath != null) {488                String[] amendedArgs = new String[args.length + 2];489                amendedArgs[0] = "-cp";490                amendedArgs[1] = systemServerClasspath;491                System.arraycopy(args, 0, amendedArgs, 2, args.length);492                args = amendedArgs;493            }494495            WrapperInit.execApplication(parsedArgs.invokeWith,496                    parsedArgs.niceName, parsedArgs.targetSdkVersion,497                    VMRuntime.getCurrentInstructionSet(), null, args);498        } else {499            ClassLoader cl = null;500            if (systemServerClasspath != null) {501                cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);502503                Thread.currentThread().setContextClassLoader(cl);504            }505506            /*507             * Pass the remaining arguments to SystemServer.见3.5小节508             */509            ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);510        }511512        /* should never reach here */513    }

延伸阅读:
在Android系统中,一个App的所有代码都在一个Dex文件里面。Dex是一个类似Jar的存储了多有Java编译字节码的归档文件。因为Android系统使用Dalvik虚拟机,所以需要把使用Java Compiler编译之后的class文件转换成Dalvik能够执行的class文件。这里需要强调的是,Dex和Jar一样是一个归档文件,里面仍然是Java代码对应的字节码文件。当Android系统启动一个应用的时候,有一步是对Dex进行优化,这个过程有一个专门的工具来处理,叫DexOpt。DexOpt的执行过程是在第一次加载Dex文件的时候执行的。这个过程会生成一个ODEX文件,即Optimised Dex。执行ODex的效率会比直接执行Dex文件的效率要高很多。但是在早期的Android系统中,DexOpt有一个问题,DexOpt会把每一个类的方法id检索起来,存在一个链表结构里面。但是这个链表的长度是用一个short类型来保存的,导致了方法id的数目不能够超过65536个。当一个项目足够大的时候,显然这个方法数的上限是不够的。尽管在新版本的Android系统中,DexOpt修复了这个问题,但是我们仍然需要对老系统做兼容。

Android提供了一个专门验证与优化dex文件的工具dexopt。其源码位于Android系统源码的dalvik/dexopt目录下classPath中的内容如下

systemServerClasspath = /system/framework/services.jar:/system/framework/ethernet-service.jar:/system/framework/wifi-service.jar

之后会将这三个jar从路径中获取出来,判断是否要进行dexopt优化. 如果需要就调用installer进行优化。

3.5、zygoteInit方法
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java816    /**817     * The main function called when started through the zygote process. This818     * could be unified with main(), if the native code in nativeFinishInit()819     * were rationalized with Zygote startup.

820 *821 * Current recognized args:822 *

    823 *
  • [--] <start class name> <args>824 *
825 *826 * @param targetSdkVersion target SDK version827 * @param argv arg strings828 */829 public static final void zygoteInit(int targetSdkVersion, String[] argv,830 ClassLoader classLoader) throws Zygote.MethodAndArgsCaller {831 if (RuntimeInit.DEBUG) {832 Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");833 }834835 Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit"); //见3.5.1836 RuntimeInit.redirectLogStreams();837 //见3.5.2838 RuntimeInit.commonInit(); //见3.5.3839 ZygoteInit.nativeZygoteInit(); //见3.5.4840 RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);841 }842
3.5.1、RuntimeInit的redirectLogStreams方法
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java319    /**320     * Redirect System.out and System.err to the Android log.321     */322    public static void redirectLogStreams() {323        System.out.close();324        System.setOut(new AndroidPrintStream(Log.INFO, "System.out"));325        System.err.close();326        System.setErr(new AndroidPrintStream(Log.WARN, "System.err"));327    }

初始化Android LOG输出流, 并且将system.out, system.err关闭, 将两者重新定向到Android log中 。

3.5.2、RuntimeInit的commonInit方法
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java135    protected static final void commonInit() {136        if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");137138        /*139         * set handlers; these apply to all threads in the VM. Apps can replace140         * the default handler, but not the pre handler.141         */             //设置进程的uncaught exception的处理方法,默认是设置LoggingHandler,输出函数的出错堆栈。见3.5.2.1142        Thread.setUncaughtExceptionPreHandler(new LoggingHandler());            //进入异常崩溃的处理流程,通知AMS弹窗,见3.5.2.2143        Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler());144145        /*146         * Install a TimezoneGetter subclass for ZoneInfo.db,设置时区147         */148        TimezoneGetter.setInstance(new TimezoneGetter() {149            @Override150            public String getId() {151                return SystemProperties.get("persist.sys.timezone");152            }153        });154        TimeZone.setDefault(null);155156        /*157         * Sets handler for java.util.logging to use Android log facilities.158         * The odd "new instance-and-then-throw-away" is a mirror of how159         * the "java.util.logging.config.class" system property works. We160         * can't use the system property here since the logger has almost161         * certainly already been initialized.162         */163        LogManager.getLogManager().reset();164        new AndroidConfig();165166        /*167         * Sets the default HTTP User-Agent used by HttpURLConnection.168         */169        String userAgent = getDefaultUserAgent();170        System.setProperty("http.agent", userAgent);171172        /*173         * Wire socket tagging to traffic stats.174         */175        NetworkManagementSocketTagger.install();176177        /*178         * If we're running in an emulator launched with "-trace", put the179         * VM into emulator trace profiling mode so that the user can hit180         * F9/F10 at any time to capture traces.  This has performance181         * consequences, so it's not something you want to do always.182         */183        String trace = SystemProperties.get("ro.kernel.android.tracing");184        if (trace.equals("1")) {185            Slog.i(TAG, "NOTE: emulator trace profiling enabled");186            Debug.enableEmulatorTraceOutput();187        }188189        initialized = true;190    }
3.5.2.1、 设置进程出错堆栈的捕获方式。
 /frameworks/base/core/java/com/android/internal/os/RuntimeInit.java63    /**64     * Logs a message when a thread encounters an uncaught exception. By65     * default, {@link KillApplicationHandler} will terminate this process later,66     * but apps can override that behavior.67     */68    private static class LoggingHandler implements Thread.UncaughtExceptionHandler {69        @Override70        public void uncaughtException(Thread t, Throwable e) {71            // Don't re-enter if KillApplicationHandler has already run72            if (mCrashing) return;73            if (mApplicationObject == null) {74                // The "FATAL EXCEPTION" string is still used on Android even though75                // apps can set a custom UncaughtExceptionHandler that renders uncaught76                // exceptions non-fatal.77                Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);78            } else {79                StringBuilder message = new StringBuilder();80                // The "FATAL EXCEPTION" string is still used on Android even though81                // apps can set a custom UncaughtExceptionHandler that renders uncaught82                // exceptions non-fatal.83                message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");84                final String processName = ActivityThread.currentProcessName();85                if (processName != null) {86                    message.append("Process: ").append(processName).append(", ");87                }88                message.append("PID: ").append(Process.myPid());89                Clog_e(TAG, message.toString(), e);90            }91        }92    }

应用的JAVA的crash问题是FATAL EXCEPTION开头的,比如:

01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: FATAL EXCEPTION: main01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: Process: com.xiaomi.scanner, PID: 1763501-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: java.lang.IllegalArgumentException: [email protected][] not attached to window manager01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerGlobal.findViewLocked(WindowManagerGlobal.java:491)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerGlobal.removeView(WindowManagerGlobal.java:400)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.view.WindowManagerImpl.removeViewImmediate(WindowManagerImpl.java:125)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.app.Dialog.dismissDialog(Dialog.java:374)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at android.app.Dialog.dismiss(Dialog.java:357)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:14)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.a.d.c(Unknown Source:39)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.a.d.a(Unknown Source:53)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:30)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity.b(Unknown Source:0)01-16 11:48:50.525 10026 17635 17635 E AndroidRuntime: at com.alibaba.imagesearch.ui.SearchResultActivity$6.onJsPrompt(Unknown 

系统的JAVA的crash问题是FATAL EXCEPTION IN SYSTEM PROCESS开头的,比如:

logcat.log.01:2211: 08-27 16:41:16.664  2999  3026 E AndroidRuntime: *** FATAL EXCEPTION IN SYSTEM PROCESS: android.bglogcat.log.01:2212: 08-27 16:41:16.664  2999  3026 E AndroidRuntime: java.lang.NullPointerException: Attempt to get length of null arraylogcat.log.01:2213: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService.isUidIdle(NetworkPolicyManagerService.java:2318)logcat.log.01:2214: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService.updateRuleForAppIdleLocked(NetworkPolicyManagerService.java:2244)logcat.log.01:2215: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService.updateRulesForTempWhitelistChangeLocked(NetworkPolicyManagerService.java:2298)logcat.log.01:2216: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at com.android.server.net.NetworkPolicyManagerService$3.run(NetworkPolicyManagerService.java:572)logcat.log.01:2217: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.Handler.handleCallback(Handler.java:739)logcat.log.01:2218: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.Handler.dispatchMessage(Handler.java:95)logcat.log.01:2219: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.Looper.loop(Looper.java:148)logcat.log.01:2220: 08-27 16:41:16.664  2999  3026 E AndroidRuntime:     at android.os.HandlerThread.run(HandlerThread.java:61)logcat.log.01:2221: 08-27 16:41:16.665  2999  3026 I am_crash: [2999,0,system_server,-1,java.lang.NullPointerException,Attempt to get length of null array,NetworkPolicyManagerService.java,2318]logcat.log.01:2224: 08-27 16:41:16.696  2999  3026 I MitvActivityManagerService: handleApplicationCrash, processName: system_serverlogcat.log.01:2225: 08-27 16:41:16.696  2999  3026 I Process : Sending signal. PID: 2999 SIG: 9
3.5.2.1、 发生JE问题,弹窗提醒用户。
100    private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {101        public void uncaughtException(Thread t, Throwable e) {102            try {103                // Don't re-enter -- avoid infinite loops if crash-reporting crashes.104                if (mCrashing) return;105                mCrashing = true;106107                // Try to end profiling. If a profiler is running at this point, and we kill the108                // process (below), the in-memory buffer will be lost. So try to stop, which will109                // flush the buffer. (This makes method trace profiling useful to debug crashes.)110                if (ActivityThread.currentActivityThread() != null) {111                    ActivityThread.currentActivityThread().stopProfiling();112                }113114                // Bring up crash dialog, wait for it to be dismissed,通知AMS弹窗115                ActivityManager.getService().handleApplicationCrash(116                        mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));117            } catch (Throwable t2) {118                if (t2 instanceof DeadObjectException) {119                    // System process is dead; ignore120                } else {121                    try {122                        Clog_e(TAG, "Error reporting crash", t2);123                    } catch (Throwable t3) {124                        // Even Clog_e() fails!  Oh well.125                    }126                }127            } finally {128                // Try everything to make sure this process goes away.129                Process.killProcess(Process.myPid());130                System.exit(10);131            }132        }133    }
3.5.3、ZygoteInit的nativeZygoteInit方法

nativeZygoteInit方法是个JNI方法,在AndroidRuntime.cpp中注册。

/frameworks/base/core/jni/AndroidRuntime.cpp12811282static const RegJNIRec gRegJNI[] = {1283    REG_JNI(register_com_android_internal_os_RuntimeInit),1284    REG_JNI(register_com_android_internal_os_ZygoteInit),  .....
/frameworks/base/core/jni/AndroidRuntime.cpp48int register_com_android_internal_os_ZygoteInit(JNIEnv* env)249{250    const JNINativeMethod methods[] = {251        { "nativeZygoteInit", "()V",252            (void*) com_android_internal_os_ZygoteInit_nativeZygoteInit },253    };254    return jniRegisterNativeMethods(env, "com/android/internal/os/ZygoteInit",255        methods, NELEM(methods));256}

所以实际调用的是com_android_internal_os_ZygoteInit_nativeZygoteInit函数。

/frameworks/base/core/jni/AndroidRuntime.cpp221static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)222{223    gCurRuntime->onZygoteInit();224}

com_android_internal_os_ZygoteInit_nativeZygoteInit调用的是AndroidRuntime的onZygoteInit函数,但是onZygoteInit函数是个虚函数,它的实现是app_main.cpp中。

/frameworks/base/cmds/app_process/app_main.cpp91    virtual void onZygoteInit()92    {93        sp proc = ProcessState::self();94        ALOGV("App process: starting thread pool.\n");            //开启Binder线程池95        proc->startThreadPool();96    }
 /frameworks/native/libs/binder/ProcessState.cpp145void ProcessState::startThreadPool()146{147    AutoMutex _l(mLock);148    if (!mThreadPoolStarted) {149        mThreadPoolStarted = true;150        spawnPooledThread(true);151    }152}153
 /frameworks/native/libs/binder/ProcessState.cpp300void ProcessState::spawnPooledThread(bool isMain)301{302    if (mThreadPoolStarted) {303        String8 name = makeBinderThreadName();304        ALOGV("Spawning new pooled thread, name=%s\n", name.string());305        sp t = new PoolThread(isMain);306        t->run(name.string());307    }308}
 /frameworks/native/libs/binder/ProcessState.cpp292String8 ProcessState::makeBinderThreadName() {293    int32_t s = android_atomic_add(1, &mThreadPoolSeq);294    pid_t pid = getpid();295    String8 name;296    name.appendFormat("Binder:%d_%X", pid, s);297    return name;298}
3.5.4、RuntimeInit的applicationInit方法
/frameworks/base/core/java/com/android/internal/os/RuntimeInit.java289    protected static void applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)290            throws Zygote.MethodAndArgsCaller {291        // If the application calls System.exit(), terminate the process292        // immediately without running any shutdown hooks.  It is not possible to293        // shutdown an Android application gracefully.  Among other things, the294        // Android runtime shutdown hooks close the Binder driver, which can cause295        // leftover running threads to crash before the process actually exits.296        nativeSetExitWithoutCleanup(true);297298        // We want to be fairly aggressive about heap utilization, to avoid299        // holding on to a lot of memory that isn't needed.300        VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);301        VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);302303        final Arguments args;304        try {               //将com.android.server.SystemServer赋值给startClass305            args = new Arguments(argv);306        } catch (IllegalArgumentException ex) {307            Slog.e(TAG, ex.getMessage());308            // let the process exit309            return;310        }311312        // The end of of the RuntimeInit event (see #zygoteInit).313        Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);314315        // Remaining arguments are passed to the start class's static main316        invokeStaticMain(args.startClass, args.startArgs, classLoader);317    }

经过applicationInit中的Arguments构造方法,args.startClass的值就是com.android.server.SystemServer。

 /frameworks/base/core/java/com/android/internal/os/RuntimeInit.java231    private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader)232            throws Zygote.MethodAndArgsCaller {233        Class<?> cl;234235        try {236            cl = Class.forName(className, true, classLoader);237        } catch (ClassNotFoundException ex) {238            throw new RuntimeException(239                    "Missing class when invoking static main " + className,240                    ex);241        }242243        Method m;244        try {245            m = cl.getMethod("main", new Class[] { String[].class });246        } catch (NoSuchMethodException ex) {247            throw new RuntimeException(248                    "Missing static main on " + className, ex);249        } catch (SecurityException ex) {250            throw new RuntimeException(251                    "Problem getting static main on " + className, ex);252        }253254        int modifiers = m.getModifiers();255        if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {256            throw new RuntimeException(257                    "Main method is not public and static on " + className);258        }259260        /*261         * This throw gets caught in ZygoteInit.main(), which responds262         * by invoking the exception's run() method. This arrangement263         * clears up all the stack frames that were required in setting264         * up the process.265         */266        throw new Zygote.MethodAndArgsCaller(m, argv);267    }

加载com.android.server.SystemServer的字节码,反射此类的main方法,得到Method对象,抛出Zygote.MethodAndArgsCaller异常。回到最开始的ZygoteInit的main方法。经过层层调用,ZygoteInit.main-->ZygoteInit.startSystemServer-->Zygote.forkSystemServer-->com_android_internal_os_Zygote_nativeForkSystemServer-->ForkAndSpecializeCommon-->fork-->ZygoteInit.handleSystemServerProcess--> ZygoteInit.zygoteInit-->RuntimeInit.applicationInit-->RuntimeInit.invokeStaticMain。最终来到invokeStaticMain方法,抛出一个Zygote.MethodAndArgsCaller异常被ZygoteInit.main方法所捕获。

  frameworks/base/core/java/com/android/internal/os/ZygoteInit.java public static void main(String argv[]) {        //1、创建ZygoteServer        ZygoteServer zygoteServer = new ZygoteServer();        try {            //2、创建一个Server端的Socket            zygoteServer.registerServerSocket(socketName);            //3、加载进程的资源和类            preload(bootTimingsTraceLog);            if (startSystemServer) {                //4、开启SystemServer进程,这是受精卵进程的第一次分裂                startSystemServer(abiList, socketName, zygoteServer);            }            //5、启动一个死循环监听来自Client端的消息            zygoteServer.runSelectLoop(abiList);             //6、关闭SystemServer的Socket            zygoteServer.closeServerSocket();        } catch (Zygote.MethodAndArgsCaller caller) {             //7、这里捕获这个异常调用MethodAndArgsCaller的run方法。            caller.run();        } catch (Throwable ex) {            Log.e(TAG, "System zygote died with exception", ex);            zygoteServer.closeServerSocket();            throw ex;        }    }
 /frameworks/base/core/java/com/android/internal/os/Zygote.java225    public static class MethodAndArgsCaller extends Exception226            implements Runnable {227        /** method to call */228        private final Method mMethod;229230        /** argument array */231        private final String[] mArgs;232233        public MethodAndArgsCaller(Method method, String[] args) {234            mMethod = method;//构造函数, 将SystemServer的main函数赋值给mMethod235            mArgs = args;236        }237238        public void run() {239            try {                    //执行SystemServer的main函数, 从而进入到SystemServer的main方法。240                mMethod.invoke(null, new Object[] { mArgs });241            } catch (IllegalAccessException ex) {242                throw new RuntimeException(ex);243            } catch (InvocationTargetException ex) {244                Throwable cause = ex.getCause();245                if (cause instanceof RuntimeException) {246                    throw (RuntimeException) cause;247                } else if (cause instanceof Error) {248                    throw (Error) cause;249                }250                throw new RuntimeException(ex);251            }252        }253    }254}
  • 思考:为什么这里要有抛出异常的方式调用SytemServer的main方法呢?
    因为从ZygoteInit的main开始fork一个进程出来,经过了层层调用,系统中累积了不少栈帧,为了一个创建一个干干净净的进程,需要清除里面的栈帧,故抛出这个异常。

四、总结

本文主要梳理了SystemServer进程的启动,这是受精卵进程的第一次分裂,有几个重点需要把握。

  • 1、waitpid方法的特殊使用
  • 2、SystemServer与Zygote共存亡
  • 3、进程出错堆栈是怎么输出的,以及错误Dialog是怎么弹出的
  • 4、为什么要有抛出异常的方式调用SytemServer的main方法

下篇将会梳理SytemServer的main里面做了哪些事情。

参考资料
https://www.cnblogs.com/rainey-forrest/p/5509215.html

更多相关文章

  1. C语言函数的递归(上)
  2. Android内存监测工具DDMS-->Heap
  3. Android(安卓)中的防锯齿
  4. Android实习第九至十八天——碎碎念汇总
  5. 一起学android之利用回调函数onCreateDialog实现加载对话框(23)
  6. EasyPermissions源码浅析
  7. Android(安卓)XML解析学习——Sax方式
  8. android.util.XML介绍
  9. Android将camera获取到的YuvData在jni中转化为Mat方法

随机推荐

  1. Linux C语言实现的Socket通信
  2. 总结一下linux中的分段机制
  3. 如何在Linux中以编程方式获取给定相对路
  4. 提高Linux安全性--hosts.allow, hosts.de
  5. Linux(七):常用命令-文件处理命令-目录处
  6. android启动后根文件系统分析
  7. Nessus-3.0.6【linux漏洞扫描工具】
  8. linux / vi 常用命令(ubuntu环境)
  9. Linux-自己创建动态库静态库
  10. 多队列网卡CPU中断均衡