Android从7.0开始引入新的OTA升级方式,A/B System Updates,这里将其叫做A/B系统,涉及的内容较多,分多篇对A/B系统的各个方面进行分析。本文为第三篇,主系统和bootloader的通信。

本文基于AOSP 7.1.1_r23 (NMF27D)代码进行分析。

1. 传统系统的沟通机制

传统方式中,Android主系统同bootloaderrecovery系统通过存放于misc分区的bootloader_message结构进行通信。

struct bootloader_message {    char command[32];    char status[32];    char recovery[768];    /* The 'recovery' field used to be 1024 bytes.  It has only ever     * been used to store the recovery command line, so 768 bytes     * should be plenty.  We carve off the last 256 bytes to store the     * stage string (for multistage packages) and possible future     * expansion.*/    char stage[32];    char reserved[224];};
  1. android系统或recovery系统根据操作更新bootloader_messagecommand成员,并写入misc分区;
  2. bootloader启动后读取misc分区并解析得到bootloader_message,根据command内容选择相应的操作,command可能的内容包括:

    • "update-radio/hboot"
      • 指示bootloader更新firmware
    • "boot-recovery"
      • 指示bootloader加载recovery系统,进入recovery mode

2. A/B系统的沟通机制

2.1 boot_control的接口定义

A/B系统中,指定了用于通信的HAL层boot_control功能的定义,但没有指定通信数据具体的存储实现,这点有别于传统方式下AOSP定义的存储于misc分区的bootloader_message结构。

HAL层的boot_control,其定义位于文件中hardware/libhardware/include/hardware/boot_control.h

/** * Every hardware module must have a data structure named HAL_MODULE_INFO_SYM * and the fields of this data structure must begin with hw_module_t * followed by module specific information. */typedef struct boot_control_module {    struct hw_module_t common;    /*     * (*init)() perform any initialization tasks needed for the HAL.     * This is called only once.     */    /* init 用于HAL初始化调用,仅启动时调用一次 */    void (*init)(struct boot_control_module *module);    /*     * (*getNumberSlots)() returns the number of available slots.     * For instance, a system with a single set of partitions would return     * 1, a system with A/B would return 2, A/B/C -> 3...     */    /* 返回系统slot分区套数,1套slot包含boot, system和vendor分区 */    unsigned (*getNumberSlots)(struct boot_control_module *module);    /*     * (*getCurrentSlot)() returns the value letting the system know     * whether the current slot is A or B. The meaning of A and B is     * left up to the implementer. It is assumed that if the current slot     * is A, then the block devices underlying B can be accessed directly     * without any risk of corruption.     * The returned value is always guaranteed to be strictly less than the     * value returned by getNumberSlots. Slots start at 0 and     * finish at getNumberSlots() - 1     */    /* 返回系统当前所在的slot位置 */    unsigned (*getCurrentSlot)(struct boot_control_module *module);    /*     * (*markBootSuccessful)() marks the current slot     * as having booted successfully     *     * Returns 0 on success, -errno on error.     */    /* 标记当前slot为已经成功启动 */    int (*markBootSuccessful)(struct boot_control_module *module);    /*     * (*setActiveBootSlot)() marks the slot passed in parameter as     * the active boot slot (see getCurrentSlot for an explanation     * of the "slot" parameter). This overrides any previous call to     * setSlotAsUnbootable.     * Returns 0 on success, -errno on error.     */    /* 标记指定slot为可启动 */    int (*setActiveBootSlot)(struct boot_control_module *module, unsigned slot);    /*     * (*setSlotAsUnbootable)() marks the slot passed in parameter as     * an unbootable. This can be used while updating the contents of the slot's     * partitions, so that the system will not attempt to boot a known bad set up.     * Returns 0 on success, -errno on error.     */    /* 标记指定slot为不可启动 */    int (*setSlotAsUnbootable)(struct boot_control_module *module, unsigned slot);    /*     * (*isSlotBootable)() returns if the slot passed in parameter is     * bootable. Note that slots can be made unbootable by both the     * bootloader and by the OS using setSlotAsUnbootable.     * Returns 1 if the slot is bootable, 0 if it's not, and -errno on     * error.     */    /* 返回指定slot是否可启动 */    int (*isSlotBootable)(struct boot_control_module *module, unsigned slot);    /*     * (*getSuffix)() returns the string suffix used by partitions that     * correspond to the slot number passed in parameter. The returned string     * is expected to be statically allocated and not need to be freed.     * Returns NULL if slot does not match an existing slot.     */    /* 返回指定slot的系统分区后缀,例如“_a”/“_b”等 */    const char* (*getSuffix)(struct boot_control_module *module, unsigned slot);    /*     * (*isSlotMarkedSucessful)() returns if the slot passed in parameter has     * been marked as successful using markBootSuccessful.     * Returns 1 if the slot has been marked as successful, 0 if it's     * not the case, and -errno on error.     */    /* 返回指定slot是否已经标记为成功启动 */    int (*isSlotMarkedSuccessful)(struct boot_control_module *module, unsigned slot);    void* reserved[31];} boot_control_module_t;

2.2 boot_control的存储和功能实现

对于boot_controlAOSP仅定义了其功能接口,并没有提供具体的代码实现,各厂家根据这个头文件,自定义其存储和功能实现。

使用grep工具搜索代码中的boot_control关键字,可以发现AOSP代码里面包含了三个平台的boot_control实现:

  • Google平台的Brillo
  • Intel平台的edison
  • QualComm

2.2.1 Google平台Brillo的实现

AOSP代码中,system\extra\boot_control_copy 定义了bootctrl.default实现:

$ ls -lh system/extras/boot_control_copy/total 36K-rw-r--r-- 1 ygu users  458 Mar 31 08:50 Android.mk-rw-r--r-- 1 ygu users  11K Mar 31 08:50 NOTICE-rw-r--r-- 1 ygu users 7.7K Mar 31 08:50 boot_control_copy.c-rw-r--r-- 1 ygu users 5.1K Mar 31 08:50 bootinfo.c-rw-r--r-- 1 ygu users 2.0K Mar 31 08:50 bootinfo.h

各文件的内容如下:

  • bootinfo.h定义了结构体BrilloSlotInfoBrilloBootInfo

    BrilloBootInfo包含结构体BrilloBootInfo,作为boot_control的私有数据实现,定义如下:

    typedef struct BrilloSlotInfo {  uint8_t bootable : 1;  uint8_t reserved[3];} BrilloSlotInfo;typedef struct BrilloBootInfo {  // Used by fs_mgr. Must be NUL terminated.  char bootctrl_suffix[4];  // Magic for identification - must be 'B', 'C', 'c' (short for  // "boot_control copy" implementation).  uint8_t magic[3];  // Version of BrilloBootInfo struct, must be 0 or larger.  uint8_t version;  // Currently active slot.  uint8_t active_slot;  // Information about each slot.  BrilloSlotInfo slot_info[2];  uint8_t reserved[15];} BrilloBootInfo;

    结构体BrilloBootInfo占用32字节,系统复用misc分区的bootloader_message结构体,将BrilloBootInfo存放在偏移量为864字节的成员slot_suffix[32]中,整个misc分区数据结构的框图如下:

  • bootinfo.c实现了对BrilloBootInfo进行存取操作的接口

    • 存取操作
      • bool boot_info_load(BrilloBootInfo *out_info)
      • bool boot_info_save(BrilloBootInfo *info)
    • 校验和复位操作
      • bool boot_info_validate(BrilloBootInfo* info)
      • void boot_info_reset(BrilloBootInfo* info)
    • 指定分区的打开操作
      • int boot_info_open_partition(const char *name, uint64_t *out_size, int flags)
  • boot_control_copy.c实现了boot_control模块的功能

    /* This boot_control HAL implementation emulates A/B by copying the * contents of the boot partition of the requested slot to the boot * partition. It hence works with bootloaders that are not yet aware * of A/B. This code is only intended to be used for development. */boot_control_module_t HAL_MODULE_INFO_SYM = {  .common = {    .tag                 = HARDWARE_MODULE_TAG,    .module_api_version  = BOOT_CONTROL_MODULE_API_VERSION_0_1,    .hal_api_version     = HARDWARE_HAL_API_VERSION,    .id                  = BOOT_CONTROL_HARDWARE_MODULE_ID,    .name                = "Copy Implementation of boot_control HAL",    .author              = "The Android Open Source Project",    .methods             = &module_methods,  },  .init                 = module_init,  .getNumberSlots       = module_getNumberSlots,  .getCurrentSlot       = module_getCurrentSlot,  .markBootSuccessful   = module_markBootSuccessful,  .setActiveBootSlot    = module_setActiveBootSlot,  .setSlotAsUnbootable  = module_setSlotAsUnbootable,  .isSlotBootable       = module_isSlotBootable,  .getSuffix            = module_getSuffix,};

代码实现了boot_control_module_t模块接口的功能,这里不再对每一个函数实现进行注释,但需要特别指出的是,函数module_setActiveBootSlot内部会根据传入的slot参数将对应分区boot_X内容复制到boot分区(系统上应该存在三个分区,如bootboot_aboot_b),bootloader不需要改动代码去检查到底是从哪个分区启动,只管加载boot分区就好了,带来的问题就是,一旦启动失败(例如,kernel挂载system分区失败,根本没有进入Android环境),bootloader无法切换到另外一个slot。注释中也提到,这种方式不需要修改bootloader,其代码实现只是用于开发目的,最终产品不应该是这样的。

2.2.2 Intel平台edison的实现

AOSP代码中,hardware\bsp\intel\soc\common\bootctrl定义了bootctrl.edison的实现:

$ ls -lh hardware/bsp/intel/soc/common/bootctrl/total 20K-rw-r--r-- 1 ygu users  860 Mar 31 08:47 Android.mk-rw-r--r-- 1 ygu users 9.1K Mar 31 08:47 bootctrl.c-rw-r--r-- 1 ygu users 1.5K Mar 31 08:47 bootctrl.h

各文件的内容如下:

  • bootctrl.h定义了结构体slot_metadata_tboot_ctrl_t

    boot_ctrl_t包含结构体slot_metadata_t,作为boot_control的私有数据实现,定义如下:

    #define BOOT_CONTROL_VERSION    1typedef struct slot_metadata {    uint8_t priority : 4;    uint8_t tries_remaining : 3;    uint8_t successful_boot : 1;} slot_metadata_t;typedef struct boot_ctrl {    /* Magic for identification - '\0ABB' (Boot Contrl Magic) */    uint32_t magic;    /* Version of struct. */    uint8_t version;    /* Information about each slot. */    slot_metadata_t slot_info[2];    uint8_t recovery_tries_remaining;} boot_ctrl_t;

    Brillo类似,系统复用misc分区的bootloader_message结构体,将boot_ctrl_t存放在偏移量为864字节的成员slot_suffix[32]中,整个misc分区数据结构的框图如下:

  • bootctrl.c实现了boot_ctrl_t存取操作和boot_control的模块功能

    • boot_ctrl_t存取操作
      • int bootctrl_read_metadata(boot_ctrl_t *bctrl)
      • int bootctrl_write_metadata(boot_ctrl_t *bctrl)
    • boot_control模块功能

      /* Boot Control Module implementation */boot_control_module_t HAL_MODULE_INFO_SYM = {    .common = {        .tag                 = HARDWARE_MODULE_TAG,        .module_api_version  = BOOT_CONTROL_MODULE_API_VERSION_0_1,        .hal_api_version     = HARDWARE_HAL_API_VERSION,        .id                  = BOOT_CONTROL_HARDWARE_MODULE_ID,        .name                = "boot_control HAL",        .author              = "Intel Corporation",        .methods             = &bootctrl_methods,    },    .init                 = bootctrl_init,    .getNumberSlots       = bootctrl_get_number_slots,    .getCurrentSlot       = bootctrl_get_current_slot,    .markBootSuccessful   = bootctrl_mark_boot_successful,    .setActiveBootSlot    = bootctrl_set_active_boot_slot,    .setSlotAsUnbootable  = bootctrl_set_slot_as_unbootable,    .isSlotBootable       = bootctrl_is_slot_bootable,    .getSuffix            = bootctrl_get_suffix,};

由于没有bootloader的代码,所以对于如何通过结构体slot_metadata_t的成员prioritypriority来选择启动哪一个slot并不清楚,无法对结构体成员的作用有更详细的说明。

值得一提的是,通过读取linux命令行参数androidboot.slot_suffix=来确定当前系统在哪一个slot上运行(见bootctrl_get_active_slot函数)。

2.2.3 QualComm平台的实现

AOSP代码中,hardware\qcom\bootctrl定义了bootctrl.$(TARGET_BOARD_PLATFORM)的实现(具体名字依赖于TARGET_BOARD_PLATFORM变量设定):

$ ls -lh hardware/qcom/bootctrl/total 28K-rw-r--r-- 1 ygu users  944 Mar 31 08:47 Android.mk-rw-r--r-- 1 ygu users 1.5K Mar 31 08:47 NOTICE-rw-r--r-- 1 ygu users  19K Mar 31 08:47 boot_control.cpp

QualComm平台的实现比较特别,没有单独定义boot_control的私有数据,而是将A/B系统相关信息存放到gpt表上。
GPT内容的第3个逻辑块LBA 2开始,依次存放的是每个GPT分区的详细信息Partition Entry,单个Partition Entry占用128个字节,从其第48个字节开始存放的是分区属性(Attribute flags)。A/B系统将每个slot分区的信息,存放到分区属性的Bit 48开始的位置上。

QualComm平台详细的A/B系统分区属性如下:

关于GPT分区的详细信息,可以参考另外一篇文章:<<博通机顶盒平台GPT分区和制作工具>>的第1部分,关于GPT的介绍。

在代码实现中比较特别的是:

  • 统计系统中boot开头的分区数作为slot总数(见get_number_slots函数)
  • 访问系统的属性ro.boot.slot_suffix来确定当前系统在哪一个slot上运行(见get_current_slot函数)

2.3.4 Broadcom机顶盒平台的实现

Broadcom单独提供的代码中(非AOSP代码),vendor/broadcom/bcm_platform/hals/boot_control定义了bootctrl.$(TARGET_BOARD_PLATFORM)的实现(如bootctrl.bcm7252ssffdr4):

$ ls -lh vendor/broadcom/bcm_platform/hals/boot_control/total 20K-rw-r--r-- 1 ygu users 1.3K Mar 30 16:09 Android.mk-rw-r--r-- 1 ygu users  11K May  6 16:26 boot_control.cpp-rw-r--r-- 1 ygu users 1.1K Mar 30 16:09 eio_boot.h
  • eio_boot.h定义了结构体eio_boot_sloteio_boot

    eio_boot包含结构体eio_boot_slot,作为boot_control的私有数据实现,定义如下:

    struct eio_boot_slot {   char suffix[8];   int  valid;   int  boot_try;   int  boot_ok;   int  boot_fail;};struct eio_boot {   int    magic;   int    current;   struct eio_boot_slot slot[2];};

    结构体eio_boot的数据存放在名为eio的分区上。
    Broadcom机顶盒平台eio_boot结构框图如下:

  • boot_control.cpp实现了eio_boot存取操作和boot_control的模块功能

    struct boot_control_module HAL_MODULE_INFO_SYM = {   .common = {      .tag                = HARDWARE_MODULE_TAG,      .module_api_version = BOOT_CONTROL_MODULE_API_VERSION_0_1,      .hal_api_version    = HARDWARE_HAL_API_VERSION,      .id                 = BOOT_CONTROL_HARDWARE_MODULE_ID,      .name               = "boot control hal for bcm platform",      .author             = "Broadcom",      .methods            = &boot_control_module_methods,      .dso                = 0,      .reserved           = {0}    },    .init                    = init,    .getNumberSlots          = getNumberSlots,    .getCurrentSlot          = getCurrentSlot,    .markBootSuccessful      = markBootSuccessful,    .setActiveBootSlot       = setActiveBootSlot,    .setSlotAsUnbootable     = setSlotAsUnbootable,    .isSlotBootable          = isSlotBootable,    .getSuffix               = getSuffix,    .isSlotMarkedSuccessful  = isSlotMarkedSuccessful,};

    Broadcom平台的分区后缀名不同于常见的_a/_b,而是采用_i/_e,这里略去对函数内容的注释。

2.3 boot_control的测试工具

除了定义HAL层的接口外,AOSP也提供了boot_control模块调用的工具bootctl,位于:
system/extras/bootctl/bootctl.c

默认情况下,bootctl不会参与编译,可以在包含update_engine是将其添加到PRODUCT_PACKAGES,如下:

PRODUCT_PACKAGES += \  update_engine \  update_verifier \  bootctl

bootctl工具很简单,通过命令行调用boot_control的功能接口,以下是在Broadcom参考平台上运行bootctl的例子:

bcm7252ssffdr4:/ $ subcm7252ssffdr4:/ # which bootctl/system/bin/bootctlbcm7252ssffdr4:/ # bootctl --helpbootctl - command-line wrapper for the boot_control HAL.Usage:  bootctl COMMANDCommands:  bootctl hal-info                       - Show info about boot_control HAL used.  bootctl get-number-slots               - Prints number of slots.  bootctl get-current-slot               - Prints currently running SLOT.  bootctl mark-boot-successful           - Mark current slot as GOOD.  bootctl set-active-boot-slot SLOT      - On next boot, load and execute SLOT.  bootctl set-slot-as-unbootable SLOT    - Mark SLOT as invalid.  bootctl is-slot-bootable SLOT          - Returns 0 only if SLOT is bootable.  bootctl is-slot-marked-successful SLOT - Returns 0 only if SLOT is marked GOOD.  bootctl get-suffix SLOT                - Prints suffix for SLOT.SLOT parameter is the zero-based slot-number.64|bcm7252ssffdr4:/ # 64|bcm7252ssffdr4:/ # bootctl hal-infoHAL name:            boot control hal for bcm platformHAL author:          BroadcomHAL module version:  0.1bcm7252ssffdr4:/ # bootctl get-number-slots2bcm7252ssffdr4:/ # bootctl get-current-slot0bcm7252ssffdr4:/ # bootctl get-suffix 0_ibcm7252ssffdr4:/ # bootctl get-suffix 1_e

最后的bootctl get-suffix调用可以看到,在我的测试平台上,slot Aslot B的分区命名后缀分别为_i_e

基于bootctl的基础上,Android系统提供了两个基于Brillo平台的测试代码,分别位于以下路径:

  • system/extras/tests/bootloader
  • external/autotest/server/site_tests/brillo_BootLoader

后续打算写一篇博客来单独介绍如何在Android下运行这些测试例子进行单元测试。

2.4 boot_control的调用

2.4.1 bootloader读取boot_control私有实现的数据

设备启动后bootloader会读取boot_control私有实现的数据,来判断从哪一个slot启动,由于各家实现的私有数据结构不一样,所以无法详细说明如何解析和处理的过程。

2.4.1 boot_control_android调用boot_control

文件system/update_engine/boot_control_android.cc中,类BootControlAndroid有一个私有成员module_

// The Android implementation of the BootControlInterface. This implementation// uses the libhardware's boot_control HAL to access the bootloader.class BootControlAndroid : public BootControlInterface { ... private:  // NOTE: There is no way to release/unload HAL implementations so  // this is essentially leaked on object destruction.  boot_control_module_t* module_;  ...};

BootControlAndroidInit方法内,获取boot_control_module_t模块指针并赋值给module_成员,然后调用module_->init进行boot_control的初始化,如下:

bool BootControlAndroid::Init() {  const hw_module_t* hw_module;  int ret;#ifdef _UE_SIDELOAD  // For update_engine_sideload, we simulate the hw_get_module() by accessing it  // from the current process directly.  # 对于update_engine_sideload应用,直接将HAL_MODULE_INFO_SYM转换为hw_module  hw_module = &HAL_MODULE_INFO_SYM;  ret = 0;  if (!hw_module ||      strcmp(BOOT_CONTROL_HARDWARE_MODULE_ID, hw_module->id) != 0) {    ret = -EINVAL;  }#else  // !_UE_SIDELOAD  # 对于update_engine应用,通过BOOT_CONTROL_HARDWARE_MODULE_ID获取hw_module  ret = hw_get_module(BOOT_CONTROL_HARDWARE_MODULE_ID, &hw_module);#endif  // _UE_SIDELOAD  if (ret != 0) {    LOG(ERROR) << "Error loading boot_control HAL implementation.";    return false;  }  # 通过hw_module得到boot_control_module_t,从而后面可以愉快地调用其各种功能实现函数  module_ = reinterpret_cast(const_cast(hw_module));  # 调用boot_control的init函数  module_->init(module_);  LOG(INFO) << "Loaded boot_control HAL "            << "'" << hw_module->name << "' "            << "version " << (hw_module->module_api_version>>8) << "."            << (hw_module->module_api_version&0xff) << " "            << "authored by '" << hw_module->author << "'.";  return true;}

初始化完成后,就可以通过module_成员来调用各种boot_control的操作了。

2.4.2 update_verifier调用boot_control

文件bootable/recovery/update_verifier/update_verifier.cpp中,获取boot_control_module_t指针,检查当前slot分区是否已经标记为successful,如果没有,则尝试verify_image并将当前slot标记为successful,具体代码如下:

int main(int argc, char** argv) {  ...  # 直接根据名称"bootctrl"获取模块  const hw_module_t* hw_module;  if (hw_get_module("bootctrl", &hw_module) != 0) {    SLOGE("Error getting bootctrl module.\n");    return -1;  }  # 将"bootctrl"模块转化为"boot_control_module_t"结构体  boot_control_module_t* module = reinterpret_cast(      const_cast(hw_module));  # 调用init  module->init(module);  # 获取当前slot  unsigned current_slot = module->getCurrentSlot(module);  # 检查当前slot是否标记为successful  int is_successful= module->isSlotMarkedSuccessful(module, current_slot);  SLOGI("Booting slot %u: isSlotMarkedSuccessful=%d\n", current_slot, is_successful);  # 如果当前slot没有标记为successful,说明当前启动可能存在问题  if (is_successful == 0) {    // The current slot has not booted successfully.    # 检查"ro.boot.verifymode",是否其它原因导致失败    # 不是其它原因导致失败的情况下,重新调用verify_image验证    ...    # verify_image验证成功,尝试标记当前slot为successful    int ret = module->markBootSuccessful(module);    if (ret != 0) {      SLOGE("Error marking booted successfully: %s\n", strerror(-ret));      return -1;    }    SLOGI("Marked slot %u as booted successfully.\n", current_slot);  }  # 完成操作,退出update_verifier  SLOGI("Leaving update_verifier.\n");  return 0;}

整个A/B系统中,基于boot_control的上层应用操作已经实现了,各家需要单独实现boot_control的底层操作,同时bootloader也需要配合解析boot_control的私有数据,从而选择相应的slot来启动Android系统。

3. 联系和福利

  • 关注微信公众号“洛奇看世界”

    • 回复关键词“Android电子书”,获取超过150本Android相关的电子书和文档。电子书包含了Android开发相关的方方面面,从此你再也不需要到处找Android开发的电子书了。

    Android A/B System OTA分析(三)主系统和bootloader的通信_第1张图片


  • 个人微信号,添加请备注“微信公众号”。

    Android A/B System OTA分析(三)主系统和bootloader的通信_第2张图片

更多相关文章

  1. android 查看系统数据库
  2. 从Handler+Message+Looper源码带你分析Android系统的消息处理机
  3. Android swap分区作用及swapper软件设置方法
  4. 从Android到React Native开发(三、自定义原生控件支持)
  5. 如今的移动操作系统,写在2013年——android篇 by 伊一线天
  6. 自定义RatingBar的显示(评分组件)

随机推荐

  1. android创建googlemap基础教程和画导航线
  2. Android 简单例子以及入门学习资料链接
  3. Android——studio入门
  4. Android数据库应用(《Android开发入门与
  5. android Tabhost部件
  6. appwidget 续
  7. Android UsageStatsService:要点解析
  8. Android getResources().getConfiguratio
  9. Android-----js和android的互调
  10. RxJava(十三):RxAndroid 2.x 和 Retrofit