高通对8916/8939平台出现的死机重启问题的解决方法

Question: Normally , when meet a crash, but from dump, can not find any clue, how to handle it?
like unknown reset/dog bite/memory corrupt/bit flip crash.

Answer: There maybe many crash that are unknown dog bite/reset, or looks like random memory corrupt, or bit flip,or strange kernel panic that does not like a possible software logic bug, for such crash issues, we need to check from hardware, memory configure, PVS, CPR, voltage, clock setting etc.
Normally for this kind of issue , we need do lots for test to narrow down issue , so can find the right directions to resolve this kind of unknown reset issue .

Normally we can do following test :

  1. check PDN report
    Customer hardware team will raise up case for PDN Simulation, need check this report to confirm whether all powr rail match qualcomm requirement .
    if any of power rail not match the requirement normally,we need check vdd_cx, vdd_mx, vdd_apc(power rail for apps core),
    if any of them out of qualcomm requirement, we need boost it for test ,the method boost it will list below.

  2. do ddr QBlazed test ,test ddr . for QBlazed test ,if still do not know how to use it ,pls refer doc 80-NH759-1 QBlizzard_2_10_UG.pdf.

  3. disable cpr , if disable cpr, issue can not duplicated ,then we need adopt your cpr setting ,to boost vdd_apc ,step by step .
    arch/arm/boot/dts/qcom/msm8916-regulator.dtsi
    <0 0 2 4 8>,
    <1 0 2 4 7>;
    qcom,cpr-quot-adjust-scaling-factor-max = <650>;

  • qcom,cpr-enable;
    };
  1. 提高vdd_apc电压
    qcom,cpr-voltage-ceiling = <1050000 1150000 1350000>;
    qcom,cpr-voltage-floor = <1050000 1050000 11375000>;
    —>
    qcom,cpr-voltage-ceiling = <1350000 1350000 1350000>;
    qcom,cpr-voltage-floor = <1350000 1350000 1350000>;

  2. 提高 cx 电压
    bump up the retention voltage by 50mV.
    /rpm_proc /core/power/sleep/src/8916/sleep_target_config.c
    // retention programmed in uV ( 600000uV = 0.6V )
    static const uint32 vddcx_pvs_retention_data[8] =
    {
    /* 000 / 650000+50000,
    /
    001 / 500000+50000,
    /
    010 / 650000+50000,
    /
    011 / 650000+50000,
    /
    100 / 650000+50000,
    /
    101 / 650000+50000,
    /
    110 / 650000+50000,
    /
    111 */ 650000+50000
    };

bump up vdd_cx by 50 mv
\rpm_proc\ core\power\railway_v2\src\8916\railway_config.c
// VDDCX
{
// VDDCX
{
.vreg_name = “vddcx”,

        .vreg_type      = RPM_SMPS_A_REQ,
        .vreg_num       = 1,

        .pm_rail_id     = PM_RAILWAY_CX,
        .pmic_step_size = 12500,     // not used

        .initial_corner = RAILWAY_NOMINAL,

        .supports_explicit_voltage_requests = true,

        .default_uvs = (const unsigned[])
        {
            0,                      // RAILWAY_NO_REQUEST
            700000+50000,                 // RAILWAY_RETENTION
            1050000+50000,                // RAILWAY_SVS_KRAIT
            1050000+50000,                // RAILWAY_SVS_SOC
            1150000+50000,                // RAILWAY_NOMINAL
            1287500+50000,                // RAILWAY_TURBO
            1287500+50000,                // RAILWAY_TURBO_HIGH
            1287500+50000,                // RAILWAY_SUPER_TURBO
            1287500+50000,                // RAILWAY_SUPER_TURBO_NO_CPR
        },


},

  1. boost mx
    bump up the vdd_mx to max value

(1) SBL change:
the following code will change the range of LDO3(vdd_mx) in PMIC registor:
pm_device_post_init(void)
{
pm_err_flag_type err = PM_ERR_FLAG__SUCCESS;
//LDO_UL_LL_CONFIG:

  • err |= pm_spmi_lite_write_byte(1, 0x42d0, 0xA5, 0);
  • err |= pm_spmi_lite_write_byte(1, 0x426a, 0x2, 0);
  • err |= pm_spmi_lite_write_byte(1, 0x42d0, 0xA5, 0);
  • err |= pm_spmi_lite_write_byte(1, 0x426b, 0x34, 0);
  • err |= pm_spmi_lite_write_byte(1, 0x4240, 0x2, 0);
  • err |= pm_spmi_lite_write_byte(1, 0x4241, 0x34, 0);
    }

(2)RPM change:
\rpm_proc\ core\power\railway_v2\src\8916\railway_config.c
// Must init VDDMX first, as voting on the other rails will cause Mx changes to occur.
{
.vreg_name = “vddmx”,
.vreg_type = RPM_LDO_A_REQ,
.vreg_num = 3,
.pm_rail_id = PM_RAILWAY_MX,
.pmic_step_size = 12500, // not used
.initial_corner = RAILWAY_NOMINAL,
.supports_explicit_voltage_requests = true,
.default_uvs = (const unsigned[])
{
0, // RAILWAY_NO_REQUEST
750000+50000, // RAILWAY_RETENTION
1050000+50000, // RAILWAY_SVS_KRAIT
1050000+50000, // RAILWAY_SVS_SOC
1150000+50000, // RAILWAY_NOMINAL
1287500+50000, // RAILWAY_TURBO
1287500+50000, // RAILWAY_TURBO_HIGH
1287500+50000, // RAILWAY_SUPER_TURBO
1287500+50000, // RAILWAY_SUPER_TURBO_NO_CPR
},

(3)rpm_proc\core\systemdrivers\pmic\config\msm8916\pm_config_target.c
the following code will change the range of LDO3(vdd_mx) in software.
In pm_rpm_ldo_rail_info_type ldo_rail_a[NUM_OF_LDO_A] =
Change from:
{5, 62.5, 0, PM_ACCESS_ALLOWED, PM_ALWAYS_ON, PM_NPA_SW_MODE_LDO__IPEAK, PM_NPA_BYPASS_DISALLOWED, 750, 1350}, // LDO3 N1200_Stepper
To:
{5, 62.5, 0, PM_ACCESS_ALLOWED, PM_ALWAYS_ON, PM_NPA_SW_MODE_LDO__IPEAK, PM_NPA_BYPASS_DISALLOWED, 750, 1400}, // LDO3 N1200_Stepper

  1. Disable RPM CPR
    /rpm_proc/core/power/rbcpr/src/target/8916/rbcpr_bsp.c
    All variables ( gf_tn1_cpr_settings / gf_tn3_cpr_settings / tsmc_tn1_cpr_settings / tsmc_tn3_cpr_settings)
    -.rbcpr_enablement=RBCPR_ENABLED_CLOSED_LOOP,
  • .rbcpr_enablement=RBCPR_DISABLED,
  1. if issue still happen ,after you did all the test aboved ,then need swap test for msm chip
  2. if issue go with msm chip ,then we need RMA

note: some code change will be changes base on different build , code list here ,just for reference

上一篇:Gauss-Legendre Quadrature - Python实现


下一篇:接口测试之初步认识Postman