Question: Normally , when meet a crash, but from dump, can not find any clue, how to handle it?
like unknown reset/dog bite/memory corrupt/bit flip crash.
Answer: There maybe many crash that are unknown dog bite/reset, or looks like random memory corrupt, or bit flip,or strange kernel panic that does not like a possible software logic bug, for such crash issues, we need to check from hardware, memory configure, PVS, CPR, voltage, clock setting etc.
Normally for this kind of issue , we need do lots for test to narrow down issue , so can find the right directions to resolve this kind of unknown reset issue .
Normally we can do following test :
-
check PDN report
Customer hardware team will raise up case for PDN Simulation, need check this report to confirm whether all powr rail match qualcomm requirement .
if any of power rail not match the requirement normally,we need check vdd_cx, vdd_mx, vdd_apc(power rail for apps core),
if any of them out of qualcomm requirement, we need boost it for test ,the method boost it will list below. -
do ddr QBlazed test ,test ddr . for QBlazed test ,if still do not know how to use it ,pls refer doc 80-NH759-1 QBlizzard_2_10_UG.pdf.
-
disable cpr , if disable cpr, issue can not duplicated ,then we need adopt your cpr setting ,to boost vdd_apc ,step by step .
arch/arm/boot/dts/qcom/msm8916-regulator.dtsi
<0 0 2 4 8>,
<1 0 2 4 7>;
qcom,cpr-quot-adjust-scaling-factor-max = <650>;
- qcom,cpr-enable;
};
-
提高vdd_apc电压
qcom,cpr-voltage-ceiling = <1050000 1150000 1350000>;
qcom,cpr-voltage-floor = <1050000 1050000 11375000>;
—>
qcom,cpr-voltage-ceiling = <1350000 1350000 1350000>;
qcom,cpr-voltage-floor = <1350000 1350000 1350000>; -
提高 cx 电压
bump up the retention voltage by 50mV.
/rpm_proc /core/power/sleep/src/8916/sleep_target_config.c
// retention programmed in uV ( 600000uV = 0.6V )
static const uint32 vddcx_pvs_retention_data[8] =
{
/* 000 / 650000+50000,
/ 001 / 500000+50000,
/ 010 / 650000+50000,
/ 011 / 650000+50000,
/ 100 / 650000+50000,
/ 101 / 650000+50000,
/ 110 / 650000+50000,
/ 111 */ 650000+50000
};
bump up vdd_cx by 50 mv
\rpm_proc\ core\power\railway_v2\src\8916\railway_config.c
// VDDCX
{
// VDDCX
{
.vreg_name = “vddcx”,
.vreg_type = RPM_SMPS_A_REQ,
.vreg_num = 1,
.pm_rail_id = PM_RAILWAY_CX,
.pmic_step_size = 12500, // not used
.initial_corner = RAILWAY_NOMINAL,
.supports_explicit_voltage_requests = true,
.default_uvs = (const unsigned[])
{
0, // RAILWAY_NO_REQUEST
700000+50000, // RAILWAY_RETENTION
1050000+50000, // RAILWAY_SVS_KRAIT
1050000+50000, // RAILWAY_SVS_SOC
1150000+50000, // RAILWAY_NOMINAL
1287500+50000, // RAILWAY_TURBO
1287500+50000, // RAILWAY_TURBO_HIGH
1287500+50000, // RAILWAY_SUPER_TURBO
1287500+50000, // RAILWAY_SUPER_TURBO_NO_CPR
},
…
},
- boost mx
bump up the vdd_mx to max value
(1) SBL change:
the following code will change the range of LDO3(vdd_mx) in PMIC registor:
pm_device_post_init(void)
{
pm_err_flag_type err = PM_ERR_FLAG__SUCCESS;
//LDO_UL_LL_CONFIG:
- err |= pm_spmi_lite_write_byte(1, 0x42d0, 0xA5, 0);
- err |= pm_spmi_lite_write_byte(1, 0x426a, 0x2, 0);
- err |= pm_spmi_lite_write_byte(1, 0x42d0, 0xA5, 0);
- err |= pm_spmi_lite_write_byte(1, 0x426b, 0x34, 0);
- err |= pm_spmi_lite_write_byte(1, 0x4240, 0x2, 0);
- err |= pm_spmi_lite_write_byte(1, 0x4241, 0x34, 0);
}
(2)RPM change:
\rpm_proc\ core\power\railway_v2\src\8916\railway_config.c
// Must init VDDMX first, as voting on the other rails will cause Mx changes to occur.
{
.vreg_name = “vddmx”,
.vreg_type = RPM_LDO_A_REQ,
.vreg_num = 3,
.pm_rail_id = PM_RAILWAY_MX,
.pmic_step_size = 12500, // not used
.initial_corner = RAILWAY_NOMINAL,
.supports_explicit_voltage_requests = true,
.default_uvs = (const unsigned[])
{
0, // RAILWAY_NO_REQUEST
750000+50000, // RAILWAY_RETENTION
1050000+50000, // RAILWAY_SVS_KRAIT
1050000+50000, // RAILWAY_SVS_SOC
1150000+50000, // RAILWAY_NOMINAL
1287500+50000, // RAILWAY_TURBO
1287500+50000, // RAILWAY_TURBO_HIGH
1287500+50000, // RAILWAY_SUPER_TURBO
1287500+50000, // RAILWAY_SUPER_TURBO_NO_CPR
},
(3)rpm_proc\core\systemdrivers\pmic\config\msm8916\pm_config_target.c
the following code will change the range of LDO3(vdd_mx) in software.
In pm_rpm_ldo_rail_info_type ldo_rail_a[NUM_OF_LDO_A] =
Change from:
{5, 62.5, 0, PM_ACCESS_ALLOWED, PM_ALWAYS_ON, PM_NPA_SW_MODE_LDO__IPEAK, PM_NPA_BYPASS_DISALLOWED, 750, 1350}, // LDO3 N1200_Stepper
To:
{5, 62.5, 0, PM_ACCESS_ALLOWED, PM_ALWAYS_ON, PM_NPA_SW_MODE_LDO__IPEAK, PM_NPA_BYPASS_DISALLOWED, 750, 1400}, // LDO3 N1200_Stepper
- Disable RPM CPR
/rpm_proc/core/power/rbcpr/src/target/8916/rbcpr_bsp.c
All variables ( gf_tn1_cpr_settings / gf_tn3_cpr_settings / tsmc_tn1_cpr_settings / tsmc_tn3_cpr_settings)
-.rbcpr_enablement=RBCPR_ENABLED_CLOSED_LOOP,
- .rbcpr_enablement=RBCPR_DISABLED,
- if issue still happen ,after you did all the test aboved ,then need swap test for msm chip
- if issue go with msm chip ,then we need RMA
note: some code change will be changes base on different build , code list here ,just for reference