CTS问题分析13-CTS问题分析10(续)

CTS/GTS问题分析13

问题分析

这个问题不是第一次出现,详见CTS问题分析10;但当时有更紧急的问题,所以并没有继续深入分析,只是分析到持有大量的CompatibilityTestSuite导致retry时发生错误;

但是这次又出现了,因此有必要进行下调研,以确保下次不再复现此问题

retry 命令: run retry --retry 0 --shard-count 2 -s 7c6252f -s 7c62472

终端报错log:

java.lang.OutOfMemoryError: GC overhead limit exceeded
Dumping heap to java_pid26338.hprof ...
Heap dump file created [5553157593 bytes in 101.829 secs]
01-29 16:09:47 E/CommandScheduler: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.HashMap.newNode(HashMap.java:1747)
at java.util.HashMap.putVal(HashMap.java:631)
at java.util.HashMap.put(HashMap.java:612)
at java.util.HashSet.add(HashSet.java:220)
at java.util.AbstractCollection.addAll(AbstractCollection.java:344)
at com.android.tradefed.config.OptionSetter.setFieldValue(OptionSetter.java:452)
at com.android.tradefed.config.OptionSetter.setFieldValue(OptionSetter.java:549)
at com.android.tradefed.config.OptionCopier.copyOptions(OptionCopier.java:49)
at com.android.tradefed.config.OptionCopier.copyOptionsNoThrow(OptionCopier.java:60)
at com.android.tradefed.testtype.suite.ITestSuite.split(ITestSuite.java:662)
at com.android.compatibility.common.tradefed.testtype.retry.RetryFactoryTest.split(RetryFactoryTest.java:122)
at com.android.tradefed.invoker.shard.ShardHelper.shardTest(ShardHelper.java:123)
at com.android.tradefed.invoker.shard.ShardHelper.shardConfig(ShardHelper.java:30)
at com.android.tradefed.invoker.shard.StrictShardHelper.shardConfig(StrictShardHelper.java:51)
at com.android.tradefed.invoker.InvocationExecution.shardConfig(InvocationExecution.java:149)
at com.android.tradefed.invoker.TestInvocation.invoke(TestInvocation.java:656)
at com.android.tradefed.command.CommandScheduler$InvocationThread.run(CommandScheduler.java:1357)

首先,我们从中可以看到失败时栈的路径,从中找出为什么占用大量内存的原因

多台机器retry时的数据结构组织

通过以前的分析,我们知道大量的CompatibilityTestSuite,中间持有大量的exclude case项记录最终造成问题;因此我们跟着栈梳理下多台机器retry时,cts相关的数据结构是如何组织的
tools/tradefederation/core/src/com/android/tradefed/invoker/shard/ShardHelper.java

65    /**
66     * Attempt to shard the configuration into sub-configurations, to be re-scheduled to run on
67     * multiple resources in parallel.
68     *
69     * <p>A successful shard action renders the current config empty, and invocation should not
70     * proceed.
71     *
72     * @see IShardableTest
73     * @see IRescheduler
74     * @param config the current {@link IConfiguration}.
75     * @param context the {@link IInvocationContext} holding the tests information.
76     * @param rescheduler the {@link IRescheduler}
77     * @return true if test was sharded. Otherwise return <code>false</code>
78     */
79    @Override
80    public boolean shardConfig(
81            IConfiguration config, IInvocationContext context, IRescheduler rescheduler) {
82        List<IRemoteTest> shardableTests = new ArrayList<IRemoteTest>();
83        boolean isSharded = false;
84        Integer shardCount = config.getCommandOptions().getShardCount();
85        for (IRemoteTest test : config.getTests()) {
86            isSharded |= shardTest(shardableTests, test, shardCount, context);// shardTest做retry时test的切分工作 ,此时test中没有什么,只记录了cts-known-failures.xml中的已知失败项,保存在exclude list中
87        }
88        if (!isSharded) {
89            return false;
90        }
91        // shard this invocation!
92        // create the TestInvocationListener that will collect results from all the shards,
93        // and forward them to the original set of listeners (minus any ISharddableListeners)
94        // once all shards complete
95        int expectedShard = shardableTests.size();
96        if (shardCount != null) {
97            expectedShard = Math.min(shardCount, shardableTests.size());
98        }
99        ShardMasterResultForwarder resultCollector =
100                new ShardMasterResultForwarder(buildMasterShardListeners(config), expectedShard);
101
102        resultCollector.invocationStarted(context);
103        synchronized (shardableTests) {
104            // When shardCount is available only create 1 poller per shard
105            // TODO: consider aggregating both case by picking a predefined shardCount if not
106            // available (like 4) for autosharding.
107            if (shardCount != null) {
108                // We shuffle the tests for best results: avoid having the same module sub-tests
109                // contiguously in the list.
110                Collections.shuffle(shardableTests);
111                int maxShard = Math.min(shardCount, shardableTests.size());
112                CountDownLatch tracker = new CountDownLatch(maxShard);
113                for (int i = 0; i < maxShard; i++) {
114                    IConfiguration shardConfig = config.clone();
115                    shardConfig.setTest(new TestsPoolPoller(shardableTests, tracker));
116                    rescheduleConfig(shardConfig, config, context, rescheduler, resultCollector);
117                }
118            } else {
119                CountDownLatch tracker = new CountDownLatch(shardableTests.size());
120                for (IRemoteTest testShard : shardableTests) {
121                    CLog.i("Rescheduling sharded config...");
122                    IConfiguration shardConfig = config.clone();
123                    if (config.getCommandOptions().shouldUseDynamicSharding()) {
124                        shardConfig.setTest(new TestsPoolPoller(shardableTests, tracker));
125                    } else {
126                        shardConfig.setTest(testShard);
127                    }
128                    rescheduleConfig(shardConfig, config, context, rescheduler, resultCollector);
129                }
130            }
131        }
132        // clean up original builds
133        for (String deviceName : context.getDeviceConfigNames()) {
134            config.getDeviceConfigByName(deviceName)
135                    .getBuildProvider()
136                    .cleanUp(context.getBuildInfo(deviceName));
137        }
138        return true;
139    }
196    /**
197     * Attempt to shard given {@link IRemoteTest}.
198     *
199     * @param shardableTests the list of {@link IRemoteTest}s to add to
200     * @param test the {@link IRemoteTest} to shard
201     * @param shardCount attempted number of shard, can be null.
202     * @param context the {@link IInvocationContext} of the current invocation.
203     * @return <code>true</code> if test was sharded
204     */
205    private static boolean shardTest(
206            List<IRemoteTest> shardableTests,
207            IRemoteTest test,
208            Integer shardCount,
209            IInvocationContext context) {
210        boolean isSharded = false;
211        if (test instanceof IShardableTest) {
212            // inject device and build since they might be required to shard.
213            if (test instanceof IBuildReceiver) {
214                ((IBuildReceiver) test).setBuild(context.getBuildInfos().get(0));
215            }
216            if (test instanceof IDeviceTest) {
217                ((IDeviceTest) test).setDevice(context.getDevices().get(0));
218            }
219            if (test instanceof IMultiDeviceTest) {
220                ((IMultiDeviceTest) test).setDeviceInfos(context.getDeviceBuildMap());
221            }
222            if (test instanceof IInvocationContextReceiver) {
223                ((IInvocationContextReceiver) test).setInvocationContext(context);
224            }
225            //为test设置一些属性
226            IShardableTest shardableTest = (IShardableTest) test;
227            Collection<IRemoteTest> shards = null;
228            // Give the shardCount hint to tests if they need it.
229            if (shardCount != null) { //当多台机器retry指定了shardCount时
230                shards = shardableTest.split(shardCount); //调用RetryFactoryTest.split方法
231            } else {
232                shards = shardableTest.split();
233            }
234            if (shards != null) {
235                shardableTests.addAll(shards);
236                isSharded = true;
237            }
238        }
239        if (!isSharded) {
240            shardableTests.add(test);
241        }
242        return isSharded;
243    }

test/suite_harness/common/host-side/tradefed/src/com/android/compatibility/common/tradefed/testtype/retry/RetryFactoryTest.java

180    @Override
181    public Collection<IRemoteTest> split(int shardCountHint) {
182        try {
183            CompatibilityTestSuite test = loadSuite();
184            return test.split(shardCountHint); //注意上面两句,这里是组织数据结构的关键所在
185        } catch (DeviceNotAvailableException e) {
186            CLog.e("Failed to shard the retry run.");
187            CLog.e(e);
188        }
189        return null;
190    }

创建一个CompatibilityTestSuite

192    /**
193     * Helper to create a {@link CompatibilityTestSuite} from previous results.
194     */
195    private CompatibilityTestSuite loadSuite() throws DeviceNotAvailableException {
196        // Create a compatibility test and set it to run only what we want.
197        CompatibilityTestSuite test = createTest();
198
199        CompatibilityBuildHelper buildHelper = new CompatibilityBuildHelper(mBuildInfo);
200        // Create the helper with all the options needed.
201        RetryFilterHelper helper = createFilterHelper(buildHelper); //创建一个RetryFilterHelper
202        // TODO: we have access to the original command line, we should accommodate more re-run
203        // scenario like when the original cts.xml config was not used.
204        helper.validateBuildFingerprint(mDevice);
205        helper.setCommandLineOptionsFor(test);
206        helper.setCommandLineOptionsFor(this);
207        helper.populateRetryFilters(); //exclude项的增加
208
209        try {
210            OptionSetter setter = new OptionSetter(test);
211            for (String moduleArg : mModuleArgs) {
212                setter.setOptionValue("compatibility:module-arg", moduleArg);
213            }
214            for (String testArg : mTestArgs) {
215                setter.setOptionValue("compatibility:test-arg", testArg);
216            }
217        } catch (ConfigurationException e) {
218            throw new RuntimeException(e);
219        }
220
221        test.setIncludeFilter(helper.getIncludeFilters());
222        test.setExcludeFilter(helper.getExcludeFilters());
223        test.setDevice(mDevice);
224        test.setBuild(mBuildInfo);
225        test.setAbiName(mAbiName);
226        test.setPrimaryAbiRun(mPrimaryAbiRun);
227        test.setSystemStatusChecker(mStatusCheckers);
228        test.setInvocationContext(mContext);
229        test.setConfiguration(mMainConfiguration);
230        // reset the retry id - Ensure that retry of retry does not throw
231        test.resetRetryId();
232        test.isRetry();
233        // clean the helper
234        helper.tearDown();
235        return test;
236    }

test/suite_harness/common/host-side/tradefed/src/com/android/compatibility/common/tradefed/util/RetryFilterHelper.java

72    /**
73     * Constructor for a {@link RetryFilterHelper}.
74     *
75     * @param build a {@link CompatibilityBuildHelper} describing the build.
76     * @param sessionId The ID of the session to retry.
77     * @param subPlan The name of a subPlan to be used. Can be null.
78     * @param includeFilters The include module filters to apply
79     * @param excludeFilters The exclude module filters to apply
80     * @param abiName The name of abi to use. Can be null.
81     * @param moduleName The name of the module to run. Can be null.
82     * @param testName The name of the test to run. Can be null.
83     * @param retryType The type of results to retry. Can be null.
84     */
85    public RetryFilterHelper(CompatibilityBuildHelper build, int sessionId, String subPlan,
86            Set<String> includeFilters, Set<String> excludeFilters, String abiName,
87            String moduleName, String testName, RetryType retryType) {
88        this(build, sessionId);
89        mSubPlan = subPlan;
90        mIncludeFilters.addAll(includeFilters);
91        mExcludeFilters.addAll(excludeFilters);
92        mAbiName = abiName;
93        mModuleName = moduleName;
94        mTestName = testName;
95        mRetryType = retryType;
96    }

到此时mExcludeFilters中还只有cts-known-failures.xml中记录的已知错误,关键在populateRetryFilters

183    /**
184     * Populate mRetryIncludes and mRetryExcludes based on the options and the result set for
185     * this instance of RetryFilterHelper.
186     */
187    public void populateRetryFilters() {
188        mRetryIncludes = new HashSet<>(mIncludeFilters); // reset for each population
189        mRetryExcludes = new HashSet<>(mExcludeFilters); // reset for each population
190        if (RetryType.CUSTOM.equals(mRetryType)) {
191            Set<String> customIncludes = new HashSet<>(mIncludeFilters);
192            Set<String> customExcludes = new HashSet<>(mExcludeFilters);
193            if (mSubPlan != null) { //retry时一般不指定subplan,因此这里不会走到
194                ISubPlan retrySubPlan = SubPlanHelper.getSubPlanByName(mBuild, mSubPlan);
195                customIncludes.addAll(retrySubPlan.getIncludeFilters());
196                customExcludes.addAll(retrySubPlan.getExcludeFilters());
197            }
198            // If includes were added, only use those includes. Also use excludes added directly
199            // or by subplan. Otherwise, default to normal retry.
200            if (!customIncludes.isEmpty()) {
201                mRetryIncludes.clear();
202                mRetryIncludes.addAll(customIncludes);
203                mRetryExcludes.addAll(customExcludes);
204                return;
205            }
206        }
207        // remove any extra filtering options
208        // TODO(aaronholden) remove non-plan includes (e.g. those in cts-vendor-interface)
209        // TODO(aaronholden) remove non-known-failure excludes
210        mModuleName = null;
211        mTestName = null;
212        mSubPlan = null;
213        populateFiltersBySubPlan();
214        populatePreviousSessionFilters();
215    }

因此会走到这里

217    /* Generation of filters based on previous sessions is implemented thoroughly in SubPlanHelper,
218     * and retry filter generation is just a subset of the use cases for the subplan retry logic.
219     * Use retry type to determine which result types SubPlanHelper targets. */
220    public void populateFiltersBySubPlan() {
221        SubPlanHelper retryPlanCreator = new SubPlanHelper();
222        retryPlanCreator.setResult(getResult());
223        if (RetryType.FAILED.equals(mRetryType)) {
224            // retry only failed tests
225            retryPlanCreator.addResultType(SubPlanHelper.FAILED);
226        } else if (RetryType.NOT_EXECUTED.equals(mRetryType)){
227            // retry only not executed tests
228            retryPlanCreator.addResultType(SubPlanHelper.NOT_EXECUTED);
229        } else {
230            // retry both failed and not executed tests
231            retryPlanCreator.addResultType(SubPlanHelper.FAILED);
232            retryPlanCreator.addResultType(SubPlanHelper.NOT_EXECUTED);
233        }
234        try {
235            ISubPlan retryPlan = retryPlanCreator.createSubPlan(mBuild); //可以看到SubPlanHelper中的include list和exclude list会被加到CompatibilityTestSuite项中
236            mRetryIncludes.addAll(retryPlan.getIncludeFilters());了
237            mRetryExcludes.addAll(retryPlan.getExcludeFilters());
238        } catch (ConfigurationException e) {
239            throw new RuntimeException ("Failed to create subplan for retry", e);
240        }
241    }

test/suite_harness/common/host-side/tradefed/src/com/android/compatibility/common/tradefed/result/SubPlanHelper.java
createSubPlan 最关键点,从我们retry的报告中提取信息到include list(mIncludeFilters)和exclude list(mExcludeFilters)

206    /**
207     * Create a subplan derived from a result.
208     * <p/>
209     * {@link Option} values must be set before this is called.
210     * @param buildHelper
211     * @return subplan
212     * @throws ConfigurationException
213     */
214    public ISubPlan createSubPlan(CompatibilityBuildHelper buildHelper)
215            throws ConfigurationException {
216        setupFields(buildHelper);
217        ISubPlan subPlan = new SubPlan();
218
219        // add filters from previous session to track which tests must run
220        subPlan.addAllIncludeFilters(mIncludeFilters);
221        subPlan.addAllExcludeFilters(mExcludeFilters);
222        if (mLastSubPlan != null) {
223            ISubPlan lastSubPlan = SubPlanHelper.getSubPlanByName(buildHelper, mLastSubPlan);
224            subPlan.addAllIncludeFilters(lastSubPlan.getIncludeFilters());
225            subPlan.addAllExcludeFilters(lastSubPlan.getExcludeFilters());
226        }
227        if (mModuleName != null) {
228            addIncludeToSubPlan(subPlan, new TestFilter(mAbiName, mModuleName, mTestName));
229        }
230        Set<TestStatus> statusesToRun = getStatusesToRun();
231        for (IModuleResult module : mResult.getModules()) {
232            if (shouldRunModule(module)) {
233                TestFilter moduleInclude =
234                            new TestFilter(module.getAbi(), module.getName(), null /*test*/);
235                if (shouldRunEntireModule(module)) {
236                    // include entire module
237                    addIncludeToSubPlan(subPlan, moduleInclude); //整个模块的所有case全部fail
238                } else if (mResultTypes.contains(NOT_EXECUTED) && !module.isDone()) {
239                    // add module include and test excludes
240                    addIncludeToSubPlan(subPlan, moduleInclude);
241                    for (ICaseResult caseResult : module.getResults()) {
242                        for (ITestResult testResult : caseResult.getResults()) {
243                            if (!statusesToRun.contains(testResult.getResultStatus())) {
244                                TestFilter testExclude = new TestFilter(module.getAbi(),
245                                        module.getName(), testResult.getFullName());
246                                addExcludeToSubPlan(subPlan, testExclude); //模块没执行完 done = false的情况
247                            }
248                        }
249                    }
250                } else {
251                    // Not-executed tests should not be rerun and/or this module is completed
252                    // In any such case, it suffices to add includes for each test to rerun
253                    for (ICaseResult caseResult : module.getResults()) {
254                        for (ITestResult testResult : caseResult.getResults()) {
255                            if (statusesToRun.contains(testResult.getResultStatus())) {
256                                TestFilter testInclude = new TestFilter(module.getAbi(),
257                                        module.getName(), testResult.getFullName());
258                                addIncludeToSubPlan(subPlan, testInclude);//模块执行完成,但是中间有部分fail的情况
259                            }
260                        }
261                    }
262                }
263            } else {
264                // module should not run, exclude entire module
265                TestFilter moduleExclude =
266                        new TestFilter(module.getAbi(), module.getName(), null /*test*/);
267                addExcludeToSubPlan(subPlan, moduleExclude);//全部正确的module
268            }
269        }
270        return subPlan;
271    }

那么到这里,CompatibilityTestSuite为什么会持有大量的exclude case项记录已经明白了,CtsDeqpTestCases没有完成,且是在快完成前中断导致最后没有完成,这一项共有35万条case(仅v7a或者v8a)
CompatibilityTestSuite下面的一些初始化操作因为不是本文的重点,不再赘述了;继续看test.split(shardCountHint)的逻辑
tools/tradefederation/core/src/com/android/tradefed/testtype/suite/ITestSuite.java

621    /** {@inheritDoc} */
622    @Override
623    public Collection<IRemoteTest> split(int shardCountHint) {
624        if (shardCountHint <= 1 || mIsSharded) {
625            // cannot shard or already sharded
626            return null;
627        }
628
629        LinkedHashMap<String, IConfiguration> runConfig = loadAndFilter();
630        if (runConfig.isEmpty()) {
631            CLog.i("No config were loaded. Nothing to run.");
632            return null;
633        }
634        injectInfo(runConfig);
635
636        // We split individual tests on double the shardCountHint to provide better average.
637        // The test pool mechanism prevent this from creating too much overhead.
638        List<ModuleDefinition> splitModules =
639                ModuleSplitter.splitConfiguration(
640                        runConfig, shardCountHint, mShouldMakeDynamicModule);
641        runConfig.clear();
642        runConfig = null;
643        // create an association of one ITestSuite <=> one ModuleDefinition as the smallest
644        // execution unit supported.
645        List<IRemoteTest> splitTests = new ArrayList<>();
646        for (ModuleDefinition m : splitModules) {
647            ITestSuite suite = createInstance();
648            OptionCopier.copyOptionsNoThrow(this, suite);
649            suite.mIsSharded = true;
650            suite.mDirectModule = m;
651            splitTests.add(suite);
652        }
653        // return the list of ITestSuite with their ModuleDefinition assigned
654        return splitTests;
655    }

首先看loadAndFilter的相关逻辑

261    private LinkedHashMap<String, IConfiguration> loadAndFilter() {
262        LinkedHashMap<String, IConfiguration> runConfig = loadTests();
263        if (runConfig.isEmpty()) {
264            CLog.i("No config were loaded. Nothing to run.");
265            return runConfig;
266        }
267        if (mModuleMetadataIncludeFilter.isEmpty() && mModuleMetadataExcludeFilter.isEmpty()) {
268            return runConfig;
269        }
270        LinkedHashMap<String, IConfiguration> filteredConfig = new LinkedHashMap<>();
271        for (Entry<String, IConfiguration> config : runConfig.entrySet()) {
272            if (!filterByConfigMetadata(
273                    config.getValue(),
274                    mModuleMetadataIncludeFilter,
275                    mModuleMetadataExcludeFilter)) {
276                // if the module config did not pass the metadata filters, it's excluded
277                // from execution.
278                continue;
279            }
280            if (!filterByRunnerType(config.getValue(), mAllowedRunners)) {
281                // if the module config did not pass the runner type filter, it's excluded from
282                // execution.
283                continue;
284            }
285            filterPreparers(config.getValue(), mAllowedPreparers);
286            filteredConfig.put(config.getKey(), config.getValue());
287        }
288        runConfig.clear();
289        return filteredConfig;
290    }

tools/tradefederation/core/src/com/android/tradefed/testtype/suite/BaseTestSuite.java
首先在loadTests中重新组织mIncludeFilters和mExcludeFilters,变为mIncludeFiltersParsed和mExcludeFiltersParsed

133    /** {@inheritDoc} */
134    @Override
135    public LinkedHashMap<String, IConfiguration> loadTests() {
136        try {
137            File testsDir = getTestsDir();
138            setupFilters(testsDir);
139            Set<IAbi> abis = getAbis(getDevice());
140
141            // Create and populate the filters here
142            SuiteModuleLoader.addFilters(mIncludeFilters, mIncludeFiltersParsed, abis);
143            SuiteModuleLoader.addFilters(mExcludeFilters, mExcludeFiltersParsed, abis); //解析成<String,List>键值对,module为name,List为其test
144
145            CLog.d(
146                    "Initializing ModuleRepo\nABIs:%s\n"
147                            + "Test Args:%s\nModule Args:%s\nIncludes:%s\nExcludes:%s",
148                    abis, mTestArgs, mModuleArgs, mIncludeFiltersParsed, mExcludeFiltersParsed);
149            mModuleRepo =
150                    createModuleLoader(
151                            mIncludeFiltersParsed, mExcludeFiltersParsed, mTestArgs, mModuleArgs);
152            // Actual loading of the configurations.
153            return loadingStrategy(abis, testsDir, mSuitePrefix, mSuiteTag);  //取要执行的module对应的config
154        } catch (DeviceNotAvailableException | FileNotFoundException e) {
155            throw new RuntimeException(e);
156        }
157    }
159    /**
160     * Default loading strategy will load from the resources and the tests directory. Can be
161     * extended or replaced.
162     *
163     * @param abis The set of abis to run against.
164     * @param testsDir The tests directory.
165     * @param suitePrefix A prefix to filter the resource directory.
166     * @param suiteTag The suite tag a module should have to be included. Can be null.
167     * @return A list of loaded configuration for the suite.
168     */
169    public LinkedHashMap<String, IConfiguration> loadingStrategy(
170            Set<IAbi> abis, File testsDir, String suitePrefix, String suiteTag) {
171        LinkedHashMap<String, IConfiguration> loadedConfigs = new LinkedHashMap<>();
172        // Load configs that are part of the resources
173        if (!mSkipJarLoading) {
174            loadedConfigs.putAll(
175                    getModuleLoader().loadConfigsFromJars(abis, suitePrefix, suiteTag));
176        }
177
178        // Load the configs that are part of the tests dir
179        if (mConfigPatterns.isEmpty()) {
180            // If no special pattern was configured, use the default configuration patterns we know
181            mConfigPatterns.add(".*\\.config");
182            mConfigPatterns.add(".*\\.xml");
183        }
184        loadedConfigs.putAll(
185                getModuleLoader()
186                        .loadConfigsFromDirectory(
187                                testsDir, abis, suitePrefix, suiteTag, mConfigPatterns));
188        return loadedConfigs;
189    }

tools/tradefederation/core/src/com/android/tradefed/testtype/suite/ModuleSplitter.java
然后调用到splitConfiguration

56    /**
57     * Create a List of executable unit {@link ModuleDefinition}s based on the map of configuration
58     * that was loaded.
59     *
60     * @param runConfig {@link LinkedHashMap} loaded from {@link ITestSuite#loadTests()}.
61     * @param shardCount a shard count hint to help with sharding.
62     * @return List of {@link ModuleDefinition}
63     */
64    public static List<ModuleDefinition> splitConfiguration(
65            LinkedHashMap<String, IConfiguration> runConfig,
66            int shardCount,
67            boolean dynamicModule) {
68        if (dynamicModule) {
69            // We maximize the sharding for dynamic to reduce time difference between first and
70            // last shard as much as possible. Overhead is low due to our test pooling.
71            shardCount *= 2;
72        }
73        List<ModuleDefinition> runModules = new ArrayList<>();
74        for (Entry<String, IConfiguration> configMap : runConfig.entrySet()) {
75            // Check that it's a valid configuration for suites, throw otherwise.
76            ValidateSuiteConfigHelper.validateConfig(configMap.getValue());
77
78            createAndAddModule(
79                    runModules,
80                    configMap.getKey(),
81                    configMap.getValue(),
82                    shardCount,
83                    dynamicModule); //根据module name,config,shardcount 创建对应的ModuleDefinition
84        }
85        return runModules;
86    }
88    private static void createAndAddModule(
89            List<ModuleDefinition> currentList,
90            String moduleName,
91            IConfiguration config,
92            int shardCount,
93            boolean dynamicModule) {
94        // If this particular configuration module is declared as 'not shardable' we take it whole
95        // but still split the individual IRemoteTest in a pool.
96        if (config.getConfigurationDescription().isNotShardable()
97                || (!dynamicModule
98                        && config.getConfigurationDescription().isNotStrictShardable())) {
99            for (int i = 0; i < config.getTests().size(); i++) {
100                if (dynamicModule) {
101                    ModuleDefinition module =
102                            new ModuleDefinition(
103                                    moduleName,
104                                    config.getTests(),
105                                    clonePreparersMap(config),
106                                    clonePreparers(config.getMultiTargetPreparers()),
107                                    config);
108                    currentList.add(module);
109                } else {
110                    addModuleToListFromSingleTest(
111                            currentList, config.getTests().get(i), moduleName, config);
112                }
113            }
114            return;
115        }
116
117        // If configuration is possibly shardable we attempt to shard it.
118        for (IRemoteTest test : config.getTests()) {
119            if (test instanceof IShardableTest) {
120                Collection<IRemoteTest> shardedTests = ((IShardableTest) test).split(shardCount);
121                if (shardedTests != null) {
122                    // Test did shard we put the shard pool in ModuleDefinition which has a polling
123                    // behavior on the pool.
124                    if (dynamicModule) {
125                        for (int i = 0; i < shardCount; i++) {
126                            ModuleDefinition module =
127                                    new ModuleDefinition(
128                                            moduleName,
129                                            shardedTests,
130                                            clonePreparersMap(config),
131                                            clonePreparers(config.getMultiTargetPreparers()),
132                                            config);
133                            currentList.add(module);
134                        }
135                    } else {
136                        // We create independent modules with each sharded test.
137                        for (IRemoteTest moduleTest : shardedTests) {
138                            addModuleToListFromSingleTest(
139                                    currentList, moduleTest, moduleName, config);
140                        }
141                    }
142                    continue;
143                }
144            }
145            // test is not shardable or did not shard
146            addModuleToListFromSingleTest(currentList, test, moduleName, config);
147        }
148    }

创建出ModuleDefinition list之后,根据其进行进一步的split操作

646        for (ModuleDefinition m : splitModules) {
647            ITestSuite suite = createInstance();
648            OptionCopier.copyOptionsNoThrow(this, suite); //注意这里,刚刚的创建的CompatibilityTestSuite有复制的操作
649            suite.mIsSharded = true;
650            suite.mDirectModule = m; //新的suite,为mDirectModule赋值(刚刚创建的ModuleDefinition)
651            splitTests.add(suite); //CompatibilityTestSuite list
652        }

这里splitTests就是hprof中造成失败的CompatibilityTestSuite list
tools/tradefederation/core/src/com/android/tradefed/config/OptionCopier.java

54    /**
55     * Identical to {@link #copyOptions(Object, Object)} but will log instead of throw if exception
56     * occurs.
57     */
58    public static void copyOptionsNoThrow(Object source, Object dest) {
59        try {
60            copyOptions(source, dest);
61        } catch (ConfigurationException e) {
62            CLog.e(e);
63        }
64    }
32    /**
33     * Copy the values from {@link Option} fields in <var>origObject</var> to <var>destObject</var>
34     *
35     * @param origObject the {@link Object} to copy from
36     * @param destObject the {@link Object} tp copy to
37     * @throws ConfigurationException if options failed to copy
38     */
39    public static void copyOptions(Object origObject, Object destObject)
40            throws ConfigurationException {
41        Collection<Field> origFields = OptionSetter.getOptionFieldsForClass(origObject.getClass());
42        Map<String, Field> destFieldMap = getFieldOptionMap(destObject);
43        for (Field origField : origFields) {
44            final Option option = origField.getAnnotation(Option.class);
45            Field destField = destFieldMap.remove(option.name());
46            if (destField != null) {
47                Object origValue = OptionSetter.getFieldValue(origField,
48                        origObject);
49                OptionSetter.setFieldValue(option.name(), destObject, destField, origValue);
50            }
51        }
52    }

最后复制出大量的CompatibilityTestSuite (需要retry module多的情况) ;并且每个CompatibilityTestSuite持有大量的exclude记录项(35万条);最终造成log中的报错

问题总结

  1. 测试CtsDeqpTestCases module这个超大模块时,再其要执行完时,adb中断等情况造成case中断,done = false;因此再retry时,会将大量的exclude项记录到CompatibilityTestSuite中
  2. CompatibilityTestSuite在多台机器retry时有复制操作,更进一步放到了问题,导致fail
  3. 临时解决方案,将CtsDeqpTestCases这个模块单独提出来测试,这样能保证问题绝对不会发生;就算在此中断,单独retry CtsDeqpTestCases报告也不会进行复制操作;因此,目前看来只要单独测试CtsDeqpTestCases模块,此问题绝不会复现,这也是google允许的
  4. 建议google进行cts框架的修改,比如对retry时不用的exclude项进行移除;或者复制CompatibilityTestSuite时对exclude list用单例模式进行处理(这建议google来修复,google更熟悉此逻辑,并且google自身有专门的团队在不断迭代更新)
  5. 向google提供的首个patch 只是一种思路,不太好,还是建议google来修复这个问题
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,125评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,293评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,054评论 0 351
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,077评论 1 291
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,096评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,062评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,988评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,817评论 0 273
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,266评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,486评论 2 331
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,646评论 1 347
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,375评论 5 342
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,974评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,621评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,796评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,642评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,538评论 2 352

推荐阅读更多精彩内容

  • mean to add the formatted="false" attribute?.[ 46% 47325/...
    ProZoom阅读 2,695评论 0 3
  • GMS认证包括三个部分:CTS、GTS、CTS Verifier;Android8.0以后,增加了两个新的测试,分...
    Darkt阅读 8,610评论 5 9
  • To90Kg 111 - 90 = 21; Date Target Done 7.31 108; 109.6 ...
    ericguo阅读 197评论 0 0
  • 人,太爱容易宠爱。 太理智容易苍老。 太谦虚容易虚伪。 太仁慈容易被人欺。 太话多容易浮夸。 太犹豫容易失机。 太...
    简赏阅读 163评论 1 3
  • 1,在NDK开发中,JNI层可能需要调用到上层java对象的方法,那我们先看看这个过程是怎么样的。 JNIEXPO...
    蓝胖子_Android阅读 3,315评论 0 1