Blog 2: AP CSP MCQ Review and Analysis
- MCQ Review

Blog 2: AP CSP MCQ Review and Analysis

MCQ Review

Practice Test Results

AP CSP College Board MCQ Analysis

Score Improvement

First Attempt: 44/66 (~67%), 2hr 40 min
Second Attempt: 55/67 (~82%)
Improvement: +11 points (+15%)

Performance by Topic

Topic	Performance	Strengths	Improvements Needed
Collaboration	100%	Strong teamwork and collaboration skills.	None.
Program Function & Purpose	100%	Clear understanding of program goals.	None.
Debugging & Error Correction	71%	Some debugging ability.	Improve error identification.
Binary Numbers	100%	Confident with binary operations.	None.
Data Compression	100%	Strong understanding of compression.	None.
Extracting Information from Data	71%	Can handle basic data tasks.	Practice advanced data analysis.
Conditionals	100%	Mastery of logical statements.	None.
Computing Impacts (Bias, Ethics)	100%	Strong grasp of societal impacts.	None.

Question:

A computational biologist is studying the relationship between protein folding efficiency and various environmental factors. The researcher has developed a novel algorithm that predicts protein stability scores (PSS) based on amino acid sequence, temperature, pH, and ionic strength. The table below shows experimental results from five different proteins under varying conditions, with their measured stability compared to the algorithm’s predictions.

Protein ID	Amino Acid Count	Temperature (°C)	pH	Ionic Strength (mM)	Measured PSS	Predicted PSS	Deviation
PRO-A742	156	37.0	7.2	150	0.83	0.79	+0.04
PRO-B219	327	42.5	6.8	125	0.61	0.72	-0.11
PRO-C588	203	39.5	7.4	175	0.76	0.74	+0.02
PRO-D105	412	36.0	6.5	200	0.58	0.67	-0.09
PRO-E871	189	40.0	7.0	100	0.91	0.85	+0.06

The researcher is evaluating the algorithm’s performance using a significance threshold of ±0.08 for prediction deviation. Which of the following conclusions is most strongly supported by the data?

A. The algorithm systematically overestimates the stability of larger proteins (>300 amino acids) and underestimates smaller proteins, suggesting a fundamental flaw in the sequence analysis component.

B. The algorithm’s performance exceeds the significance threshold only when the ionic strength falls outside the 125-175 mM range, indicating that the ionic strength parameter requires recalibration.

C. There is a statistically significant negative correlation (p < 0.01) between pH and prediction accuracy, with the algorithm performing worse at lower pH values.

D. The algorithm’s predictions fall within the acceptable deviation threshold for 60% of the proteins tested, with the most significant deviations occurring in proteins with extreme combinations of size and temperature parameters.

Answer:

The correct answer is D, because examining the data shows that 3 out of 5 proteins (PRO-A742, PRO-C588, and PRO-E871) have deviations within the ±0.08 threshold, which equals 60%. The two proteins with deviations exceeding the threshold (PRO-B219 at -0.11 and PRO-D105 at -0.09) are indeed the ones with the most extreme combinations of size and temperature - PRO-B219 has a high temperature (42.5°C) combined with large size (327 amino acids), while PRO-D105 has the largest size (412 amino acids) combined with the lowest temperature (36.0°C).

Wrong Answer Analysis

A wrong answer I might have selected is A: The algorithm systematically overestimates the stability of larger proteins (>300 amino acids) and underestimates smaller proteins, suggesting a fundamental flaw in the sequence analysis component.

Why I might have thought it was correct:

I noticed that the two largest proteins (PRO-B219 and PRO-D105) both have negative deviations, meaning the algorithm overestimated their stability.
Conversely, I observed that smaller proteins like PRO-A742, PRO-C588, and PRO-E871 all have positive deviations, suggesting the algorithm underestimated their stability.
This pattern seems to indicate a systematic bias related to protein size, which would be a fundamental issue in the algorithm.
However, I failed to carefully analyze whether this pattern is consistent enough to be considered “systematic” and didn’t consider that other factors might be contributing to these deviations more significantly than protein size alone.
I also didn’t properly consider that correlation doesn’t necessarily imply causation in this complex multivariate system of protein folding.

Key Insights

Strengths: Binary Numbers, Conditionals, Societal Impacts
Weaknesses: Debugging, Advanced Data Analysis

Action Plan to Improve score

Focus on Weak Areas:
- Practice debugging and error correction.
- Work on interpreting complex data.
Take More Timed Tests:
- Build speed and confidence under exam-like conditions.

Summary: Improved from 67% to 82%. Strengths are solid, but I need to refine debugging and data analysis skills for further growth.