Blog 2: AP CSP MCQ Review and Analysis
MCQ Review

AP CSP College Board MCQ Analysis
Score Improvement
- First Attempt: 44/66 (~67%), 2hr 40 min
- Second Attempt: 55/67 (~82%)
- Improvement: +11 points (+15%)
Performance by Topic
| Topic | Performance | Strengths | Improvements Needed |
|---|---|---|---|
| Collaboration | 100% | Strong teamwork and collaboration skills. | None. |
| Program Function & Purpose | 100% | Clear understanding of program goals. | None. |
| Debugging & Error Correction | 71% | Some debugging ability. | Improve error identification. |
| Binary Numbers | 100% | Confident with binary operations. | None. |
| Data Compression | 100% | Strong understanding of compression. | None. |
| Extracting Information from Data | 71% | Can handle basic data tasks. | Practice advanced data analysis. |
| Conditionals | 100% | Mastery of logical statements. | None. |
| Computing Impacts (Bias, Ethics) | 100% | Strong grasp of societal impacts. | None. |
Question:
A computational biologist is studying the relationship between protein folding efficiency and various environmental factors. The researcher has developed a novel algorithm that predicts protein stability scores (PSS) based on amino acid sequence, temperature, pH, and ionic strength. The table below shows experimental results from five different proteins under varying conditions, with their measured stability compared to the algorithm’s predictions.
| Protein ID | Amino Acid Count | Temperature (°C) | pH | Ionic Strength (mM) | Measured PSS | Predicted PSS | Deviation |
|---|---|---|---|---|---|---|---|
| PRO-A742 | 156 | 37.0 | 7.2 | 150 | 0.83 | 0.79 | +0.04 |
| PRO-B219 | 327 | 42.5 | 6.8 | 125 | 0.61 | 0.72 | -0.11 |
| PRO-C588 | 203 | 39.5 | 7.4 | 175 | 0.76 | 0.74 | +0.02 |
| PRO-D105 | 412 | 36.0 | 6.5 | 200 | 0.58 | 0.67 | -0.09 |
| PRO-E871 | 189 | 40.0 | 7.0 | 100 | 0.91 | 0.85 | +0.06 |
The researcher is evaluating the algorithm’s performance using a significance threshold of ±0.08 for prediction deviation. Which of the following conclusions is most strongly supported by the data?
A. The algorithm systematically overestimates the stability of larger proteins (>300 amino acids) and underestimates smaller proteins, suggesting a fundamental flaw in the sequence analysis component.
B. The algorithm’s performance exceeds the significance threshold only when the ionic strength falls outside the 125-175 mM range, indicating that the ionic strength parameter requires recalibration.
C. There is a statistically significant negative correlation (p < 0.01) between pH and prediction accuracy, with the algorithm performing worse at lower pH values.
D. The algorithm’s predictions fall within the acceptable deviation threshold for 60% of the proteins tested, with the most significant deviations occurring in proteins with extreme combinations of size and temperature parameters.
Answer:
The correct answer is D, because examining the data shows that 3 out of 5 proteins (PRO-A742, PRO-C588, and PRO-E871) have deviations within the ±0.08 threshold, which equals 60%. The two proteins with deviations exceeding the threshold (PRO-B219 at -0.11 and PRO-D105 at -0.09) are indeed the ones with the most extreme combinations of size and temperature - PRO-B219 has a high temperature (42.5°C) combined with large size (327 amino acids), while PRO-D105 has the largest size (412 amino acids) combined with the lowest temperature (36.0°C).
Wrong Answer Analysis
A wrong answer I might have selected is A: The algorithm systematically overestimates the stability of larger proteins (>300 amino acids) and underestimates smaller proteins, suggesting a fundamental flaw in the sequence analysis component.
Why I might have thought it was correct:
- I noticed that the two largest proteins (PRO-B219 and PRO-D105) both have negative deviations, meaning the algorithm overestimated their stability.
- Conversely, I observed that smaller proteins like PRO-A742, PRO-C588, and PRO-E871 all have positive deviations, suggesting the algorithm underestimated their stability.
- This pattern seems to indicate a systematic bias related to protein size, which would be a fundamental issue in the algorithm.
- However, I failed to carefully analyze whether this pattern is consistent enough to be considered “systematic” and didn’t consider that other factors might be contributing to these deviations more significantly than protein size alone.
- I also didn’t properly consider that correlation doesn’t necessarily imply causation in this complex multivariate system of protein folding.
Key Insights
- Strengths: Binary Numbers, Conditionals, Societal Impacts
- Weaknesses: Debugging, Advanced Data Analysis
Action Plan to Improve score
- Focus on Weak Areas:
- Practice debugging and error correction.
- Work on interpreting complex data.
- Take More Timed Tests:
- Build speed and confidence under exam-like conditions.
Summary: Improved from 67% to 82%. Strengths are solid, but I need to refine debugging and data analysis skills for further growth.