In this work, we identify two fundamental challenges in existing semantic code search: (C1) the interpretability gap and (C2) the generalization collapse. While embedding-based retrievers perform well on in-distribution benchmarks, they are often black-box and fragile in real-world, out-of-distribution scenarios—frequently relying on shortcut cues such as leading tokens (e.g., function names) rather than the true functional semantics.
We argue that the root cause lies in the learning paradigm: current retrievers mainly perform inductive matching via a single global query–code similarity, whereas developers need deductive verification—explicitly checking whether each functional requirement in the query is satisfied by the code. To bridge this gap, we introduce XSearch, which reformulates retrieval as a concept-to-code alignment problem and unifies explanations with ranking.