Finding papers is easy. Finding papers that actually support a sentence is much harder.
That gap is where many literature tools break down. A result may be topically related, mention similar biomedical terms, or come from the right field, yet still fail to support the original claim. For researchers and academic writers, that creates unnecessary work: paste a sentence, open multiple papers, and manually decide which ones are actually usable.
Our latest round of product improvements focused on exactly that problem.
The real problem is not "paper discovery"
In reverse literature search, users are rarely asking for a generic reading list. They are usually asking a much narrower question:
- Which papers support this sentence?
- Which papers only partially support it?
- Which papers point in the opposite direction?
That means relevance alone is not enough. We need to move closer to evidence.
Better handling of long and complex claims
Many users do not search with short keywords. They paste real draft sentences or paragraphs, often with multiple sub-claims in the same input.
We improved how long inputs are handled so complex claims can be checked in more meaningful sentence-level units. This helps prevent later clauses from being ignored and makes it easier to surface evidence for each part of the original text.
In practice, that means a long biomedical sentence is less likely to collapse into one broad, diluted search.
Stronger support for Chinese input
Another important use case is Chinese-language drafting paired with English-language literature retrieval.
This is harder than basic translation. Scientific meaning is often lost when a claim is converted too literally, while over-broad rewriting can drift away from what the user intended.
We improved how Chinese claims are interpreted and matched so the system is more likely to recover the underlying scientific intent, not just the surface words.
Broader evidence recall
One of the clearest lessons from testing was this: weak top results do not necessarily mean strong evidence does not exist.
Sometimes the right paper is present in the broader literature but simply does not appear high enough in the initial results. So we expanded evidence recall to improve coverage before ranking.
This increases the chance of finding papers that are not just related to the topic, but meaningfully closer to the claim itself.
Ranking that prefers support over loose similarity
A paper should not rank highly just because it shares several biomedical terms with the query.
It should rank highly because it is more likely to:
- directly support the claim
- partially support it
- contradict it
- or meaningfully engage with it
That shift matters. Users do not want a list of papers that feel adjacent. They want a shortlist that reduces judgment work.
Evidence-type filtering
We also added support for filtering by evidence type, including:
- support
- partial support
- against
- uncertain
This makes the result set much more practical. Instead of scanning a mixed list, users can immediately focus on the evidence pattern they care about most.
For example:
- If you want to defend a sentence, you can start with support.
- If you want to pressure-test a claim, you can inspect against and uncertain first.
- If the literature is mixed, partial support becomes especially useful.
What this improves for users
Taken together, these updates should make reverse literature search more useful in everyday writing and verification workflows.
Users should now see:
- better handling of long claims
- stronger support for Chinese input
- broader recall of candidate evidence
- fewer loosely related top results
- faster access to papers that are actually worth reading
- more control through evidence-type filters
The goal is simple: reduce the distance between a written claim and the evidence needed to evaluate it.
That is the direction we are continuing to push.
