API reference¶
Autogenerated from the public surface. Everything here is importable from the top-level toolpicker package:
ToolPicker¶
ToolPicker ¶
ToolPicker(
source: ToolSource,
*,
embedder: EmbeddingProvider | None = None,
intent_classifier: IntentClassifier | None = None,
bm25_weight: float = 1.0,
semantic_weight: float = 1.0,
intent_weight: float = 1.0,
rrf_k: int = 60,
bm25_k1: float = 1.5,
bm25_b: float = 0.75,
bm25_stopwords: frozenset[str] | None = None,
)
Hybrid lexical + semantic tool selection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
ToolSource
|
Anything satisfying |
required |
embedder
|
EmbeddingProvider | None
|
Optional |
None
|
intent_classifier
|
IntentClassifier | None
|
Optional |
None
|
bm25_weight
|
float
|
RRF weight for the BM25 retriever. Default 1.0. |
1.0
|
semantic_weight
|
float
|
RRF weight for the semantic retriever. Default 1.0. |
1.0
|
intent_weight
|
float
|
RRF weight for the intent classifier. Default 1.0. |
1.0
|
rrf_k
|
int
|
RRF damping constant. Default 60. |
60
|
bm25_k1
|
float
|
BM25 saturation parameter. Default 1.5. |
1.5
|
bm25_b
|
float
|
BM25 length-normalisation parameter. Default 0.75. |
0.75
|
bm25_stopwords
|
frozenset[str] | None
|
Optional override for the BM25 stopword set. |
None
|
Example
from toolpicker import ToolPicker, FunctionSchemaSource, HashEmbedder
source = FunctionSchemaSource([
{"name": "get_weather", "description": "Get weather for a city.",
"parameters": {"type": "object", "properties": {"city": {"type": "string"}}}},
{"name": "send_email", "description": "Send an email.",
"parameters": {"type": "object", "properties": {"to": {"type": "string"}}}},
])
picker = ToolPicker(source, embedder=HashEmbedder())
tools = picker.select("what's the temperature in SF?", k=1)
# Returns a list of Tool objects matching the query.
tools
property
¶
All tools the picker can return. Useful for debugging the corpus.
select ¶
Return tools for the query, ordered by fused relevance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
The user / agent input to route against. |
required |
k
|
int
|
Cap on the number of tools returned. |
5
|
token_budget
|
int | None
|
Optional. If set, only return tools whose serialised
schemas fit under this total token budget. Greedy first-fit
(skip-and-continue): a too-big tool at rank N doesn't block
smaller tools at rank N+1. Returned list is bounded by
|
None
|
Returns:
| Type | Description |
|---|---|
list[Tool]
|
|
list[Tool]
|
retriever returns a hit (or token_budget is too small for even |
list[Tool]
|
the smallest tool). |
Tool sources¶
Tool
dataclass
¶
Tool(
*,
id: str,
name: str,
description: str,
parameters_schema: dict[str, Any] = dict(),
keywords: list[str] = list(),
metadata: dict[str, Any] = dict(),
)
One callable surface an LLM agent might invoke.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
str
|
Stable opaque identifier. Usually the function name; must be unique across the corpus the router sees. |
name |
str
|
Display name (typically same as |
description |
str
|
Natural-language description of what the tool does. This is what the semantic retriever embeds. |
parameters_schema |
dict[str, Any]
|
JSON-Schema-shaped dict for the tool's parameters. Same shape as OpenAI function-call schemas. |
keywords |
list[str]
|
Optional short tokens that boost lexical recall. Append domain-specific terms here that aren't in the name/params/desc (e.g. internal codes, account-type abbreviations). |
metadata |
dict[str, Any]
|
Free-form caller-supplied tags. Not used by retrieval; useful for downstream routing (group, owner, deprecation, etc.). |
ToolSource ¶
Bases: Protocol
Anything that knows how to enumerate a set of tools.
Implementations: FunctionSchemaSource (v0.1), OpenAPISource (v0.3),
MCPSource (v0.3). Each parses its input format into Tool objects.
The router reads from the source once at construction; reload the source
and rebuild the router if tools change.
FunctionSchemaSource ¶
FunctionSchemaSource(
schemas: list[dict[str, Any]],
*,
keywords: dict[str, list[str]] | None = None,
)
Wrap a list of function-call schema dicts as a ToolSource.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
schemas
|
list[dict[str, Any]]
|
Function-call schemas. Each may be either the bare schema
( |
required |
keywords
|
dict[str, list[str]] | None
|
Optional mapping of tool name → keyword list. Lets callers
attach domain-specific lexical hints without hand-building the
|
None
|
OpenAPISource ¶
Wrap an OpenAPI 3.0 / 3.1 spec as a ToolSource.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
dict[str, Any] | str | Path
|
The spec, either as a parsed dict or a path to a YAML/JSON
file. YAML files require |
required |
validate
|
bool
|
Whether to run |
True
|
MCPSource ¶
Wrap a list of MCP tool descriptions as a ToolSource.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mcp_tools
|
list[dict[str, Any]]
|
A list of dicts in MCP's tool-description format
( |
required |
from_client
async
classmethod
¶
Introspect an MCP server and build a source from its advertised tools.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session
|
ClientSession
|
An already-initialized |
required |
MergedSource ¶
Concatenate the tools from multiple sources, preserving order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*sources
|
ToolSource
|
Any number of |
()
|
Retrievers¶
Retriever ¶
Bases: Protocol
A single ranking pass over the tool corpus.
Implementations are stateful: they build their index at construction time
from the full tool list, then retrieve() queries the prebuilt index.
retrieve ¶
Return up to k (tool_id, score) tuples, sorted by score desc.
Empty list if the corpus is empty or no tool clears the retriever's internal threshold. Score interpretation is retriever-specific.
BM25Retriever ¶
BM25Retriever(
tools: list[Tool],
*,
k1: float = 1.5,
b: float = 0.75,
stopwords: frozenset[str] | None = None,
)
BM25 over the tool corpus.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tools
|
list[Tool]
|
Tools to index. Indexed at construction; rebuild for new tools. |
required |
k1
|
float
|
Term-frequency saturation parameter (default 1.5). |
1.5
|
b
|
float
|
Length-normalisation parameter (default 0.75). |
0.75
|
stopwords
|
frozenset[str] | None
|
Set of tokens to drop from both the index and queries.
Defaults to |
None
|
SemanticRetriever ¶
Embedding-based retrieval over tool descriptions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tools
|
list[Tool]
|
Tools to index. Each tool's |
required |
embedder
|
EmbeddingProvider
|
Any |
required |
reciprocal_rank_fusion ¶
reciprocal_rank_fusion(
rankings: list[list[tuple[str, float]]],
*,
weights: list[float] | None = None,
rrf_k: int = 60,
) -> list[tuple[str, float]]
Fuse N ranked lists into one ranked list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rankings
|
list[list[tuple[str, float]]]
|
List of per-retriever rankings. Each ranking is a list of
|
required |
weights
|
list[float] | None
|
Optional per-retriever weight (same length as |
None
|
rrf_k
|
int
|
The damping constant. Larger = less penalty on lower ranks. Default 60 per the original paper. |
60
|
Returns:
| Type | Description |
|---|---|
list[tuple[str, float]]
|
|
list[tuple[str, float]]
|
A tool only appears if at least one retriever ranked it. |
Embeddings¶
EmbeddingProvider ¶
OpenAIEmbeddings ¶
Real OpenAI embeddings. Defaults to text-embedding-3-small (1536-d).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
OpenAI embedding model name. Default |
'text-embedding-3-small'
|
api_key
|
str | None
|
API key. Falls back to |
None
|
Auto-batches at 2048 inputs per call (OpenAI's hard limit on the batch size). Returned vectors are unit-normalised by the API; cosine similarity and dot product yield identical rankings.
HashEmbedder ¶
Deterministic non-semantic embedder. Same input → same vector, always.
Useful when: * tests need reproducibility without an API key * you want a fast no-network baseline for benchmarking
NOT useful for production semantic retrieval - it's hash-based and has no notion of meaning. Lexical retrieval (BM25) will beat it on most queries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dimensions
|
int
|
Output vector dimensionality. Default 16 (fast for tests). |
16
|
CachedEmbedder ¶
CachedEmbedder(
embedder: EmbeddingProvider,
*,
cache_path: Path | None = None,
autosave: bool = True,
)
Wrap an EmbeddingProvider with a disk-backed content-hash cache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
embedder
|
EmbeddingProvider
|
The underlying provider. Must satisfy |
required |
cache_path
|
Path | None
|
Where to persist the cache. Default
|
None
|
autosave
|
bool
|
Whether to write the cache to disk after each |
True
|
invalidate ¶
Drop entries from the cache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str | None
|
If given, drop only that single entry. If |
None
|
save ¶
Persist the in-memory cache to self.path.
Creates the parent directory if needed. Atomic-ish: writes to a temp file and renames, so a crash mid-write doesn't corrupt the existing cache.
Intent classifier¶
IntentClassifier ¶
Bases: Protocol
A ranker that scores tools by labelled-example similarity to the query.
Same shape as Retriever: returns (tool_id, score) tuples sorted
by score descending. Scores are aggregated across the k nearest
training examples; absolute values aren't comparable across classifier
implementations, but rank order is what RRF needs anyway.
classify ¶
Return up to k (tool_id, score) tuples, sorted by score desc.
Empty list if the example corpus is empty.
IntentExample
dataclass
¶
A labelled (query, tool_id) example for intent training.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
The user / agent text the example represents. |
required |
tool_id
|
str
|
The tool that should be selected for this query. |
required |
EmbeddingNNIntent ¶
EmbeddingNNIntent(
*,
examples: list[IntentExample],
embedder: EmbeddingProvider,
neighbours: int = 5,
)
k-NN intent classifier over embedded labelled queries.
At construction time, embeds every training example's query once and
keeps the unit-norm vector. At classify(query, k=K), embeds the
query, computes cosine similarity against each example, picks the K
most similar, and aggregates per tool_id (sum of similarities).
Tools with multiple similar training examples beat tools with a single
strong-but-isolated match - which is the right default for routing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
examples
|
list[IntentExample]
|
Labelled training examples. Empty list is allowed (the
classifier just returns |
required |
embedder
|
EmbeddingProvider
|
Any |
required |
neighbours
|
int
|
Number of nearest training examples to consider when classifying. Default 5; raise for noisy corpora. |
5
|
Example
from toolpicker import HashEmbedder
from toolpicker.intent import EmbeddingNNIntent, IntentExample
intent = EmbeddingNNIntent(
examples=[
IntentExample(query="ping the team", tool_id="send_email"),
IntentExample(query="block the afternoon", tool_id="create_calendar_event"),
],
embedder=HashEmbedder(dimensions=32),
)
hits = intent.classify("notify the team", k=2)
# Returns [(tool_id, score), ...] sorted by score desc.
examples
property
¶
All training examples. Returns a copy to avoid external mutation.
Token-budget packer¶
pack_to_budget ¶
pack_to_budget(
tools: list[Tool],
*,
token_budget: int,
token_counter: Callable[[Tool], int] | None = None,
serialise: Callable[
[Tool], dict[str, Any]
] = default_serialise,
) -> list[Tool]
Greedy first-fit packing of tools under a token budget.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tools
|
list[Tool]
|
Tools in rank order (most relevant first). |
required |
token_budget
|
int
|
Max total token cost across the returned tools. Must be positive. |
required |
token_counter
|
Callable[[Tool], int] | None
|
Optional override for token cost per tool. Defaults to
|
None
|
serialise
|
Callable[[Tool], dict[str, Any]]
|
How to render a tool into a dict for token-counting. Defaults to OpenAI function-call envelope. |
default_serialise
|
Returns:
| Type | Description |
|---|---|
list[Tool]
|
The subset of |
list[Tool]
|
tool whose cost exceeds the remaining budget is skipped; we keep |
list[Tool]
|
going to try smaller tools further down the ranking. |
count_tokens ¶
count_tokens(
tool: Tool,
*,
serialise: Callable[
[Tool], dict[str, Any]
] = default_serialise,
) -> int
Count tokens of the serialised tool.
Uses tiktoken (cl100k_base, covers all modern OpenAI chat models)
when installed. Falls back to ceil(len(json) / 4) otherwise - close
enough for budget gating, off by ~10% in pathological cases.
default_serialise ¶
OpenAI function-call tool envelope.
What you'd pass to client.chat.completions.create(tools=[...]).