Search for AIs
Valyu's Deepsearch API gives AI the context it needs. Integrate trusted, high-quality public and proprietary sources, with full-text multimodal retrieval.
Get $10 free credits for the Valyu API when you sign up at Valyu!
No credit card required.
We do all the heavy lifting for you - one unified API for all data:
- Academic & Research Content - Access millions of scholarly papers and textbooks
- Real-time Web Search - Get the latest information from across the internet
- Structured Financial Data - Stock prices, market data, and financial metrics
- Intelligent Reranking - Results across all sources are automatically sorted by relevance
- Transparent Pricing - Pay only for what you use with clear CPM pricing
Install the Valyu SDK using pip:
pip install valyu
Here's what it looks like, make your first query in just 4 lines of code:
from valyu import Valyu
valyu = Valyu(api_key="your-api-key-here")
response = valyu.search(
"Implementation details of agentic search-enhanced large reasoning models",
max_num_results=5, # Limit to top 5 results
max_price=10, # Maximum price per thousand queries (CPM)
fast_mode=True # Enable fast mode for quicker, shorter results
)
print(response)
# Feed the results to your AI agent as you would with other search APIs
The search()
method is the core of the Valyu SDK. It accepts a query string as the first parameter, followed by optional configuration parameters.
def search(
query: str, # Your search query
search_type: str = "all", # "all", "web", or "proprietary"
max_num_results: int = 10, # Maximum results to return (1-20)
is_tool_call: bool = True, # Whether this is an AI tool call
relevance_threshold: float = 0.5, # Minimum relevance score (0-1)
max_price: int = 30, # Maximum price per thousand queries (CPM)
included_sources: List[str] = None, # Specific sources to search
excluded_sources: List[str] = None, # Sources to exclude from search
country_code: str = None, # Country code filter (e.g., "US", "GB")
response_length: Union[str, int] = None, # Response length: "short"/"medium"/"large"/"max" or character count
category: str = None, # Category filter
start_date: str = None, # Start date (YYYY-MM-DD)
end_date: str = None, # End date (YYYY-MM-DD)
fast_mode: bool = False, # Enable fast mode for faster but shorter results
) -> SearchResponse
Parameter | Type | Default | Description |
---|---|---|---|
query |
str |
required | The search query string |
search_type |
str |
"all" |
Search scope: "all" , "web" , or "proprietary" |
max_num_results |
int |
10 |
Maximum number of results to return (1-20) |
is_tool_call |
bool |
True |
Whether this is an AI tool call (affects processing) |
relevance_threshold |
float |
0.5 |
Minimum relevance score for results (0.0-1.0) |
max_price |
int |
30 |
Maximum price per thousand queries in CPM |
included_sources |
List[str] |
None |
Specific data sources or URLs to search |
excluded_sources |
List[str] |
None |
Data sources or URLs to exclude from search |
country_code |
str |
None |
Country code filter (e.g., "US", "GB", "JP", "ALL") |
response_length |
Union[str, int] |
None |
Response length: "short"/"medium"/"large"/"max" or character count |
category |
str |
None |
Category filter for results |
start_date |
str |
None |
Start date filter in YYYY-MM-DD format |
end_date |
str |
None |
End date filter in YYYY-MM-DD format |
fast_mode |
bool |
False |
Enable fast mode for faster but shorter results. Good for general purpose queries |
The search method returns a SearchResponse
object with the following structure:
class SearchResponse:
success: bool # Whether the search was successful
error: Optional[str] # Error message if any
tx_id: str # Transaction ID for feedback
query: str # The original query
results: List[SearchResult] # List of search results
results_by_source: ResultsBySource # Count of results by source type
total_deduction_pcm: float # Cost in CPM
total_deduction_dollars: float # Cost in dollars
total_characters: int # Total characters returned
Each SearchResult
contains:
class SearchResult:
title: str # Result title
url: str # Source URL
content: Union[str, List[Dict]] # Full content (text or structured)
description: Optional[str] # Brief description
source: str # Source identifier
price: float # Cost for this result
length: int # Content length in characters
image_url: Optional[Dict[str, str]] # Associated images
relevance_score: float # Relevance score (0-1)
data_type: Optional[str] # "structured" or "unstructured"
The contents()
method extracts clean, structured content from web pages with optional AI-powered data extraction and summarization.
def contents(
urls: List[str], # List of URLs to process (max 10)
summary: Union[bool, str, Dict] = None, # AI summary configuration
extract_effort: str = None, # "normal" or "high"
response_length: Union[str, int] = None, # Content length configuration
max_price_dollars: float = None, # Maximum cost limit in USD
) -> ContentsResponse
Parameter | Type | Default | Description |
---|---|---|---|
urls |
List[str] |
required | List of URLs to process (maximum 10 URLs per request) |
summary |
Union[bool, str, Dict] |
None |
AI summary configuration: - False/None : No AI processing (raw content)- True : Basic automatic summarization- str : Custom instructions (max 500 chars)- dict : JSON schema for structured extraction |
extract_effort |
str |
None |
Extraction thoroughness: "normal" (fast) or "high" (thorough but slower) |
response_length |
Union[str, int] |
None |
Content length per URL: - "short" : 25,000 characters- "medium" : 50,000 characters- "large" : 100,000 characters- "max" : No limit- int : Custom character limit |
max_price_dollars |
float |
None |
Maximum cost limit in USD |
The contents method returns a ContentsResponse
object:
class ContentsResponse:
success: bool # Whether the request was successful
error: Optional[str] # Error message if any
tx_id: str # Transaction ID for tracking
urls_requested: int # Number of URLs submitted
urls_processed: int # Number of URLs successfully processed
urls_failed: int # Number of URLs that failed
results: List[ContentsResult] # List of extraction results
total_cost_dollars: float # Total cost in dollars
total_characters: int # Total characters extracted
Each ContentsResult
contains:
class ContentsResult:
url: str # Source URL
title: str # Page/document title
content: Union[str, int, float] # Extracted content
length: int # Content length in characters
source: str # Data source identifier
summary: Optional[Union[str, Dict]] # AI-generated summary or structured data
summary_success: Optional[bool] # Whether summary generation succeeded
data_type: Optional[str] # Type of data extracted
image_url: Optional[Dict[str, str]] # Extracted images
citation: Optional[str] # APA-style citation
from valyu import Valyu
valyu = Valyu("your-api-key")
# Simple search across all sources
response = valyu.search("What is machine learning?")
print(f"Found {len(response.results)} results")
# Search academic papers on arXiv
response = valyu.search(
"transformer architecture improvements",
search_type="proprietary",
included_sources=["valyu/valyu-arxiv"],
relevance_threshold=0.7,
max_num_results=10
)
# Search recent web content
response = valyu.search(
"AI safety developments",
search_type="web",
start_date="2024-01-01",
end_date="2024-12-31",
max_num_results=5
)
# Search both web and proprietary sources
response = valyu.search(
"quantum computing breakthroughs",
search_type="all",
category="technology",
relevance_threshold=0.6,
max_price=50
)
response = valyu.search("climate change solutions")
if response.success:
print(f"Search cost: ${response.total_deduction_dollars:.4f}")
print(f"Sources: Web={response.results_by_source.web}, Proprietary={response.results_by_source.proprietary}")
for i, result in enumerate(response.results, 1):
print(f"\n{i}. {result.title}")
print(f" Source: {result.source}")
print(f" Relevance: {result.relevance_score:.2f}")
print(f" Content: {result.content[:200]}...")
else:
print(f"Search failed: {response.error}")
# Extract raw content from URLs
response = valyu.contents(
urls=["https://techcrunch.com/2025/08/28/anthropic-users-face-a-new-choice-opt-out-or-share-your-data-for-ai-training/"]
)
if response.success:
for result in response.results:
print(f"Title: {result.title}")
print(f"Content: {result.content[:500]}...")
# Extract content with automatic summarization
response = valyu.contents(
urls=["https://docs.python.org/3/tutorial/"],
summary=True,
response_length="max"
)
for result in response.results:
print(f"Summary: {result.summary}")
# Extract structured data using JSON schema
company_schema = {
"type": "object",
"properties": {
"company_name": {"type": "string"},
"founded_year": {"type": "integer"},
"key_products": {
"type": "array",
"items": {"type": "string"},
"maxItems": 3
}
}
}
response = valyu.contents(
urls=["https://en.wikipedia.org/wiki/OpenAI"],
summary=company_schema,
response_length="max"
)
if response.success:
for result in response.results:
if result.summary:
print(f"Structured data: {json.dumps(result.summary, indent=2)}")
# Process multiple URLs with a cost limit
response = valyu.contents(
urls=[
"https://www.valyu.network/",
"https://docs.valyu.network/overview",
"https://www.valyu.network/blogs/why-ai-agents-and-llms-struggle-with-search-and-data-access"
],
summary="Provide key takeaways in bullet points, and write in very emphasised singaporean english"
)
print(f"Processed {response.urls_processed}/{response.urls_requested} URLs")
print(f"Cost: ${response.total_cost_dollars:.4f}")
Set your API key in one of these ways:
-
Environment variable (recommended):
export VALYU_API_KEY="your-api-key-here"
-
Direct initialization:
valyu = Valyu(api_key="your-api-key-here")
The SDK handles errors gracefully and returns structured error responses:
response = valyu.search("test query")
if not response.success:
print(f"Error: {response.error}")
print(f"Transaction ID: {response.tx_id}")
else:
# Process successful results
for result in response.results:
print(result.title)
- Sign up for a free account at Valyu
- Get your API key from the dashboard
- Install the SDK:
pip install valyu
- Start building with the examples above
- Documentation: docs.valyu.network
- API Reference: Full parameter documentation above
- Examples: Check the
examples/
directory in this repository - Issues: Report bugs on GitHub
This project is licensed under the MIT License.