.media argument on llm_message()All non-text content now attaches to messages through a single .media argument that accepts any combination of media types. Pass a single object or a list:
# Single image
llm_message("What is in this image?",
.media = img("photo.jpg")) |>
chat(claude())
# Multiple images in one message
llm_message("Describe the difference between these two images.",
.media = list(img("before.jpg"), img("after.jpg"))) |>
chat(openai())
# Mixed types: image + PDF together
llm_message("Does this figure match what is reported in Table 2?",
.media = list(img("figure_3.png"),
pdf_file("paper.pdf", pages = 1:8))) |>
chat(gemini())
Three new constructors join img():
audio_file(path): attach audio inline; supported by gemini(), openrouter(), and mistral() (Voxtral models)video_file(path): attach video inline; supported by gemini() and openrouter()pdf_file(path, pages, .text_extract): attach a PDF; Claude and Gemini receive the binary file (preserving layout, tables, and scanned content); all other providers receive extracted text automatically# Transcribe a recording
llm_message("Summarise what is discussed in this interview.",
.media = audio_file("bosch_interview.mp3")) |>
chat(gemini())
# Analyse a video clip with a JSON schema
video_schema <- tidyllm_schema(
title = field_chr("Title or subject of the clip"),
era = field_chr("Approximate decade or period depicted"),
key_people = field_chr("Names mentioned, semicolon-separated")
)
llm_message("Analyse this video clip.",
.media = video_file("documentary.mp4")) |>
chat(gemini(), .json_schema = video_schema)
# Extract references from a scanned PDF (binary path, no OCR needed)
llm_message("Extract all references in APA format.",
.media = pdf_file("1995_Neal_Industry_Specific.pdf")) |>
chat(claude(), .json_schema = ref_schema)
# Force text extraction for any provider
llm_message("Summarise this report.",
.media = pdf_file("annual_report.pdf", .text_extract = TRUE)) |>
chat(openai())
All providers that accept images now handle multiple images per message. Pass them as a list inside .media. Claude supports up to 600 images per message; Gemini up to 3,600.
A unified set of verbs manages files stored on provider servers. Upload once, reuse across many requests:
# Upload; returns a tidyllm_file handle
report <- upload_file(gemini(), .path = "quarterly_report.pdf")
# Attach the handle to any message via .files
llm_message("What were the key results this quarter?",
.files = report) |>
chat(gemini())
llm_message("List the top three risks in the document.",
.files = report) |>
chat(gemini())
# Inspect and manage uploaded files
list_files(gemini())
file_info(gemini(), report) # accepts a tidyllm_file or a plain ID string
delete_file(gemini(), report)
The same pattern works with claude() and openai(). Provider support:
img() insteadA tidyllm_file is provider-specific: a file uploaded to Claude cannot be sent to Gemini. tidyllm validates provider match before every request.
openai() now uses the Responses API (POST /v1/responses). All existing workflows continue to work unchanged. New capabilities unlocked by the rewrite:
# Reasoning effort for o-series models
llm_message("Prove that there are infinitely many primes.") |>
chat(openai(.model = "o4-mini"), .reasoning_effort = "high")
# Stateful multi-turn conversations (server retains context by ID)
first <- llm_message("My name is Alex.") |>
chat(openai(), .stateful = TRUE)
second <- llm_message("What is my name?") |>
chat(openai(), .previous_response_id = first)
Batch processing (send_batch(openai())) continues to use the Chat Completions endpoint internally.
# Web search: the server runs the search, results appear in the reply
llm_message("What happened in AI research this week?") |>
chat(openai(), .tools = openai_websearch())
# Code interpreter
llm_message("Plot a histogram of 1,000 standard-normal samples.") |>
chat(openai(), .tools = openai_code_interpreter())
# Mix built-in and custom tools in one call
llm_message("Find today's EUR/USD rate and convert 500 EUR.") |>
chat(openai(), .tools = list(openai_websearch(), my_converter_tool))
# Background research job (slow; typically 5 to 30 minutes)
job <- llm_message("Survey the literature on causal inference with LLMs.") |>
deep_research(openai(.model = "o4-mini-deep-research"), .background = TRUE)
check_job(job)
result <- fetch_job(job)
A chat_completions() provider for any OpenAI-compatible endpoint (vLLM, LiteLLM, Together AI, Anyscale, and others), without having to repurpose openai():
llm_message("Hello!") |>
chat(chat_completions(
.api_url = "https://api.together.xyz/v1/",
.api_key_env_var = "TOGETHER_API_KEY",
.model = "meta-llama/Llama-3-8b-chat-hf"
))
.reasoning_effort parameter for Magistral thinking models ("low", "medium", "high"):
llm_message("Is this argument valid?", .media = pdf_file("proof.pdf")) |>
chat(mistral(.model = "magistral-medium-latest"), .reasoning_effort = "high")
audio_file() and video_file() now work with OpenRouter and are routed to the underlying model's audio/video endpoint. Filter for capable models by the audio modality at openrouter.ai/models.The following are soft-deprecated with warnings in 0.5.0 and will remain as permanent aliases:
.imagefile on llm_message(): use .media = img(path) instead.pdf on llm_message(): use .media = pdf_file(path) insteadclaude_upload_file(), claude_delete_file(), claude_file_metadata(), claude_list_files(): use upload_file(claude()), delete_file(claude()), file_info(claude()), list_files(claude()) insteadgemini_upload_file(), gemini_delete_file(), gemini_file_metadata(), gemini_list_files(): use the corresponding upload_file(gemini()) etc. verbs instead.file_ids on claude_chat() and .fileid on gemini_chat(): upload with upload_file() and attach with .files on llm_message() insteadfile_info() and delete_file() accept a tidyllm_file object directly in addition to a plain ID stringclaude() updated to claude-sonnet-4-6; fast model updated to claude-haiku-4-5gemini() updated to gemini-2.5-flash; default embedding model updated to gemini-embedding-2-previewvoyage_embedding() updated to voyage-4openai() updated to gpt-5.5 (released April 2026)deepseek() updated to deepseek-v4-pro (DeepSeek V4, released April 2026); .thinking = TRUE now enables thinking mode via the thinking body parameter instead of switching to the deprecated deepseek-reasoner model name; both deepseek-v4-pro and deepseek-v4-flash support thinking modeopenrouter())Access to 300+ models from a single API key via OpenRouter. Supports chat, embeddings, model listing, and fallback routing across providers:
# Chat with any model on OpenRouter
llm_message("What is the capital of France?") |>
chat(openrouter(.model = "anthropic/claude-3.5-sonnet"))
# List available models
list_models(openrouter())
# Check account credits
openrouter_credits()
# Retrieve generation metadata (tokens, cost) for a completed request
openrouter_generation(generation_id)
OpenRouter also supports provider fallback routing — specify a list of fallback providers to use if the primary model is unavailable.
llamacpp())Full support for local llama.cpp servers, including chat, embeddings, reranking, and model management:
# Chat with a local llama.cpp server
llm_message("Explain R to a Python developer") |>
chat(llamacpp())
# Generate embeddings
c("text one", "text two") |> embed(llamacpp())
# Rerank documents by relevance
llamacpp_rerank("best R package for LLMs", c("tidyllm", "ellmer", "httr2"))
# Model management
llamacpp_list_local_models() # list models in the model directory
list_hf_gguf_files("Qwen/Qwen2.5-7B-Instruct-GGUF") # browse HuggingFace GGUF files
llamacpp_download_model("Qwen/Qwen2.5-7B-Instruct-GGUF", "qwen2.5-7b-instruct-q4_k_m.gguf")
llamacpp_delete_model("path/to/model.gguf")
llamacpp_health() # check server status
deep_research(), check_job(), fetch_job()A new deep_research() verb for running long-horizon research tasks. Currently supported by perplexity() via the sonar-deep-research model:
# Blocking — waits for completion and returns an LLMMessage
result <- llm_message("Compare Rust and Go for systems programming") |>
deep_research(perplexity())
get_reply(result)
get_metadata(result)$api_specific[[1]]$citations
# Background — returns immediately, poll with check_job() / fetch_job()
job <- llm_message("Summarize the latest EU AI Act developments") |>
deep_research(perplexity(), .background = TRUE)
check_job(job) # poll status
result <- fetch_job(job) # retrieve when complete
check_job() and fetch_job() are type-dispatching aliases — they delegate to check_batch()/fetch_batch() for batch objects, or to perplexity_check_research()/perplexity_fetch_research() for research jobs.
perplexity_deep_research(), perplexity_check_research(), perplexity_fetch_research() functions for async deep research via the sonar-deep-research model.json_schema structured output support for both perplexity_chat() and perplexity_deep_research().search_domain_filter parameter to restrict or exclude domains (up to 10, prefix - to exclude).reasoning_effort parameter for perplexity_deep_research() ("low", "medium", "high")Extended thinking is now available for two additional providers:
.thinking_budget parameter in gemini_chat() sets the token budget for internal reasoning (works with gemini-2.5-flash and gemini-2.5-pro). Thinking output is stored in get_metadata()$api_specific[[1]]$thinking..thinking = TRUE in deepseek_chat() switches to the deepseek-reasoner model and captures the reasoning trace in get_metadata()$api_specific[[1]]$thinking.field_fct() (enum) and vector fields are now correctly serialised in tool parameter schemasellmer_tool() converts ellmer ToolDef objects to tidyllm TOOL objects, enabling tools defined in ellmer (and packages like btw) to be used directly with tidyllm:
btw_tool <- ellmer_tool(btw::btw_tool_files_list_files)
llm_message("List files in the R/ folder") |>
chat(claude(), .tools = btw_tool)
ellmer_tool() also supports provider-native builtin tools such as ellmer::claude_tool_web_search():
web_search <- ellmer_tool(ellmer::claude_tool_web_search())
llm_message("Latest AI safety news?") |>
chat(claude(), .tools = web_search)
chat_ellmer() lets you use any ellmer Chat object as a tidyllm provider, bridging the two ecosystems.json_schema structured output support for groq_chat() and Groq batch requestsjson_object mode in favour of proper JSON schema responsessend_groq_batch() to fetch_groq_batch()voyage_rerank() — new reranking function using the rerank-2 model; returns a tibble sorted by relevance score.output_dimension parameter for voyage_embedding() — control output vector size (256, 512, 1024, 2048) for Voyage-4 modelsopenrouter_embedding() — generate embeddings via OpenRouteropenrouter_credits() — check account balance and credit usageopenrouter_generation() — retrieve token and cost metadata for a completed generationcheck_azure_openai_batch() and fetch_azure_openai_batch() to handle null values for created_at and expires_at fields returned by some deploymentsollama() changed to qwen3.5:4b (faster, better instruction following for local use)ollama() changed to qwen3-embedding:0.6bdispatch_to_provider() helper, reducing duplication across all verbssend_mistral_batch() and send_groq_batch() to their respective fetch functions, causing structured-output batch results to be returned as raw textperplexity_deep_research() API request format for the async endpointclaude(). At the moment only implemented for the chat() verbexample_file <- here::here("vignettes","die_verwandlung.pdf") |>
claude_upload_file()
llm_message("Summarize the document in 100 words") |>
chat(claude(.file_ids = example_file$file_id))
#> Message History:
#> system:
#> You are a helpful assistant
#> --------------------------------------------------------------
#> user:
#> Summarize the document in 100 words
#> --------------------------------------------------------------
#> assistant:
#> This document is the German text of Franz Kafka's novella
#> "Die Verwandlung" (The Metamorphosis), published through
#> Project Gutenberg. The story follows Gregor Samsa, a
#> traveling salesman who wakes up one morning transformed into
#> a monstrous insect-like creature. Unable to work and support
#> his family, Gregor becomes isolated in his room while his
#> family struggles with the burden of his transformation.
#> His sister Grete initially cares for him, bringing food
#> and cleaning his room, but over time the family's situation
#> deteriorates financially and emotionally. The story explores
#> themes of alienation, family duty, and dehumanization as
#> Gregor gradually loses his human identity and connection to
#> his family. Eventually, Gregor dies, and his family, though
#> initially grief-stricken, ultimately feels relieved and
#> optimistic about their future without the burden of caring
#> for him. The text includes the complete three-part novella
#> along with Project Gutenberg licensing information.
#> --------------------------------------------------------------
perplexity() provider now supports more Perplexity API parameters, allowing you to set reasoning and search effort.gemini() with most functionality of chat() requests.This release marks a major internal refactor accompanied by a suite of subtle yet impactful improvements. While many changes occur under the hood, they collectively deliver a more robust, flexible, and maintainable framework.
Robust Streaming:
httr2::req_perform_connection() (httr2 ≥ 1.1.1), resulting in a more stable and reliable experience.Optimized Internal Processing:
Schema support:
New field_object() function to allow for nested schemata
Expanded API Features:
mistral() now accepts the .json_schema argument.claude() incorporates .json_schema via a JSON-extractor tool, in line with Anthropic's guidelines.Claude now supports reasoning:
conversation <- llm_message("Are there an infinite number of prime numbers such that n mod 4 == 3?") |>
chat(claude(.thinking=TRUE)) |>
print()
#> Message History:
#> system:
#> You are a helpful assistant
#> --------------------------------------------------------------
#> user:
#> Are there an infinite number of prime numbers such that n
#> mod 4 == 3?
#> --------------------------------------------------------------
#> assistant:
#> # Infinitude of Primes Congruent to 3 mod 4
#>
#> Yes, there are infinitely many prime numbers $p$ such
#> that $p \equiv 3 \pmod{4}$ (when $p$ divided by 4 leaves
#> remainder 3).
#>
#> ## Proof by Contradiction
#>
#> I'll use a proof technique similar to Euclid's classic proof
#> of the infinitude of primes:
#>
#> 1) Assume there are only finitely many primes $p$ such that
#> $p \equiv 3 \pmod{4}$. Let's call them $p_1, p_2, ..., p_k$.
#>
#> 2) Consider the number $N = 4p_1p_2...p_k - 1$
#>
#> 3) Note that $N \equiv 3 \pmod{4}$ since $4p_1p_2...p_k
#> \equiv 0 \pmod{4}$ and $4p_1p_2...p_k - 1 \equiv -1 \equiv 3
#> \pmod{4}$
#>
#> 4) $N$ must have at least one prime factor $q$
#>
#> 5) For any $i$ between 1 and $k$, we have $N \equiv -1
#> \pmod{p_i}$, so $N$ is not divisible by any of the primes
#> $p_1, p_2, ..., p_k$
#>
#> 6) Therefore, $q$ is a prime not in our original list
#>
#> 7) Furthermore, $q$ must be congruent to 3 modulo 4:
#> - $q$ cannot be 2 because $N$ is odd
#> - If $q \equiv 1 \pmod{4}$, then $\frac{N}{q} \equiv 3
#> \pmod{4}$ would need another prime factor congruent to 3
#> modulo 4
#> - So $q \equiv 3 \pmod{4}$
#>
#> 8) This contradicts our assumption that we listed all primes
#> of the form $p \equiv 3 \pmod{4}$
#>
#> Therefore, there must be infinitely many primes of the form
#> $p \equiv 3 \pmod{4}$.
#> --------------------------------------------------------------
#Thinking process is stored in API-specific metadata
conversation |>
get_metadata() |>
dplyr::pull(api_specific) |>
purrr::map_chr("thinking") |>
cat()
#> The question is asking if there are infinitely many prime numbers $p$ such that $p \equiv 3 \pmod{4}$, i.e., when divided by 4, the remainder is 3.
#>
#> I know that there are infinitely many prime numbers overall. The classic proof is Euclid's proof by contradiction: if there were only finitely many primes, we could multiply them all together, add 1, and get a new number not divisible by any of the existing primes, which gives us a contradiction.
#>
#> For primes of the form $p \equiv 3 \pmod{4}$, we can use a similar proof strategy.
#>
#> Let's assume there are only finitely many primes $p_1, p_2, \ldots, p_k$ such that $p_i \equiv 3 \pmod{4}$ for all $i$.
#>
#> Now, consider the number $N = 4 \cdot p_1 \cdot p_2 \cdot \ldots \cdot p_k - 1$.
#>
#> Note that $N \equiv -1 \equiv 3 \pmod{4}$.
#>
#> Now, let's consider the prime factorization of $N$. If $N$ is itself prime, then we have found a new prime $N$ such that $N \equiv 3 \pmod{4}$, which contradicts our assumption that we enumerated all such primes.
#>
> ...
gemini(): Sytem prompts were not sent to the API in older versionsA first tool usage system inspired by a similar system in ellmer has been introduced to tidyllm. At the moment tool use is available
for claude(), openai(), mistral(), ollama(), gemini() and groq():
get_current_time <- function(tz, format = "%Y-%m-%d %H:%M:%S") {
format(Sys.time(), tz = tz, format = format, usetz = TRUE)
}
time_tool <- tidyllm_tool(
.f = get_current_time,
.description = "Returns the current time in a specified timezone. Use this to determine the current time in any location.",
tz = field_chr("The time zone identifier (e.g., 'Europe/Berlin', 'America/New_York', 'Asia/Tokyo', 'UTC'). Required."),
format = field_chr("Format string for the time output. Default is '%Y-%m-%d %H:%M:%S'.")
)
llm_message("What's the exact time in Stuttgart?") |>
chat(openai,.tools=time_tool)
#> Message History:
#> system:
#> You are a helpful assistant
#> --------------------------------------------------------------
#> user:
#> What's the exact time in Stuttgart?
#> --------------------------------------------------------------
#> assistant:
#> The current time in Stuttgart (Europe/Berlin timezone) is
#> 2025-03-03 09:51:22 CET.
#> --------------------------------------------------------------
You can use the tidyllm_tool() function to define tools available to a large language model.
Once a tool or a list of tools is passed to a model, it can request to run these
these functions in your current session and use their output for further generation context.
tidyllm now supports the deepseek API as provider via deepseek_chat() or the deepseek() provider function.
Deepseek supports logprobs just like openai(), which you can get via get_logprobs().
At the moment tool usage for deepseek is very inconsistent.
Voyage.ai introduces a unique multimodal embeddings feature, allowing you to generate embeddings not only for text but also for images.
The new voyage_embedding() function in tidyllm enables this functionality by seamlessly handling different input types,
working with both the new feature as well as the same inputs as for other embedding functions.
The new img() function lets you create image objects for embedding. You can mix text and img() objects in a list and send them to Voyage AI for multimodal embeddings:
list("tidyllm", img(here::here("docs", "logo.png"))) |>
embed(voyage)
#> # A tibble: 2 × 2
#> input embeddings
#> <chr> <list>
#> 1 tidyllm <dbl [1,024]>
#> 2 [IMG] logo.png <dbl [1,024]>
In this example, both text ("tidyllm") and an image (logo.png) are embedded together. The function returns a tibble where the input column contains the text and labeled image names, and the embeddings column contains the corresponding embedding vectors.
tidyllm_schema() and tidyllm_tool()⚠️ There is a bad bug in the latest CRAN release in the fetch_openai_batch() function that is only fixed in version 0.3.2. For the release 0.3.1. the fetch_openai_batch() function throws errors if the logprobs are turned off.
.json_schema option of api-functions. Moreover, tidyllm_schema() now accepts ellmer types as field definitions. In addition four ellmer-inspired type-definition functionsfield_chr(), field_dbl(), field_lgl() and field_fct() were added that allow you to set description fields in schemata ellmer_adress <-ellmer::type_object(
street = ellmer::type_string("A famous street"),
houseNumber = ellmer::type_number("a 3 digit number"),
postcode = ellmer::type_string(),
city = ellmer::type_string("A large city"),
region = ellmer::type_string(),
country = ellmer::type_enum(values = c("Germany", "France"))
)
person_schema <- tidyllm_schema(
person_name = "string",
age = field_dbl("An age between 25 and 40"),
is_employed = field_lgl("Employment Status in the last year")
occupation = field_fct(.levels=c("Lawyer","Butcher")),
address = ellmer_adress
)
address_message <- llm_message("imagine an address") |>
chat(openai,.json_schema = ellmer_adress)
person_message <- llm_message("imagine a person profile") |>
chat(openai,.json_schema = person_schema)
openai_chat() and send_openai_batch() and new get_logprobs() function:badger_poem <- llm_message("Write a haiku about badgers") |>
chat(openai(.logprobs=TRUE,.top_logprobs=5))
badger_poem |> get_logprobs()
#> # A tibble: 19 × 5
#> reply_index token logprob bytes top_logprobs
#> <int> <chr> <dbl> <list> <list>
#> 1 1 "In" -0.491 <int [2]> <list [5]>
#> 2 1 " moon" -1.12 <int [5]> <list [5]>
#> 3 1 "lit" -0.00489 <int [3]> <list [5]>
#> 4 1 " forest" -1.18 <int [7]> <list [5]>
#> 5 1 "," -0.00532 <int [1]> <list [5]>
ollama_delete_model() functionlist_models() is now a verb supporting most providers.list_models(openai)
#> # A tibble: 52 × 3
#> id created owned_by
#> <chr> <chr> <chr>
#> 1 gpt-4o-mini-audio-preview-2024-12-17 2024-12-13 18:52:00 system
#> 2 gpt-4-turbo-2024-04-09 2024-04-08 18:41:17 system
#> 3 dall-e-3 2023-10-31 20:46:29 system
#> 4 dall-e-2 2023-11-01 00:22:57 system
send_ollama_batch() function to make use of the fast parallel request features of Ollama.openai() reasoning models supportedperplexity() and gemini()LLMMessagetidyllm 0.3.0 represents a major milestone for tidyllm
The largest changes compared to 0.2.0 are:
chat(), embed(), send_batch(), check_batch(), and fetch_batch() to interact with APIs. These functions always work with a combination of verbs and providers:
chat(), embed(), send_batch()) define the type of action you want to perform.openai(), claude(), ollama()) are an arguement of verbs and specify the API to handle the action with and take provider-specific argumentsEach verb and provider combination routes the interaction to provider-specific functions like openai_chat() or claude_chat() that do the work in the background. These functions can also be called directly as an alternative more verbose and provider-specific interface.
llm_message("Hello World") |>
openai(.model = "gpt-4o")
# Recommended Verb-Based Approach
llm_message("Hello World") |>
chat(openai(.model = "gpt-4o"))
# Or even configuring a provider outside
my_ollama <- ollama(.model = "llama3.2-vision:90B",
.ollama_server = "https://ollama.example-server.de",
.temperature = 0)
llm_message("Hello World") |>
chat(my_ollama)
# Alternative Approach is to use more verbose specific functions:
llm_message("Hello World") |>
openai_chat(.model = "gpt-4o")
openai(), claude(), etc.) still work if you directly supply an LLMMessage as arguement, but issue deprecation warnings when used directly for chat.R6-based saved LLMMessage objects are no longer compatible with the new version. Saved objects from earlier versions need to be re-createdgemini() and perplexity() as new supported API providers. gemini() brings interesting Video and Audio features as well as search grounding to tidyllm. perplexity() also offers well cited search grounded assitant repliesmistral()get_reply_metadata() to get information on token usage, or on other relevant metadata (like sources used for grounding)R6 to S7 for the main LLMMessage class, improving maintainability, interoperability, and future-proofing..grounding_threshold argument added of the gemini_chat() function allowing you to use Google searches to ground model responses to a search result Gemini models. For example, asking about the maintainer of an obscure R package works with grounding but does only lead to a hallucination without:llm_message("What is tidyllm and who maintains this package?") |>
gemini_chat(.grounding_threshold = 0.3)
perplexity_chat(). The neat feature of perplexity is the up-to-date web search it does with detailed citations. Cited sources are available in the api_specific-list column of get_metadata().json_schema support for ollama() available with Ollama 0.5.0get_metadata() returns a list column with API-specific metadataR6 to S7 for the main LLMMessage classdf_llm_message()APIProvider classesapi_openai.R,api_gemini.R,etc. filesas_tibble() S3 Generic for LLMMessagetrack_rate_limit().onattach() removedR6-based LLMMessage-objects are not compatible with the new version anymore! This also applies to saved objects, like lists of batch files.here::here("local_wip","example.mp3") |> gemini_upload_file()
here::here("local_wip","legrille.mp4") |> gemini_upload_file()
file_tibble <- gemini_list_files()
llm_message("What are these two files about?") |>
gemini_chat(.fileid=file_tibble$name)
Better embedding functions with improved output and error handling and new documentation. New article on using embeddings with tidyllm. Support for embedding models on azure with azure_openai_embedding()
embed() and the related API-specific functions was changed from a matrix to a tibble with an input column and a list column containing one embedding vector and one input per row.One disadvantage of the first iteration of the new interface was that all arguements that needed to be passed to provider-specific functions, were going through the provider function. This feels, unintuitive, because users expect common arguments (e.g., .model, .temperature) to be set directly in main verbs like chat() or send_batch().Moreover, provider functions don't expose arguments for autocomplete, making it harder for users to explore options. Therefore, the main API verbs now directly accept common arguements, and check them against the available arguements for each API.
tidyllm has introduced a verb-based interface overhaul to provide a more intuitive and flexible user experience. Previously, provider-specific functions like claude(), openai(), and others were directly used for chat-based workflows. Now, these functions primarily serve as provider configuration for some general verbs like chat().
chat(), embed(), send_batch(), check_batch(), and fetch_batch() to interact with APIs. These functions always work with a combination of verbs and providers:
chat(), embed(), send_batch()) define the type of action you want to perform.openai(), claude(), ollama()) are an arguement of verbs and specify the API to handle the action with and take provider-specific argumentsEach verb and provider combination routes the interaction to provider-specific functions like openai_chat() or claude_chat() that do the work in the background. These functions can also be called directly as an alternative more verbose and provider-specific interface.
llm_message("Hello World") |>
openai(.model = "gpt-4o")
# Recommended Verb-Based Approach
llm_message("Hello World") |>
chat(openai(.model = "gpt-4o"))
# Or even configuring a provider outside
my_ollama <- ollama(.model = "llama3.2-vision:90B",
.ollama_server = "https://ollama.example-server.de",
.temperature = 0)
llm_message("Hello World") |>
chat(my_ollama)
# Alternative Approach is to use more verbose specific functions:
llm_message("Hello World") |>
openai_chat(.model = "gpt-4o")
openai(), claude(), etc.) still work if you directly supply an LLMMessage as arguement, but issue deprecation warnings when used directly for chat.gemini() main API-function#Upload a file for use with gemini
upload_info <- gemini_upload_file("example.mp3")
#Make the file available during a Gemini API call
llm_message("Summarize this speech") |>
gemini(.fileid = upload_info$name)
#Delte the file from the Google servers
gemini_delete_file(upload_info$name)
tidyllm_schema()gemini()-requests allow for a wide range of file types that can be used for context in messagesgemini() file workflows:
application/pdftext/plaintext/htmltext/csstext/mdtext/csvtext/xmltext/rtfgemini() file workflows:
application/x-javascript, text/javascriptapplication/x-python, text/x-pythongemini() file workflows:
image/pngimage/jpegimage/webpimage/heicimage/heifgemini() file workflows:
video/mp4video/mpegvideo/movvideo/avivideo/x-flvvideo/mpgvideo/webmvideo/wmvvideo/3gppgemini() file workflows:
audio/wavaudio/mp3audio/aiffaudio/aacaudio/oggaudio/flacget_metadata() function to retrieve and format metadata from LLMMessage objects.print method for LLMMessage to support printing metadata, controlled via the new tidyllm_print_metadata option or a new .meta-arguement for the print method.conversation <- llm_message("Write a short poem about software development") |>
claude()
#Get metdata on token usage and model as tibble
get_metadata(conversation)
#or print it with the message
print(conversation,.meta=TRUE)
#Or allways print it
options(tidyllm_print_metadata=TRUE)
send_openai_batch() caused by a missing .json-arguement not being passed for messages without schemaNew CRAN release. Largest changes compared to 0.1.0:
Major Features:
.json_schema handling in openai(), enhancing support for well-defined JSON responses.azure_openai() function for accessing the Azure OpenAI service, with full support for rate-limiting and batch operations tailored to Azure’s API structure.mistral() function provides full support for Mistral models hosted in the EU, including rate-limiting and streaming capabilities.pdf_page_batch() function, which processes PDFs page by page, allowing users to define page-specific prompts for detailed analysis..compatible argument (and flexible url and path) in openai() to allow compatibility with third-party OpenAI-compatible APIs.Improvements:
to_api_format() to reduce code duplication, simplify API format generation, and improve maintainability.httr2::req_retry() in addition to the rate-limit tracking functions in tidyllm, using 429 headers to wait for rate limit resets.httptest2Breaking Changes:
get_reply() was split into get_reply() for text outputs and get_reply_data() for structured outputs, improving type stability compared to an earlier function that had different outputs based on a .json-arguement.chatgpt(): The chatgpt() function has been deprecated in favor of openai() for feature alignment and improved consistency.Minor Updates and Bug Fixes:
llm_message(): Allows extraction of specific page ranges from PDFs, improving flexibility in document handling.ollama_download_model() function to download models from the Ollama API.compatible-arguement in openai() to allow working with compatible third party APIsto_api_format(): API format generation now has much less code duplication and is more maintainable.get_reply() was split into two type-stable functions: get_reply() for text and get_reply_data() for structured outputs.httr2::req_retry(): Rate limiting now uses the right 429 headers where they come.Enhanced Input Validation: All API functions now have improved input validation, ensuring better alignment with API documentation
Improved error handling More human-readable error messages for failed requests from the API
Advanced JSON Mode in openai(): The openai() function now supports advanced .json_schemas, allowing structured output in JSON mode for more precise responses.
Reasoning Models Support: Support for O1 reasoning models has been added, with better handling of system prompts in the openai() function.
Streaming callback functions refactored: Given that the streaming callback format for Open AI, Mistral and Groq is nearly identical the three now rely on the same callback function.
chatgpt() Deprecated: The chatgpt() function has been deprecated in favor of openai(). Users should migrate to openai() to take advantage of the new features and enhancements.openai(), ollama(), and claude() functions now return more informative error messages when API calls fail, helping with debugging and troubleshooting.ollama_embedding() to generate embeddings using the Ollama API.openai_embedding() to generate embeddings using the OpenAI API.mistral_embedding() to generate embeddings using the Mistral API.llm_message(): The llm_message() function now supports specifying a range of pages in a PDF by passing a list with filename, start_page, and end_page. This allows users to extract and process specific pages of a PDF.pdf_page_batch() function, which processes PDF files page by page, extracting text and converting each page into an image, allowing for a general prompt or page-specific prompts. The function generates a list of LLMMessage objects that can be sent to an API and work with the batch-API functions in tidyllm.mistral() function to use Mistral Models on Le Platforme on servers hosted in the EU, with rate-limiting and streaming support.last_user_message() pulls the last message the user sent.get_reply() gets the assistant reply at a given index of assistant messages.get_user_message() gets the user message at a given index of user messages..dry_run argument, allowing users to generate an httr2-request for easier debugging and inspection.httptest2-based tests with mock responses for all API functions, covering both basic functionality and rate-limiting.ollama_download_model() function to download models from the Ollama API. It supports a streaming mode that provides live progress bar updates on the download progress.llm_message()groq() function now supports images.llm_message().JSON Mode: JSON mode is now more widely supported across all API functions, allowing for structured outputs when APIs support them. The .json argument is now passed only to API functions, specifying how the API should respond, and it is not needed anymore in last_reply().
Improved last_reply() Behavior: The behavior of the last_reply() function has changed. It now automatically handles JSON replies by parsing them into structured data and falling back to raw text in case of errors. You can still force raw text replies even for JSON output using the .raw argument.
last_reply(): The .json argument is no longer used, and JSON replies are automatically parsed. Use .raw to force raw text replies.