NEWS


tidyllm 0.3.3

Thinking support in Claude

Claude now supports reasoning:

conversation <- llm_message("Are there an infinite number of prime numbers such that n mod 4 == 3?") |>
   chat(claude(.thinking=TRUE)) |>
  print()
   
#> Message History:
#> system:
#> You are a helpful assistant
#> --------------------------------------------------------------
#> user:
#> Are there an infinite number of prime numbers such that n
#> mod 4 == 3?
#> --------------------------------------------------------------
#> assistant:
#> # Infinitude of Primes Congruent to 3 mod 4
#> 
#> Yes, there are infinitely many prime numbers $p$ such
#> that $p \equiv 3 \pmod{4}$ (when $p$ divided by 4 leaves
#> remainder 3).
#> 
#> ## Proof by Contradiction
#> 
#> I'll use a proof technique similar to Euclid's classic proof
#> of the infinitude of primes:
#> 
#> 1) Assume there are only finitely many primes $p$ such that
#> $p \equiv 3 \pmod{4}$. Let's call them $p_1, p_2, ..., p_k$.
#> 
#> 2) Consider the number $N = 4p_1p_2...p_k - 1$
#> 
#> 3) Note that $N \equiv 3 \pmod{4}$ since $4p_1p_2...p_k
#> \equiv 0 \pmod{4}$ and $4p_1p_2...p_k - 1 \equiv -1 \equiv 3
#> \pmod{4}$
#> 
#> 4) $N$ must have at least one prime factor $q$
#> 
#> 5) For any $i$ between 1 and $k$, we have $N \equiv -1
#> \pmod{p_i}$, so $N$ is not divisible by any of the primes
#> $p_1, p_2, ..., p_k$
#> 
#> 6) Therefore, $q$ is a prime not in our original list
#> 
#> 7) Furthermore, $q$ must be congruent to 3 modulo 4:
#> - $q$ cannot be 2 because $N$ is odd
#> - If $q \equiv 1 \pmod{4}$, then $\frac{N}{q} \equiv 3
#> \pmod{4}$ would need another prime factor congruent to 3
#> modulo 4
#> - So $q \equiv 3 \pmod{4}$
#> 
#> 8) This contradicts our assumption that we listed all primes
#> of the form $p \equiv 3 \pmod{4}$
#> 
#> Therefore, there must be infinitely many primes of the form
#> $p \equiv 3 \pmod{4}$.
#> --------------------------------------------------------------

#Thinking process is stored in API-specific metadata
conversation |> 
   get_metadata() |>
   dplyr::pull(api_specific) |>
   purrr::map_chr("thinking") |>
   cat()
   
#> The question is asking if there are infinitely many prime numbers $p$ such that $p \equiv 3 \pmod{4}$, i.e., when divided by 4, the remainder is 3.
#> 
#> I know that there are infinitely many prime numbers overall. The classic proof is Euclid's proof by contradiction: if there were only finitely many primes, we could multiply them all together, add 1, and get a new number not divisible by any of the existing primes, which gives us a contradiction.
#> 
#> For primes of the form $p \equiv 3 \pmod{4}$, we can use a similar proof strategy. 
#> 
#> Let's assume there are only finitely many primes $p_1, p_2, \ldots, p_k$ such that $p_i \equiv 3 \pmod{4}$ for all $i$. 
#> 
#> Now, consider the number $N = 4 \cdot p_1 \cdot p_2 \cdot \ldots \cdot p_k - 1$. 
#> 
#> Note that $N \equiv -1 \equiv 3 \pmod{4}$. 
#> 
#> Now, let's consider the prime factorization of $N$. If $N$ is itself prime, then we have found a new prime $N$ such that $N \equiv 3 \pmod{4}$, which contradicts our assumption that we enumerated all such primes.
#> 
> ...

Bugfixes

tidyllm 0.3.2 (2025-03-07)

Tool usage introduced to tidyllm

A first tool usage system inspired by a similar system in ellmer has been introduced to tidyllm. At the moment tool use is available for claude(), openai(), mistral(), ollama(), gemini() and groq():

get_current_time <- function(tz, format = "%Y-%m-%d %H:%M:%S") {
  format(Sys.time(), tz = tz, format = format, usetz = TRUE)
}

time_tool <- tidyllm_tool(
  .f = get_current_time,
  .description = "Returns the current time in a specified timezone. Use this to determine the current time in any location.",
  tz = field_chr("The time zone identifier (e.g., 'Europe/Berlin', 'America/New_York', 'Asia/Tokyo', 'UTC'). Required."),
  format = field_chr("Format string for the time output. Default is '%Y-%m-%d %H:%M:%S'.")
)


llm_message("What's the exact time in Stuttgart?") |>
  chat(openai,.tools=time_tool)
  
#> Message History:
#> system:
#> You are a helpful assistant
#> --------------------------------------------------------------
#> user:
#> What's the exact time in Stuttgart?
#> --------------------------------------------------------------
#> assistant:
#> The current time in Stuttgart (Europe/Berlin timezone) is
#> 2025-03-03 09:51:22 CET.
#> --------------------------------------------------------------  

You can use the tidyllm_tool() function to define tools available to a large language model. Once a tool or a list of tools is passed to a model, it can request to run these these functions in your current session and use their output for further generation context.

Support for DeepSeek added

tidyllm now supports the deepseek API as provider via deepseek_chat() or the deepseek() provider function. Deepseek supports logprobs just like openai(), which you can get via get_logprobs(). At the moment tool usage for deepseek is very inconsistent.

Support for Voyage.ai and Multimodal Embeddings Added

Voyage.ai introduces a unique multimodal embeddings feature, allowing you to generate embeddings not only for text but also for images. The new voyage_embedding() function in tidyllm enables this functionality by seamlessly handling different input types, working with both the new feature as well as the same inputs as for other embedding functions.

The new img() function lets you create image objects for embedding. You can mix text and img() objects in a list and send them to Voyage AI for multimodal embeddings:

list("tidyllm", img(here::here("docs", "logo.png"))) |>
  embed(voyage)
#> # A tibble: 2 × 2
#>   input          embeddings   
#>   <chr>          <list>       
#> 1 tidyllm        <dbl [1,024]>
#> 2 [IMG] logo.png <dbl [1,024]>

In this example, both text ("tidyllm") and an image (logo.png) are embedded together. The function returns a tibble where the input column contains the text and labeled image names, and the embeddings column contains the corresponding embedding vectors.

New Tests and Bugfixes

tidyllm 0.3.1 (2025-02-24)

⚠️ There is a bad bug in the latest CRAN release in the fetch_openai_batch() function that is only fixed in version 0.3.2. For the release 0.3.1. the fetch_openai_batch() function throws errors if the logprobs are turned off.

Changes compared to last release

 ellmer_adress <-ellmer::type_object(
    street = ellmer::type_string("A famous street"),
    houseNumber = ellmer::type_number("a 3 digit number"),
    postcode = ellmer::type_string(),
    city = ellmer::type_string("A large city"),
    region = ellmer::type_string(),
    country = ellmer::type_enum(values = c("Germany", "France"))
  ) 

person_schema <-  tidyllm_schema(
                person_name = "string",
                age = field_dbl("An age between 25 and 40"),
                is_employed = field_lgl("Employment Status in the last year")
                occupation = field_fct(.levels=c("Lawyer","Butcher")),
                address = ellmer_adress
                )

address_message <- llm_message("imagine an address") |>
  chat(openai,.json_schema = ellmer_adress)
  
person_message  <- llm_message("imagine a person profile") |>
  chat(openai,.json_schema = person_schema)
badger_poem <- llm_message("Write a haiku about badgers") |>
    chat(openai(.logprobs=TRUE,.top_logprobs=5))

 badger_poem |> get_logprobs()
#> # A tibble: 19 × 5
#>   reply_index token          logprob bytes     top_logprobs
#>          <int> <chr>            <dbl> <list>    <list>      
#>  1           1 "In"       -0.491      <int [2]> <list [5]>  
#>  2           1 " moon"    -1.12       <int [5]> <list [5]>  
#>  3           1 "lit"      -0.00489    <int [3]> <list [5]>  
#>  4           1 " forest"  -1.18       <int [7]> <list [5]>  
#>  5           1 ","        -0.00532    <int [1]> <list [5]>  
list_models(openai)
#> # A tibble: 52 × 3
#>    id                                   created             owned_by
#>    <chr>                                <chr>               <chr>   
#>  1 gpt-4o-mini-audio-preview-2024-12-17 2024-12-13 18:52:00 system  
#>  2 gpt-4-turbo-2024-04-09               2024-04-08 18:41:17 system  
#>  3 dall-e-3                             2023-10-31 20:46:29 system  
#>  4 dall-e-2                             2023-11-01 00:22:57 system  

tidyllm 0.3.0 (2024-12-08)

tidyllm 0.3.0 represents a major milestone for tidyllm

The largest changes compared to 0.2.0 are:

New Verb-Based Interface

Each verb and provider combination routes the interaction to provider-specific functions like openai_chat() or claude_chat() that do the work in the background. These functions can also be called directly as an alternative more verbose and provider-specific interface.

Old Usage:

llm_message("Hello World") |>
  openai(.model = "gpt-4o")

New Usage:

# Recommended Verb-Based Approach
llm_message("Hello World") |>
  chat(openai(.model = "gpt-4o"))
  
# Or even configuring a provider outside
my_ollama <- ollama(.model = "llama3.2-vision:90B",
       .ollama_server = "https://ollama.example-server.de",
       .temperature = 0)

llm_message("Hello World") |>
  chat(my_ollama)

# Alternative Approach is to use more verbose specific functions:
llm_message("Hello World") |>
  openai_chat(.model = "gpt-4o")

Backward Compatibility:

Breaking Changes:

Other Major Features:

Improvements:

tidyllm 0.2.7

Major Features

llm_message("What is tidyllm and who maintains this package?") |>
  gemini_chat(.grounding_threshold = 0.3)

Improvements

tidyllm 0.2.6

Large Refactor of package internals

Breaking Changes

Minor Features

here::here("local_wip","example.mp3") |> gemini_upload_file()
here::here("local_wip","legrille.mp4") |> gemini_upload_file()

file_tibble <- gemini_list_files()

llm_message("What are these two files about?") |>
  gemini_chat(.fileid=file_tibble$name)

tidyllm 0.2.5

Major Features

Better embedding functions with improved output and error handling and new documentation. New article on using embeddings with tidyllm. Support for embedding models on azure with azure_openai_embedding()

Breaking Changes

tidyllm 0.2.4

Refinements of the new interface

One disadvantage of the first iteration of the new interface was that all arguements that needed to be passed to provider-specific functions, were going through the provider function. This feels, unintuitive, because users expect common arguments (e.g., .model, .temperature) to be set directly in main verbs like chat() or send_batch().Moreover, provider functions don't expose arguments for autocomplete, making it harder for users to explore options. Therefore, the main API verbs now directly accept common arguements, and check them against the available arguements for each API.

Bug-fixes

tidyllm 0.2.3

Major Interface Overhaul

tidyllm has introduced a verb-based interface overhaul to provide a more intuitive and flexible user experience. Previously, provider-specific functions like claude(), openai(), and others were directly used for chat-based workflows. Now, these functions primarily serve as provider configuration for some general verbs like chat().

Key Changes:

Each verb and provider combination routes the interaction to provider-specific functions like openai_chat() or claude_chat() that do the work in the background. These functions can also be called directly as an alternative more verbose and provider-specific interface.

Old Usage:

llm_message("Hello World") |>
  openai(.model = "gpt-4o")

New Usage:

# Recommended Verb-Based Approach
llm_message("Hello World") |>
  chat(openai(.model = "gpt-4o"))
  
# Or even configuring a provider outside
my_ollama <- ollama(.model = "llama3.2-vision:90B",
       .ollama_server = "https://ollama.example-server.de",
       .temperature = 0)

llm_message("Hello World") |>
  chat(my_ollama)

# Alternative Approach is to use more verbose specific functions:
llm_message("Hello World") |>
  openai_chat(.model = "gpt-4o")

tidyllm 0.2.2

Major Features

#Upload a file for use with gemini
upload_info <- gemini_upload_file("example.mp3")

#Make the file available during a Gemini API call
llm_message("Summarize this speech") |>
  gemini(.fileid = upload_info$name)
  
#Delte the file from the Google servers
gemini_delete_file(upload_info$name)

tidyllm 0.2.1

Major Features:

conversation <- llm_message("Write a short poem about software development") |>
  claude()
  
#Get metdata on token usage and model as tibble  
get_metadata(conversation)

#or print it with the message
print(conversation,.meta=TRUE)

#Or allways print it
options(tidyllm_print_metadata=TRUE)

Bug-fixes:

tidyllm 0.2.0 (2024-11-07)

New CRAN release. Largest changes compared to 0.1.0:

Major Features:

Improvements:

Breaking Changes:

Minor Updates and Bug Fixes:

tidyllm 0.1.11

Major Features

Improvements

tidyllm 0.1.10

Breaking Changes

Improvements

tidyllm 0.1.9

Major Features

Breaking Changes

Improvements


tidyllm 0.1.8

Major Features

Improvements


tidyllm 0.1.7

Major Features


tidyllm 0.1.6

Major Features


tidyllm 0.1.5

Major Features

Improvements


tidyllm 0.1.4

Major Features

Improvements


tidyllm 0.1.3

Major Features

Breaking Changes


tidyllm 0.1.2

Improvements


tidyllm 0.1.1

Major Features

Breaking Changes