CHANGELOG
2024-06-28
Dolma v1.7 is now available! We will deprecate the Dolma v1.6 index (
v4_dolma-v1_6_llama
) on July 7.
2024-06-12
The
count
query now has two optional fields,max_clause_freq
andmax_diff_tokens
, both are for customizing CNF queries.The
ntd
andinfgram_ntd
queries now has an optional field,max_support
, which is to customize the accuracy of result.The
search_docs
query now has four optional fields.max_clause_freq
andmax_diff_tokens
which are similar to those incount
queries.max_disp_len
which controls how many tokens to return per document.maxnum
is now optional with a default value of 1.
2024-06-06
The input field
corpus
is renamed toindex
. Support forcorpus
will be discontinued sometime in the future. Please update your scripts accordingly.count
,ntd
, andinfgram_ntd
queries now returns an extra fieldapprox
.infgram_prob
andinfgram_ntd
queries now returns an extra fieldsuffix_len
.For
search_docs
queries, each returned document now contains an extra fieldtoken_ids
.
2024-05-08
The
count
query now supports CNF inputs, similar tosearch_docs
.
2024-04-15
We’re lifting the restriction on concurrent requests and sleeping between requests. Now it should be OK to issue concurrent requests. Though our server is serving a lot of requests and you may experience longer latency.
2024-03-02
The API now supports inputting a list of token IDs in place of a string as the query. Check out the
query_ids
field.
2024-02-23
The output field
tokenized
is deprecated and replaced bytoken_ids
andtokens
in all query types (except insearch_docs
, where the new fields aretoken_idsss
andtokensss
). Thetokenized
field will be removed on 2024-03-01.For
ntd
andinfgram_ntd
queries, there is now a new output fieldprompt_cnt
, and the output fieldntd
is deprecated and replaced byresult_by_token_id
. Thentd
field will be removed on 2024-03-01.For
search_docs
queries, the output fielddocs
is deprecated and replaced bydocuments
, which contains additional metadata of the retrieved documents. Thedocs
field will be removed on 2024-03-01.