Releases: NeoZhangJianyu/llama.cpp
Releases · NeoZhangJianyu/llama.cpp
b3943
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745) * refactor llama_batch_get_one * adapt all examples * fix simple.cpp * fix llama_bench * fix * fix context shifting * free batch before return * use common_batch_add, reuse llama_batch in loop * null terminated seq_id list * fix save-load-state example * fix perplexity * correct token pos in llama_batch_allocr
b3942
rpc : backend refactoring (#9912) * rpc : refactor backend Use structs for RPC request/response messages * rpc : refactor server
b3831
Enable use to the rebar feature to upload buffers to the device. (#9251)
b3828
[SYCL] add missed dll file in package (#9577) * update oneapi to 2024.2 * use 2024.1 --------- Co-authored-by: arthw <[email protected]>
update_oneapi-b3789-3ae8374
use 2024.1
update_oneapi-b3788-f557ccf
update oneapi to 2024.2
b3787
server : clean-up completed tasks from waiting list (#9531) ggml-ci
b3735
cann: Fix error when running a non-exist op (#9424)
b3678
server : simplify state machine for slot (#9283) * server : simplify state machine for slot * add SLOT_STATE_DONE_PROMPT * pop_deferred_task * add missing notify_one * fix passkey test * metrics : add n_busy_slots_per_decode * fix test step * add test * maybe fix AddressSanitizer? * fix deque ? * missing lock * pop_deferred_task: also notify * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>
b3449
examples : Fix `llama-export-lora` example (#8607) * fix export-lora example * add more logging * reject merging subset * better check * typo