Sentence transformers v4 by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 3 points 1 day ago

I want to clarify something. Reranker is a general term that can refer to any model used for reranking. It is independent of implementation.

What you refer to

because reranker models look at the two pieces of content simultaneously and can be fine tuned to the domain in question. They shouldn't be used for the initial retrieval because the evaluation time is O(n²) as each combination of input

Is a specific implementation known as CrossEncoder that is common for reranking models but not retrieval ones for the reasons you described. But you can also use any other architecture

19

Sentence transformers v4 (lemm.ee)

submitted 1 day ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

3 comments fedilink

Link to bluesky https://bsky.app/profile/tomaarsen.com/post/3llc2jvwah22f

Some more details https://huggingface.co/blog/train-reranker

11

NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms (electricalexis.github.io)

submitted 3 days ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

0 comments fedilink

StarVector - a foundation model for generating svgs by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 1 points 4 days ago

autotracers can't generate svgs from text

StarVector - a foundation model for generating svgs by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 3 points 6 days ago

Claude frequently draws svgs to illustrate things for me (I'm guessing it's in the prompt) but even though it's better at it than all the other models, it still kinda sucks. It's just fudamentally dumb task to do for a purely language model, similar to the arc-agi benchmark , just makes more sense for a vision model and trying to get an llm to do is a waste

19

StarVector - a foundation model for generating svgs (huggingface.co)

submitted 6 days ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

5 comments fedilink

EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 1 points 1 week ago

what is the license? The link on hf just 404s

4

EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News (www.lgresearch.ai)

submitted 1 week ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

4 comments fedilink

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 2 points 2 weeks ago

Very similar to chain of draft but seems more thorough

12

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching (arxiv.org)

submitted 2 weeks ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

2 comments fedilink

5

Sorting-Free GPU Kernels for LLM Sampling (flashinfer.ai)

submitted 2 weeks ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

0 comments fedilink

Reka Flash, open source 21B model comparable to QWQ 32B by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 2 points 2 weeks ago* (last edited 2 weeks ago)

More info here https://www.reka.ai/news/introducing-reka-flash
HF: https://huggingface.co/RekaAI/reka-flash-3

18

Reka Flash, open source 21B model comparable to QWQ 32B (i.postimg.cc)

submitted 2 weeks ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

2 comments fedilink

Qwen/QwQ-32B · Hugging Face by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 3 points 3 weeks ago

It matches R1 in the given benchmarks. R1 has 671B params (36 activated) while this only has 32

Qwen/QwQ-32B · Hugging Face by morrowind in c/localllama@sh.itjust.works

[-] morrowind@lemm.ee 2 points 3 weeks ago

insane, absolutely insane

8

Chain of Draft: Thinking Faster by Writing Less (arxiv.org)

submitted 3 weeks ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

0 comments fedilink

13

Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1 (bsky.app)

submitted 3 weeks ago by morrowind@lemm.ee to c/localllama@sh.itjust.works

0 comments fedilink

morrowind

joined 1 month ago