4
submitted 3 weeks ago* (last edited 3 weeks ago) by andrew0@lemmy.dbzer0.com to c/aihorde@lemmy.dbzer0.com

Hello everyone!

TL;DR: I want to propose a community-driven effort to research and improve 1-bit LLM models, for use within and outside the Horde. I think having access to such models would be very useful for the overall project, as you do not need a lot of compute to run "bigger" models if they're compressed well. Relevant paper.

We currently live in very interesting times regarding AI development. The big companies in the US seem to still be ahead, but groups from China, working on open-weights models, are making very impressive strides. However, the focus has been and still is on developing really big models. Most of the impressive models that keep coming out are bigger and bigger, leaving most people to pay for API tokens if they want something useful. Probably this is also incentivised, as Nvidia wants to make more money from sales to data centers.

I have been keeping an eye out for the AI Horde project, and I always wanted to help out, but never got around to setting it properly up due to my AMD GPU and running Windows. However, really cool project, and I wanted to congratulate everyone involved!

In my opinion, the goal of the AI Horde community also positions it as one of the very few that are capable of bringing forward some LLMs that can be "smart" on consumer-level hardware. That is, getting models that can handle long-horizon tasks well without paying providers for API access.

Since the first paper on 1.68-bit LLMs, there has been quite a few other suggestions thrown around (recent example). No one really uses these models much, as the quality is seriously degraded. Similarly, no one is really trying to improve these models further, as there is no incentive to do so when you can just pay someone 3$ per million token output from an existing open-weight model. For example, to the best of my knowledge, no one has tried to introduce latent reasoning (example) in this context, or specifically training/fine-tuning models at 1-bit levels.

So, to get to the point, would it make sense to get some community-driven research in this area? I believe that we could all pool together compute, good training data, ideas for fine-tuning / RL-training, etc. If it works out, we could have a method that makes existing larger models (say, up to 200B) available on a single 24GB GPU.

First thing I would try is to expand on the recent NanoQuant paper:

  1. Wait for weights to be released.
  2. If no weights come out, quantize a Qwen 3 32B, and try a more diverse dataset, with more tokens to see if fidelity can improve. I could get some access to GPUs for this myself. Another option for 1-bit models would be using other existing ones (e.g., Unsloth), but performance degradation is much bigger in those versions, from what I have seen. Furthermore, the compression of these models is not as efficient, and you would not be able to fully run a 70B parameter (as described in the paper), with only 8GB VRAM.
  3. Get an LLM (or human volunteers) to determine behaviour on various tasks: find limitations, strengths, etc. Get some human preferences for RLHF, or use a bigger LLM to grade output quality. Preference here on logic tasks.
  4. Perform fine-tuning of 1-bit model based on the gathered data, and deploy for use. Return to step 2 after a while.

Fine-tuning the 1-bit model might get a bit hairy, as the binary operation are not differentiable. We also wouldn't be able to up-cast back to F32 for regular training, as this would completely invalidate the consumer-driven access to these models. Simplest idea would be to train a LoRA head, or do some stochastic-driven training (e.g., flipping bits). However, the latter would probably be very unstable, and not work out, and LoRA might be the only option. I'm not a mathematician, so I am open to suggestions here :)

Past the initial prototype, I would consider the following stuff that's already implemented in other quants:

  • Try to re-calibrate the LayerNorm / RMSNorm parameters based on the original model's activations.
  • Perform some regular KL-divergence distillation
  • Other than using an LLM as a judge for RL, perhaps one could also fine-tune using a semantic similarity metric while aligning with the output of the original model. This could ensure that the intent is the same, even if the style differs.
  • Depending on token complexity, look into reducing compression just for difficult tokens and compressing further for easy ones (à la FlexQuant)
  • Mixture-of-Experts-style quantization, where we increase compression for experts that are not important, and reduce it for higher-frequency ones.

Curious what everyone thinks!

[-] andrew0@lemmy.dbzer0.com 50 points 4 months ago* (last edited 4 months ago)

This article just screams rage-bait. Not that I am against making people aware of this kind of privacy invasion, but the authors did not bother to do any fact checking.

Firstly, they mention that the vacuum was "transmitting logs and telemetry that [the guy] had never consented to share". If you set up an app with the robot vacuum company, I'm pretty sure you'll get a rather long terms and services document that you just skip past, because who bothers reading that?

Secondly, the ADB part is rather weird. The person probably tried to install Valetudo on it? Otherwise, I have no clue what they tried to say with "reprinting the devices’ circuit boards". I doubt that this guy was able to reverse engineer an entire circuit board, but was surprised when seeing that ADB is enabled? This is what makes some devices rather straight forward to install custom firmware that block all the cloud shenanigans, so I'm not sure why they're painting this as a horrifying thing. Of course, you're broadcasting your map data to the manufacturer so that you can use their shitty app.

The part saying that it had full root access and a kill-switch is a bit worse, but still... It doesn't have to be like this. Shout-out to the people working on the Valetudo project. If you're interested in getting a privacy-friendly robot vacuum, have a look at their website. It requires some know-how, but once it's done, you know for sure you don't need to worry about a 3rd party spying on you.

[-] andrew0@lemmy.dbzer0.com 61 points 4 months ago

Even the comic-book bullies are better than this... The sad part is that the West will continue to lick the boot, hoping everyone will just forget. I really hope that that is not the case. What Greta did here is very impressive, and I hope that her spirit will inspire other young people to vote out these dumbfucks in government that try to do damage control in this situation.

[-] andrew0@lemmy.dbzer0.com 46 points 5 months ago* (last edited 5 months ago)

Get a dog. I'm now forced to get up early to take it out, otherwise it will pee on my bed.

(Do not actually get a pet if you cannot take care of them.)

[-] andrew0@lemmy.dbzer0.com 57 points 6 months ago

Tesla's bang for buck is horrible. You get a shitty car made from the worst plastic possible, and on top of that they don't even have good quality control. The only thing that differentiated Tesla from the competition previously was the battery technology, but they no longer have that edge nowadays.

The Norwegians are probably getting them because they got used to it, and probably don't want to rely on Chinese cars. Beats me why they would select a Tesla nowadays over the European brands.

[-] andrew0@lemmy.dbzer0.com 32 points 10 months ago* (last edited 10 months ago)

I heard that Poland is also cheering for some MAGA guy in the next election... Troubling times ahead.

For Romania, there might still be a chance in the run-off. However, the difference between the two candidates was quite large (20% difference; 1.8 million votes). Similarly, the other candidates seemed to have voters that would rather vote for the nazi. Most likely all hope is lost, but that 1% chance is still there.

34
submitted 11 months ago* (last edited 11 months ago) by andrew0@lemmy.dbzer0.com to c/foss@beehaw.org

Hello everyone! I am interested in replacing the Google Speech Recognition and Synthesis app on Android. For Speech-to-Text (STT), I've tried Whisper and FUTO, and settled on the latter because it seemed to be more versatile. Also, FUTO seems to have some decent recognition, but not yet capable of handling all the languages that I want. Regardless, so far happy with STT. The only annoyance I have is that it does not appear as an option in the settings for Speech recognition :(

However, I can't seem to find any replacements that have good Text-to-Speech (TTS) quality. I tried espeak-ng and RHVoice, but both have robotic outputs.

Given the recent advancements in AI, I was expecting that there would be ways to incorporate open source TTS models like Kokoro to generate speech on the go. Nevertheless, I could not really find any such apps so far.

Has anyone managed to completely replace the Google app with (an)other privacy-focused FOSS app(s)?

31
submitted 1 year ago* (last edited 1 year ago) by andrew0@lemmy.dbzer0.com to c/europe@feddit.org

Some great news regarding the development of computer chips in the EU. However, the total of €240 million allocated to the project is not as much as I would say we need to invest in this area. Let's see how things change in the next few years!

116
submitted 1 year ago* (last edited 1 year ago) by andrew0@lemmy.dbzer0.com to c/europe@feddit.org

Previously used link: https://archive.ph/ICJZZ

Link to petition

Until now, the EU has allowed a majority of countries to rely on American big tech companies for communication and storage of sensitive data. For example, many universities across Europe rely on Google or Microsoft for email services, research data storage, and department communication. Similarly, many of them write their research using Microsoft Word, which could be used by these big companies to train their own AI models.

A majority of regular citizens rely on Meta for instant messaging apps (WhatsApp), Facebook, Instagram, but also on X (formerly Twitter), and TikTok. None of these apps are properly regulated even with EU's efforts, leaving people unshielded to other states' attempts at polarization. There is also the problem of mass profiling of users, which is used to supply targeted advertisements and sometimes influence public opinion on certain topics (cough Musk tweaking the Twitter algorithm to promote AfD cough).

The article that I supplied focuses mainly on the aspect of maintaining data privacy when our data is harvested by outside entities. However, this is, in my opinion, a horrible approach. We need to move everything ASAP to open source alternatives, and preferably EU based ones. Some attempts at this have been previously made in Germany, which should give hope to other countries in the EU.

The cost of moving away from Google/Microsoft tech stacks will be a drop in the bucket compared to the wealth that these companies extract from EU. Similarly, offering alternatives to social media like Friendica, Mastodon, Pixelfed, Lemmy, and perhaps PeerTube, would be a huge win against disinformation and propaganda from other countries. We should also push for instant messaging platforms like SimpleX that do not rely on Google's proprietary Push Notification services, and perhaps deGoogled Android devices.

If the recent events are not a catalyst to push everyone away from US software in the EU, I do not know what else will. Do you think that this would be possible at all?

238
submitted 1 year ago* (last edited 1 year ago) by andrew0@lemmy.dbzer0.com to c/europe@feddit.org

I have never donated money in my life before, but what happened yesterday really upset me. Ended up sending some money this morning. I know that my small donation won't contribute to much, but I am trying to help :D

I hope this post doesn't break any rules!

31

Hi! I'm trying to archive papers as soon as they appear in a scientific journal, and I've attempted to search for PDF links on each page using some regular web scraping.

The problem is that most of these journals will add their fancy PDF readers, and downloading the file is not as straight-forward as it seems. However, the Zotero Connector works flawlessly when you trigger the extension. Therefore, I attempted to set up a selenium instance with this extension to download the papers given a link, but I struggle to actually get the extension to trigger. I tried sending a Shift + Ctrl + S command, but that doesn't seem to get picked up. Similarly, I can't figure out how to call the extension from the console.

Did anyone else attempt such a workflow before? Am I doing something completely unnecessary, as there are better options available? Help a fellow sailor out. Thanks a lot in advance for your help!

[-] andrew0@lemmy.dbzer0.com 34 points 1 year ago

If you already have medical knowledge, why not look into bioinformatics? Cyber security would be a pretty big jump if you're not into tweaking computers as a hobby. For example, have you ever set up Linux on your own?

Certifications will give you a starting point, but it will take years for all the information to settle properly in your mind.

90
submitted 1 year ago* (last edited 1 year ago) by andrew0@lemmy.dbzer0.com to c/technology@lemmy.world

I recently discovered that Redox OS got a new release earlier this month. I'm quite surprised how far they managed to get, given that only a handful of people are working on this project (compared to the Linux kernel).

Now, I'm curious what it would take to get bigger players to focus on this project. Given the recent Linux + Rust drama, it would surprise me if the backers of Rust for Linux would not give this project some attention.

[-] andrew0@lemmy.dbzer0.com 38 points 2 years ago

That person clearly hasn't witnessed Dutch students carrying a whole bedroom on the back of their bike.

[-] andrew0@lemmy.dbzer0.com 67 points 2 years ago

Wow, some of the comments on that article saying Google should have made Android closed source are mindboggling. They realize they never would have had their current worldwide marketshare if they did that, no?

But maybe if they did, we would have had more people working on true linux phones 🤔 I'm a bit torn on this one haha.

159
submitted 2 years ago by andrew0@lemmy.dbzer0.com to c/linux@lemmy.ml

Hello everyone! I've been playing around with Wayland for a bit and was hoping to start learning some more about it. For example, I would be interested in making a lock screen, similar to Swaylock, as a toy project.

What GUI toolkit would you use to develop apps on Wayland? I've added a little poll below with some of the popular choices I've seen thrown around. Feel free to add your own suggestions and maybe leave a comment as to why you'd use that!

Link to poll

[-] andrew0@lemmy.dbzer0.com 41 points 2 years ago

Cheats nowadays don't even need to run on your machine. You can get a second computer that is connected to your computer via a capture card, analyze your video feed with an AI and send mouse commands wirelessly from it (mimicking the signal for your USB receiver).

These anti-cheats are nothing more than privacy invasion, and any game maker that believes they have the upper hand on people that want to cheat are very wrong.

Opening up anti-cheat support for Linux would at least make them more creative at finding these people from their behaviour, and not from analysing everything that's running in the background.

[-] andrew0@lemmy.dbzer0.com 59 points 2 years ago

It's amazing that Linux gaming is becoming a thing that's better sometimes than Windows gaming (minus the getting banned part in some games). I also like that AMD is making some big pushes on open source drivers, plus their ROCm open-source alternative to CUDA.

This is a great time for Linux users! :)

[-] andrew0@lemmy.dbzer0.com 104 points 2 years ago

What a stupid article. It's like saying "stop using electric vehicles because you can't use gas stations". I don't understand why he's so adamant about this? It's not like Wayland had about 20 years of extra time to develop like X11. People keep working on it, and it takes time to polish things.

24
submitted 2 years ago* (last edited 2 years ago) by andrew0@lemmy.dbzer0.com to c/archlinux@lemmy.ml

Hi! I am trying to automate my install process by creating a json file that can be used by archinstall (example). One of the example shows how you can run custom commands to get paru (yay, but written in Rust):

"custom-commands": [
        "cd /home/devel; git clone https://aur.archlinux.org/paru.git",
        "chown -R devel:devel /home/devel/paru",
    ]

However, their example doesn't provide any further information about installing packages with paru. I would like to install some stuff just for my user.

My idea was the following:

  • using archinstall, install everything according to the config
  • disregard the "custom-commands" option in the config and create a separate custom script
  • get all the users from the system and allow user to choose which one to chroot as
  • run all commands as the chosen user ( e.g., install Rust with curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh )

I need to install a few packages that are not in the official repository, as well as moving my dotfiles in /home/user/.config and making sure everything is accessible by that user. If there are any better approaches to this, I would be glad to hear them!

An example of the script I am planning to use after running archinstall:

spoiler

#!/bin/bash

# Find all users on the system
for user in $(ls /home); do
    if [ "$user" != "lost+found" ]; then
        users+=($user)
    fi
done

# If there is more than one user, ask which user to install for
if [ ${#users[@]} -gt 1 ]; then
    echo "Multiple users found on system. Please select a user to install for:"
    select user in "${users[@]}"; do
        if [[ " ${users[@]} " =~ " ${user} " ]]; then
            break
        else
            echo "Invalid selection"
        fi
    done
else
    user=${users[0]}
fi

echo "Installing for user $user"

# chroot as the user
arch-chroot -u $user /mnt/archinstall # This only opens bash, but I am working on it :D 
cd /home/$user

# Install paru
git clone https://aur.archlinux.org/paru.git
cd paru
makepkg -si

# Install stuff with paru
paru -S tlrc --noconfirm

[-] andrew0@lemmy.dbzer0.com 129 points 2 years ago

Framework laptops are getting better. Not Apple levels good, but it certainly beats them in average longevity.

The only hope with Apple is having the EU step in again to stop this kind of bullcrap.

24
submitted 2 years ago* (last edited 2 years ago) by andrew0@lemmy.dbzer0.com to c/moddedminecraft@sopuli.xyz

Server performance is not very good with so many mods, and I have been looking into ways to fix this. One of the latest comments on the ATM8 page on CurseForge is from XZot1K, and says the following:

After lots of testing I resolved most of my issues by installing the following mods to the server (Ensure to install the correct versions, as of writing this the version is latest of each for 1.19.2):

https://www.curseforge.com/minecraft/mc-mods/too-fast

https://www.curseforge.com/minecraft/mc-mods/smooth-chunk-save

https://www.curseforge.com/minecraft/mc-mods/chunk-sending-forge-fabric

https://www.curseforge.com/minecraft/mc-mods/packet-size-doubler

These mods will resolve larger packet disconnect issues, chunk lag, and irregular movement rubber banding.

In addition to these, for further improvement, set the tick rate to -1 in the server.properties file.

Paste the following into the bottom of your "user_jvm_args.txt" (change the 6GB and 256m to your liking


Xms must be less than Xmx):

-Xmx6G -Xms256m -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=200 -XX:+UnlockExperimentalVMOptions -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -XX:G1NewSizePercent=30 -XX:G1MaxNewSizePercent=40 -XX:G1HeapRegionSize=32M -XX:G1ReservePercent=20 -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:InitiatingHeapOccupancyPercent=15 -XX:G1MixedGCLiveThresholdPercent=90 -XX:G1RSetUpdatingPauseTimePercent=5 -XX:SurvivorRatio=32 -XX:+PerfDisableSharedMem -XX:MaxTenuringThreshold=1 -Dusing.aikars.flags=https://mcflags.emc.gs -Daikars.new.flags=true

Please note that while these additional mods do work on the client the major improvement comes from the server-side.

I've already used those jvm arguments, but I didn't look for performance mods before. Now, after fiddling a bit around with them, the server feels much snappier (and I don't have to install anything client side)! I'm hosting on Azure, with a Standard D2s v3 (2 vcpus, 8 GiB memory) VM, and when I would do a /home from a far away place it would take a few seconds to load. Now, it's almost instantaneous! Thanks XZot1K! :)

The server also used to crash whenever multiple people entered the Nether, but I haven't been able to test this yet with the new configuration.

If you have any tips to improve performance, please share them here :)

52
Jump from Arch to NixOS? (lemmy.dbzer0.com)
submitted 2 years ago* (last edited 2 years ago) by andrew0@lemmy.dbzer0.com to c/linux@lemmy.ml

As the title implies, should I do it? I love Arch so far, and I can fix most issues that pop out. However, I sometimes wish to start fresh without too much hassle, but I get a feeling NixOS isn't as mature as Arch.

Have any of you used both, and if so, what do you miss from Arch? What are you grateful for in NixOS?

18
submitted 2 years ago* (last edited 2 years ago) by andrew0@lemmy.dbzer0.com to c/piracy@lemmy.dbzer0.com

Hi everyone! I'll soon take the DP-100 exam for Microsoft Azure, and I was interested in finding more leaked exam questions. At the moment, I was using examtopics for this, but it sucks because it basically cuts you off halfway through.

I heard there are some private trackers that specialize in exam questions, such as LearnFlakes, but I do not have anyone that can invite me to them. Therefore, I was wondering if there is another way to find the information I need for this exam.

Do you know any other sources that are fully free?

view more: next ›

andrew0

joined 2 years ago