8
Compile with AI (beehaw.org)

Hello everyone,

I apologize if this is a debate that has already taken place. Please delete the post, and kindly indicate where I can send my message.

We all know that in technology, there are always things where one has to accept trust in the developer(s), whether it's hardware or software. Some things are currently unavoidable to change in the short term, so that's not where I'm focusing my point.

But something bothers me about "Open-Source" applications. I don't know how to compile, and I'm not willing to dedicate so many hours of my life to learning it. So, in addition to trusting reputable companies, I now choose to trust a reputable person or group, who likely receives code audits for their open-source code. However, these audits are based on the open-source code, not on what ends up being compiled for my final consumer execution. In the end, each project is a bucket of trust unless I know how to compile. And even then, there may be ways that something slips past us, but I understand that it would at least reduce the risk. I read that F-Droid did this: they didn't trust the app creator, but rather compiled their own version from the open-source code. It seemed fantastic to me, but the problem was always the delay.

The question is: Couldn't a program with AI be created to compile any GitHub repository directly? It would eliminate the need to trust even the developer themselves; we would only have to trust their code, as we already do today, and the audits would finally have real value. We would know that what we receive is that code.

I would also love for the concept of Flatpak to be like this: that the developer doesn't sign the binary, but only signs the code, and Flathub (or some backend) creates everything automatically. And if there are doubts about Flathub or any other backend, users could do it locally. It would be a bit more tedious, but its value in privacy would be enormous.

By the way, if any of this already works this way and I am confused, please enlighten me.

Thank you very much, everyone!

you are viewing a single comment's thread
view the rest of the comments
[-] argv_minus_one@beehaw.org 6 points 1 year ago

You're still trusting whoever runs the compiler. If you rely on an AI to run the compiler, then you're trusting the AI and whoever controls it.

Moreover, I don't believe AI is intelligent enough to meaningfully comprehend how to compile any project it's handed. Every project is different and has its own requirements, including libraries and tools that must be installed on the machine that is to compile the project.

There have been various attempts at standardizing the compilation of software, such that any standard-conforming project can be compiled in the same way as any other. F-Droid must have done that. But each of these standards make assumptions about the nature of the project being compiled, which makes it infeasible to compile some projects with them. For example, the Linux distribution Debian has its own standards for how packages are to be compiled, and you can compile any Debian package from source code with the same sequence of commands, but you can only compile a Debian package this way, and not, for example, a Windows application.

There is value in what you're proposing, but I don't believe it's possible at this point.

[-] adespoton@lemmy.ca 4 points 1 year ago* (last edited 1 year ago)

There’s a more nefarious problem too — AI algorithms are a black box. This means it’s virtually impossible to trust an AI’s methodology. Also, if someone knows the algorithm used for the AI, they can exploit the training methodology to hide secrets in the AI model that nobody will find but can still be triggered to perform specific repeatable tasks.

Essentially, for the AI to be more trustworthy than the person you trust to compile your code, you’d have to build the AI, determine its algorithms, vet the source material and train the model yourself. This is MUCH harder than setting up a buildbot environment with some basic unit tests for privacy and security.

[-] o1i1wnkk@beehaw.org 1 points 1 year ago

I understand your concern about the black-box nature of AI and the potential for exploitation. It's indeed a serious challenge, but I still believe it’s possible to work towards solutions.

As AI continues to evolve, there's ongoing research into improving the transparency and interpretability of AI algorithms. Ideally, this could lead to AI models that can better explain their actions and decisions. We may not have reached this point yet, but it is an active area of research and progress is being made.

Furthermore, having open-source AI models could offer some degree of assurance. If an AI model is open source and has undergone rigorous audits, there's a higher level of transparency and trustworthiness. The community could scrutinize and vet the code, which might help to mitigate some of the risks associated with hidden secrets and exploitation of the AI's training methodology.

And about your point of building, training and vetting the AI ourselves being harder than setting up a buildbot environment: I agree, but the idea here is not to replace human compilers entirely... for now. Instead, the goal could be to have a tool that can aid in ensuring trustworthiness, especially for those of us without the technical background to compile code ourselves.

[-] o1i1wnkk@beehaw.org 1 points 1 year ago

I understand your point about the transfer of trust, and it is indeed a serious concern. However, I believe there are measures that could be taken. I'm not an expert myself and I won't pretend to be one, but it occurs to me that eventually technology will evolve to the point where we could ask the AI to explain step by step how it arrived at the final result. We could also potentially perform audits by cherry-picking the final results from different software to assess their accuracy.

If we were to use Open Source AI projects (like GPT4all, for example), maybe eventually we could run these codes 100% locally and privately. Naturally, I understand that we are far from this scenario, either due to the resources required or the nature of the complexity involved. It's just an idea.

I would never think of bothering a developer by asking them to compile code step by step in front of me. First, because their time is valuable, and second, because the level of my questions would be frustrating. And third - and most importantly - because no one would accept such a whim.

However, I am willing to go step by step with an AI in some key software applications, such as communication, for example. Journalists or people in jobs where they cannot afford to trust blindly but lack the technical background might find benefit in these possibilities.

this post was submitted on 06 Jun 2023
8 points (100.0% liked)

Privacy Guides

16557 readers
3 users here now

In the digital age, protecting your personal information might seem like an impossible task. We’re here to help.

This is a community for sharing news about privacy, posting information about cool privacy tools and services, and getting advice about your privacy journey.


You can subscribe to this community from any Kbin or Lemmy instance:

Learn more...


Check out our website at privacyguides.org before asking your questions here. We've tried answering the common questions and recommendations there!

Want to get involved? The website is open-source on GitHub, and your help would be appreciated!


This community is the "official" Privacy Guides community on Lemmy, which can be verified here. Other "Privacy Guides" communities on other Lemmy servers are not moderated by this team or associated with the website.


Moderation Rules:

  1. We prefer posting about open-source software whenever possible.
  2. This is not the place for self-promotion if you are not listed on privacyguides.org. If you want to be listed, make a suggestion on our forum first.
  3. No soliciting engagement: Don't ask for upvotes, follows, etc.
  4. Surveys, Fundraising, and Petitions must be pre-approved by the mod team.
  5. Be civil, no violence, hate speech. Assume people here are posting in good faith.
  6. Don't repost topics which have already been covered here.
  7. News posts must be related to privacy and security, and your post title must match the article headline exactly. Do not editorialize titles, you can post your opinions in the post body or a comment.
  8. Memes/images/video posts that could be summarized as text explanations should not be posted. Infographics and conference talks from reputable sources are acceptable.
  9. No help vampires: This is not a tech support subreddit, don't abuse our community's willingness to help. Questions related to privacy, security or privacy/security related software and their configurations are acceptable.
  10. No misinformation: Extraordinary claims must be matched with evidence.
  11. Do not post about VPNs or cryptocurrencies which are not listed on privacyguides.org. See Rule 2 for info on adding new recommendations to the website.
  12. General guides or software lists are not permitted. Original sources and research about specific topics are allowed as long as they are high quality and factual. We are not providing a platform for poorly-vetted, out-of-date or conflicting recommendations.

Additional Resources:

founded 1 year ago
MODERATORS