IDE SSDs have been a niche but readily-available product for as long as consumer SSDs have been mainstream. CompactFlash cards ensured that the necessary controller chips were available.
With those grippers, though? There's a lot of difficulty in making it scrunch up a sock, and a sock does fit. Doing a long sleeve completely unanchored is probably physically possible with extreme care but I see why they mark the robot down as physically unable.
Definitely not trivial but I disagree "impossible". Better to just not use that word and say "too difficult given our grippers". By saying impossible it implies they are limiting their own thinking, not to mention that RL often comes up with pretty interesting solutions.
> The way to test out this theory is to try out an experiment to see if this is so. If this experiment fails, we'll have to figure out why theory predicted it but the experiment didn't deliver.
If "this experiment" is trying to build a machine, then failure doesn't give much evidence against the theory. Most machine-building failures are caused by insufficient hardware/engineering.
Quantum theory predicts this: https://en.wikipedia.org/wiki/Threshold_theorem. An experiment can show that this prediction is false. This is a scientific problem not an engineering one. Physical theories have to be verified with experiments. If the results of the experiment don't match what the theory predicts then you have to do things like re-examine data, revise the theory e.t.c.
But that theorem being true doesn't mean "they will work given enough time". That's my objection. If a setup is physically possible but sufficiently thorny to actually build, there's a good chance it won't be built ever.
In the specific spot I commented, I guess you were just talking about the physics part? But the GP was talking about both physics and physical realization, so I thought you were also talking about the combination too.
Yes we can probably test the quantum theory. But verifying the physics isn't what this comment chain is really about. It's about working machines. With enough reliable qubits to do useful work.
Of course, it's random and by chance - tokens are literally sampled from a predicted probability distribution. If you mean chance=uniform probability you have to articulate that.
It's trivially true that arbitrarily short reconstructions can be reproduced by virtually any random process and reconstruction length scales with the similarity in output distribution to that of the target. This really shouldn't be controversial.
My point is that matching sequence length and distributional similarity are both quantifiable. Where do you draw the line?
> Of course, it's random and by chance - tokens are literally sampled from a predicted probability distribution.
Picking randomly out of a non-random distribution doesn't give you a random result.
And you don't have to use randomness to pick tokens.
> If you mean chance=uniform probability you have to articulate that.
Don't be a pain. This isn't about uniform distribution versus other generic distribution. This is about the very elaborate calculations that exist on a per-token basis specifically to make the next token plausible and exclude the vast majority of tokens.
> My point is that matching sequence length and distributional similarity are both quantifiable. Where do you draw the line?
Any reasonable line has examples that cross it from many models. Very long segments that can be reproduced. Because many models were trained in a way that overfits certain pieces of code and basically causes them to be memorized.
Right, and very short segments can also be reproduced. Let's say that "//" is an arbitrarily short segment that matches some source code. This is trivially true. I could write "//" on a coin and half the time it's going to land "//". Let's agree that's a lower bound.
I don't even disagree that there is an upper bound. Surely reproducing a repo in its entirety is a match.
So there must exist a line between the two that divides too short and too long.
Again, by what basis do you draw a line between a 1 token reproduction and a 1,000 token reproduction? 5, 10, 20, 50? How is it justified? Purely "reasonableness"?
There are very very long examples that are clearly memorization.
Like, if a model was trained on all the code in the world except that specific example, the chance of it producing that snippet is less than a billionth of a billionth of a percent. But that snippet got fed in so many times it gets treated like a standard idiom and memorized in full.
Is that a clear enough threshold for you?
I don't know where the exact line is, but I know it's somewhere inside this big ballpark, and there are examples that go past the entire ballpark.
I care that it's within the ballpark I spent considerable detail explaining. I don't care where inside the ballpark it is.
You gave an exaggerated upper limit, so extreme there's no ambiguity, of "entire repo".
I gave my own exaggerated upper limit, so extreme there's no ambiguity. And mine has examples of it actually happening. Incidents so extreme they're clear violations.
Maybe an analogy will help: The point at which a collection of sand grains becomes a heap is ambiguous. But when we have documented incidents involving a kilogram or more of sand in a conical shape, we can skip refining the threshold and simply declare that yes heaps are real. Incidents of major LLMs copying code, in a way that is full-on memorization and not just recreating things via chance and general code knowledge, are real.
You're the only person I've seen ever imply that true copying incidents are a statistical illusion, akin to a random die. Normally the debate is over how often and impactful they are, who is going to be held responsible, and what to do about them.
To recap, the original statement was, "Llm's do not verbatim disgorge chunks of the code they were trained on." We obviously both disagree with it.
While you keep trying to drag this toward an upper bound, I'm trying to illustrate that a coin with "//" reproduces a chunk of code. Again. I don't see much of a disagreement on that point either. What I continue to fail to elicit from you is the salient difference between the two.
I'm trying to find a scissor that distills your vibes into a consistent rule and each time it's the rebutted like I'm trying to make an argument. If your system doesn't have consistency, just say so.
I have a consistent rule. The rule is that if an LLM meets the threshold I set then it definitely violated copyright, and if it doesn't meet the threshold then we need more investigation.
We have proof of LLMs going over the threshold. So that answers the question.
Your illustrations are all in the "needs more investigation" area and they don't affect the conclusion.
We both agree that 1 token by itself is fine, and that some number is too many.
So why do you keep asking about that, as if it makes my argument inconsistent in some way? We both say the same thing!
We don't need to know the exact cutoff, or calculate how it varies. We only need to find violators that are over the cutoff.
How about you tell me what you want me to say? Do you want me to say my system is inconsistent? It's not. Having an area where the answer is unclear means the system is not able to answer every question, but it doesn't need to answer every question.
If you're accusing me of using "vibes" in a way that ruins things, then I counter that no I give nice specific and super-rare probabilities that are no more "vibes" based than your suggestion of an entire repo.
> What I continue to fail to elicit from you is the salient difference between the two.
Between what, "//" and the threshold I said?
The salient difference between the two is that one is too short to be copyright infringement and the other is so long and specific that it's definitely copyright infringement (when the source is an existing file under copyright without permission to copy). What more do you want?
Just like 1 grain of sand is definitely not a heap and 1kg of sand is definitely a heap.
If you ask me about 2, 3, 20 tokens my answer is I don't care and it doesn't matter and don't pretend it's relevant to the question of whether LLMs have been infringing copyright or not ("verbatim disgorge chunks").
> Is this website promoting Rust, memory unsafety and insecurity?
What a joke of a question. The website prioritizes programs with 98% safe code over programs with 0% safe code. Does that mean it's "promoting memory unsafety" because it's not demanding 100%? No.
> This code is 100% Safe Rust but it is also completely unsound. Changing the capacity violates the invariants of Vec (that cap reflects the allocated space in the Vec). This is not something the rest of Vec can guard against. It has to trust the capacity field because there's no way to verify it.
> Because it relies on invariants of a struct field, this unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy.
If 2% code means that, wildly spitballing, 50% code has to be manually reviewed, it's not quite as impressive.
And it might be worse regarding memory safety than the "status quo" if one accepts the assumption that unsafe Rust is harder than C and C++, as many Rust developers do.
> It may be even less safe because of the strong aliasing rules. That is, it may be harder to write correct unsafe {} Rust than correct C. Only use unsafe {} when absolutely needed.
If you're letting safe code in the same module as unsafe code mess with invariants, then the whole module needs to be verified by hand, and should be kept as minimal as feasible. Anything outside the module doesn't need to be verified by hand. "Modules with unsafe" should be a lot lot less than 50%. Your spitball is not a fit to real code.
When I wrote "98% safe code" I meant the code that can be automatically verified by the compiler. I wish the terminology was better.
> If you're letting safe code in the same module as unsafe code mess with invariants, then the whole module needs to be verified by hand, and should be kept as minimal as feasible. Anything outside the module doesn't need to be verified by hand.
More or less correct, but in principle, it also requires modules to have proper encapsulation and interfaces. Otherwise, an interface could be made that enables safe API functions to cause memory unsafety if called incorrectly.
> "Modules with unsafe" should be a lot lot less than 50%. Your spitball is not a fit to real code.
I mean, you have the search results right there. You could always take the time to look for yourself instead of "wildly spitballing", especially since the codebase is not that large.
Might want to take a closer look at your search results anyways, since those results include the FAQ, the sudoers man page, and instances of #![forbid(unsafe_code)] and #![deny(unsafe_code)]. Not exactly a promising start...
But since you asked so nicely, by tokei's count and my transcribing file paths, the modules including the `unsafe` keyword account for ~39% (8207/21106) of the total lines of code in the repo. I don't think you'll need to actually look at anywhere near that amount of code, though; from a brief glance through the search results I suspect most of the `unsafe` usages are for FFI calls, and relatively self-documenting/self-contained ones at that. If there were modules with module-wide invariants, I do not appear to have stumbled upon them.
> When I wrote "98% safe code" I meant the code that can be automatically verified by the compiler. I wish the terminology was better.
That depends on what is meant by "automatically verified by the compiler". The compiler cannot generally verify outside-unsafe code that modifies invariants that inside-unsafe code might rely on to have memory safety, for instance.
I mean the portion of the code that has no write access to those fragile invariants. At minimum, this includes all modules that don't have unsafe blocks.
When talking about the kind that lead to torn memory writes, no it doesn't have those. To share between threads you need to go through atomics or mutexes or other protection methods.
I don't understand what people mean when they say "barely portable" about a device that weighs less than 10 pounds. You can't use a big laptop one handed in midair, but that's not "portability". And it can't be a weight hauling problem when small children can handle that much. What is the issue?
There's a difference between carrying ten pounds small distances for short durations, and carrying an extra two pounds over twenty hours of travel, across multiple connecting international flights in a single day. It's also not just an extra two pounds, it's an additional proprietary power cord, bulk, more mass moving in and out under an airliner seat, it all adds up. Especially when you're sleep deprived and physically exhausted.
Any amount of weight is annoying after that long, but if the extra laptop weight is reduced to 10% of your 25 pound bag then it's even less able to be the deciding factor between "portable" and "barely portable".
I am not focused on doing it myself. The most that I care about doing myself is buying a new charging cable if somehow I damage the one the laptop comes with.
And I have this feeling most people are kind of the same page as me.
But if you're stuck with hardware that old, an SSD isn't an option.
reply