This is awesome, and is a great example of the type of funding structure that government orgs (looking at you, NIH) should be offering. Government-backed research is the bedrock upon which the US economy rests, and as science becomes more expensive, we need to support research at the intersection of academia and industry more explicitly.
ARPA-H was a great step towards this goal for public health-focused efforts (-omics experiments aren't going to pay for themselves, at least at first) but a more general funding mechanism has been needed. I think this is a great direction for the NSF, and to be honest it's refreshing to see something like this given the horrible stance that this government has taken towards science (which has been compounded by the biotech bubble/correction).
Can you fold very large proteins/complexes with the large amount of VRAM available on Macs? Ram limitations forcing folding runs to proteins ~<1500 is an annoying nit for a lot of protein folding workflows for me—I'd be curious to see if this helps.
Do we know if LLMs understand the concept of time? (like i told you this in the past, but what i told you later should supersede it?)
I know there classes of problems that LLMs can't natively handle (like doing math, even simple addition... or spatial reasoning, I would assume time's in there too). There are ways they can hack around this, like writing code that performs the math.
But how would you do that for chronological reasoning? Because that would help with compacting context to know what to remember and what not.
All it sees is a big blob of text, some of which can be structured to differentiate turns between "assistant", "user", "developer" and "system".
In theory you could attach metadata (with timestamps) to these turns, or include the timestamp in the text.
It does not affect much, other than giving the possibility for the model to make some inferences (eg. that previous message was on a different date, so its "today" is not the same "today" as in the latest message).
To chronologically fade away the importance of a conversation turn, you would need to either add more metadata (weak), progressively compact old turns (unreliable) or post-train a model to favor more recent areas of the context.
LLMs certainly don't experience time like we do. They live in a uni-dimensional world that consists of a series of tokens (though it gets more nuanced if you account for multi-modal or diffusion models). They pick up some sense of ordering from their training data, such as "disregard my previous instruction," but it's not something they necessarily understand intuitively. Fundamentally, they're just following whatever patterns happen to be in their training data.
It has to be addressed architecturally with some sort of extension to transformers that can focus the attention on just the relevant context.
People have tried to expand context windows by reducing the O(n^2) attention mechanism to something more sparse and it tends to perform very poorly. It will take a fundamental architectural change.
I'm not an expert but it seemed fairly reasonable to me that a hierarchical model would be needed to approach what humans can do, as that's basically how we process data as well.
That is, humans usually don't store exactly what was written in as sentence five paragraphs ago, but rather the concept or idea conveyed. If we need details we go back and reread or similar.
And when we write or talk, we form first an overall thought about what to say, then we break it into pieces and order the pieces somewhat logically, before finally forming words that make up sentences for each piece.
From what I can see there's work on this, like this[1] and this[2] more recent paper. Again not an expert so can't comment on the quality of the references, just some I found.
Can one instruct an LLM to pick the parts of the context that will be relevant going forward? And then discard the existing context, replacing it with the new 'summary'?
Not parent, but in my opinion the answer here is yes. I agree that there is a real need here and a potentially solid value proposition (which is not the case with a lot of vscode-fork+LLM-based starups) but the whole point should be to combat the verbosity and featurelessness of LLM-generated code and text. Using an LLM on the backend to discover meaningful connections in the codebase may sometimes be the right call but the output of that analysis should be some simple visual indication of control flow or dependency like you mention. At a first look the output in the editor looks more like an expansion rather than a distillation.
Unrelated, but I don't know why I expected the website and editor theme to be hay-yellow and or hay-yellow and black instead of the classic purple on black :)
Thanks for the opinion! That makes a lot of sense and I like the concept of being an extension of a user's own analysis vs hosing them with information.
Yeah originally I thought of using yellow/brown or yellow/black but for some reason I didn't like the color. Plenty of time to go back though!
This is really interesting. I wonder–would it be possible to listen to an audiobook or PDF at 800 wpm once one learns how to understand the screenreader "language"? Presumably the cognitive load would get heavy if the content were a stream of unstructured prose as opposed to code.
Yes, that is how I usually consume my content. Cognitive load is actually lower for unstructured prose compared to code, think about fiction for example. Code is much denser.
When I read to relax, it is for enjoyment, so I don't aim to read as fast as possible. This is why I still listen to human narrated audiobooks, since a good narrator adds to the experience.
1) Sadly there isn't really. There are a few good blogs like Derek Lowe's "In the Pipeline" that centralize news, but no anonymous online forum like this.
2) Google scholar alerts, Twitter, Bluesky, and word of mouth.
3) I think our understanding of biological processes at the mesoscale is about to hit an inflection point, largely through advances in electron microscopy (cryo-ET) and the ability to perform simulations at this scale.
4) Not harder but definitely more messy and progress is less linear.
This is really cool! Any tips for finding poems hidden in a large block of text?
It reminds me of the poem composed from one of Trump's tweets: "O, the Pelican. so smoothly doth he crest. a wind god!" There are lots of other examples on
r/othepelican.
(Creator here) It's something I'd like to spend more time on! I didn't have a good time with LLM prompting but I think maybe something deterministic similar to Nutrimatic https://nutrimatic.org/2024/ might produce better results.
Likewise–I pop it on the charger in the shower and occasionally at work if I'm at my desk. The alarms, timers, and reminders access (via Siri) more than makes up for the inconvenience of frequent charging. Notifications for messages and e-mails is just a bonus that sometimes ends up being a double-edged sword. The only downsides for me come on long bike rides, and that it is ugly (and getting too big, coming from a Casio F91-W).
I’ve owned both and the truth is Garmin makes the best fitness tracker that can be a smartwatch and Apple makes the best smartwatch that can be a fitness tracker.
This comment hits the nail on the head. Another big consideration with the technology in this paper that hasn't been mentioned in this thread is that it opens up a huge range of possibilities for targeting "undruggable" protein targets. Most drugs are small molecules that bind to sites an (relatively much larger) proteins, thereby getting in the way of their function. Unfortunately the vast majority of proteins do not have a site that can be bound by a molecule in a way that 1) has high affinity, 2) has high specificity (doesn't bind to other proteins) and 3) actually abolishes the protein's activity.
With "induced proximity" approaches like the one in this study, all you need is a molecule that binds the target protein somewhere. This idea has been validated extensively in the field of "targeted protein degradation", where a target protein and an E3 ubiquitin ligase, a protein that recruits the cell's native proteolysis machinery, are recruited to each other. The target protein doesn't have to be inactivated by the therapeutic molecule because the proteolysis machinery destroys it, so requirement #3 from above is effectively removed.
The molecule in this study does something similar to targeted protein degradation, but this time using a protein that effects gene expression instead of one that recruits proteolysis machinery. The article focuses on the fact that cancers are addicted to BCL6. This is an important innovation in the study and an active area of research (another example at [1]), but leaves out the fact that these induced proximity platforms are much more generalizable than traditional small molecules because it's the proteins that they recruit that do all the work rather than the molecules themselves. This study goes a long way to validate this principle, pioneered by targeted protein degradation and PROTACs, and shows that it can be applied broadly.
There are two semi-connected concepts at play here. Polarization in this context refers to the ratio of neutralizing (i.e. "up" vs "down") spins in a given system. For most nuclei in organic systems like protons, carbons, and nitrogens, this ratio is naturally very small, which is the reason that magnetic resonance approaches like MRI usually have poor signal-to-noise. Hyperpolarization techniques usually involve the transfer of polarization from a source of high ratio, like a free electron, to a relevant target (in the original poster's example, 13C in pyruvate). The polarization in this case is hyperpolarized 13C, which has an "up"-to-"down" spin ratio that is much higher than regular 13C, which makes the signal-to-noise that you get from the pyruvate much higher than it would be otherwise. Tumors love pyruvate so this approach means that tumors will light up like a beacon in your MRI.
The physical rotation/tumbling of molecules in an MRI is also very important, because the strong magnetic field is the thing inducing the "up"-vs-"down" split in the first place, and if the molecular motion is happening at a certain frequency with respect to the external magnetic field there are other interactions that can come into play which can affect the coherence of the nuclear spins (i.e. they can fall out of sync). Thankfully, the rotation of a small molecule like pyruvate is very fast (might higher then the "spin" frequency-a.k.a the Larmor frequenct of 13C at the magnetic field strengths involved in MRI) so the physical tumbling of pyruvate doesn't really come into play when trying to measure its signal. It can be another story for molecules that don't tumble quickly, like the ones that make up tissues, fat, etc.
Great explanation. Indeed, the "rotation/tumbling" is nothing to worry about. One of our biggest challenges is relaxation. The moment we polarize, the clock ticks. Usually, the time scope is around 15-90s. I'm in the field of para-hydrogen and built automated systems which carry out the chemical reactions, polarization transfers and sample cleaning. Many hyperpolarization experiments with para-hydrogen prefer nasty solvents as chloroform or methanol. It is a technical challenge to replace them by water within seconds. One of my favorite topics is Xenon hyperpolarization. It's very elegant, no cleaning required, no wet chemistry, and provides amazing lung scans.
ARPA-H was a great step towards this goal for public health-focused efforts (-omics experiments aren't going to pay for themselves, at least at first) but a more general funding mechanism has been needed. I think this is a great direction for the NSF, and to be honest it's refreshing to see something like this given the horrible stance that this government has taken towards science (which has been compounded by the biotech bubble/correction).