Hacker Newsnew | past | comments | ask | show | jobs | submit | more konsalexee's commentslogin

I think eventually all OSS projects/repos will suffer with this.

My bet is that git hosting providers like GitHub etc. should start providing features to allow us for better signal/noise ratio


Why would GitHub develop features that are adversarial to one of Microsoft’s favorite products?


So that you pay for both.


Not even the mafia has it that good. You only pay them so they won’t beat you up. Imagine if you paid them to beat you up and then paid them to protect you from them.


That's how you profit


Learning from Cloudflare: Host malware and DDOSsers AND provide protection against them = $$$


I feel my question is naive in retrospect.


E.g. to secure the quality of training data?


Depends. I'm not suffering it at all, but I'm a sort of research project producing variations on audio processing under MIT license.

And I don't take pull requests: only exception has been to accomodate a downstream user who was running a script to incorporate the code, and that was so out of my usual experience that it took way to long to register it was a legitimate pull request.


Githubs owner is betting the farm on pushing slop, so that seems unlikely to happen there anytime soon.


They just need to offer you more slop to review the slop and give it a sloppiness score.


qemu & libvirt are already seeing a bunch of these. Here's a recent spammer sending AI slop reports:

https://gitlab.com/ququruza


I think the title is stating this: "Postgres LISTEN/NOTIFY does not scale"

That means for moderate cases you do not even have to care about this. 99% of PostgreSQL instances out there are not big "scale".

As a sr. engineer is your responsibility to make a decision if you will build for "scale" from day zero or ignore this as you are mindful that this will not affect you until a certain point.


The parrot on the OGs shoulder is the goal (besides raising the ARR lol)


Funny thing is he's my fiancé's bird, which she's had for 20 years, but somehow the only person he likes now is me. Kind of a bummer for her. But he's happy I guess haha


I love it! Amazing work.

A slider to do a bit of time-travelling if possible would be also a nice to have


> Start with: “How do you spend most of your time?” Not “What do you do?” It opens people up beyond job titles.

This is something that feels alien in SF people. A fundamental difference for example from Greece and people living in SF is this.

- Greek opening question: "Which city are you from?" - SF opening question: "Which company do you work for?"


> "Which city are you from?"

Many big tech companies have inclusion training calling this question out as inappropriate on the grounds it provides an opportunity to introduce bias.


Sure, that's advised in interviews, where you're about to make a decision on someone's livelihood, hence the importance of reducing bias. That's a completely different context than in casual conversation at a social event.


This is nonsense as there is no such thing as unbiased personal interaction.


It's a networking event, not an interview loop.

In the same lines, don't ask anything. Everything is a bias. ie, What do they do? - Also a bias as they are engineer, or product, or sales, or whatever.


In most parts of the world this is not true.


Even in the US it's ridiculous advice, driven by fear rather than a rational assessment of policy. Asking people where they are from is just fine.


In SF, people are from so many different places that there is little to no chance that you have personal experience or knowledge of the city that the person would answer your question with. The same is probably not true of someone who grew up in Greece, asking someone else who grew up in Greece


It was really enjoyable to read. And I also do not read a lot of fiction, with my last book being the Hitchhiker's guide to the galaxy series.

My verdict is that Project Hail Mary was way much more engaging in terms of story-telling. The concepts were cool, and tbh I look forward for the movie and see if the adaptation will be nice


Long story short, I didn't want to make that analysis/distinction because it would miss the point.

They excel in their respective areas based on the architectural decisions they've made for the use cases they wanted to optimize for.

PlanetScale, with their latest Metal introduction, optimized for super low latency (they act like they've reinvented the wheel, lol), but they clearly have something in mind going in this direction.

Neon offers many managed features for serverless PostgreSQL that were missing in the market, like instant branching, and with auto-scaling, you may perform better with variable workloads. From their perspective, they wanted to serve other use cases.

There's no reason to always compare apples to oranges, and no reason to hate one another when everyone is pushing the managed database industry forward.


> PlanetScale, with their latest Metal introduction, optimized for super low latency (they act like they've reinvented the wheel, lol), but they clearly have something in mind going in this direction.

I’ve spoken to them personally, and didn’t get the impression at all that they think they’ve “re-invented the wheel.” More like they realized that separating compute and storage was a god-awful idea, and are bringing back how things used to be in the days of boring tech.

Also, re: branching, PS MySQL definitely has that. I assume they’ll bring it to Postgres.


Why is it an awful idea? I don't understand the trade-offs well.


Bear in mind I have a large bias towards performance, and am a DBRE, so I also have strong opinions about normalization.

Separating compute and storage means that if you ever have to hit the disk - which is every time for writes, and depending on your working set size, often for reads as well - you’re getting a massive latency hit. I’ll use Amazon Aurora as an example, because they’re quite open with their architecture design, they’re the largest player in this space, and I’m personally familiar with it.

Aurora’s storage layer consists of 6 nodes split across 3 AZs. For a write to be counted as durable, it needs to be ack’d by 4/6 nodes, which means 2/3 AZs. That’s typically a minimum of 1 msec, though they do get written in parallel, which helps. 1 msec may not sound like much, but it’s an eternity for traditional SSD access.

MySQL is even worse with Aurora, because of its change buffer. Normally, writes (including deletes) to indexed columns (secondary indices) results in the changes to the indices being buffered, which avoids random I/O. Since Aurora's architecture is so wildly different than vanilla MySQL, it can't do that, and all writes to secondary indices must happen synchronously.

Given most SaaS companies' tendency to eschew RDBMS expertise in favor of full-stack teams, and those teams' tendency to use JSON[B] for everything, poor normalization practices, and sub-optimal queries, all of this adds up to a disastrous performance experience.

I have a homelab with Dell R620s, which originally came out in 2012. Storage is via Ceph on Samsung PM983 NVMe drives, connected with Mellanox ConnectX3-Pro in a mesh. These drives are circa-2013. Despite the age of this system, it has consistently out-performed Aurora MySQL and Postgres in benchmarks I've done. The only instance classes that can match it are, unsurprisingly, those with local NVMe storage.

In fairness, it isn't _all_ awful. Aurora does have one feature that is extremely nice: survivable page cache. If an instance restarts, in most circumstances, you don't lose the buffer pool / shared buffers on the instance. This means you don't have the typical cold start performance hit. That is legitimately cool tech, and quite useful. I'm less sold on the other features, like auto-scaling. If you're planning for a peak event (e.g. a sales event for e-commerce), you know well in advance, and have plenty of time to bring new instances online. If you have a surprise peak event, auto-scaling is going to take 30 minutes to 1 hour for the new instances to come online, which is an extremely long time to be sitting in a degraded state. This isn't really any faster than RDS, though again to Aurora's credit, the fact that all instances share the same underlying cluster volume means that there is no delay when pulling in blocks from S3.

Finally, Aurora’s other main benefit, as I alluded to, is that its shared cluster volume means that replication lag is typically quite low; 10-30 msec IME. However, also IME, devs don’t design apps around this, and anything other than instantaneous is too slow, so it doesn’t really matter.


Just seeing this now. (Is there a way to turn on notifications here?)

Thanks for the breakdown!

And I suppose they are eating the cost of worse latency to allow for more scalability?


Well, I imagine at least the emotional aspect of this squabble had more than a billion reasons injected via Databricks.


It is always about the trade-off between those two parameters.

Of course an increase in both is the optimal, but a small sacrifice in performance/accuracy for being 200% faster is worth noting. Around 10% drop in accuracy for 200% speed-up, some would take it!


Also that “speed up” is actually hiding “less compute used” which is a proxy for cost. Assuming this is 200% faster purely because it needs less compute, that should mean it costs roughly 1/3 as much to run for a 10% decrease in quality of output.



What is the full stack you are using? What parts also require scaling up? DB, bandwidth orrr?


So the stack is datastar + clojure + sqlite + caddy on a hetzner VPS 2 core 2G.

The scaling was preemptive as it was hitting 150% CPU (out of 200%). Needed to power down to rescale.

Now it's hovering at around 200% (out of 400%). About 80kb/s and 10 disk iops.

Everything goes via a sqlite db.


How are you doing the throttling? I imagine one malicious player might do like 10000 iops to your DB.


So there's no rate limiting.

But because it's sqlite it's a single writer. Everything gets batched as one transaction every 100ms. The operations on a single chunk gets squashed into a single write.

Even without the squashing sqlite can handle 10000-20000+ updates/s because of the transaction batching.

With the chunk based squashing all edits to a chunk in that 100ms window become one update, so it can scale quite well.


But the question wasn't really if you can handle it, but more if someone could control the whole board themselves, I think?


Oh, I mean they could try. It's a very big board. Probably possible though if someone is sufficiently motivated.


I see a snake eating the 0,0 part now at least, heh.

Btw, if I add a ' to the string, it's impossible to override by othrs. At least in the UI on Firefox, the snake still ate it.


TIL higgs-bugson and Heisenbug


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: