Rendered at 20:31:44 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
AgentOrange1234 19 hours ago [-]
"Every optional field is a question the rest of the codebase has to answer every time it touches that data,"
This is a beautiful articulation of a major pet peeve when using these coding tools. One of my first review steps is just looking for all the extra optional arguments it's added instead of designing something good.
shepherdjerred 14 hours ago [-]
There's nothing specific to AI about this. Humans make the same mistake.
To solve this permanently, use a linter and apply a "ratchet" in CI so that the LLM cannot use ignore comments
oblio 12 hours ago [-]
Is there a Python linter that does this?
raulparada 6 hours ago [-]
Not out-of-the-box afaik, but we use https://ast-grep.github.io (on a pre-commit hook) for such cases, which bridges the linter gaps nicely.
datsci_est_2015 8 hours ago [-]
Not that I’m aware of (writing Python 10+ years). Suppose you could vibecode one yourself though.
KronisLV 7 hours ago [-]
I've been writing my own linter that's supposed to check projects regardless of the technology (e.g. something that focuses on architecture and conventions, alongside something like Oxlint/Oxfmt and Ruff and so on), with Go and goja: https://github.com/dop251/goja
Basically just a bunch of .js rules that are executed like:
projectlint run --rules-at ./projectlint-rules ./src
Which in practice works really well and can be in the loop during AI coding. For example, I can disallow stuff like eslint-disable for entire files and demand a reason comment to be added when disabling individual lines (that can then be critiqued in review afterwards), with even the error messages giving clear guidelines on what to do:
var WHAT_TO_DO = "If you absolutely need to disable an ESLint rule, you must follow the EXACT format:\n\n" +
"// prebuild-ignore-disallow-eslint-disable reason for disabling the rule below: [Your detailed justification here, at least 32 characters]\n" +
"// eslint-disable-next-line specific-rule-name\n\n" +
"Requirements:\n" +
"- Must be at least 32 characters long, to enforce someone doesn't leave just a ticket number\n" +
"- Must specify which rule(s) are being disabled (no blanket disables for ALL rules)\n" +
"- File-wide eslint-disable is not allowed\n\n" +
"This is done for long term maintainability of the codebase and to ensure conscious decisions about rule violations.";
The downside is that such an approach does mean that your rules files will need to try to parse what's in the code based on whatever lines of text there are (hasn't been a blocker yet), but the upside is that with slightly different rules I can support Java, .NET, Python, or anything else (and it's very easy to check when a rule works).
And since the rules are there to prevent AI (or me) from doing stupid shit, they don't have to be super complex or perfect either, just usable for me. Furthermore, since it's Go, the executable ends up being a 10 MB tool I can put in CI container images, or on my local machine, and for example add pre-run checks for my app, so that when I try to launch it in a JetBrains IDE, it can also check for example whether my application configuration is actually correct for development.
Currently I have plenty in regards to disabling code checks, that reusable components should show up in a showcase page in the app, checking specific configuration for the back end for specific Git branches, how to use Pinia stores on the front end, that an API abstraction must be used instead of direct Axios or fetch, how Celery tasks must be handled, how the code has to be documented (and what code needs comments, what format) and so on.
Obviously the codebase is more or less slop so I don't have anything publish worthy atm, but anyone can make something like that in a weekend, to supplement already existing language-specific linters. Tbh ECMAScript is probably not the best choice, but hey, it's just code with some imports like:
// Standalone eslint-disable-next-line without prebuild-ignore
if (trimmed.indexOf("// eslint-disable-next-line") === 0) {
projectlint.error(file, "eslint-disable-next-line must be preceded by: " + IGNORE_MARKER, {
line: lineNum,
whatToDo: WHAT_TO_DO
});
continue;
}
Can personally recommend the general approach, maybe someone could even turn it into real software (not just slop for personal use that I have), maybe with a more sane scripting language for writing those rules.
pipes 48 minutes ago [-]
Optional field can be addressed with good defaults. Well, that is how I think about it. I.e. if they aren't passed in then they are set to a default value.
fatata123 11 hours ago [-]
[dead]
xiaolu627 16 hours ago [-]
What changed for me isn’t that AI writes bad code by default, but that it lowers the friction to adding code faster than the team can properly absorb it. The dangerous part is not obvious bugs, it’s subtle erosion of consistency.
vinnymac 7 hours ago [-]
Well said. I have to review PRs of non-software developers nowadays.
The “what is this trying to do?” has never been harder to answer than before. It creates scenarios where 99% is correct, but the most important area is subtly broken. I prefer it to be human, where 60-80% will be correct, and the problematic areas begin to smell more and more gradually.
In my experience LLMs, at times, may hide the truth from you in a haystack made of needles.
thienannguyencv 5 hours ago [-]
This very matches my observation. The error isn't due to incorrect code—it's code that looks specific to your system but is actually generic patterns applied from the training process. The structure is correct, the logic is sound, it just doesn't interact with what your source code actually does.
Harder to catch because nothing is factually wrong. You have to ask: could this output have been produced without actually reading my codebase?
riteshkew1001 3 hours ago [-]
Been seeing exact same drift in tool configurations, not just source code. MCP server setups specifically, a tool declares "read-only file access" but actual implementation has write capabilities nobody reviewed. same dynamic as the optional-field creep but at theintegration boundary where nobody is even looking. Your linter catches bad function signature. Nothing catches a tool description that doesn't match what the tool actually does. and most of it is AI generated, not reviewed properly.
ChrisMarshallNY 22 hours ago [-]
Because of the way that I use AI, I am constantly looking at the code. I usually leave it alone, if I can; even if I don't really like it.
I will, often go back, after the fact, and ask for refactors and documentation.
It works. Probably a lot slower than using agents, but I test every step, and it is a lot faster than I would do it, unassisted.
benswerd 22 hours ago [-]
I don't think testing the product alone is good enough, because when you give it tests it has to pass it prioritizes passing them at the expense of everything else — including code quality. I've seen it pull in random variables, break semantic functions, etc.
theshrike79 10 hours ago [-]
Code quality can also be codified. If you can't express "code quality" deterministically, then it's all just feels.
And if you can define "quality" in a way the agent can check against it, it will follow the instructions.
embedding-shape 9 hours ago [-]
> then it's all just feels
Would that be so bad? "Readability" sure is subjective, so it seems "code quality" is.
Ask 10 programmers what quality a snippet of code is, and you'll get 10 different answers.
theshrike79 9 hours ago [-]
And there is the problem. Then you start arguing about brace positions and function names and whether simple data classes should have docstrings on properties or not.
All that time it's people arguing with people and wasting time on pure feels. People will get offended and angry and defensive, nothing good ever comes from it.
But when you pick a style and enforce it with a tool like gofmt or black both locally and in the CI, the arguments go away. That's the style all code merged to the codebase must look like and you will deal with it like a professional.
Go proverb: "Gofmt's style is no one's favorite, yet gofmt is everyone's favorite."
embedding-shape 9 hours ago [-]
"Style" is such a small part about what people generally care about when they talk about code quality though, useful/intuitive abstractions, the general design and more tends to be a lot more important and core to the whole code quality debate.
theshrike79 7 hours ago [-]
Linters can be set check for cyclomatic complexity, using old/inefficient styles of programming (go fix ftw) etc. Formatting is just an easy and clear example that everyone should understand.
embedding-shape 4 hours ago [-]
Right, but all of those are easy, left is the actually hard stuff...
mchaver 7 hours ago [-]
> And there is the problem. Then you start arguing about brace positions and function names and whether simple data classes should have docstrings on properties or not.
In my 15 years of experience I have not worked at a place like this. Those are distractions. Anytime something about style has been brought up, the solution was to just enforce a linter/pre-commit process/blacklist for certain functions, etc. It can easily be automated. When those tools don't exist for particular ecosystems we made our own.
datsci_est_2015 8 hours ago [-]
> And there is the problem. Then you start arguing about brace positions and function names and whether simple data classes should have docstrings on properties or not.
Holy strawman Batman!
Have you ever given a code review? These are the lowest items on the totem pole of things usually considered critical for a code review.
Here’s an example code review from this week from me to a colleague, paraphrased:
“We should consider using fewer log statements and raising more exceptions in functions like this. This condition shouldn’t happen very often, and a failure of this service is more desirable than it silently chugging along but filling STDOUT with error messages.”
theshrike79 7 hours ago [-]
So you're fine with people using, for example, different brace styles at random? Or one person uses var everywhere, other uses definite types. One adds standard docstrings on every function and property, one never comments a single line of code.
Don't you have "format on save" enabled in your editor? When you open a file, change two lines and save -> boom 500 changed lines because the previous programmer had different formatting rules than you. Whoops.
This is why the low totem pole stuff needs to be enforced automatically so that actual humans can focus on the higher stuff that's about feels and intuition - things that are highly context dependent and can't be codified into rules.
dotancohen 6 hours ago [-]
You're bikeshedding in a conversation about real issues.
After a certain point in your career you don't care what brace style the new dev used, even if the project has lint rules. You do care if critical errors are ignored and possibly incorrect data is returned. These two situations are in no way equivalent, no need to bikeshed the former when discussing the latter.
datsci_est_2015 7 hours ago [-]
> Code quality can also be codified. If you can't express "code quality" deterministically, then it's all just feels. And if you can define "quality" in a way the agent can check against it, it will follow the instructions.
…
> This is why the low totem pole stuff needs to be enforced automatically so that actual humans can focus on the higher stuff that's about feels and intuition - things that are highly context dependent and can't be codified into rules.
I’m confused, have you switched your position on this topic over the course of this thread? Maybe I’ve misinterpreted your position entirely. If so, my bad.
deadbabe 9 hours ago [-]
Amateurs are the one who argue about syntax.
Code quality is about how well a piece of code expresses what it intends to do. It’s like quality writing.
theshrike79 7 hours ago [-]
You start to care about standard syntactic rules and enforced naming conventions when you're the one waking up 4 in the morning on a Saturday to an urgent production issue and you need to fix someone else's code that's written in a completely incoherent style.
It "expresses what it intends to do" prefectly well - for the original author. Nobody else can decipher it without spending significant amounts of memory cycles.
Jack Kerouac is "quality writing" as is the Finnish national epic Kalevala.
But neither are the kind you want to read in a hurry when you need to understand something.
I want the code at work to be boring, standard and easy to understand. I can get excited by fancy expressive tricks on my own time.
deadbabe 6 hours ago [-]
What do you mean exactly? Are you the type that hates seeing comprehensions and high order functions and would rather just see long for loops and nested ifs?
ChrisMarshallNY 9 hours ago [-]
Syntax and style can be very important, when transferring code.
I’m generally of the opinion that LLM-supplied code is “prolix,” but works well. I don’t intend to be personally maintaining the code, and plan to have an LLM do that, so I ask the LLM to document the code, with the constraint being, that an LLM will be reading the code.
It tends to write somewhat wordy documentation, but quite human-understandable.
In fact, it does such a good job, that I plan on having an LLM rewrite a lot of my docs (and I have a lot of code documentation. My cloc says that it’s about 50/50, between code and documentation).
Personally, I wish Apple would turn an LLM loose on the header docs for their SwiftUI codebase. It would drastically improve their docs (which are clearly DocC).
[EDITED TO ADD] By the way, it warms my heart to see actual discussion threads on code Quality, on HN.
benswerd 4 hours ago [-]
I disagree with this.
My team will send me random snippets from OSS libraries and we all go WTF what is that, and my team will also send really clever lines and we'll go wow.
"Good code" is subjective, but good engineers have good taste, and taste is real.
datsci_est_2015 8 hours ago [-]
> Code quality can also be codified.
Do you think that no one has tried this over the past 80 years with human programmers, but now with LLMs we can suddenly manage to do it? Why do linters and formal verification and testing exist if we could’ve jus codified coding quality in the first place?
To me, this is like telling a carpenter that we can codify what makes a chair comfortable or not.
stpedgwdgfhgdd 13 hours ago [-]
You can ask it to /simplify
Related, it seems to me that there are two types of tests, the ones created in a TDD style and can be modified and the ones that come from acceptance criteria and should only be changed very carefully.
ChrisMarshallNY 22 hours ago [-]
Oh, no. I test. Each. and. Every. Step.
I use a test harness, and step through the code, look at debug logs, and abuse the code, as much as possible.
This could have been html instead of whatever awful moving pattern it is.
christophilus 7 hours ago [-]
Wow. You weren’t joking.
abdusco 4 hours ago [-]
God forbid people use CSS to build something cool
mattacular 20 hours ago [-]
Code cannot and should not be self documenting at scale. You cannot document "the why" with code. In my experience, that is only ever used as an excuse not to write actual documentation or use comments thoughtfully in the codebase by lazy developers.
bdangubic 20 hours ago [-]
this always starts out right but over the years the code changes and its documentation seldom does, even on the best of teams. the amount of code documentation that I have seen that is just plain wrong (it was right at some point) far outnumbers the amount of code documentation that was actually in-sync with the code. 30 years in the industry so
large sample size. now I prefer no code documentation in general
layer8 5 hours ago [-]
The good thing about having documentation in the (version-controlled) code is that it allows you to retrace when it was correct (using git blame or equivalent), and that gives you background about why certain things are the way they are. I 100% prefer outdated documentation in the code to no documentation.
derrak 19 hours ago [-]
Are there any good systems that somehow enforce consistency between documentation and code? Maybe the problem is fundamentally ill-posed.
It's not a massively complex AI monstrosity (it's from 2018 after all) or a perfect solution, but it's a good jumping off point.
With a slight sprinkling of LLM this could be improved quite a bit. Not by having the agent write the documentation necessarily, but for checking the parity and flagging it for users.
For example a CI job that checks that relevant documentation has been created / updated when new functionality is added or old one is changed.
dec0dedab0de 9 hours ago [-]
interesting that they don’t mention doctest which has been a python built-in for quite a while.
It allows you to write simple unit tests directly in your doc strings, by essentially copying the repl output so it doubles as an example.
combined with something like sphinx that is almost exactly what you’re looking for.
doctest kind of sucks for anything where you need to set up state, but if you’re writing functional code it is often a quick and easy way to document and test your code/documentation at the same time.
That system is an unit test that checks that functions are documented in the documentation. Nothing to do with docstrings.
dec0dedab0de 7 hours ago [-]
right but docstrings are documentation, so if your doctest is working, then at least that part of the documentation is correct.
Even without doctest, generating your documentation from docstrings is much easier to keep updated than writing your documentation somewhere else, because it is right there as you are making changes.
sgc 16 hours ago [-]
I am not saying it doesn't matter because it does, but how much does it matter now since we can get documentation on the fly?
I started working on something today I hadn't touched in a couple years. I asked for a summary of code structure, choices I made, why I made them, required inputs and expected outputs. Of course it wasn't perfect, but it was a very fast way to get back up to speed. Faster than picking through my old code to re-familiarize myself for sure.
codingdave 8 hours ago [-]
We cannot get full documentation on the fly, though. We can get "what this does" level of documentation for the system that AI is looking at. And if all you are doing is writing some code, maybe that is enough. But AI cannot offer the bigger picture of where it fits in the overall infrastructure, nor the business strategy. It cannot tell you why technical debt was chosen on some feature 5-10 years ago. And those types of documentation are far more important these days, as people write less of the code by hand.
This is the same discussion that goes round ad nauseum about comments. Nobody needs comments to tell us what the code does. We need comments to explain why choices were made.
reverius42 15 hours ago [-]
Keeping the documentation in the repo (Markdown files) and using an AI coding agent to update the code seems to work quite well for keeping documentation up to date (especially if you have an AGENTS.md/CLAUDE.md in the repo telling it to always make sure the documentation is up to date).
jurgenburgen 15 hours ago [-]
Ultimately the code is the documentation.
benswerd 14 hours ago [-]
This is correct. Comments serve a purpose too, but they should only be used when code fails to self document which should be the exception.
earljwagner 15 hours ago [-]
The concepts of Semantic Functions and Pragmatic Functions seem to be analogous to a Functional Core and Imperative shell (FCIS):
The key insight of FCIS is that complicated logic with large dependencies leads to a large test suite that runs slowly. The solution is to isolate the complicated logic in the functional core. Test that separately from the simpler, more sequential tests of the imperative shell.
bcjdjsndon 7 hours ago [-]
I think it's much better put in your link. Op is too vague on what constitutes pragmatic v semantic... when what he should just say is make it pure functional because then you don't have to simulate a database in your test suite.
abcde666777 21 hours ago [-]
My intentionality is that I'll never let it make the changes. I make the changes. I might make changes it suggests, but only upon review and only written with my hands.
benswerd 21 hours ago [-]
I think this style of work will go away. I was skeptical but I now write the majority of my code through agents.
abcde666777 15 hours ago [-]
I don't think it will go away, I think there will remain a niche for code where we care about precision. Maybe that niche will get smaller over time, but I think it will be a hold out for quite a while. A loose analogy I've found myself using of late is comparing it to bespoke vs off the shelf suits.
For instance, two things I'm currently working on:
- A reasonably complicated indie game project I've been doing solo for four years.
- A basic web API exposing data from a legacy database for work.
I can see how the API could be developed mostly by agents - it's a pretty cookie cutter affair and my main value in the equation is just my knowledge of the legacy database in question.
But for the game... man, there's a lot of stuff in there that's very particular when it comes to performance and the logic flow. An example: entities interacting with each other. You have to worry about stuff like the ordering of events within a frame, what assumptions each entity can make about the other's state, when and how they talk to each other given there's job based multi-threading, and a lot of performance constraints to boot (thousands of active entities at once). And that's just a small example from a much bigger iceberg.
I'm pretty confident that if I leaned into using agents on the game I'd spend more time re-explaining things to them than I do just writing the code myself.
benswerd 14 hours ago [-]
I write systems rust on the cutting edge all day. My work is building instant MicroVM sandboxes.
I was shocked recently when it helped me diagnose a musl compile issue, fork a sys package, and rebuild large parts of it in 2 hours. Would've taken me atleast 2 weeks to do it without AI.
Don't want to reveal the specific task, but it was a far out of training data problem and it was able to help me take what would've normally taken 2 weeks down to 2 hours.
Since then I've been going pretty hard at maximizing my agent usage, and tend to have a few going at most times.
brabel 7 hours ago [-]
Yeah a lot of us were at the point the other guy is now and thinking that writing code by hand is still an acceptable way to go. It just isn’t anymore unless you can justify spending 5 times more time in a task just because you have some principle that code needs to be written by hand. And the funny thing is that the more complex the code base, it actually becomes the more appropriate to only touch it with AI since AI can keep a lot more concepts in its mind than us human with our poultry 7 or so. I think only a few die hard programmers will keep thinking that in a year from now.
abcde666777 7 hours ago [-]
That you even describe it as holding concepts in its mind sounds like confusion to me.
As does the reductionist idea that human thinking is something crude in comparison.
layer8 4 hours ago [-]
Diagnosis is very different from writing code, though. I fully agree that it can be very helpful for analysis and search, but I don’t let it write code.
newsicanuse 12 hours ago [-]
People like OP are the reason why the demand for software engineers will rise exponentially.
dougg 16 hours ago [-]
I see this a lot in research as well, unfortunately including myself. I do miss college where I would hand write a few thousand lines of code in a month, but i’m just so much more productive now.
thepukingcat 19 hours ago [-]
+1 for this, once you have a solid plan with the AI and prompt it to make one small changes at a time and review as you go, you could still be in control of your code without writing a single line
android521 13 hours ago [-]
unfortunately, unless you are god level good (i would say top 100 developers in the entire world), you will be fired eventually.
bandrami 13 hours ago [-]
Dude I still get contracts in ColdFusion. You guys have no idea how slowly actual businesses actually move.
sfn42 11 hours ago [-]
Lol
gravitronic 21 hours ago [-]
*adds "be intentional" to the prompt*
Got it, good idea.
clbrmbr 22 hours ago [-]
Page not rendering well on iPhone Safari.
Good content tho!
butILoveLife 17 hours ago [-]
[dead]
deadlypointer 8 hours ago [-]
site is totally broken on mobile, just because cursor can vibe code a nice rolling scrolling shithole, chances are it will break on some platform/browser
maciejj 5 hours ago [-]
I've noticed the cleaner the codebase, the better AI agents perform on it. They pick up on existing patterns and follow them. Throw them at a messy repo and they'll invent a new pattern every time.
It's basically like hiring a new developer for one task and letting them go right after. They don't know your conventions, your history, or why things are the way they are. The only thing they have is what they can see in the code. Your code quality is basically the prompt now.
divyanshu_dev 13 hours ago [-]
The velocity problem is real. AI makes it easy to add things faster than you can understand what you added. The intentionality has to come before you prompt, not after you review.
theshrike79 10 hours ago [-]
Why?
You can ask the agent to make 10 different solutions in the time it takes you to make 0.5.
Then you review them based on whatever criteria you feel is right and either throw them all away and do it yourself (maybe with inspiration from the other solutions) or pick one to progress further.
diatone 8 hours ago [-]
If 9 of those solutions are crummy and reviewing them takes longer than just doing it right once…
amavashev 8 hours ago [-]
Agree, you need to your own code review, although as AI gets better, this problem will most likely be solved.
benswerd 23 hours ago [-]
I've seen a lot of people talking about how AI is making codebases worse. I reject that, people are making codebases worse by not being intentional about how their AI writes code.
This is my take on how to not write slop.
peacebeard 22 hours ago [-]
Agreed. When you submit code you must take responsibility for its quality. Blaming AI for low quality code is like blaming hammers for giant holes in the drywall. If you don't know how to use AI tools without confidence that your code is high quality, you need to re-assess how you use those tools. I'm not saying AI tools are bad. They're great. But the prevalence of people pushing the tools beyond their limits is not a failure of the tools. Vibe coding may be fun but tight-leash high-oversight AI usage is underrated in my opinion.
newAccount2025 20 hours ago [-]
I think this is mostly right.
In a blameless postmortem style process, you would look at not just the mistake itself but the factors influencing the mistake and how to mitigate them. E.g., doctor was tired AND the hospital demanded long hours AND the industry has normalized this.
So yes, the programmers need to hold the line AND ALSO the velocity of the tool makes it easy to get tired AND and its confidence and often-good results promote laziness or maybe folks just don’t know better AND it can thrash your context and bounce you around the code base making it hard to remember the subtleties AND on and on.
Anyway, strong agree on “dude, review better” as a key part of the answer. Also work on all this other stuff and understand the cost of VeLOciTy…
22 hours ago [-]
tabwidth 22 hours ago [-]
The intention part is right but the bottleneck is review. AI is really good at turning your clean semantic functions into pragmatic ones without you noticing. You ask for a feature, it slips a side effect into something that was pure, tests still pass. By the time you catch it you've got three more PRs built on top.
peacebeard 22 hours ago [-]
In my experience trying to push the onus of filtering out slop onto reviewers is both ineffective and unfair to the reviewer. When you submit code for review you are saying "I believe to the best of my ability that this code is high quality and adequate but it's best to have another person verify that." If the AI has done things without you noticing, you haven't reviewed its output well enough yet and shouldn't be submitting it to another person yet.
skydhash 21 hours ago [-]
Code review should be a transmission of ideas and helping spotting errors that can slip in due to excessive familiarity with the changes (which are often glaring to anyone other than the author).
If you're not familiar with the patch enough to answer any question about it, you shouldn't submit it for review.
systemsweird 22 hours ago [-]
I think there’s just a lot of people who would love to push lower quality code for a variety of legitimate and illegitimate reasons (time pressure, cost, laziness, skill issues, bad management, etc). AI becomes a perfect scapegoat for lowered code quality.
And you’re completely right, humans are still the ones in control here. It’s entirely possible to use AI without lowering your standards.
lukaslalinsky 12 hours ago [-]
Fully agree. AI or not, it's still the human developer's responsibility to make sure the code is correct and integrates well into the codebase. AI just made it easier to be sloppy about it, but that doesn't mean that's the only way to use these tools.
Heer_J 23 hours ago [-]
[dead]
21 hours ago [-]
mrbluecoat 22 hours ago [-]
..but unintentional AI (aka Modern Chaos Monkey) is so much more fun!
benswerd 22 hours ago [-]
LOL fr. I've been talking with some friends about RL on chaos monkeying the codebase to benchmark on feature isolation for measuring good code.
ares623 16 hours ago [-]
What if it's not _my_ codebase?
heliumtera 9 hours ago [-]
Dog, be intentional with you web page.
Holy fuck Batman
WWilliam 13 hours ago [-]
[dead]
microbuilderco 11 hours ago [-]
[dead]
bobokaytop 12 hours ago [-]
[dead]
rsmtjohn 13 hours ago [-]
[dead]
c3z_ 12 hours ago [-]
[dead]
openclaw01 19 hours ago [-]
[dead]
fhouser 20 hours ago [-]
[dead]
lucas36666 13 hours ago [-]
[dead]
Sense_101856 19 hours ago [-]
[dead]
mika-el 22 hours ago [-]
[flagged]
p1necone 22 hours ago [-]
I haven't really extensively evaluated this, but my instinct is to really aggressively trim any 'instructions' files. I try to keep mine at a mid-double-digit linecount and leave out anything that's not critically important. You should also be skeptical of any instructions that basically boil down to "please follow this guideline that's generally accepted to be best practice" - most current models are probably already aware - stick to things that are unique to your project, or value decisions that aren't universally agreed upon.
benswerd 21 hours ago [-]
Wrestled with this a bit. The struggle with this one in particular is its as much for people to read as it is for agents, and the agents are secondary in its case.
I generally agree on this as best practice today, though I think it will become irrelevant in the next 2 generations of models.
w29UiIm2Xz 21 hours ago [-]
Shouldn't all of this be implicit from the codebase? Why do I have to write a file telling it these things?
cjonas 21 hours ago [-]
For any sufficiently large codebase, the agent only ever has a very % of the code loaded into context. Context engineering strategies like "skills" allow the agent to more efficiently discover the key information required to produce consistent code.
cyanydeez 21 hours ago [-]
mostly because reading the code base fills up the context window; as you aggregate context, you then need to synthesize the basics; these things arnt intelligence; they dont know whats useless and whats useful. They're as accurate as the structureyou surround them with.
keeganpoppen 21 hours ago [-]
it’s not that shorter rules are intrinsically better, it’s that longer rules tend to have irrelevant junk in them. ceteris paribus, longer rules are better. it’s just most of the time the longer rules fall under the Blaise Pascal-ian “i regret i didn’t have time to make this shorter”.
slopinthebag 21 hours ago [-]
AI comments are against the rules. Fuck off, bot.
devnotes77 20 hours ago [-]
[dead]
ueda_keisuke 13 hours ago [-]
AI feels less like an autonomous programmer and more like a very capable junior engineer.
The useful part is not just asking it to write code, but giving it context:
how the codebase got here,
what constraints are intentional,
where the sharp edges are,
and what direction we want to take.
With that guidance, it can be excellent.
Without it, it tends to produce changes that make sense in isolation but not in the system.
slopinthebag 13 hours ago [-]
Fuck off bot
thienannguyencv 5 hours ago [-]
Yes, it scored 84% in GPTZero's AI test, but it was still "good enough" to pass HN's anti-AI test.
This is a beautiful articulation of a major pet peeve when using these coding tools. One of my first review steps is just looking for all the extra optional arguments it's added instead of designing something good.
To solve this permanently, use a linter and apply a "ratchet" in CI so that the LLM cannot use ignore comments
Basically just a bunch of .js rules that are executed like:
Which in practice works really well and can be in the loop during AI coding. For example, I can disallow stuff like eslint-disable for entire files and demand a reason comment to be added when disabling individual lines (that can then be critiqued in review afterwards), with even the error messages giving clear guidelines on what to do: The downside is that such an approach does mean that your rules files will need to try to parse what's in the code based on whatever lines of text there are (hasn't been a blocker yet), but the upside is that with slightly different rules I can support Java, .NET, Python, or anything else (and it's very easy to check when a rule works).And since the rules are there to prevent AI (or me) from doing stupid shit, they don't have to be super complex or perfect either, just usable for me. Furthermore, since it's Go, the executable ends up being a 10 MB tool I can put in CI container images, or on my local machine, and for example add pre-run checks for my app, so that when I try to launch it in a JetBrains IDE, it can also check for example whether my application configuration is actually correct for development.
Currently I have plenty in regards to disabling code checks, that reusable components should show up in a showcase page in the app, checking specific configuration for the back end for specific Git branches, how to use Pinia stores on the front end, that an API abstraction must be used instead of direct Axios or fetch, how Celery tasks must be handled, how the code has to be documented (and what code needs comments, what format) and so on.
Obviously the codebase is more or less slop so I don't have anything publish worthy atm, but anyone can make something like that in a weekend, to supplement already existing language-specific linters. Tbh ECMAScript is probably not the best choice, but hey, it's just code with some imports like:
Can personally recommend the general approach, maybe someone could even turn it into real software (not just slop for personal use that I have), maybe with a more sane scripting language for writing those rules.The “what is this trying to do?” has never been harder to answer than before. It creates scenarios where 99% is correct, but the most important area is subtly broken. I prefer it to be human, where 60-80% will be correct, and the problematic areas begin to smell more and more gradually.
In my experience LLMs, at times, may hide the truth from you in a haystack made of needles.
Harder to catch because nothing is factually wrong. You have to ask: could this output have been produced without actually reading my codebase?
I will, often go back, after the fact, and ask for refactors and documentation.
It works. Probably a lot slower than using agents, but I test every step, and it is a lot faster than I would do it, unassisted.
And if you can define "quality" in a way the agent can check against it, it will follow the instructions.
Would that be so bad? "Readability" sure is subjective, so it seems "code quality" is.
Ask 10 programmers what quality a snippet of code is, and you'll get 10 different answers.
All that time it's people arguing with people and wasting time on pure feels. People will get offended and angry and defensive, nothing good ever comes from it.
But when you pick a style and enforce it with a tool like gofmt or black both locally and in the CI, the arguments go away. That's the style all code merged to the codebase must look like and you will deal with it like a professional.
Go proverb: "Gofmt's style is no one's favorite, yet gofmt is everyone's favorite."
In my 15 years of experience I have not worked at a place like this. Those are distractions. Anytime something about style has been brought up, the solution was to just enforce a linter/pre-commit process/blacklist for certain functions, etc. It can easily be automated. When those tools don't exist for particular ecosystems we made our own.
Holy strawman Batman!
Have you ever given a code review? These are the lowest items on the totem pole of things usually considered critical for a code review.
Here’s an example code review from this week from me to a colleague, paraphrased:
“We should consider using fewer log statements and raising more exceptions in functions like this. This condition shouldn’t happen very often, and a failure of this service is more desirable than it silently chugging along but filling STDOUT with error messages.”
Don't you have "format on save" enabled in your editor? When you open a file, change two lines and save -> boom 500 changed lines because the previous programmer had different formatting rules than you. Whoops.
This is why the low totem pole stuff needs to be enforced automatically so that actual humans can focus on the higher stuff that's about feels and intuition - things that are highly context dependent and can't be codified into rules.
After a certain point in your career you don't care what brace style the new dev used, even if the project has lint rules. You do care if critical errors are ignored and possibly incorrect data is returned. These two situations are in no way equivalent, no need to bikeshed the former when discussing the latter.
…
> This is why the low totem pole stuff needs to be enforced automatically so that actual humans can focus on the higher stuff that's about feels and intuition - things that are highly context dependent and can't be codified into rules.
I’m confused, have you switched your position on this topic over the course of this thread? Maybe I’ve misinterpreted your position entirely. If so, my bad.
Code quality is about how well a piece of code expresses what it intends to do. It’s like quality writing.
It "expresses what it intends to do" prefectly well - for the original author. Nobody else can decipher it without spending significant amounts of memory cycles.
Jack Kerouac is "quality writing" as is the Finnish national epic Kalevala.
But neither are the kind you want to read in a hurry when you need to understand something.
I want the code at work to be boring, standard and easy to understand. I can get excited by fancy expressive tricks on my own time.
I’m generally of the opinion that LLM-supplied code is “prolix,” but works well. I don’t intend to be personally maintaining the code, and plan to have an LLM do that, so I ask the LLM to document the code, with the constraint being, that an LLM will be reading the code.
It tends to write somewhat wordy documentation, but quite human-understandable.
In fact, it does such a good job, that I plan on having an LLM rewrite a lot of my docs (and I have a lot of code documentation. My cloc says that it’s about 50/50, between code and documentation).
Personally, I wish Apple would turn an LLM loose on the header docs for their SwiftUI codebase. It would drastically improve their docs (which are clearly DocC).
[EDITED TO ADD] By the way, it warms my heart to see actual discussion threads on code Quality, on HN.
My team will send me random snippets from OSS libraries and we all go WTF what is that, and my team will also send really clever lines and we'll go wow.
"Good code" is subjective, but good engineers have good taste, and taste is real.
Do you think that no one has tried this over the past 80 years with human programmers, but now with LLMs we can suddenly manage to do it? Why do linters and formal verification and testing exist if we could’ve jus codified coding quality in the first place?
To me, this is like telling a carpenter that we can codify what makes a chair comfortable or not.
Related, it seems to me that there are two types of tests, the ones created in a TDD style and can be modified and the ones that come from acceptance criteria and should only be changed very carefully.
I use a test harness, and step through the code, look at debug logs, and abuse the code, as much as possible.
Kind of a pain, but I find unit tests are a bit of a "false hope" kind of thing: https://littlegreenviper.com/testing-harness-vs-unit/
It's not a massively complex AI monstrosity (it's from 2018 after all) or a perfect solution, but it's a good jumping off point.
With a slight sprinkling of LLM this could be improved quite a bit. Not by having the agent write the documentation necessarily, but for checking the parity and flagging it for users.
For example a CI job that checks that relevant documentation has been created / updated when new functionality is added or old one is changed.
It allows you to write simple unit tests directly in your doc strings, by essentially copying the repl output so it doubles as an example.
combined with something like sphinx that is almost exactly what you’re looking for.
doctest kind of sucks for anything where you need to set up state, but if you’re writing functional code it is often a quick and easy way to document and test your code/documentation at the same time.
https://docs.python.org/3/library/doctest.html
That system is an unit test that checks that functions are documented in the documentation. Nothing to do with docstrings.
Even without doctest, generating your documentation from docstrings is much easier to keep updated than writing your documentation somewhere else, because it is right there as you are making changes.
I started working on something today I hadn't touched in a couple years. I asked for a summary of code structure, choices I made, why I made them, required inputs and expected outputs. Of course it wasn't perfect, but it was a very fast way to get back up to speed. Faster than picking through my old code to re-familiarize myself for sure.
This is the same discussion that goes round ad nauseum about comments. Nobody needs comments to tell us what the code does. We need comments to explain why choices were made.
https://testing.googleblog.com/2025/10/simplify-your-code-fu...
The key insight of FCIS is that complicated logic with large dependencies leads to a large test suite that runs slowly. The solution is to isolate the complicated logic in the functional core. Test that separately from the simpler, more sequential tests of the imperative shell.
For instance, two things I'm currently working on: - A reasonably complicated indie game project I've been doing solo for four years. - A basic web API exposing data from a legacy database for work.
I can see how the API could be developed mostly by agents - it's a pretty cookie cutter affair and my main value in the equation is just my knowledge of the legacy database in question.
But for the game... man, there's a lot of stuff in there that's very particular when it comes to performance and the logic flow. An example: entities interacting with each other. You have to worry about stuff like the ordering of events within a frame, what assumptions each entity can make about the other's state, when and how they talk to each other given there's job based multi-threading, and a lot of performance constraints to boot (thousands of active entities at once). And that's just a small example from a much bigger iceberg.
I'm pretty confident that if I leaned into using agents on the game I'd spend more time re-explaining things to them than I do just writing the code myself.
I was shocked recently when it helped me diagnose a musl compile issue, fork a sys package, and rebuild large parts of it in 2 hours. Would've taken me atleast 2 weeks to do it without AI.
Don't want to reveal the specific task, but it was a far out of training data problem and it was able to help me take what would've normally taken 2 weeks down to 2 hours.
Since then I've been going pretty hard at maximizing my agent usage, and tend to have a few going at most times.
As does the reductionist idea that human thinking is something crude in comparison.
Got it, good idea.
Good content tho!
It's basically like hiring a new developer for one task and letting them go right after. They don't know your conventions, your history, or why things are the way they are. The only thing they have is what they can see in the code. Your code quality is basically the prompt now.
You can ask the agent to make 10 different solutions in the time it takes you to make 0.5.
Then you review them based on whatever criteria you feel is right and either throw them all away and do it yourself (maybe with inspiration from the other solutions) or pick one to progress further.
This is my take on how to not write slop.
In a blameless postmortem style process, you would look at not just the mistake itself but the factors influencing the mistake and how to mitigate them. E.g., doctor was tired AND the hospital demanded long hours AND the industry has normalized this.
So yes, the programmers need to hold the line AND ALSO the velocity of the tool makes it easy to get tired AND and its confidence and often-good results promote laziness or maybe folks just don’t know better AND it can thrash your context and bounce you around the code base making it hard to remember the subtleties AND on and on.
Anyway, strong agree on “dude, review better” as a key part of the answer. Also work on all this other stuff and understand the cost of VeLOciTy…
If you're not familiar with the patch enough to answer any question about it, you shouldn't submit it for review.
And you’re completely right, humans are still the ones in control here. It’s entirely possible to use AI without lowering your standards.
Holy fuck Batman
I generally agree on this as best practice today, though I think it will become irrelevant in the next 2 generations of models.
The useful part is not just asking it to write code, but giving it context: how the codebase got here, what constraints are intentional, where the sharp edges are, and what direction we want to take.
With that guidance, it can be excellent. Without it, it tends to produce changes that make sense in isolation but not in the system.