Algorithm and Intent
When a decision has no decision-maker, the question of responsibility disappears
In 2014, Amazon began building a hiring tool. The system was designed to review résumés and score applicants on a scale of one to five, the way the company rated products. Engineers trained it on ten years of hiring data — patterns extracted from the résumés of people the company had previously selected. The model learned to identify what a successful Amazon applicant looked like. By 2015, the team noticed the model had taught itself to penalize résumés containing the word “women’s,” as in “women’s chess club” or “women’s studies.” It downgraded graduates of two all-women’s colleges. The engineers attempted to neutralize the variable. The model found proxies. They corrected the proxies. It found others. In 2017, Reuters reported that Amazon abandoned the project. The tool was never used as the sole determinant in hiring. But the pattern it revealed was not a malfunction. The model had done what it was told: identify what a successful Amazon hire looks like. What a successful Amazon hire looked like, based on ten years of data, was a man.
The instinct is to treat this as a cautionary tale about flawed data. Fix the data, fix the outcome. But the data was not flawed. The data was accurate. Amazon had, in fact, hired overwhelmingly male candidates for technical positions over the previous decade. The historical record was correct. The model ingested it faithfully. The bias was not introduced by the algorithm. It was already in the institution, encoded in a decade of human decisions, and the algorithm made those decisions operational at scale. The difference between a biased hiring manager and a biased algorithm is not the bias. It is the speed, the consistency, and the invisibility.
A human interviewer who systematically penalized candidates from women’s colleges would eventually be noticed. A model that does it is noticed only when someone audits the model — and most models are not audited.
The inheritance problem
A credit score is a three-digit number, typically ranging from 300 to 850. It is calculated by three private companies — Equifax, Experian, TransUnion — using proprietary formulas that are not publicly disclosed in full. FICO, the most widely used scoring model, weighs five categories: payment history, amounts owed, length of credit history, new credit, and credit mix. These categories sound neutral. They are not. Each one encodes assumptions about what kind of financial behavior constitutes reliability, and those assumptions were calibrated on the behavior of populations that had access to credit in the first place.
Length of credit history, for instance, rewards duration. A person whose parents added them to a credit card at eighteen has a longer history than a first-generation immigrant who opened their first account at thirty. The immigrant’s financial behavior may be identical in every respect — same payment record, same debt ratios, same income — and they will score lower because the metric measures time, and time is not distributed equally. Amounts owed, weighted at 30 percent, penalizes those who use a higher proportion of available credit — but available credit is itself determined by previous credit scores, creating a recursion in which past disadvantage compounds into present penalty. A person with a $2,000 limit who carries $1,000 in debt is measured differently from a person with a $20,000 limit who carries $1,000 in debt, even though the dollar amount is identical. The metric does not measure financial responsibility. It measures financial headroom. And headroom is a proxy for prior advantage.
Cathy O’Neil documented this architecture in Weapons of Math Destruction, tracing how models trained on historical data do not merely reflect historical patterns — they operationalize them. A model that scores creditworthiness based on past credit access is not predicting future behavior in a vacuum. It is predicting future behavior as a function of prior access, which is itself shaped by redlining, by discriminatory lending, by decades of policy that determined who received credit and on what terms. The algorithm inherits the history. It does not interrogate it. O’Neil called these models “opinions embedded in mathematics.” The phrase is precise. The score presents itself as a measurement — as neutral as a thermometer reading — but what it measures is a combination of behavior and circumstance, and the circumstance carries the weight of decisions made long before the person being scored was born.
(Between the discriminatory lending practices of the 1960s and a credit score generated in 2025, there are six decades, multiple policy reforms, and several generations of algorithmic iteration. The bank that redlined a neighborhood in 1965 no longer exists. The FICO model that penalizes the grandchild of someone denied a mortgage in that neighborhood is a different instrument built by different people. The effect persists. The responsibility has evaporated somewhere in the iterations. I do not yet know how to think clearly about accountability across that kind of distance — whether the absence of a responsible party is a genuine philosophical problem or merely a convenient one.)
What the score touches
The credit score was designed for lending. It is no longer used only for lending. Landlords use credit scores to screen tenants. Employers in most U.S. states can check credit history as part of hiring decisions. Auto insurance companies in forty-seven states use credit-based insurance scores to set premiums. Utility companies use them to determine deposit requirements. The three-digit number that was built to assess the likelihood that a borrower would repay a loan now determines whether a person can rent an apartment, get a job, insure a car, or turn on their electricity.
Each of these expansions follows its own logic. Landlords use credit scores because they predict, statistically, the likelihood of rent default. Insurers use them because credit history correlates, actuarially, with claims frequency. Employers use them because — the reasoning gets thinner here — financial instability is assumed to indicate unreliability or vulnerability to fraud. Each user of the score can point to a statistical correlation that justifies the practice. The correlations are real. The question is what a correlation means when the variable being measured is itself shaped by the conditions being predicted.
A person denied a job because of a low credit score may have a low credit score because they were previously denied employment. A person charged higher insurance premiums because of poor credit may have poor credit because a medical emergency forced them into debt, and the medical debt exists because their previous insurer denied a claim. A person denied housing because of a thin credit file may have a thin file because previous landlords required the credit history they could not yet build. Each system treats the score as an input. None of them treat it as an output of the other systems — which is what it also is.
The circularity is not a bug. It is the architecture.
Consider the texture of the interaction. A person applies for an apartment. They have toured the unit, measured the bedroom doorframe with a tape measure to confirm a bed frame will fit through it, checked the water pressure in the kitchen sink. The property management company runs a credit check through an automated screening service. The applicant never meets the person making the decision because there is no person making the decision. The screening service returns a recommendation — accept, deny, or require additional deposit — based on a threshold the property manager set months ago and has not revisited. The denial arrives by email, a two-paragraph template in the system’s default sans-serif. At the bottom, in smaller type than the denial itself, a notice required by the Fair Credit Reporting Act informs the applicant of their right to request a copy of the report that was used. The right is real. The process for exercising it is bureaucratic enough that most people do not.
The question of neutrality
The architecture of these systems distributes a specific kind of power. Not the power to decide — no individual decides — but the power to determine the framework within which outcomes become inevitable. The person who designed FICO’s weighting system made a choice about what matters: that payment history should count for 35 percent and credit mix for 10 percent. This is a judgment, not a discovery. A different weighting would produce different scores, different populations flagged as risky, different distributions of access and denial. The choice was made. It was not made democratically, or publicly, or with input from the populations it would sort. It was made by a private company, in the course of building a product, and it became infrastructure.
This same structure replicates in every domain where algorithms make or shape decisions. Predictive policing systems like PredPol — now rebranded as Geolitica, in a nominal distancing from its own reputation — deploy patrol officers to neighborhoods flagged as high-risk by models trained on historical arrest data. Neighborhoods with more prior arrests get more patrols, more patrols produce more arrests, more arrests confirm the model’s prediction. The feedback loop is not hidden. Multiple studies have demonstrated that the predictions track policing patterns more closely than they track crime patterns. The algorithm is not predicting where crime will occur. It is predicting where police will be sent, which is a function of where police have already been, which is a function of enforcement priorities that predate the algorithm by decades. The model inherited the pattern. It did not create it. But it made the pattern faster, more consistent, and harder to see because the output carries the authority of mathematics rather than the visible discretion of a precinct commander.
Optimization is not neutral. The act of optimizing requires a target, and the choice of target determines what is maximized, what is minimized, and what is ignored. A hiring algorithm optimized for “candidates who resemble past successful hires” will reproduce the demographics of past hiring. A credit model optimized for “predicting default among populations with established credit histories” will penalize populations without established histories. A policing model optimized for “deploying resources to areas with high arrest rates” will intensify policing in already-policed areas. Each optimization is technically correct. Each achieves its stated objective. And each encodes a set of values into a system that presents itself as value-free — a system that claims to measure reality while actively constructing it, sorting populations according to patterns it inherited and then treating those patterns as though they were discovered rather than chosen, and the further the algorithm runs, the more the pattern solidifies, because each output becomes the next cycle’s input and the distance between the original human decision and its algorithmic descendant grows until the decision’s fingerprints are no longer visible on the surface of the data, which is precisely where the accountability evaporates, not in any single moment of error or malice but in the accumulation of iterations, each one correct, each one justified, each one moving the pattern one step further from the point where someone could have chosen differently.
What replaces discretion
Algorithmic systems were adopted, in most domains, for a defensible reason: human judgment is inconsistent. Two loan officers reviewing the same application will reach different conclusions. A hiring manager’s assessment shifts depending on whether the interview is before or after lunch. An algorithm removes that variation. It applies the same rules to every applicant. It produces the same output for the same inputs. Consistency, at scale, looks like fairness.
But what the algorithm actually does is replace individual discretion with structural discretion — the discretion embedded in the training data, the weighting of variables, the choice of optimization target. Individual discretion is visible, local, and contestable. Structural discretion is none of these things. It does not reside in any person. It resides in the system. And the system, when questioned, can produce a mathematically precise explanation for every output, which looks like transparency but functions as a wall.
The question is what happens to accountability when a consequential decision has no identifiable decision-maker. When a human denies your loan application, there is a person. You can appeal. You can argue. The person can be wrong, and their wrongness can be named. When an algorithm denies it, there is a score. The score can be explained. The explanation will be technically accurate. And the process of challenging it requires you to argue not with a judgment but with a formula — which requires expertise most people do not possess, resources most people cannot afford, and access to proprietary systems most people are not granted.
The burden of proof shifts from the decision-maker to the person affected by the decision.
The system does not need to be right about everyone. It needs to be right often enough that the exceptions — the people wrongly scored, wrongly denied, wrongly sorted — are too few to organize, too expensive to litigate, too invisible to matter to the metrics by which the system evaluates itself. Whether it works for the people inside it is a different question, measured by a different instrument, and that instrument does not yet exist.
-Aimé
Aimé Halden writes Uninsurable, a newsletter about the systems that shape who is protected and who is not. Subscribe for weekly analysis.
