What AI Can and Cannot Do

Discussion of AI and the Future of Work

Artificial intelligence has captured the public imagination, promising to revolutionize everything from healthcare and finance to education and entertainment. Yet behind the hype lurk countless examples of overpromised, underdelivered, and occasionally harmful applications. In 2019, Arvind Narayanan—a professor at MIT specializing in privacy and security—found himself at the heart of the debate over “AI snake oil” when he encountered recruitment tools that claimed to assess job candidates’ personalities from mere 30-second video clips. Narayanan’s skepticism about these dubious claims led to a viral talk on “How to Recognize AI Snake Oil,” and over the next five years, he and graduate student Sayah Kapoor conducted in-depth research into the foundational issues, failures, and ethical dilemmas surrounding modern AI. Their efforts culminated in a book that not only maps the landscape of AI’s true capabilities but also offers a practical framework for separating genuine technological breakthroughs from misleading or harmful applications.

In this article, we’ll explore the origins of the “AI Snake Oil” concept, unpack the distinctions between various forms of AI (especially predictive and generative models) and examine why some AI deployments are fraught with ethical pitfalls. We’ll introduce a two-dimensional framework for evaluating AI applications based on functionality and harm potential, and lay out a call to action for policymakers, developers, and end users alike. Drawing on insights from a fireside chat with economist Daron Acemoglu and a lively audience Q&A, we’ll highlight the pitfalls of unchecked AI development, the enduring promise of “background AI,” and the necessity of a tempered, “normal technology” perspective that envisions AI as a decades-long transformation rather than a magical panacea or a passing fad.

The Origin of “AI Snake Oil” and the Book’s Genesis

It all began in 2019, when Arvind Narayanan first noticed a proliferation of recruitment software companies pitching AI-driven personality assessments. These vendors claimed their algorithms could analyze a candidate’s body language, speech patterns, and even micro-expressions from a brief, thirty-second video to derive multi-dimensional personality scores—labels such as “change agent,” “team player,” or “leadership potential,” each purportedly gauged with uncanny precision. Recruitment managers were told that these tools could sift through hundreds or thousands of applicants and surface hidden gems merely by analyzing their nonverbal cues. Narayanan’s immediate reaction was astonishment and skepticism: how could a few seconds of video truly reveal deep-seated behavioral tendencies? Was there any empirical basis linking these “personality scores” to actual job performance?

To test these claims, investigative journalists later exposed glaring flaws. Small tweaks to video backgrounds—like adding or removing a bookshelf—caused dramatic swings in the purported personality metrics. Resumes featuring identical credentials could yield wildly different “scores” simply because of a minor change in lighting or camera angle. Perhaps most troublingly, there was no transparent, peer-reviewed evidence connecting these AI-generated personality evaluations to workplace success, longevity, or cultural fit. In essence, the technology was a sophisticated marketing pitch dressed up as scientific innovation, yet it lacked rigorous validation.

Narayanan took to the stage at MIT with a talk titled “How to Recognize AI Snake Oil,” which quickly went viral. He recounted his encounters with vendors promising everything from “neuro-linguistic programming” analytics to “facial micro-expression” inference and pointed out how easily these systems could be gamed or manipulated. His message was simple but powerful: skepticism is healthy, and practitioners must demand empirical validation rather than blindly trusting glossy marketing materials. Invitations poured in for Narayanan and Kapoor to dig deeper, and over the next five years, they examined a broad range of AI applications; evaluating their technical underpinnings, potential failure modes, and ethical implications. Their research yielded a book that not only chronicles the evolution of these concerns but also serves as a clarion call for careful, evidence-based deployment of AI.

AI Is Not One Monolithic Technology

A central theme in Narayanan and Kapoor’s work is the imperative to distinguish between assorted flavors of AI. The term “artificial intelligence” has become a catchall for any computational system that exhibits some semblance of “learning” or “intelligence,” but in reality, there are diverse subfields with disparate mechanics, use cases, and societal ramifications. Two broad categories stand out:

Predictive AI relies on historical, labeled data to forecast future outcomes about individuals. Examples include credit scoring (predicting loan repayment risk), recidivism risk scoring in criminal justice, and recruitment algorithms that assess an applicant’s chances of succeeding in a given role. These systems typically hinge on supervised machine learning models (logistic regression, random forests, or gradient-boosted trees) that output probability scores (e.g., the likelihood a defendant will skip bail).
Generative AI leverages large-scale unsupervised or self-supervised learning—often in the form of transformer-based architectures like GPT—to produce novel content. Generative models can write essays, generate images, compose music, or even construct code snippets. Rather than merely classifying or ranking existing data, generative AI synthesizes new material that did not exist in the training set.

These categories differ not only in their technical design but also in their failure modes. Predictive AI can suffer from biased training data, low overall accuracy, and opaque decision boundaries—failures that can lead to unfair or unethical outcomes when used for high-stakes decisions (e.g., courtroom bail determinations). Generative AI, meanwhile, can hallucinate entirely fabricated facts, produce low-quality outputs, or produce content that amplifies toxic stereotypes, because it is essentially remixing statistical patterns without rigorous factual grounding.

Understanding these distinctions is not merely an academic exercise. When a CEO or HR director hears “AI,” they may be equally excited by a predictive hiring screening tool and a generative text assistant; yet the implications of deploying each in a production environment are vastly different. To navigate this landscape responsibly, stakeholders must ask: What type of AI is this? What are its known limitations? How was it validated? What metrics truly matter—accuracy, fairness, safety, user satisfaction—and how do we measure them?

Critique of Predictive AI

Predictive AI systems have proliferated in domains where historical data abounds, but decision-making processes are complex and resource intensive. Credit scoring, loan underwriting, and criminal-justice risk assessment are prime examples. On paper, these tools promise to streamline decisions, reduce human bias, and increase efficiency. In practice, however, they often fall short of their lofty goals.

Low Accuracy and Questionable Utility

Many criminal-justice risk-scoring tools report area under the ROC curve (AUC) values in the 0.65–0.70 range, only marginally better than random chance (0.50). In a pretrial setting, where computational models predict the likelihood of a defendant failing to appear in court or committing a new crime, an AUC under 0.70 means that one in every three predictions could be wrong. When judges rely on these tools, a false positive (incorrectly predicting an elevated risk) can lead to unjustified pretrial detention; a false negative (predicting low risk when the defendant is high risk) can compromise public safety. Neither outcome is acceptable in an ethical justice system.

The Bias Dilemma

Proponents of algorithmic “debiasing” argue that predictive AI could be made fairer by adjusting thresholds or reweighting outcomes to equalize error rates across demographic groups. Attempts like the widely used COMPAS system face both legal and technical hurdles. If a model is racially disparate in its error rates, one remedy is to set different decision thresholds for different racial groups (e.g., automatically granting bail to individuals from a group with historically higher false positives). However, adjusting thresholds explicitly based on race can run afoul of civil rights laws, which prohibit decision-making based on protected attributes. Furthermore, once you tweak thresholds to equalize false-positive rates, you may inadvertently exacerbate false-negative disparities, meaning some groups face higher false negatives or false positives depending on the metric. There is no “one size fits all” fix, and each adjustment carries its own tradeoffs.

Ethical and Legal Concerns

Even if a predictive model’s architects could ensure perfectly balanced error rates across all subgroups, low overall accuracy still renders these tools ethically dubious. It is one thing to automate routine tasks (e.g., sorting emails or recommending products), but quite another to determine a person’s freedom or livelihood. In domains like lending or hiring, a mistaken prediction can cost someone their job prospects or deny them access to credit. Algorithmic opacity compounds the problem: if an applicant is denied a loan, they are rarely given a transparent explanation. They may never know which features or past behaviors drove the adverse decision. The lack of recourse heightens perceptions of unfairness and undermines public trust.

Generative AI: Promise and Peril

Generative AI has dominated headlines in recent years, fueled by the rapid ascent of models like OpenAI’s GPT series. These systems can draft blog posts, write functional code, and even craft images from text prompts with impressive flair. For knowledge workers, the promise is tantalizing imagine a teacher generating personalized lesson plans in seconds, a marketer churns out social media campaigns in hours instead of days, or a designer sketching visual concepts on the fly. Generative AI can augment creativity, automate routine writing tasks, and turbocharge ideation.

The Upside: Productivity and Creativity

Educational Applications: Educators can deploy AI to create interactive learning modules tailored to individual students. For instance, a math teacher might ask an AI to produce multiple examples of fraction-word problems at varying difficulty levels. Students can receive instant feedback, freeing teachers to focus on strategy and mentorship rather than rote drill.
Content Creation: Journalists or bloggers can leverage generative AI to draft first-pass articles, perform background research, and generate outlines. A well-tuned prompt might produce a coherent 500-word article on a given topic, which the human author can then refine. This reduces the time spent on initial drafts and allows creatives to focus on narrative flow, voice, and fact-checking.
Software Prototyping: Developers can ask AI to generate code snippets, boilerplate functions, or even entire microservices in popular languages like Python or JavaScript. While not production-ready code, these drafts can accelerate development cycles by providing a starting point that human engineers can review and enhance.

However, generative AI’s allure comes with significant caveats.

The Downside: Hallucinations, Misinformation, and Labor Exploitation

Hallucinations and Low-Quality Output: Generative models, by their very nature, “hallucinate”, producing plausible-sounding sentences that are factually incorrect or internally inconsistent. Early adopters in e-commerce marketplaces discovered that unscrupulous sellers were using AI to produce “foraging guides” and “plant identification books” that not only contained factual errors but sometimes recommended poisonous plants as edible. Consumers rely on those guides and could suffer harm.
Deepfakes and Nonconsensual Imagery: Advances in image-generation models have enabled the creation of hyper-realistic “deepfake” images and videos. Worse, some apps have emerged that allow users to create nude or sexualized images of individuals without their consent. These tools disproportionately target women, celebrities, and marginalized communities. The psychological and reputational damage inflicted by nonconsensual AI-generated imagery is staggering. Legal frameworks struggle to keep pace with the rapid evolution of such tools.
Precarious Labor behind the Scenes: Behind the shimmering interfaces of generative AI lie vast networks of human annotators, often based in developing countries, hired to label data, filter toxic content, or curate training corpora. Many of these workers operate under precarious conditions; low pay, long hours, and exposure to disturbing or traumatic content. While end users bask in the seamless experience of “just type a prompt,” the hidden labor that powers these models raises serious questions about exploitation and human welfare.

A Two-Dimensional Framework for Evaluating AI Applications

Narayanan and Kapoor propose a simple yet powerful framework for assessing any AI deployment. Imagine a two-dimensional grid:

X-axis (Functionality): How well does the AI actually perform its intended function? Does it work as claimed, partially, or not at all? The scale runs from “Snake Oil” (fails catastrophically or delivers no real value) to “Reliable Technology” (delivers on promises consistently).
Y-axis (Harm Potential): How harmful could this AI be? Harm may arise because technology malfunctions (e.g., false positives in a high-stakes context) or because it succeeds too well and is used for unethical ends (e.g., mass surveillance via facial recognition). The vertical axis runs from “Low Harm Potential” (unlikely to cause significant adverse impacts) to “High Harm Potential” (poses grave ethical, social, or legal risks).

This yields four quadrants:

Bottom-Left: Positive AI

Performs reliably and has low harm potential. Examples:
- Autocomplete in email or search bars.
- Spell checkers in word processors.
- Basic speech-to-text transcription for accessibility.
  These systems have become so ubiquitous that users barely notice them. They “fade into the background” and deliver consistent value.

Top-Left: Overhyped Snake Oil

Failing to deliver functional value yet marketed as revolutionary. Examples:
- Video-based personality assessments for hiring (no empirical backing and easily manipulated).
- Predictive policing tools that claim to forecast crime hotspots with high precision (often inaccurate and prone to bias).
  These applications generate hype but ultimately waste resources, mislead stakeholders, and can cause injustice.

Bottom-Right: Effective but Dangerous Technology

Highly functional yet carries significant ethical or social harm. Examples:
- Facial recognition deployed for mass surveillance or “minority report”-style predictions.
- Deepfake generators that produce highly realistic but false videos, enabling political misinformation or personal defamation.
  Although technically impressive, these tools can be weaponized to infringe on privacy, manipulate public opinion, or commit fraud.

Top-Right: Ideal Utilitarian AI

Rare and precious. These applications perform reliably and have a profound, positive societal impact. Examples might include:
- AI-assisted medical diagnostics where algorithms detect early signs of cancer with high accuracy, under transparent validation and regulatory oversight.
- Climate modeling helps forecast extreme weather events and guide disaster preparedness.
  Achieving this quadrant is exceedingly difficult but worth aspiring toward.

Narayanan and Kapoor emphasize that stakeholders should aim to promote bottom-left technologies, actively stamp out or “call out” top-left snake oil and carefully regulate or repurpose bottom-right tools to minimize harm. Top-right successes, while scarce, should be recognized, audited, and scaled responsibly.

The “AI Snake Oil” Call to Action

Having laid out this framework, the authors stress a clear set of imperatives for AI practitioners, regulators, and consumers:

Identify and Reject Overhyped or Harmful Applications

If an AI claim sounds too good to be true—such as a “30-second video can predict your leadership potential”, it likely is. Professionals should demand transparent validation studies, open-source code, and independent audits before deploying such tools, especially in high-stakes domains like hiring, lending, or criminal justice.

Institute Robust Guardrails for Promising Technologies

For AI that does function, say, facial recognition or speech-to-text, robust regulations and governance mechanisms are required. This includes transparency measures (e.g., disclosing when AI is used), accountability mechanisms (e.g., human oversight or appeals processes), and civil liberties protections (e.g., limiting government surveillance powers).

Address Structural Inequities

AI rarely exists in a vacuum. Its deployment often amplifies corporate power, concentrates wealth, and exacerbates societal disparities. For instance, venture capital flows into flashy generative AI startups, while underfunding workplace-augmenting applications that could benefit everyday workers. Policymakers must consider tax incentives, public funding, or regulatory frameworks that steer investment toward socially beneficial AI, such as tools that enhance worker skills or improve public services.

Promote AI Literacy and Public Engagement

End users, from hiring managers to consumers, must develop a basic understanding of AI’s capabilities and limitations. Schools, universities, and industry organizations should integrate AI literacy into curricula, teaching not just how to use AI tools but also how to critically evaluate AI claims, recognize bias, and interpret error metrics.

Foster a Culture of Continuous Stewardship

AI is not a “set it and forget it” technology. It demands ongoing monitoring, iterative feedback loops, and regular updates. Much like self-driving cars accumulate data and refine their models over time, generative models and predictive algorithms require continuous calibration, retraining, and auditing to maintain performance and fairness.

“AI as Normal Technology” Vision

One of the most compelling narratives advanced by Narayanan and Kapoor is the idea of “AI as normal technology.” Rather than succumbing to two polar extremes—rosy utopian fantasies of superintelligence or dystopian doomsday scenarios, this perspective frames AI as a long-term, incremental transformation akin to the advent of electricity, automobiles, or the internet.

Key Tenets of the “Normal Technology” Narrative

Gradual Adoption and Integration

Historically, transformative technologies take decades, if not centuries, to fully permeate society. The electric grid, for instance, first illuminated a handful of homes in the late 19th century but only became widespread by the mid-20th century. Similarly, self-driving cars won’t reach ubiquitous reliability overnight; they’ll progress through incremental improvements, pilot programs, geo-fenced deployments, and semi-autonomous features,before fully autonomous vehicles become mainstream.

Mixed Impact across Domains

Some AI applications will become seamlessly integrated into daily life, think autocomplete in search engines or voice assistants that set your calendar reminders. Others will remain ethically fraught or technically unsolved, such as algorithms predicting “pre-crime” or systems that attempt to decode brain signals. Viewing AI through a “normal technology” lens helps set realistic expectations.

Emphasis on Institutional and Regulatory Evolution

As AI advances, laws and norms will adapt in tandem. Traffic regulations for self-driving cars will evolve through pilot programs and regulatory sandboxes rather than through a single legislative decree. Privacy laws will grapple with generative deepfakes, and new statutes will emerge to govern data provenance, consent, and usage.

Collective Human Knowledge as a Limiting Factor

AI does not pop into existence fully formed; it builds upon vast repositories of human-generated data, models, and processes. Achieving “artificial general intelligence” (AGI) would require replicating centuries of collective experimentation, trial and error, and social dynamics within algorithms. This historical perspective tempers wild AGI predictions, underscoring that replicating human-level cognition involves not just scaling up data but also capturing embedded social, cultural, and ethical dimensions.

By embracing AI as a “normal technology,” we can avoid cyclical hype and disappointment. Rather than expecting overnight miracles or fearing imminent robot overlords, we recognize that AI’s most profound impacts will be subtle, diffuse, and often invisible, much like the electric motor quietly powering your household appliances.

Fireside Chat Highlights with Daron Acemoglu

Narayanan’s research culminated in a fireside chat with economist Daron Acemoglu, whose insights underscored many of the structural and normative concerns surrounding AI.

Predictive AI’s Foundational Concerns

Acemoglu emphasized that predictive AI is not merely a neutral tool; it is deeply embedded in social and power dynamics. In settings like pretrial detention or credit underwriting, algorithmic forecasts inherently reify existing inequalities. A model trained on historic arrest records may perpetuate biases against marginalized communities, because those communities have historically been over-policed and over-arrested. Even if you improve the model’s technical accuracy, you still risk entrenching unjust power asymmetries, deciding a person’s liberty based on statistical probabilities rather than individual circumstances.

Moreover, Acemoglu argued that normative questions about the legitimacy of delegating decision-making authority to algorithms remain unresolved. Even if a predictive model were 99% accurate, does that justify depriving individuals of due process? Is it ethical to base someone’s future on a probabilistic estimate? These questions extend beyond technical optimization into the realm of societal values and principles.

Generative AI’s Limited Real-World Reliability

While demos of generative AI can dazzle, Acemoglu cautioned that real-world, production-grade deployments demand near-perfect reliability, often upwards of 99.99% uptime or accuracy. He noted that early attempts to replace domain-specific software (for instance, a hypothetical “AI-driven” spreadsheet that fully automates financial analysis) floundered because the cost of a single error could be catastrophic. In contrast, a chatbot that occasionally misinterprets a grocery list may be amusing, but a financial modeling AI that miscalculates risk could precipitate billions in losses.

Acemoglu stressed that generative AI will see real breakthroughs only through sector-by-sector, iterative deployments, much like self-driving cars collecting millions of miles of data to refine their perception modules. It’s not enough to train a massive language model on the entire internet; successful integration into domains like medicine or law requires deep, co-design between AI engineers and domain experts, continuous feedback loops, and rigorous validation protocols.

Skepticism toward AGI Timelines

Dubbing rampant AGI predictions as “magical thinking,” Acemoglu reminded listeners that since the 1950s, AI luminaries have repeatedly overestimated how soon “general intelligence” would materialize. They overlooked the fact that human intelligence is not an isolated computational process but a byproduct of centuries of societal collaboration, incremental technological progress, and ethical deliberation. Without replicating these intangible human experiences—cultural norms, moral reasoning, tacit knowledge—an AI system cannot achieve genuine understanding. He advised the audience to remain skeptical of any AGI timeline that lacks a clear roadmap through the “intermediate rungs of the ladder”, the smaller milestones that bridge narrow AI tasks to broad, flexible cognition.

Pro-Worker AI and Incentive Structures

Arguably the most urgent challenge Acemoglu identified is the misalignment of market incentives. Venture capital and corporate research budgets tend to gravitate toward flashy generative AI startups, because these ventures capture headlines and promise rapid scaling. Meanwhile, “pro-worker” AI applications—tools designed to augment skills, enhance productivity, or create new job categories—often languish underfunded. For example, AI-driven tutoring systems that provide personalized feedback to students, or AI-assisted manufacturing tools that help workers optimize production lines, generate benefits for labor but do not promise the same speculative returns as a viral chatbot.

Acemoglu suggested policy interventions—tax credits for AI investments that demonstrably boost worker productivity, public funding for noncommercial AI research, and regulations that encourage redistributive outcomes. Without such measures, capitalist market dynamics will continue to prioritize short-term returns over long-term societal welfare, deepening wage stagnation and job insecurity.

Audience Q&A Takeaways

During the lively Q&A session following the fireside chat, several key themes emerged:

Hallucinations Are Inherent in Generative Models

Even with impeccable training datasets, generative models produce outputs by sampling from statistical patterns. This stochastic process inevitably yields fabricated or misleading content, commonly known as “hallucinations.” While future research may reduce hallucination rates, they will never disappear entirely. End users must adopt a critical mindset: treat AI-generated content as a draft requiring verification rather than a definitive source of truth.

Debiasing Algorithms vs. Human Judgment

While debiasing algorithms through reweighting or retraining can mitigate some disparities, these fixes often run headlong into legal and transparency constraints. For instance, using protected attributes like race or gender to correct for bias may violate anti-discrimination statutes, even if the intention is to equalize error rates. Conversely, human decision-makers, judges, loan officers, hiring managers, receive no formal training in bias mitigation and often rely on implicit heuristics. The audience debate underscored a sobering reality: both algorithmic and human decision-making processes are susceptible to bias, but our legal and regulatory frameworks are structured to address human bias more readily than algorithmic bias.

Context Matters

Blanket statements like “generative AI is better than predictive AI” are meaningless without specifying the domain. In creative writing tasks, generative models can outperform rule-based text generators. In high-stakes domains like bail decisions or medical diagnoses, predictive models—if sufficiently accurate—might offer more objective, consistent evaluations than humans. However, if the data feeding those models reflect historical injustices, the predictive AI simply reenacts bias. Listeners were reminded to ground any comparative claims in specific contexts, performance metrics, and intended uses.

Investment Allocation and “Herding” Behavior

The audience explored how the AI research community’s herd mentality—chasing after large-scale neural architectures and massive training corpora—stifles alternative approaches. Could we have breakthrough improvements if we pursued smaller, more specialized models or rule-based systems in parallel? The consensus was that diversified R&D portfolios, akin to financial portfolio theory, would enable higher “expected value” in AI innovation. Yet restructuring funding incentives remains an uphill battle in a venture capital–driven ecosystem that prizes headline-grabbing metrics (e.g., model size, parameter count) over incremental, domain-specific advances.

AI’s Impact on Wages & Productivity

AI can boost worker output—automating repetitive tasks, surfacing insights from large datasets, and enabling human-machine collaboration. But without policies to share gains equitably, productivity gains risk translating into greater profits for shareholders rather than higher wages for employees. Panelists discussed potential solutions: profit-sharing plans tied to productivity metrics, democratized ownership of AI-driven tools within companies, and targeted retraining programs financed by an AI tax on companies deploying labor-displacing systems.

Public Fear and Misconceptions

Lastly, anxiety about AI often reflects broader societal concerns—job security, income inequality, and technology-driven surveillance—rather than genuine technical limitations. Effective two-way communication is vital: AI developers must explain their systems’ real capabilities and constraints in plain language, while policymakers, journalists, and domain experts should convey nuanced, evidence-based perspectives rather than apocalyptic sound bites. A well-informed public is less likely to succumb to hype cycles and more likely to support pragmatic policies that balance innovation with social welfare.

Overarching Themes and Lessons

Drawing together these strands of discussion, several overarching themes emerge:

Distinguish Hype from Genuine Value

Too many high-stakes AI claims rest on flimsy empirical foundations. Whether its predicting job fits from a video or forecasting recidivism with token-level precision, decision-makers must demand transparent validation, peer review, and reproducible results. Without rigorous scrutiny, stakeholders risk pouring resources into applications that yield zero tangible benefit or, worse, inflict harm.

Evaluate AI by Both Functionality and Harm Potential

The two-dimensional framework—functionality vs. harm potential—provides a pragmatic roadmap for prioritizing investments, regulatory focus, and public awareness. Technologies that function reliably and pose minimal harm should be encouraged (bottom-left quadrant). Conversely, those that fail to deliver or threaten individual rights and social equity warrant strict scrutiny or outright prohibition (top-left and bottom-right quadrants).

Promote “Background AI”

The most enduring AI applications are those that operate quietly in the background—autocomplete, spam filters, basic recommendation systems, and voice assistants. These tools deliver incremental benefits that sum to meaningful productivity gains but avoid the pitfalls of overhyped, revolutionary proclamations. Recognizing AI as an invisible infrastructure empowers organizations to adopt modest, incremental innovations that deliver real value without courting controversy.

Guard Against Power Imbalances

AI can amplify existing inequalities if deployed without transparency, accountability, and thoughtful safeguards. Credit scoring algorithms trained on biased data can perpetuate systemic racism. Facial recognition used by law enforcement without oversight can erode civil liberties. To prevent such abuses, regulators should require algorithmic impact assessments, mandate explainability standards, and empower affected communities with avenues for redress.

Envision AI as a Decades-Long Evolution

Reject doomsday AGI timelines and utopian fantasies alike. Instead, embrace a “normal technology” perspective, recognizing that transformative impacts unfold gradually. Whether it’s medical AI that incrementally improves diagnostic accuracy or autonomous vehicles that slowly expand their operational domains, the real story of AI is one of steady progress, constant iteration, and evolving governance frameworks.

Final Thoughts

The journey from Narayanan’s initial discovery of video-based personality assessment tools to a comprehensive critique of modern AI reflects a broader reckoning with technological exuberance. “AI Snake Oil” serves as both a cautionary tale and a compass, reminding us that not all AI is created equal, and that hype often overshadows hard evidence. By distinguishing between predictive and generative models, assessing applications through a two-dimensional lens, and advocating for “background AI,” we can steer the AI revolution toward genuine value creation rather than empty promises or dystopian specters.

Moving forward, stakeholders must heed the call to action: reject overhyped or harmful applications, build robust guardrails around promising technologies, and invest in AI applications that empower workers and address social inequities. As Acemoglu’s insights underscore, AI’s true potential lies not in replacing human agency but in augmenting human capabilities—if we can realign incentives to reward socially beneficial outcomes rather than speculative, short-lived valuations.

In embracing AI as a “normal technology,” we acknowledge that its most profound effects will likely manifest over decades rather than overnight. By fostering continuous stewardship, through regulation, public engagement, and commitment to evidence-based deployment, we can ensure that AI’s trajectory aligns with core human values: fairness, accountability, opportunity, and collective well-being. This is neither a utopian vision nor a dystopian warning but a sober, informed roadmap toward an AI-infused future that serves not just a few, but society at large.