Where do we draw the line?

On legitimate uses of AI and open information commons, and commitment to community values

Zachary Batist, PhD

McGill University

Re-Defining Open Social Scholarship in an Age of Generative "Intelligence"

Implementing New Knowledge Environments / Canadian Society for Digital Humanities


zackbatist@archaeo.social | zackbatist | 0000-0003-0435-508X | zackbatist.info

Researchers share their work openly.
AI consumes it at scale.
Is this a new problem, or an old one made visible?

The tension we’re told exists

Researchers are obligated to share their work openly — for public benefit.

That work is being scraped to train AI models — for someone else’s benefit.

We are told this requires new rules, new laws, new technical processes.

But this framing takes the status quo for granted.
Translating data from original contexts to secondary applications has always been lossy.
This predates AI.

Commons are social systems

Researchers have always participated in information commons:

  • Citing prior work, sharing data, building on others’ findings
  • These commons are scaffolded by norms and commitments — not just platforms and mandates
  • Instilled through participation in communities of practice

Technical camp

Infrastructures & mandates
Information as disembodied
Problems are legalistic in nature

Humanistic camp

Reforming science as inclusive
Knowledge as socially situated
Problems are rooted in culture

The technical camp dominates policy. The humanistic camp understands the actual problem.

The “no boundaries” claim is a happy lie

“Universal access” sounds egalitarian.

But even the most committed open advocates expect citation, attribution, appropriate reuse.

Commons have never been, and don’t need to be, egalitarian.
They are governed — who contributes, who accesses, in what ways.

The boundary is not a wall. It’s a sniff test
judged by behavior, care, and commitment, not credentials alone.

Case 1: PDF piracy

Aaron Swartz
Shared academic papers widely
Lionized

Meta
Scraped LibGen at scale
Condemned

Same act. Radically different valence.

One serves the community — provides access to underserved members, contributes to the commons.

The other extracts from it — a predatory actor in it for themselves alone.

The difference is relational, not technical or legal.

Case 2: The Internet Archive

The National Emergency Library (2020) — granting access to books during Covid-19.

  • Informed by professional archival principles ✓
  • Driven by confidence in technical solutions to social problems ✗
  • A genuinely ambiguous case — neither fully inside nor outside

It applied a technical resolution to an underlying social problem.

The manner of the intervention mattered as much as the outcome.

Case 3: Big data archaeology

Computer scientists / data scientists / physicists publish high-profile papers “solving” archaeological problems with large integrated datasets.

Outside the community Inside the community
Treatment of data Neutral, abstract, recombinant Storied, situated, context-laden
Epistemic stance Data taken as source of universal truth Contributing one point of view
Community response Intuitively rejected Treated with consideration

The difference is not credentials. It’s care

understanding a dataset’s circumstances of creation comes from community membership.

AI as amplifier, not origin

We put our data out there. We made it FAIR.
We imagined certain use-cases. Certain users.

AI scraped it for different ends.

This is structurally similar to UNESCO World Heritage claims:
“for all humanity” — often at the expense of those who inhabit and maintain the site.

The problem isn’t AI per se.
It’s AI deployed without community membership
— without the gradual participation that builds the tacit knowledge to use data responsibly.

Is AI being done by humanists,
or by engineers playing humanist
for an afternoon?

Humanities defined by practice, not data type

Humanities is not defined by the kinds of data it uses.

It’s defined by the care and approaches we take —
the training, the community embedding, the commitments.

You can’t substitute a flashy method for a good question,
a rigorous orientation, or an open mind.

Tool-fetishizing has always been a dead end.

This is actually a case for humanists to engage with AI —
critically, carefully, on their own terms —
because humanists are precisely well-equipped to do so.

Where do we draw the line?

Re-defining open social scholarship is not about
redrawing the line from scratch.

It’s about making the existing lines more legible
more porous where appropriate,
defended where they matter.

The line isn’t gone.
It’s ours to tend.

Thank you

These slides and the abstract are available at:
zackbatist.info/inke-2026