<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Zack Batist</title>
<link>https://zackbatist.info/posts.html</link>
<atom:link href="https://zackbatist.info/posts.xml" rel="self" type="application/rss+xml"/>
<description>Rambles about archaeological practice, open science, and collaboration
</description>
<generator>quarto-1.8.25</generator>
<lastBuildDate>Wed, 10 Dec 2025 05:00:00 GMT</lastBuildDate>
<item>
  <title>A few new papers from 2025</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2025-12-10-a-few-new-papers-from-2025.html</link>
  <description><![CDATA[ 





<p>Two additional papers deriving from my dissertation have been successfully peer reviewed and/or published during 2025. These complement <strong>On the Value of Informal Communication in Archaeological Data Work</strong>, which was published by Open Archaeology last year <a href="https://doi.org/10.1515/opar-2024-0014">[DOI: 10.1515/opar-2024-0014].</a></p>
<p><strong>Locating Creative Agency in Archaeological Data Work</strong><br>
<a href="https://doi.org/10.17613/8eqy4-n1m82">DOI: 10.17613/8eqy4-n1m82</a> / <a href="https://zackbatist.info/locating-creative-agency/">Preprint</a> / <a href="https://github.com/zackbatist/locating-creative-agency">GitHub</a></p>
<p>Through comparison of illustration, photography, photogrammetry practices, and different approaches for collecting and managing spatial data, I identified the roles that fieldworkers and other actors play while constituting data. I submitted this paper for open review through Peer Community in Archaeology, which was a great experience! You can read the reviews <a href="https://doi.org/10.24072/pci.archaeo.100608">at the PCI-archaeo website</a>.</p>
<p><strong>Balancing situated and objective representations in archaeological fieldwork</strong><br>
<a href="http://doi.org/10.1017/aap.2025.10101">DOI: 10.1017/aap.2025.10101</a> / <a href="https://zackbatist.info/fuzzy-concrete/">Preprint</a> / <a href="https://github.com/zackbatist/fuzzy-concrete">GitHub</a></p>
<p>This paper has just been published in Advances in Archaeological Practice. It articulates tensions between systemic warrants to formalize archaeological observations and the embodied experiences of actually collecting data during fieldwork. I also include the peer-reviews in the <a href="https://github.com/zackbatist/fuzzy-concrete/blob/main/sections/13_peer-review.qmd">git repository</a>.</p>
<p><strong>Data Management is People Management: On Abstraction of Data and Labor in Archaeological Projects</strong><br>
DOI: TBD / <a href="https://zackbatist.info/data-tech-poli-arch/">Preprint</a> / <a href="https://github.com/zackbatist/data-tech-poli-arch">GitHub</a></p>
<p>I recently presented another outcome from my dissertation at the <a href="https://uib.no/en/nia/179128/data-and-technology-politics-archaeology">Data and Technology Politics in Archaeology Workshop</a> at the Norwegian Institute at Athens last week. A revised version of this will be published in the workshop proceedings sometime next year.</p>



 ]]></description>
  <category>publication</category>
  <category>data work</category>
  <category>archaeological fieldwork</category>
  <category>science studies</category>
  <guid>https://zackbatist.info/posts/2025-12-10-a-few-new-papers-from-2025.html</guid>
  <pubDate>Wed, 10 Dec 2025 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Scholarly metadata pertaining to monographs and grey literature</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2025-08-18-open-scholarly-metadata/</link>
  <description><![CDATA[ 





<p>Since the spring, I’ve been working with Julie Lund and Isak Roalkvam on a project to analyze the development of ideas concerning the origins of the Viking Age through a bibliometric analysis of a large corpus of published work. The project itself is very interesting, and I will post more about it when we get some significant findings, but now I just want to share some frustrations and challenges we’ve been experiencing, which may represent a bigger problem with <a href="https://www.cambridge.org/core/elements/philosophy-of-open-science/0D049ECF635F3B676C03C6868873E406#A-sec-6">representativeness of scholarly pluralism in open science</a>.</p>
<p>Basically, we want to explore the web of citations being made in hundreds of scientific works, and in order to do this we need to obtain information about each work and about the works that they are citing. We will then analyze the network analysis of bibliographic citations, while also incorporating some additional information such as affiliations as well as qualitative evaluation of topics covered by each paper.</p>
<p>However, we’re facing significant problems obtaining reliable scholarly metadata about monographs, chapters of edited volumes, and grey literature — which makes up a significant portion of archaeological literature. So in this post I outline the series of decisions we made and the roadblocks we experienced. We still haven’t quite arrived at a solution, so in a way what you’re about to read is an articulation of a work in progress, but one which reveals some inadequacies with the open infrastructures that we have built and a testament to the overconfidence we ascribe to them.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="toot.png" class="lightbox" data-gallery="quarto-lightbox-gallery-1" title="Metadata is only useful if it exists."><img src="https://zackbatist.info/posts/2025-08-18-open-scholarly-metadata/toot.png" class="img-fluid figure-img" alt="Metadata is only useful if it exists."></a></p>
<figcaption>Metadata is only useful if it exists.</figcaption>
</figure>
</div>
<section id="tapping-scholarly-metadata-resources" class="level1">
<h1>Tapping scholarly metadata resources</h1>
<p>The most common approach in bibliometric studies, or in meta-research that rely in whole or in part on scholarly metadata, is to access <a href="https://www.crossref.org/">Crossref</a>. Crossref is an open resource that registers information from virtually all modern academic publishers. It is one of a few such registries, existing alongside <a href="https://scopus.com/">Scopus</a>, <a href="https://www.webofscience.com/">Web of Science</a> and <a href="https://scholar.google.ca/">Google Scholar</a>. What makes Crossref distinct is that it’s a registry rather than an original source of data — publishers enter required and optional metadata into Crossref, and Crossref draws associated records together using a unique digital object identifier. A similar process applies for <a href="https://datacite.org/">DataCite</a>, which specializes in data resources, as opposed to published journal articles, and <a href="https://pubmed.ncbi.nlm.nih.gov/">PubMed</a>, which specializes on topics relating to medicine. Several additional services like <a href="https://openalex.org/">OpenAlex</a>, <a href="https://www.lens.org/">Lens</a> and <a href="https://www.semanticscholar.org/">Semantic Scholar</a> run on top of Crossref and other related data sources, but they are fundamentally based on the same data submitted by publishers. <a href="https://scholar.social/@steko/114873209915515620">This is why your Zotero library is so messy.</a></p>
<section id="problems-of-scope" class="level2">
<h2 class="anchored" data-anchor-id="problems-of-scope">Problems of scope</h2>
<p>Crossref is the most open scholarly metadata resource, but it’s also quite limited in its scope: it’s primarily comprised of journal articles, and has very limited coverage of monographs, book chapters, pre-prints, and experimental genres and styles. Using Crossref as data source for meta-research is therefore much more appropriate for analysis based in science, technology, engineering and medicine (STEM), since these fields rely fairly uniformly on journal publishing. On the other hand, the social sciences and humanities (SSH) regularly publish their work as whole monographs or as chapters within edited volumes, publish in smaller journals or government-hosted “grey literature” venues not registered by Crossref, and participate in experimental publishing practices. In other words, the range of scholarly works is much broader in SSH than the scopes that delimit Crossref and other scholarly metadata registries.</p>
<p>Since the corpus we want to compile includes monographs, chapters in edited volumes and grey literature, we are unable to rely on Crossref and other scholarly metadata resources alone. Moreover, after briefly investigating the field of scholarly metadata, I was left extremely disappointed by the prospect that we would ever be able to obtain reliable metadata at scale pertaining to these missing kinds of resources. It’s especially difficult to obtain the lists of references cited in each of these resources..</p>
<p>While Google Scholar has by far the best range of resources, it has a few significant drawbacks:</p>
<ul>
<li>There is no API. It can be scraped using some workarounds (see <a href="https://harzing.com/resources/publish-or-perish/manual/using/data-sources/google-scholar">the documentation provided by Publish of Perish</a>), but this is not ideal.</li>
<li>It does not provide unique identifiers such as DOI, and these need to be added through reference to the Crossref API, which re-introduces the problem of scope.</li>
<li>It does not include lists of references cited, which is a core set of information we need.</li>
</ul>
<p>With regards to Scopus and Web of Science, based on my own brief experiences, they do include a significant amount of data on monographs and book chapters, but the quality seems very inconsistent. Quality is especially quite low for monographs not written in English, which is the case for much of the literature included in our analysis of largely Scandinavian-led archaeology.</p>
</section>
<section id="open-citations" class="level2">
<h2 class="anchored" data-anchor-id="open-citations">Open Citations</h2>
<p>A key aspect of our study is analysis of citations, and it’s fantastic that the <a href="https://opencitations.net/">Open Citations</a> database exists to support this kind of work. However, initial testing showed significant discrepancies between the numbers of citations in Open Citations, Crossref and Google Scholar. So for the sake of consistency, we decided it would be best to rely on one source of information; i.e., if we are using Crossref, use the information provided by Crossref in all respects.</p>
<p>Moreover, although I haven’t tested it explicitly, it seems that Open Citations primarily indexes <em>recent</em> journal articles, similarly to Crossref. This may significantly impact the accuracy of its data, especially in fields that rely on monographs and grey literature.</p>
<p>Web of Science also includes a significant amount of data about bibliographic references made by various works, but like with other aspects of its database, the scope of this facet, especially when it comes to books, is inconsistent.</p>
<p>I was also referred to look at OpenAlex to investigate webs of references. However, it is now well documented that this aspect of the Open Alex database is extremely deficient and unreliable <span class="citation" data-cites="alperin2024 culbert2025">(see Alperin et al. 2024; Culbert et al. 2025)</span>.</p>
</section>
</section>
<section id="extracting-metadata-from-pdfs" class="level1">
<h1>Extracting metadata from PDFs</h1>
<p>To work around these issues, we tried circumventing scholarly metadata resources and looking directly into the texts themselves, especially monographs and grey literature whose coverage is most significantly lacking. Specifically, we experimented with <a href="https://grobid.readthedocs.io/">Grobid</a>, a machine learning system designed to extract scholarly metadata about individual files and about the references they cite.</p>
<p>Grobid is actually quite good at reading PDFs of journal articles, but generally fails to produce reliable data for monographs and grey literature. This is because of the training data that informs the machine learning algorithm: it is only designed to work with journal articles. It’s also not trained on what the maintainer refers to as “humanities style references”, which may refer to references to books, but may also include footnote-style references or references that include <em>ibid</em>, <em>id</em> or other Latin abbreviations beyond <em>et al</em>. Moreover, it is primarily trained on a corpus of works written in the English language. This is evident through discussion in this GitHub issue posted by a team looking to extract information from dissertations: <a href="https://github.com/kermitt2/grobid/issues/809" class="uri">https://github.com/kermitt2/grobid/issues/809</a>. Of course, this is a matter of training the model to pick up on these things, but generally speaking there is so much diversity in SSH works and citation styles that I will always be less confident in the results.</p>
<p>One significant aspect of the Grobid tool is the ability to “<a href="https://grobid.readthedocs.io/en/latest/Consolidation/">consolidate</a>” the reference, which effectively normalizes the extracted records against the Crossref database, or against a combination of the Crossref, PubMed and <a href="https://www.inist.fr/services/acceder/istex/">ISTEX</a> databases. However this simply re-introduces the same biases as mentioned above. I did create a <a href="https://github.com/kermitt2/grobid/issues/1322">GitHub Issue</a> to inquire about the feasibility of consolidating against Google Scholar, and I regret that I haven’t had much time to reach out for more in depth support, which the maintainers have kindly offered.</p>
</section>
<section id="moving-forward" class="level1">
<h1>Moving forward</h1>
<p>So right now we’re at a point where we need to reckon with what we really need to get the job done. Realistically, to construct networks, we only really need unique identifiers for each record, not the full record. So that might simplify our processes. We may also simply sample the dataset based on preliminary exploratory findings.</p>
<p>Ideally, we would be able to do something like what Alex Brandsen <span class="citation" data-cites="brandsen2023">(2023)</span> accomplished: extracting specific kinds of information from archaeological publications using machine learning techniques. But neither of us really has that kind of practical expertise, at least not yet, anyway.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="brandsen.png" class="lightbox" data-gallery="quarto-lightbox-gallery-2" title="Screenshot from @brandsen2023 highlighting named entity recognition detected in archaeological publications."><img src="https://zackbatist.info/posts/2025-08-18-open-scholarly-metadata/brandsen.png" class="img-fluid figure-img" alt="Screenshot from Brandsen (2023) highlighting named entity recognition detected in archaeological publications."></a></p>
<figcaption>Screenshot from <span class="citation" data-cites="brandsen2023">Brandsen (2023)</span> highlighting named entity recognition detected in archaeological publications.</figcaption>
</figure>
</div>
<p>Overall, the situation is kind of challenging, but potentially very rewarding too. Grappling with these issues makes the difference between a rather uniformly lazy meta-science approach and good social scientific research. It is not acceptable for us to simply pick up the data that’s already available, since unfortunately this includes significant biases that would severely hinder our findings. So we need to actively make the data work for us so that they address our research questions, rather than do things the other way around.</p>



</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-alperin2024" class="csl-entry">
Alperin, Juan Pablo, Jason Portenoy, Kyle Demes, Vincent Larivière, and Stefanie Haustein. 2024. <span>“An Analysis of the Suitability of OpenAlex for Bibliometric Analyses.”</span> April 26, 2024. <a href="https://doi.org/10.48550/arXiv.2404.17663">https://doi.org/10.48550/arXiv.2404.17663</a>.
</div>
<div id="ref-brandsen2023" class="csl-entry">
Brandsen, Alex. 2023. <span>“Information Extraction and Machine Learning for Archaeological Texts.”</span> In <em>Discourse and Argumentation in Archaeology: Conceptual and Computational Approaches</em>, edited by Cesar Gonzalez-Perez, Patricia Martin-Rodilla, and Martín Pereira-Fariña, 229–61. Quantitative Archaeology and Archaeological Modelling. Cham: Springer International Publishing. <a href="https://doi.org/10.1007/978-3-031-37156-1_11">https://doi.org/10.1007/978-3-031-37156-1_11</a>.
</div>
<div id="ref-culbert2025" class="csl-entry">
Culbert, Jack H., Anne Hobert, Najko Jahn, Nick Haupka, Marion Schmidt, Paul Donner, and Philipp Mayr. 2025. <span>“Reference Coverage Analysis of OpenAlex Compared to Web of Science and Scopus.”</span> <em>Scientometrics</em> 130 (4): 2475–92. <a href="https://doi.org/10.1007/s11192-025-05293-3">https://doi.org/10.1007/s11192-025-05293-3</a>.
</div>
</div></section></div> ]]></description>
  <category>scholarly metadata</category>
  <category>meta-research</category>
  <category>bibliometrics</category>
  <category>machine learning</category>
  <category>BibVik</category>
  <guid>https://zackbatist.info/posts/2025-08-18-open-scholarly-metadata/</guid>
  <pubDate>Mon, 18 Aug 2025 04:00:00 GMT</pubDate>
</item>
<item>
  <title>AI and extraction from the open scientific commons</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2025-05-05-ai-and-extraction-from-the-open-scientific-commons.html</link>
  <description><![CDATA[ 





<p>There’s a pretty clear tension between researchers’ obligations to publish their work (ideally, in a widely accessible manner) and the use of these materials to train AI. On one hand, researchers produce knowledge for communal benefit, and on the other hand their work is being used for some pretty egregious purposes. A lot has already been written about this tension, but I’m not really satisfied with any explanations or recommendations I’ve read so far. Chief among my concerns is that an impasse is often contrived by framing it through technical, legal or procedural outlooks which take for granted several assumptions about the status quo. So I’ll jump in with my own take, as a scholar of sicentific practice whose research is about the formation and management of information commons among scientists.</p>
<p>I don’t see this as a novel disruption imposed by the AI industry, but a continuation of a broader social phenomenon. Specifically, I look at this in terms of shifting social norms with regards to the governance of knowledge commons. When researchers do their work, they produce, rely on, or otherwise participate in information commons. Moreover, their engagement with the commons is scaffolded by technical and administrative systems and by collaborative norms and expectations.</p>
<p>For instance, while citing sources, researchers will access prior work made available to them by their library or on the publisher’s website, they identify — based on norms established through their earlier education — when and how to cite sources, and if they fail to comply with these norms they are either corrected by reviewers, criticized for bad behaviour or formally sanctioned by their institutions on charges of plagiarism. Researchers expect to be able to access and cite any prior work produced in their field, and expect to be cited when their contributions are being used by others. Contributing to and extracting from the information commons of scholarly literature is therefore scaffolded by commitments to a collective enterprise, which constitute norms and expectations instilled through participation in a community of practice.</p>
<p>Moreover, it should be emphasized that commons of all kinds do not have to be, and usually are not, egalitarian. Communities devise norms that determine who can access them and in what ways. As such, the boundaries that effectively limit access are essentially social in nature, but are enforced through technical and legalistic mechanisms. In other words, access to a commons is deemed either acceptable or unacceptable based on the actors’ relationship to the commons, and more specifically whether they commit to the norms that govern access.</p>
<p>With regards to AI’s access to academic research outputs, a significant concern is that the commitments and values which govern access are heterogeneous and in flux, especially in light of the evolution of the open science movement. Although this is probably a simplification, I consider there to be two primary camps in open science. One is driven by the design of technical infrastructures and national policy mandates that facilitate and enforce information sharing. This is driven by a transactional vision of information sharing, and considers information as something that can be disembodied, recombined, and easily recontextualized. The other camp is more concerned with reforming science as a more inclusive humanistic enterprise. This means distributing material resources more equitably and removing barriers to participate as a scientist. Whereas the former considers the problems that science faces as technical and legalistic in nature, the latter recognize the root social problems that underlie the mechanisms through which those problems are enforced.<sup>1</sup></p>
<p>The problem is that open infrastructures and policy mandates, which are largely devised by the technical camp, are incommensurate with how science actually works in practice, and impose new commitments for participating in the commons that are not valued by the communities that actually make these commons possible.<sup>2</sup> And to a greater extreme, they undermine the established norms that research communities have developed for themselves.<sup>3</sup> Specifically, community norms, which previously served to establish the boundaries that govern who can access the commons and in what ways, have essentially been undermined through claims of universal access, or claims that there should be no boundaries whatsoever.</p>
<p>But this claim of removing boundaries is a happy lie we tell ourselves — even the most hardcore open bros expect their work to be cited, and often want their work to be used in a way that is commensurate with their intentions, and not misrepresented. These are very reasonable expectations, and most researchers will adhere to them due to their common upbringing within a community of practice that instilled these norms. But outside actors who are not familiar with this decorum, or who simply refuse to adhere to these rules, are acting out of line, at least in the minds of those who expect or who are trying to foster respectful forms of community engagement.</p>
<hr>
<p>Determinig who counts as an outsider or an insider does matter. However, these distinctions are slippery and shifting. Moreover, they are not really tied to identity, but to the manner in which they engage with the community and its norms and values. It could be helpful to complicate a few common dichotomies in order to explore this further.</p>
<p>For example, we might ask about when it is or is not acceptable to enforce or not enforce copyright. Academics share PDFs of their papers all the time even when they have no legal right to do so, and at the same time react intensely when Meta engages in the same practices. However these are not the same things: academic piracy is deemed good when it provides access to underserved community members, and bad when it is purely extractive.<sup>4</sup> In other words, the reason why Aaron Shwartz is lionized for his piracy is due to his service to the research community, whereas Meta is an outside and threatening actor in it for themselves and themselves alone. One is a community member who enhances the commons by contributing to it, the other is a notoriously predatory entity that extracts while providing little in return. Another more ambiguous example that lies between these extremes is the Internet Archive, whose work is both informed by professional archival principles while also being driven by a tech-bro attitude regarding how to resolve fundamental questions concerning access to copyrighted work; in developing the National Emergency Library to grant access to reading material during the Covid-19 pandemic, they succumbed to a technical means of resolving an underlying social problem.</p>
<p>Another example is a scenario that happens from time to time, when computer scientists or physicists publish high-profile papers in <em>Science</em> or <em>Nature</em> that claim to “solve” archaeological problems using large and integrated datasets that they know nothing about. Any archaeologist who reads these works will intuitively reject them based on the fact that the authors take the data at face value, without understanding the complicated and storied histories of the datasets they rely on, which can only be truly understood through experience working as an archaeologist. And when archaeologists publish similar papers, with care and concern for evaluating the data as potentially mismatched with their intended analytical use-case, and while taking into account ethical and epistemoligical concerns, these works are considered more legitimate and are treated with kinder consideration. The difference is that in the latter scenario, archaeologists are engaging with the data in meaningful ways, and account for the decisions, actions and circumstances of the data’s creation, which they understand through their common upbringing as archaeologists. Computer scientists, on the other hand, view the data as neutral and abtract representations that can be mixed and matched with ease, which is appealing in that field — but any archaeologist knows that this is not true and that pretending that it is produces bad or wrong research outcomes.</p>
<p>Another emerging ambiguity it using AI locally and on-device, while respecting privacy, and while using it as a tool to address legitimate research questions — in other words, using AI within the bounds of established professional decorum, values and principles, as determined by a discipline or community of practice. Using AI for the sake of it, or without care for its impact on the work, is unacceptable due to the sense that this behaviour signals a disconnection with disciplinary norms and expectations for the sake of easy hype, which is typically celebrated by AI industry shills and marks you as serving their ends.</p>
<hr>
<p>I’m not really sure how to end this post. It’s tempting to apply the label of extraction, but I don’t think I know enough about this emerging framework to make that connection. And I think that, at the same time, extraction is a bit of a misnomer, since it draws an insider/outsider dichotomy that I don’t believe is consistently warranted. We must also acknowledge that <em>we did this</em>, we put our data out there, we linked them, we made them FAIR.</p>
<p>But it does seem that the treatment of scholarly commons as materials that can be justifiably exploited in dependent on the notion of universal access. It brings to mind the idea of UNESCO World Heritage sites, which claim exotic built environments for the sake of all humanity, often at the expense of those who are made to live within and around them.<sup>5</sup> When we make our data Findable, Accessible, Interoperable and Reusable, we imagine certain outcomes and use-cases, we anticipate who we are making our data FAIR for. But it’s important to recognize that the infrastructures we rely on to share our data have distinct and overlapping goals. We need to ask what open science is for, and who it is for. In other words, we need to consider the scientific commons as a social commons, which include expectations and boundaries.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>This is evident in the mechanisms that each employ to try and resolve their target of concer: the former see publishing as the business of typesetting and copyright law, which they could hack to render moot using automated publishing workflows and by encouraging use of open licensing agreements; viewed as merely technical systems, these could be resolved through technical means. Whereas the latter tend to be concerned with experimental publishing and pushing the boundaries regarding what constitutes legitimate media for scholarly commmunication, collaboration evaluation and review. One is shallow, the other deep.↩︎</p></li>
<li id="fn2"><p>As an aside, it’s frustrating that this is commonly framed as the culture not yet having “cought up” to the brave new world of total open science.↩︎</p></li>
<li id="fn3"><p>Whether or not this is warranted is topic for another discussion — but the fact <em>that</em> this is happening is important for framing the concern over AI’s access to research outputs.↩︎</p></li>
<li id="fn4"><p>I wonder if this would have been framed differently if Meta decided to seed the libgen torrents or take a stand against restrictive copyright laws altogether, rather than claiming that their use-case is exceptional.↩︎</p></li>
<li id="fn5"><p>This also brings to mind how digitally-crafted reconstructions of the Palmyra gate have been shopped around by non-archaeologists as a form of <a href="https://doi.org/10.1017/S1380203820000239">digital colonialism</a>.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>open science</category>
  <category>commons</category>
  <category>ai</category>
  <guid>https://zackbatist.info/posts/2025-05-05-ai-and-extraction-from-the-open-scientific-commons.html</guid>
  <pubDate>Mon, 05 May 2025 04:00:00 GMT</pubDate>
</item>
<item>
  <title>Migrating from Wordpress</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2025-03-09-migrating-from-wordpress.html</link>
  <description><![CDATA[ 





<p>I’m moving my blog from WordPress to a <a href="https://quarto.org/">quarto</a>-based system, which I’m already using to maintain other aspects of my professional website. The blog is now located at <a href="https://zackbatist.info/posts">zackbatist.info/posts</a>.</p>
<p>I’m doing this for a few reasons:</p>
<ul>
<li>To make my site more consistent from a visitor perspective</li>
<li>To match my other writing habits and workflows</li>
<li>To save some money on VPS hosting</li>
<li>I’ve been having some weird database corruption issues</li>
<li>All the WordPress drama</li>
</ul>
<p>This means that I’ll no longer be using the <a href="https://blog.zackbatist.info">blog.zackbatist.info</a> subdomain, and all existing links will be broken. I reformatted all the older posts and added the existing URLs to the new versions’ metadata, but this won’t actually do any active redirection.</p>
<p>I’m also shutting down my mediawiki instance (<a href="https://wiki.zackbatist.info">wiki.zackbatist.info</a>), which I was barely using to begin with. I consolidated and refactored all the pages worth keeping and will continue to write additional notes at <a href="https://zackbatist.info/notes">zackbatist.info/notes</a>.</p>
<p>Please update your RSS feeds if you want to keep following!</p>



 ]]></description>
  <category>meta</category>
  <category>writing</category>
  <category>website</category>
  <guid>https://zackbatist.info/posts/2025-03-09-migrating-from-wordpress.html</guid>
  <pubDate>Sun, 09 Mar 2025 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Starting as a postdoc at McGill</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2024-12-29-mcgill-postdoc.html</link>
  <description><![CDATA[ 





<p>I’m delighted to finally share that I’ll be starting as a postdoc at McGill University in January!</p>
<p>I’ll be joining the team at the Covid-19 Immunity Task Force to investigate data-sharing initiatives in epidemiological research, including the <a href="https://databank.citf.mcgill.ca/">CITF Databank</a>. My project will articulate the myriad social, technical, institutional and epistemic mechanisms that scaffold different approaches to data-sharing, and identify factors that enhance the value of data-sharing initiatives. Although this strays a little from my roots in archaeology, it’s actually a natural fit to continue exploring data-sharing as a social, collaborative and value-laden experience.</p>
<p>I set up a little website where I’m posting my research protocols and reflections on my work as it unfolds, and you can follow along at <a href="https://zackbatist.info/CITF-Postdoc">zackbatist.info/CITF-Postdoc</a>.</p>



 ]]></description>
  <category>CITF</category>
  <category>data-sharing</category>
  <category>postdoc</category>
  <category>science studies</category>
  <guid>https://zackbatist.info/posts/2024-12-29-mcgill-postdoc.html</guid>
  <pubDate>Sun, 29 Dec 2024 05:00:00 GMT</pubDate>
</item>
<item>
  <title>New Paper: Informal communication and archaeological data work</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2024-10-01-informal-communication-and-archaeological-data-work.html</link>
  <description><![CDATA[ 





<p>The first peer-reviewed paper deriving from my dissertation is finally published in Open Archaeology! It showcases qualitative research on scholarly communication within archaeological projects — specifically the role of informal communication styles in archaeological knowledge production, and how they complement more formally structured documentary media.</p>
<p><strong>On the Value of Informal Communication in Archaeological Data Work</strong><br>
<a href="https://doi.org/10.1515/opar-2024-0014">https://doi.org/10.1515/opar-2024-0014</a></p>
<p>Archaeological data simultaneously serve as formal documentary evidence that supports and legitimizes chains of analytical inference and as communicative media that bind together scholarly activities distributed across time, place, and social circumstance. This contributes to a sense of “epistemic anxiety,” whereby archaeologists require that data be objective and decisive to support computational analysis but also intuitively understand data to be subjective and situated based on their own experiences as participants in an archaeological community of practice. In this article, I present observations of and elicitations about archaeological practices relating to the constitution and transformation of data in three cases in order to articulate this tension and document how archaeologists cope with it. I found that archaeologists rely on a wide variety of situated representations of archaeological experiences – which are either not recorded at all or occupy entirely separate and unpublished data streams – to make sense of more formal records. This undervalued information is crucial for ensuring that relatively local, bounded, and private collaborative ties may be extended beyond the scope of a project and, therefore, should be given more attention as we continue to develop open data infrastructures.</p>



 ]]></description>
  <category>abstract</category>
  <category>formality</category>
  <category>notions of data</category>
  <category>open data</category>
  <category>publication</category>
  <category>science studies</category>
  <guid>https://zackbatist.info/posts/2024-10-01-informal-communication-and-archaeological-data-work.html</guid>
  <pubDate>Tue, 01 Oct 2024 04:00:00 GMT</pubDate>
</item>
<item>
  <title>New paper: Exploring collaborative practices in archaeological software development</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2024-07-18-exploring-collaborative-practices-in-archaeological-software-development.html</link>
  <description><![CDATA[ 





<p>I’m happy to announce that <a href="https://joeroe.io/">Joe Roe</a> and I just published a paper in Internet Archaeology that explores collaborative practices in archaeological open source software development. This paper has been in development for a while, and we’re glad to finally release our work.</p>
<ul>
<li>Version of record: <a href="https://doi.org/10.11141/ia.67.13">https://doi.org/10.11141/ia.67.13</a></li>
<li>Postprint: <a href="https://zackbatist.info/openarchaeo-collaboration">https://zackbatist.info/openarchaeo-collaboration</a></li>
<li>Research compendium: <a href="https://github.com/zackbatist/openarchaeo-collaboration">https://github.com/zackbatist/openarchaeo-collaboration</a></li>
<li>Zenodo archive: <a href="https://zenodo.org/records/12752060">https://zenodo.org/records/12752060</a></li>
</ul>
<p>To briefly summarize: we investigated the under-explored practices involved in research software engineering in archaeology, with an emphasis on collaborative experiences involved in open source software development. We identified not only what kinds of software archaeologists are making, but how archaeologists create these tools as part of a broader community of practice. We conducted exploratory data analysis and network analysis on data from <a href="https://open-archaeo.info/">open-archaeo</a>, supplemented with additional data pulled from the GitHub API, to trace how archaeologists use various languages, forges, licenses and supporting features (e.g.&nbsp;issues, stars, pull requests), and to discern trends regarding projects’ longevity, degree of community participation, and overall structure of collaborative ties.</p>
<p><strong>Open archaeology, open source? Collaborative practices in an emerging community of archaeological software engineers</strong></p>
<p>Surveying the first quarter-century of computer applications in archaeology, Scollar (1999) lamented that the field relied almost exclusively on “hand-me-down” tools repurposed from other disciplines. Twenty five years later, this is no longer the case: computational archaeologists often find themselves practicing the dual roles of data analyst and research software engineer (Baxter et al.&nbsp;2012; Schmidt and Marwick 2020), developing and applying new tools that are tailored specifically to archaeological problems and archaeological methods. Though this trend can be traced to the very earliest days of the field (Cowgill 1967), its most recent manifestation is distinguished by its apparent embrace of practices from free and open source software. Most prominently, since around 2015, there has been a rapid uptake of workflow tools designed for open source development communities, such as the version control system git and associated online source code management platforms (e.g.&nbsp;GitHub, GitLab). These tools facilitate collaboration among developers and users of open source software using patterns that can diverge quite radically from conventional scholarly norms (Tennant et al.&nbsp;2020).</p>
<p>In this paper, we investigate modes of collaboration in this emerging community of practice using ‘open-archaeo’, a curated list of archaeological software, and data on the activity of associated GitHub repositories and users. We conduct an exploratory quantitative analysis to characterize the nature and intensity of these collaborations and map the collaborative networks that emerge from them. We document uneven adoption of open source collaborative practices beyond the basic use of git as a version control system and GitHub to host source code. Most projects do make use of collaborative features and, through shared contributions, we can can trace a collaborative network that includes the majority of archaeologists active on GitHub. However, a majority of repositories have 1–3 contributors, with only a few projects distinguished by an active and diverse developer base. Direct collaboration on code or other repository content—as opposed to the more passive, social media-style interaction that GitHub supports—remains very limited. In other words, there is little evidence that archaeologists’ adoption of open source tools (git and GitHub) has been accompanied by the decentralized, participatory forms of collaboration that characterise other open source communities. On the contrary, our results indicate that research software engineering in archaeology remains largely embedded in conventional professional norms and organizational structures of academia.</p>



 ]]></description>
  <category>abstract</category>
  <category>foss</category>
  <category>network analysis</category>
  <category>open archaeology</category>
  <category>open science</category>
  <category>open-archaeo</category>
  <category>science studies</category>
  <guid>https://zackbatist.info/posts/2024-07-18-exploring-collaborative-practices-in-archaeological-software-development.html</guid>
  <pubDate>Thu, 18 Jul 2024 04:00:00 GMT</pubDate>
</item>
<item>
  <title>Pruning open-archaeo</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2024-02-02-pruning-open-archaeo.html</link>
  <description><![CDATA[ 





<p>I’ve been working with Joe Roe to analyze open-archaeo to better understand the community of practice around archaeological software development. This prompted us to go back and remove records that are arguably out of scope. We identified a few dozen items that don’t really fit with our objective of assembling software made by archaeologists for archaeological purposes. Many of these are general-purpose tools that happen to be relevant to archaeological use-cases, and whose contributors and primary maintainers are largely, if not entirely, comprised of non-archaeologists (e.g.&nbsp;<a href="https://github.com/tschoonj/xraylib">xraylib</a>, <a href="https://github.com/opengisch/QField">QField</a>). Others deal with specific processes that form parts of methods from other disciplines that archaeologists have come to work with and rely on, such as genetics, ecology and earth science, but which seem too distant from archaeology to warrant inclusion in open-archaeo.</p>
<p>It’s a bit jarring to make such a big update — especially one that removes around 15% of the dataset — so soon after we published a <a href="https://doi.org/10.5334/joad.111">data paper</a> about it. However, that paper was meant to communicate the data collection methods, the data structure, and the rationale, purpose and value that underlie open-archaeo and its continued development. We have always been very clear and upfront that this is a live project, but it’s still a bit awkward trying to align our work with more traditional forms of scholarly communication that are suited for more stable outcomes than what continuous and open-source projects afford.</p>
<p>You can see the changes in <a href="https://github.com/zackbatist/open-archaeo/commit/c801a3ca46796c5099c33d8be7d08c9b44516a62">this git commit</a>. Don’t hesitate to get in touch if you have any questions or concerns. This is still a community effort and I could not do this without all of your contributions 🙂</p>



<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{batist2024,
  author = {Batist, Zack},
  title = {Pruning Open-Archaeo},
  date = {2024-02-02},
  url = {https://zackbatist.info/posts/2024-02-02-pruning-open-archaeo.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-batist2024" class="csl-entry quarto-appendix-citeas">
Batist, Zack. 2024. <span>“Pruning Open-Archaeo.”</span> February 2,
2024. <a href="https://zackbatist.info/posts/2024-02-02-pruning-open-archaeo.html">https://zackbatist.info/posts/2024-02-02-pruning-open-archaeo.html</a>.
</div></div></section></div> ]]></description>
  <category>open-archaeo</category>
  <guid>https://zackbatist.info/posts/2024-02-02-pruning-open-archaeo.html</guid>
  <pubDate>Fri, 02 Feb 2024 05:00:00 GMT</pubDate>
</item>
<item>
  <title>ArcheoFOSS XVII</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2023-12-17-archeofoss-xvii/</link>
  <description><![CDATA[ 





<p>This week I participated in ArcheoFOSS in Turin, Italy. I’ve always been keen to present at this conference but somehow never really felt I had much important to say (aside from open-archaeo stuff, but Joe Roe and I already presented about it at the <a href="https://github.com/zackbatist/caa2021-openarchaeo">2021 CAA conference</a>, and more detailed analysis is still in the works). But this year Joe and I took the opportunity to co-lead a panel on archaeology and the fediverse based on our experiences administrating and moderating the archaeo-social mastodon instance. Our panel was meant to highlight key challenges and opportunities for collectively-owned and community-led scholarly social media, and while it only consisted of a few papers, it definitely got the ball rolling on further critical discussion regarding the role of the fediverse and decentralized communication protocols in online archaeological discourse. Joe and I are initiating work on a position paper that assembles the main ideas presented during the panel and subsequent discussion, so stay tuned for more on that. In the meantime, you can access our introductory remarks on <a href="https://doi.org/10.5281/zenodo.10362684">zenodo</a> and <a href="https://github.com/archaeo-social/archeofoss23_fediverse">github</a>.</p>
<p>I also presented a paper on the challenges I experienced integrating and reusing data during my Master’s thesis, which I completed 8 years ago, basically summarizing its failures (trying to channel Shawn Graham’s <a href="https://doi.org/10.31356/dpb015">Failing Gloriously</a>). This basically served as a venue for finally presenting my long-held yet unpublished uncertainties about the value of analyses that integrate legacy data, drawn from my personal experiences.</p>
<p>I really appreciated how low-key and relaxing the conference was. It was great to just have a casual experience with a relatively small group of like-minded researchers. I was very fortunate to be able to travel to Turin and participate in person. I also went on a nice post-conference excursion to Genoa, which is a truly lovely city. Thanks to Stefano Costa for informing me about the best places to visit and eat!</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="IMG_1542-1200x1600.png" class="lightbox" data-gallery="quarto-lightbox-gallery-1" title="Mole Antonelliana, Turin"><img src="https://zackbatist.info/posts/2023-12-17-archeofoss-xvii/IMG_1542-1200x1600.png" class="img-fluid figure-img" alt="Mole Antonelliana, Turin"></a></p>
<figcaption>Mole Antonelliana, Turin</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="IMG_1556-2048x1536.png" class="lightbox" data-gallery="quarto-lightbox-gallery-2" title="Po River, Turin"><img src="https://zackbatist.info/posts/2023-12-17-archeofoss-xvii/IMG_1556-2048x1536.png" class="img-fluid figure-img" alt="Po River, Turin"></a></p>
<figcaption>Po River, Turin</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="IMG_1566-2048x1536.png" class="lightbox" data-gallery="quarto-lightbox-gallery-3" title="Po River, Turin"><img src="https://zackbatist.info/posts/2023-12-17-archeofoss-xvii/IMG_1566-2048x1536.png" class="img-fluid figure-img" alt="Po River, Turin"></a></p>
<figcaption>Po River, Turin</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="IMG_1582-2048x975.jpg" class="lightbox" data-gallery="quarto-lightbox-gallery-4" title="Sunset in Genoa"><img src="https://zackbatist.info/posts/2023-12-17-archeofoss-xvii/IMG_1582-2048x975.jpg" class="img-fluid figure-img" alt="Sunset in Genoa"></a></p>
<figcaption>Sunset in Genoa</figcaption>
</figure>
</div>



 ]]></description>
  <category>archaeo-social</category>
  <category>archeoFOSS</category>
  <category>conference</category>
  <category>DObsiss</category>
  <category>fediverse</category>
  <category>open archaeology</category>
  <guid>https://zackbatist.info/posts/2023-12-17-archeofoss-xvii/</guid>
  <pubDate>Sun, 17 Dec 2023 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Finished my dissertation!</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2023-09-24-finished-my-dissertation.html</link>
  <description><![CDATA[ 





<p>I finally defended my doctoral dissertation a few weeks ago, and after 7 years I’m happy to put it out into the world: https://doi.org/10.5281/zenodo.8373390</p>
<p>To briefly summarize: I observed and interviewed archaeologists while they worked, focusing on how they collaborate to produce information commons within relatively small, bounded communities. I relate these observations to issues experienced when sharing data globally on the web using open data platforms. This is part of an effort to reorient data sharing (and other aspects of open science) as a social, collaborative, communicative, and commensal experience.</p>
<p>Many thanks to my supervisor, Costis Dallas, for being such a great mentor, and to Matt Ratto and Ted Banning for their constant constructive feedback. And special thanks to the external examiners, Jeremy Huggett and Ed Swenson, for critically engaging with my work.</p>
<p><strong>Archaeological data work as continuous and collaborative practice</strong></p>
<p>This dissertation critically examines the sociotechnical structures that archaeologists rely on to coordinate their research and manage their data. I frame data as discursive media that communicate archaeological encounters, which enable archaeologists to form productive collaboration relationships. All archaeological activities involve data work, as archaeologists simultaneously account for the decisions and circumstances that framed the information they rely on to perform their own practices, while anticipating how their information outputs will be used by others in the future. All archaeological activities are therefore loci of practical epistemic convergence, where meanings are negotiated in relation to communally-held objectives.</p>
<p>Through observations of and interviews with archaeologists at work, and analysis of the documents they produce, I articulate how data sharing relates distributed work experiences as part of a continuum of practice. I highlight the assumptions and value regimes that underlie the social and technical structures that support productive archaeological work, and draw attention to the inseparable relationship between the management of labour and data. I also relate this discursive view of data sharing to the open data movement, and suggest that it is necessary to develop new collaborative commitments pertaining to data publication and reuse that are more in line with disciplinary norms, expectations, and value regimes.</p>



 ]]></description>
  <category>dissertation</category>
  <category>open archaeology</category>
  <category>open data</category>
  <category>open science</category>
  <category>science studies</category>
  <guid>https://zackbatist.info/posts/2023-09-24-finished-my-dissertation.html</guid>
  <pubDate>Sun, 24 Sep 2023 04:00:00 GMT</pubDate>
</item>
<item>
  <title>open-archaeo data paper</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2023-09-11-open-archaeo-data-paper/</link>
  <description><![CDATA[ 





<p>Today, <a href="https://joeroe.io/">Joe Roe</a> and I published a data paper in the Journal of Open Archaeology Data on <a href="https://github.com/zackbatist/open-archaeo">open-archaeo</a>, the comprehensive list of open source archaeological software and resources that we maintain. In this paper, we outline the data collection methods and conceptual model, and highlight open-archaeo’s value as a public resource and as a dataset for examining the emerging community of practice surrounding open source software development in research contexts. In fact, open-archaeo serves as the basis for an extended dataset in a study we are currently working on (investigating collaborative coding experiences) and we think there is a lot of potential for additional analysis in the future.</p>
<p><strong>Open-archaeo: A Resource for Documenting Archaeological Software Development Practices</strong><br>
<a href="https://doi.org/10.5334/joad.111">https://doi.org/10.5334/joad.111</a></p>
<p>Open-archaeo (https://open-archaeo.info) is a comprehensive list of open software and resources created by and for archaeologists. It is a living collection—itself an open project—which as of writing includes 548 entries and associated metadata. Open-archaeo documents what kinds of software and resources archaeologists have produced, enabling further investigation of research software engineering and digital peer-production practices in the discipline, both under-explored aspects of archaeological research practice.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="open-archaeo-CM.png" class="lightbox" data-gallery="quarto-lightbox-gallery-1" title="Conceptual model documenting relationships between data recorded in open-archaeo and other relevant information in the source material and elsewhere on the web."><img src="https://zackbatist.info/posts/2023-09-11-open-archaeo-data-paper/open-archaeo-CM.png" class="img-fluid figure-img" alt="Conceptual model documenting relationships between data recorded in open-archaeo and other relevant information in the source material and elsewhere on the web."></a></p>
<figcaption>Conceptual model documenting relationships between data recorded in open-archaeo and other relevant information in the source material and elsewhere on the web.</figcaption>
</figure>
</div>



 ]]></description>
  <category>foss</category>
  <category>open archaeology</category>
  <category>open-archaeo</category>
  <category>publication</category>
  <guid>https://zackbatist.info/posts/2023-09-11-open-archaeo-data-paper/</guid>
  <pubDate>Mon, 11 Sep 2023 04:00:00 GMT</pubDate>
</item>
<item>
  <title>Some thoughts on data formality</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2023-06-13-some-thoughts-on-data-formality.html</link>
  <description><![CDATA[ 





<p>I’m using this post to draw out some thoughts that I feel are coherent in my mind but I struggle to communicate in writing. The general topic is the notion of formality, and how it is expressed in data work and data records.</p>
<p>In a management sense, formality involves adhering to standard protocol. It involves checking all the boxes, sticking to the book, and ensuring that behaviour conforms to institutional expectations. In this way, bureaucracy is the essence of formality. By extension, formality is a means through which power is expressed, in that it binds interactions to a certain set of acceptable possibilities. In effect, formality renders individual actions in subservience to a broader system of control.</p>
<p>But formality is also useful. Formality reduces friction involved in transforming and transmitting information across contexts. Any application that implements a formal standard can access and transmit information according to the standard, which reduces cognitive overhead on the part of actors responsible for processing information. They relocate creative agency upstream, towards managers of data and of labour, who make decisions regarding how other actors (human and non-human actors alike) may interact with the system before they ever occur. This basically manifests itself in workflows, which are essentially disciplined ways of working directed towards targeted outcomes (I wrote about workflows <a href="https://doi.org/10.1515/opar-2020-0217">in a 2021 paper</a> and in my dissertation, which draws from <a href="https://doi.org/10.17613/zdmt-0a89">Bill Caraher’s contribution</a> to <a href="https://escholarship.org/uc/item/0vh9t9jq">Critical Archaeology in the Digital Age</a>, among other work he’s written on the topic). To be clear, I do not mean to imply that adopting workflows constitutes a negative act. An independent scholar may apply a workflow to help achieve their goals more effectively and efficiently, and empowers them to get the most out of the resources at their disposal. However, one of the key findings from my dissertation is that when applied in collective enterprises, they tend to genericize labour and data for the purpose of extraction and appropriation, which is understood to be an ordinary aspect of archaeological research, as is evident by how actors performing genericized labour internalize this as part of their work role.</p>
<p>Developing a workflow essentially entails adopting and enforcing protocols and formats, which are series of documented norms and expectations that ensure that information may be made interchangeable. Protocols are standards that dictate means of direct communication, and formats are standards that dictate how should be stored. Forms are interfaces through which information is translated from real-world experiences into standardized formats.</p>
<p>Formal data are information whose variables and values are arranged according to a formally-defined schema. A formal dataset comprises a series of records collated in a consistent manner, motivated by a need, desire or warrant to render them comparable. The formally-defined schema makes this potential for comparison much easier. A common means of representing formal data is through tables, which are comprised of rows and columns. Each row represents a record, and each column a variable that describes a facet of each record. The values recorded for each variable constitute observations or descriptive characterizations pertaining to the object of each record. One can therefore determine what kinds of structured observations were made about a recorded object by finding the values located at the intersection of records and variables (i.e., individual cells in a table). Each record relates to a set of variables applied to the whole set and documented in the schema.</p>
<p>In its most extreme, formality entails a realm of total control, where all information is collected and processed according to an all-encompassing model of the world. It is not coincidental that models are the primary outlooks through which both managers and computer systems engage with the world. It has been the dream of bureaucrats and computer scientists alike to develop such systems (see the work of <a href="https://en.wikipedia.org/wiki/Paul_Otlet">Paul Otlet</a>, <a href="https://en.wikipedia.org/wiki/Vannevar_Bush">Vanevar Bush</a>, and <a href="https://forum.zettelkasten.de/discussion/comment/7689/#Comment_7689">the weirdly techno-libertarian crowd associated with structured note-taking and personal knowledge management</a>). Nor is it a coincidence that formality is a requisite aspect of both bureaucracies and computers. Computational environments and bureaucracies both effectively capture and maintain institutional power dynamics.</p>
<p>In some cases, such as with text boxes, the variable may be precisely defined by the values are left open-ended. However, users are still expected to provide certain kinds of information in these fields (I remember at a conference in 2019, <a href="https://istohuvila.se/">Isto Huvila</a> (whose work on archaeological records management is also a great source of inspiration) referred to these as “white boxes”, which conveys their literal appearance and is a clever and ironic play on words referring to the notion of “black boxes” that hide the intricate details of a process behind an opaque connotative entity). In this sense, the standards are thus mediated by social and professional norms, but exist nonetheless. This reflects the fact that social and professional norms, standards, and expectations will never go away, they are fundamental aspects of communication and participation within communities.</p>
<p>The million dollar question nowadays (at least in my own mind) is how can we create information infrastructures that strike a balance between the need to transmit information succinctly between computers via the web, and the capability to share context and subtext whose significance originate in the gaps between recorded information and which gain meaning only in relation to the shared experience of communicating agents as members of a social or professional community?</p>



 ]]></description>
  <category>formality</category>
  <category>infrastructure</category>
  <category>notions of data</category>
  <category>open data</category>
  <category>workflows</category>
  <guid>https://zackbatist.info/posts/2023-06-13-some-thoughts-on-data-formality.html</guid>
  <pubDate>Tue, 13 Jun 2023 04:00:00 GMT</pubDate>
</item>
<item>
  <title>Recap: Digital Archaeology Bern 2023</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2023-02-06-recap-digital-archaeology-bern-2023/</link>
  <description><![CDATA[ 





<p>Last week I travelled to Switzerland to participate in <a href="https://dab23.archaeological.science/">Digital Archaeology Bern (2023)</a>. The conference was themed “advancing open research into the next decade” and served as a way to take stock of developments since the <a href="https://www.tandfonline.com/toc/rwar20/44/4">2012 World Archaeology Special Issue on Open Archaeology</a> and Ben Marwick’s influential 2017 paper <a href="https://doi.org/10.1007/s10816-015-9272-9">Computational Reproducibility in Archaeological Research</a>, which came out 10 and 5 years ago, respectively. I think that the conference was a remarkable success, and all 50-60 participants were actively engaged in critical discussions on what it means to do open archaeology. You can find my slides and presentation notes on GitHub (https://github.com/zackbatist/DAB23).</p>
<p>Although there were some elements of this, the conference was not just superficial open-boosting. Most, if not, all participants highlighted challenges and unanticipated implications of being open that they have recently experienced. Looking back, a few themes stood out:</p>
<ul>
<li>Thinking about value proposition that openness entails, which necessarily involves accounting for specific use cases and imagined future stakeholders.</li>
<li>Thinking about the needs and values of all stakeholders involved in doing archaeology, including local and Indigenous communities, land-owners, archivists, government agencies, and related parties, and what openness means for them.</li>
<li>Thinking about how we might reconcile our values as archaeologists with the values demanded and afforded by the infrastructures and communities with whom we must work.</li>
</ul>
<p>I got to meet so many interesting people. I already knew many of them from social media, virtually-hosted talks, or brief in-person interactions at the CAA back in 2018, and it was really great to put a face to each person’s name. Most serious work in digital archaeology, especially productive work developing open data infrastructures, is being done in Europe, and I was very grateful to have this opportunity to connect with that crowd (especially since I’m currently entering the post-PhD academic job market). I think my paper was well-received and valued, and it opened the door to many interesting discussions during the breaks between sessions and elsewhere.</p>
<p>I was also able to tack on a couple days at the start to work with Joe Roe on an article we’ve been writing for the better part of 3 years, about collaborative aspects of open source software development among archaeologists. <a href="https://github.com/zackbatist/caa2021-openarchaeo">We presented a paper at the 2021 CAA conference</a> on the composition of <a href="http://open-archaeo.info/">open-archaeo</a>, the list of open source software and resources made by and for archaeology that I maintain, and we’re trying to expand on it a little bit more with some network analysis type stuff. So this time together really gave us an opportunity to discuss what we really want out of the paper, to actually talk through the results, and generally helped motivate us to get this done. We still have some work cut out for us, but that probably warrants its own blog post.</p>
<p>Anyway, here are some cool pictures from the trip!</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="IMG_1118-1200x1600.jpg" class="lightbox" data-gallery="quarto-lightbox-gallery-1" title="View of the street from my hostel"><img src="https://zackbatist.info/posts/2023-02-06-recap-digital-archaeology-bern-2023/IMG_1118-1200x1600.jpg" class="img-fluid figure-img" alt="View of the street from my hostel"></a></p>
<figcaption>View of the street from my hostel</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="396594BF-8D85-4415-BAE1-845284D219CC-1200x1500.jpg" class="lightbox" data-gallery="quarto-lightbox-gallery-2" title="Lithics on display at the Laténium"><img src="https://zackbatist.info/posts/2023-02-06-recap-digital-archaeology-bern-2023/396594BF-8D85-4415-BAE1-845284D219CC-1200x1500.jpg" class="img-fluid figure-img" alt="Lithics on display at the Laténium"></a></p>
<figcaption>Lithics on display at the Laténium</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="5E5B8638-7A9A-4764-8F19-7C3C864C3132-2048x1536.jpg" class="lightbox" data-gallery="quarto-lightbox-gallery-3" title="Some stuff I saw when I got lost one morning"><img src="https://zackbatist.info/posts/2023-02-06-recap-digital-archaeology-bern-2023/5E5B8638-7A9A-4764-8F19-7C3C864C3132-2048x1536.jpg" class="img-fluid figure-img" alt="Some stuff I saw when I got lost one morning"></a></p>
<figcaption>Some stuff I saw when I got lost one morning</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="IMG_1121-2048x942.jpg" class="lightbox" data-gallery="quarto-lightbox-gallery-4" title="View of the city from a perch near the university"><img src="https://zackbatist.info/posts/2023-02-06-recap-digital-archaeology-bern-2023/IMG_1121-2048x942.jpg" class="img-fluid figure-img" alt="View of the city from a perch near the university"></a></p>
<figcaption>View of the city from a perch near the university</figcaption>
</figure>
</div>



 ]]></description>
  <category>conference</category>
  <category>DAB23</category>
  <category>open archaeology</category>
  <category>open data</category>
  <category>open-archaeo</category>
  <guid>https://zackbatist.info/posts/2023-02-06-recap-digital-archaeology-bern-2023/</guid>
  <pubDate>Mon, 06 Feb 2023 05:00:00 GMT</pubDate>
</item>
<item>
  <title>I like LaTeX</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2023-01-10-i-like-latex.html</link>
  <description><![CDATA[ 





<p>So I think I finally understand LaTeX. Of course there’s still a lot for me to learn, but I think I’m at a point where I am really harnessing its true value.</p>
<p>I’ve been working with plaintext since I started writing my dissertation. Until very recently my workflow closely resembled an RMarkdown setup, and largely corresponded with <a href="https://github.com/benmarwick/atom-for-scholarly-writing-with-markdown">this guide written by Ben Marwick</a>. My simple understanding is that Pandoc passes the Markdown through LaTeX to produce a viable PDF, which makes it possible to scatter LaTeX throughout the content and in the YAML front matter. So I had a mish-mash of both Markdown and LaTeX conventions in most of my documents. For instance, I was using the comprehensive <a href="https://ctan.org/pkg/graphicx?lang=en">graphicx</a> package to render my figures and I was referring to endnotes stored in a separate tex file using the <a href="https://ctan.org/pkg/sepfootnotes?lang=en">sepfootnotes</a> package, all while using <a href="https://pandoc.org/MANUAL.html#citations">Pandoc’s citeproc</a> bibliographic referencing system. Eventually I came to realize that my workflow was fundamentally built upon LaTeX, or comprises functions that closely correlate with common LaTeX macros.</p>
<p>I worked using this hybrid Markdown/LaTeX setup for years, until last week when I was prompted by a member of my supervisory committee to compile a unified document so he could better understand the flow of my thesis. I had anticipated that I would need to convert everything to pure LaTeX at some point, and it was as good a time as any.</p>
<p>Previously, I had mashed together code from various Stack Overflow posts in order to generate something functional, but it’s a completely different experience when you start from scratch. I started by following the Overleaf guide on <a href="https://www.overleaf.com/learn/latex/How_to_Write_a_Thesis_in_LaTeX_(Part_1)%3A_Basic_Structure">how to structure a thesis using LaTeX</a>, and now I have a very robust and elegant setup that compiles a single, tidy, and systematic PDF from multiple sources. Here’s the current state of my main tex file:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode latex code-with-copy"><code class="sourceCode latex"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Page layout %</span></span>
<span id="cb1-2"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\documentclass</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">report</span>}</span>
<span id="cb1-3"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>[margin=1in]{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">geometry</span>}</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Line and paragraph spacing %</span></span>
<span id="cb1-6"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">setspace</span>}</span>
<span id="cb1-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\doublespacing</span></span>
<span id="cb1-8"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\setlength</span>{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\lineskip</span>}{3.5pt}</span>
<span id="cb1-9"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\setlength</span>{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\lineskiplimit</span>}{2pt}</span>
<span id="cb1-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\setlength</span>{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\parindent</span>}{20pt}</span>
<span id="cb1-11"></span>
<span id="cb1-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Table of contents %</span></span>
<span id="cb1-13"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">tocloft</span>}</span>
<span id="cb1-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\setlength</span>{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\cftbeforesecskip</span>}{4pt}</span>
<span id="cb1-15"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>[page,titletoc,title]{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">appendix</span>}</span>
<span id="cb1-16"></span>
<span id="cb1-17"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Verbatim %</span></span>
<span id="cb1-18"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">fvextra</span>}</span>
<span id="cb1-19"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\DefineVerbatimEnvironment</span>{Highlighting}{Verbatim}{breaklines,commandchars=<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\\\{\}</span>,breaksymbol=}</span>
<span id="cb1-20"></span>
<span id="cb1-21"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Data references %</span></span>
<span id="cb1-22"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">sepfootnotes</span>}</span>
<span id="cb1-23"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\newendnotes</span>{x}</span>
<span id="cb1-24"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\newendnotes</span>{y}</span>
<span id="cb1-25"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\newendnotes</span>{z}</span>
<span id="cb1-26"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{refs/XXXX-refs.tex}</span>
<span id="cb1-27"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{refs/YYYY-refs.tex}</span>
<span id="cb1-28"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{refs/ZZZZ-refs.tex}</span>
<span id="cb1-29"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{refs/misc-refs.tex}</span>
<span id="cb1-30"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\renewcommand\thexnote</span>{A<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\arabic</span> {xnote}}</span>
<span id="cb1-31"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\renewcommand\theynote</span>{B<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\arabic</span> {ynote}}</span>
<span id="cb1-32"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\renewcommand\theznote</span>{C<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\arabic</span> {znote}}</span>
<span id="cb1-33"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>[multiple]{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">footmisc</span>}</span>
<span id="cb1-34"></span>
<span id="cb1-35"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Figures and text boxes %</span></span>
<span id="cb1-36"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">graphicx</span>}</span>
<span id="cb1-37"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\graphicspath</span>{{images/}}</span>
<span id="cb1-38"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>[font=footnotesize,labelfont=bf]{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">caption</span>}</span>
<span id="cb1-39"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">float</span>}</span>
<span id="cb1-40"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">mdframed</span>}</span>
<span id="cb1-41"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\mdfdefinestyle</span>{quotes}{</span>
<span id="cb1-42">linecolor=black,linewidth=1pt,</span>
<span id="cb1-43">leftmargin=1cm,rightmargin=1cm,</span>
<span id="cb1-44">skipabove=12pt,skipbelow=12pt</span>
<span id="cb1-45">}</span>
<span id="cb1-46"></span>
<span id="cb1-47"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Block quotes %</span></span>
<span id="cb1-48"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">csquotes</span>}</span>
<span id="cb1-49"></span>
<span id="cb1-50"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Tables %</span></span>
<span id="cb1-51"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">tabularx</span>}</span>
<span id="cb1-52"></span>
<span id="cb1-53"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Epigraph %</span></span>
<span id="cb1-54"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">epigraph</span>}</span>
<span id="cb1-55"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\setlength</span>{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\epigraphwidth</span>}{4in}</span>
<span id="cb1-56"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\renewcommand</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">\textflush</span>}{flushright}</span>
<span id="cb1-57"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\renewcommand</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">\epigraphsize</span>}{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\footnotesize</span>}</span>
<span id="cb1-58"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\setlength\epigraphrule</span>{1pt}</span>
<span id="cb1-59"></span>
<span id="cb1-60"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Bibliographic citations %</span></span>
<span id="cb1-61"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>[</span>
<span id="cb1-62">citestyle=authoryear,</span>
<span id="cb1-63">bibstyle=authoryear,</span>
<span id="cb1-64">maxcitenames=2,</span>
<span id="cb1-65">maxbibnames=99,</span>
<span id="cb1-66">uniquelist=false,</span>
<span id="cb1-67">date=year,</span>
<span id="cb1-68">url=false</span>
<span id="cb1-69">]{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">biblatex</span>}</span>
<span id="cb1-70"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\addbibresource</span>{/Users/zackbatist/Dropbox/zotero/zack.bib}</span>
<span id="cb1-71"></span>
<span id="cb1-72"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\renewcommand*</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">\postnotedelim</span>}{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\addcolon\space</span>}</span>
<span id="cb1-73"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\DeclareFieldFormat</span>{postnote}{#1}</span>
<span id="cb1-74"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\DeclareFieldFormat</span>{multipostnote}{#1}</span>
<span id="cb1-75"></span>
<span id="cb1-76"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Hyperlinks %</span></span>
<span id="cb1-77"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">\usepackage</span>[colorlinks=true]{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">hyperref</span>}</span>
<span id="cb1-78"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\hypersetup</span>{</span>
<span id="cb1-79">linkcolor=black,</span>
<span id="cb1-80">citecolor=blue,</span>
<span id="cb1-81">urlcolor=blue</span>
<span id="cb1-82">}</span>
<span id="cb1-83"></span>
<span id="cb1-84"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Title page %</span></span>
<span id="cb1-85"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\title</span>{</span>
<span id="cb1-86">{Thesis Title}<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\\</span></span>
<span id="cb1-87">{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\large</span> University of Toronto}<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\\</span></span>
<span id="cb1-88">{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\normalsize</span> Faculty of Information}<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\\</span></span>
<span id="cb1-89"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% {\includegraphics{university.jpg}}</span></span>
<span id="cb1-90">}</span>
<span id="cb1-91"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\author</span>{Zack Batist}</span>
<span id="cb1-92"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\date</span>{<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\today</span>}</span>
<span id="cb1-93"></span>
<span id="cb1-94"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\begin</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">document</span>}</span>
<span id="cb1-95"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\maketitle</span></span>
<span id="cb1-96"></span>
<span id="cb1-97"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Front matter %</span></span>
<span id="cb1-98"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% \chapter*{Abstract}</span></span>
<span id="cb1-99"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% \chapter*{Dedication}</span></span>
<span id="cb1-100"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% \chapter*{Declaration}</span></span>
<span id="cb1-101"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% \chapter*{Acknowledgements}</span></span>
<span id="cb1-102"></span>
<span id="cb1-103"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\tableofcontents</span></span>
<span id="cb1-104"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\listoffigures</span></span>
<span id="cb1-105"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\listoftables</span></span>
<span id="cb1-106"></span>
<span id="cb1-107"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Content %</span></span>
<span id="cb1-108"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Introduction}</span>
<span id="cb1-109"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/introduction.tex}</span>
<span id="cb1-110"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Notions of Archaeological Data}</span>
<span id="cb1-111"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/notions-of-archaeological-data.tex}</span>
<span id="cb1-112"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Theories of Discursive Action}</span>
<span id="cb1-113"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/theories-of-discursive-action.tex}</span>
<span id="cb1-114"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Methods}</span>
<span id="cb1-115"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/methods.tex}</span>
<span id="cb1-116"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Social Worlds}</span>
<span id="cb1-117"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/social-worlds.tex}</span>
<span id="cb1-118"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Sites of Discursive Negotiation}</span>
<span id="cb1-119"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/sites-of-discursive-negotiation.tex}</span>
<span id="cb1-120"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Sociotechnical Tensions Relating to Data}</span>
<span id="cb1-121"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\input</span>{chapters/sociotechnical-tensions-relating-to-data.tex}</span>
<span id="cb1-122"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Discussion / Future Directions}</span>
<span id="cb1-123">(in progress)</span>
<span id="cb1-124"></span>
<span id="cb1-125"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Bibliograpy %</span></span>
<span id="cb1-126"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\printbibliography</span>[heading=bibintoc]</span>
<span id="cb1-127"></span>
<span id="cb1-128"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% Appendices %</span></span>
<span id="cb1-129"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\begin</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">appendices</span>}</span>
<span id="cb1-130"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Summary of Code System}</span>
<span id="cb1-131"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Data Management Protocols}</span>
<span id="cb1-132"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\chapter</span>{Open Data Supplement}</span>
<span id="cb1-133"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\section</span>{Case A}</span>
<span id="cb1-134"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\thexnotes</span></span>
<span id="cb1-135"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\section</span>{Case B}</span>
<span id="cb1-136"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\theynotes</span></span>
<span id="cb1-137"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\section</span>{Case C}</span>
<span id="cb1-138"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">\theznotes</span></span>
<span id="cb1-139"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\end</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">appendices</span>}</span>
<span id="cb1-140"></span>
<span id="cb1-141"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">\end</span>{<span class="ex" style="color: null;
background-color: null;
font-style: inherit;">document</span>}</span></code></pre></div></div>
<p>A large part of this work involves figuring out trends in the package ecosystem. I really struggled to differentiate between the various packages for bibliographic formatting, footnotes, figures and floats and their cross-compatibility. Some packages seem to be developed according to common tendencies, sort of like the R Tidyverse (in the sense of holding a generally common syntax, not in terms of cult behaviour), based around central cores such as <a href="https://ctan.org/pkg/biblatex?lang=en">biblatex</a> and <a href="https://ctan.org/pkg/hyperref?lang=en">hyperref</a>.</p>
<p>I’ve also been using LaTeX to format <a href="../../cv.pdf">my CV</a>, and today I started using it to create slides <a href="https://dab23.archaeological.science/abstracts/batist/">for an upcoming conference presentation</a>. I like how programmatic the CV feels — you script a macro once, call the function along with the text you want it to parse, and you got a pretty little table of all your career accomplishments! I’m still kind of undecided about using beamer for conference slides, but I do like how it encourages me to write concurrently as I create the slides.</p>



 ]]></description>
  <category>dissertation</category>
  <category>latex</category>
  <category>pain text</category>
  <category>writing</category>
  <guid>https://zackbatist.info/posts/2023-01-10-i-like-latex.html</guid>
  <pubDate>Tue, 10 Jan 2023 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Comments on ‘The rise and fall of peer review’</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2022-12-17-comments-on-the-rise-and-fall-of-peer-review.html</link>
  <description><![CDATA[ 





<p><a href="https://experimentalhistory.substack.com/p/the-rise-and-fall-of-peer-review">A substack post about peer review</a> is getting a lot of attention, and I’m here to rant about it. Basically the post is calling out the peer review process as a terrible and broken system. And it is. But the author’s rhetoric about it is kind of problematic.</p>
<p><strong>1. Peer review is not an experiment.</strong></p>
<p>The author claims that it is, but contradicts himself straight away:</p>
<blockquote class="blockquote">
<p>The experimental design wasn’t great; there was no randomization and no control group. Nobody was in charge, exactly, and nobody was really taking consistent measurements. And yet it was the most massive experiment ever run, and it included every scientist on Earth.</p>
</blockquote>
<p>These are not just things that make an experiment bad, they are things that preclude peer review from being an experiment altogether. Experiments are run on samples, they are run with intent, they are performed in controlled environments. As someone who calls himself an experimental psychologist and who calls his blog “experimental history”, he really extends the term experiment in weird ways. This use of the term is like referring to the “experiment of democracy”, basically just grand rhetoric for “we’re figuring things out and learning as we go”.</p>
<p>The author also seems to think of the experiment of peer review as a pass/fail test, which again, is not what experiments are for. He sets bars for what successful scientific evaluation ought to look like, and measures his experiences of peer review against it. But this is not an experiment, this is qualitative assessment. There’s nothing wrong with that, but it’s troubling how the author wraps his proclamation that peer review is bad and should be abolished within some phony hearkening to science-core.</p>
<p><strong>2. General discourse is not enough to validate truth statements.</strong></p>
<p>Various parts of the post indicate that the author considers science to be the evaluation of statements of truth, which can only be verified by their fidelity to observed reality. Ok, fair enough. But he refers to Einstein’s large body of non-reviewed work as an argument for relying on discourse among educated fellows as an efficient way of evaluating the quality of scientific work. Despite Einstein’s apparent genius, which is cemented in popular imagination <a href="https://www.inverse.com/science/what-einstein-got-wrong">but who also happened to be wrong about some things</a>, this is not reason enough to abolish peer review. Moreover, the author does not consider the general acceptance of non-reviewed ideas that happened to be wrong as a counter point that clearly refutes his main point.</p>
<p><strong>3. What about non-experimental methods?</strong></p>
<p>The author has a huge blind spot for non-experimental methods. He suggests that if the results of a scientific analysis can be replicated, then that is good enough for acceptance into an authoritative cannon of truth. Moreover, he indicates that work that can not replicate is “a whole lotta money for nothing”, basically a waste of time and resources. But a lot of science can not be replicated, by virtue of the fact that science doesn’t always follow experimental protocols that allow for replication tests to be performed. Fantastic and valuable work that relies on non-experimental heuristics, including a lot of work in the social sciences and humanities, climate science, ecology, astronomy and various other fields, are left in the lurch. His take on non-replicability in these disciplines reads a lot like the unethical and ironically non-replicable Sokal hoaxes that serve as the basis for unhinged right-wing attacks on the social sciences and humanities.</p>
<p>This also contradicts the author’s hearkening to discourse among learned men of olde as a way of dealing with problems relating to peer review. Opening up the comments section, even if just limited to a curated list of credentialed scholars, is not the same as conducting independent replication studies. I think the reasoning behind this link is that if other people have experienced similar phenomena in their own labs, then it’s more likely to be accepted as true. But this is not the same as replication under the same conditions, it is just the same uncontrolled consensus-based evaluation criteria as peer review but with an open filter.</p>
<p><strong>4. Peer-review in context</strong></p>
<p>I agree with many of the things that the author is saying. Yes, there are many ways in which peer review is broken and could be improved. For instance, I agree with the notion that peer reviewers do not dive deep enough into the data and aren’t always critical enough. But I think that this is because most people are unprepared to do so, either because they do not have access to data or do not know how to work with statistics or read code. Moreover, certain journals like PNAS give preferential treatment to certain authors over others, and there are definitely major issues with racism and sexism in the evaluation process. Open peer review does not resolve these issues, namely because it treats peer review in isolation.</p>
<p>The only way to make peer review better is by instilling good scholarly practices in the next generation of scholars. However, this is inhibited by structural issues, such as the tight job market that favours quantity of peer reviewed articles over any other factor, and the general prestige economy of academia. These are the root issues. The foul state of peer review is one aspect of this mess, alongside structural racism, sexism and transphobia, the sheer expense of obtaining an advanced degree and excelling in the years immediately post-PhD, and the pressures to conform trends that get you funding. You can not separate the problems with peer review from these issues. Yet somehow the author manages to completely side step these concerns, identifying the broken peer review system as a purely epistemic problem, rather than a problem with tangible and far-reaching social implications.</p>



 ]]></description>
  <category>open science</category>
  <category>peer review</category>
  <category>science core</category>
  <guid>https://zackbatist.info/posts/2022-12-17-comments-on-the-rise-and-fall-of-peer-review.html</guid>
  <pubDate>Sat, 17 Dec 2022 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Comments on a recent ‘science mapping’ paper</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2022-12-08-review-of-a-recent-science-mapping-paper/</link>
  <description><![CDATA[ 





<p>A new paper examining published research outputs to describe the makeup of archaeology as a discipline just dropped, and it’s getting a lot of positive attention.</p>
<blockquote class="blockquote">
<p>Sinclair, A. 2022 Archaeological Research 2014 to 2021: an examination of its intellectual base, collaborative networks and conceptual language using science maps, Internet Archaeology 59. <a href="https://doi.org/10.11141/ia.59.10">https://doi.org/10.11141/ia.59.10</a></p>
</blockquote>
<p>I see some issues with the paper that I think are worth addressing. This is not a comprehensive review, more like a commentary based on my own interests and experiences. I welcome dialog with the author and anyone else who is interested in discussing this further.</p>
<p>I’m a bit hesitant to post this because I do not know the author, Anthony Sinclair, and I don’t want to come across as too harsh. I intentionally did not look him up prior to writing this post. This is a commentary of the paper, not the person behind it.</p>
<section id="simplistic-description-of-network-graphs" class="level2">
<h2 class="anchored" data-anchor-id="simplistic-description-of-network-graphs">Simplistic description of network graphs</h2>
<p>My first criticism is about the surface-level description of the network visualizations. Network visualizations are one of many ways of rendering a dataset, and this would have really benefited from more multifaceted statistical analysis of the underlying data. For example, it would have been nice to see the distribution of nodes with different degrees of centrality compared against some other variable, such as gender. The author reverts back to a plain and simple citation count in his analysis of gender disparities, and misses a great opportunity to draw upon centrality measurements as a key indicator of inequitable aspects of professional development across the genders.</p>
<p>The author also annotated the graphs with diagrams that look kind of like a compass rose. I only found one instance in the text describing them and their function:</p>
<p>“In certain maps, the key dimensions that affect the layout of the maps are identified in one of the upper corners of the map.”</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="figure6.png" class="lightbox" data-gallery="quarto-lightbox-gallery-1" title="One of the network visualizations from the original paper. Note the compass rose in the top left corner."><img src="https://zackbatist.info/posts/2022-12-08-review-of-a-recent-science-mapping-paper/figure6.png" class="img-fluid figure-img" alt="One of the network visualizations from the original paper. Note the compass rose in the top left corner."></a></p>
<figcaption>One of the network visualizations from the original paper. Note the compass rose in the top left corner.</figcaption>
</figure>
</div>
<p>What do these compass roses actually represent? Are they derived from the author’s interpretations, or are they derived from the dataset? This is unclear. In either case, I would have liked to understand the reasoning or approach for identifying the extremes at each end of the gradients, and how a node’s situation along the scale is determined.</p>
</section>
<section id="framing-of-science-and-non-science" class="level2">
<h2 class="anchored" data-anchor-id="framing-of-science-and-non-science">Framing of science and non-science</h2>
<p>This paper perpetuates an outdated dichotomy between science and the arts and humanities. It never really defined either of these things, or attempts to reconcile the terms used by the citations databases against their own notion of what science and arts and humanities means to them. But these terms appear in the compass roses and in their descriptions of the graph visualizations as if their meanings are self-evident.</p>
<p>Also very interesting is they say a lot about science but not much about arts and humanities. In fact, it may be more apt to say that this paper describes science and non-science, rather than some alternative other cohesive entity. The author describe journals, topics and methods that they identify as scientific, but do not do this at all for entities that they relate to as belonging to the arts and humanities. The lack of distinction between these terms reveals a lack of willingness to treat the things they represent as things in themselves rather than a lack of something, namely, science.</p>
<p>The paper also relies on really outdated visions of the character of various disciplines and of archaeology specifically. As far as I can tell, it relies on two sources:</p>
<ul>
<li>Pantin, C.F.A. 1968 The Relations Between the Sciences, Cambridge: Cambridge University Press</li>
<li>Becher, T. and Trowler, P.R. 2001 Academic Tribes and Territories, 2nd Edition, Maidenhead: Society for Research into Higher Education/Open University Press.</li>
</ul>
<p>Becher is extremely outdated and falls within a period when scientists (especially social scientists, including Binford) were aching to make their disciplines seem more scientific. So there is a strange value judgement at play, and they often failed to capture the reality of how science actually works. The other source is mentioned only very briefly in passing, but follows a similar essentialist rhetoric regarding the fundamental nature of specific disciplines, which rubs me the wrong way. A lot of excellent work that examines the pragmatic reality of scientific practice, which highlights contradictions and misrepresentations, and that presents the fluidity across disciplines rather than hard distinctions, is simply ignored (e.g.&nbsp;Latour and Woolgar’s Laboratory Life, Latour’s Pandora’s Hope, Knorr-Cetina’s Epistemic Cultures, Bowker’s Science on the Run, to name just a few).</p>
</section>
<section id="critical-reflection-on-what-the-networks-actually-represent" class="level2">
<h2 class="anchored" data-anchor-id="critical-reflection-on-what-the-networks-actually-represent">Critical reflection on what the networks actually represent</h2>
<p>The value of analyzing citation networks is unclear to me, and the author don’t really convince me that they represent a “window on the shape of the discipline”. Citation networks depict clusters of citations, but the jump to making these clusters meaningful in relation to some broader social or epistemic phenomenon is never really articulated. Moreover, the author indicates they he applied the Girvan-Newman method for identifying clusters, but doesn’t really incorporate the means through which this algorithm operates, including its limitations, into the analysis. Clusters do not simply exist, they are highlighted through some method, which impacts what we see.</p>
<p>After writing an initial draft I found that the author has done other relevant scientometric work with a more targeted scope:</p>
<blockquote class="blockquote">
<p>Sinclair, A. (2020). From Specialty to Specialist: a citation analysis of Evolutionary Anthropology, Palaeolithic Archaeology and the Work of John Gowlett 1970-2018. In J. Cole, J. McNabb, M. Grove, &amp; R. Hosfield (Eds.), Landscapes of Evolution: Studies in Honour of John Gowlett (pp.&nbsp;175-201). Oxford: Archaeopress.</p>
</blockquote>
<p>Although I do not have access to this paper, it is likely to be much more effective since these kinds of analyses tend to work better when a more specific objective is outlined, since it’s easier to ground the relationships within a specific set of experiences, rather than relying too much on generalizations and abstractions.</p>
</section>
<section id="analysis-of-language-and-push-for-standardized-terminology" class="level2">
<h2 class="anchored" data-anchor-id="analysis-of-language-and-push-for-standardized-terminology">Analysis of language and push for standardized terminology</h2>
<p>I like the analysis of language and keywords. I think it’s the strongest part of the paper, and there’s a lot of potential there. However the author draws this into a push towards standardization, which seems kind of forced and not relevant to the analysis of key words across the literature. The author frames the diverse array of terminology as a problem that needs to be overcome, rather than a very interesting aspect of archaeological research practice with its own benefits and affordances. Standards implemented in harder sciences are stated as goals worth attaining, but I’m left unconvinced that this is really worth doing based on the findings presented here.</p>


</section>

 ]]></description>
  <category>bibliometrics</category>
  <category>network analysis</category>
  <category>science studies</category>
  <guid>https://zackbatist.info/posts/2022-12-08-review-of-a-recent-science-mapping-paper/</guid>
  <pubDate>Thu, 08 Dec 2022 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Open science and its weird conception of data</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2022-11-28-open-science-and-its-weird-conception-of-data.html</link>
  <description><![CDATA[ 





<p>In an early draft of one of my dissertation’s background chapters I wrote a ranty section about notions of data held by the open science movement that I find really annoying. I eventually excised this bit of text, and while it isn’t really worth assembling into any publication, I thought it may still be worth sharing here. So here is a lightly adapted version, original circa May 2022.</p>
<p>The past decade has seen a major push to develop information infrastructures, or “set[s] of organizational practices, technical infrastructure, and social norms that collectively provide for the smooth operation of scientific work at a distance” <span class="citation" data-cites="bowker2010">(Bowker et al. 2010: 102)</span>, that are specifically oriented towards facilitating data sharing and reuse among archaeologists. These efforts frequently identify as participating within the open access (OA) movement, which is a distributed grassroots campaign that encourages greater accessibility of scientific research outputs. The OA movement is in turn inspired by the free and open source software (FOSS) movement, whose goals are to ensure that anyone can run, study, modify and share software without restriction, but which is more popularly identified with collective and non-commercial software development processes guided by the aforementioned principles rather than by a pursuit of profit <span class="citation" data-cites="kelty2008 costa2013">(Kelty 2008: 254-255; Costa 2013: 449-450)</span>. While the OA and FOSS movements are distinct in that OA largely deals with scientific practices and the outcomes of scientific research while FOSS is concerned with software development, they share common concern with inclusive, transparent and collective integration of knowledge and labour. These principles that inform both the OA and FOSS movements are indeed commendable, however the ways that they tend to envisage science and scientific knowledge production warrant critical reflection. Here I identify some concerning issues that have been incorporated into the open data infrastructures that archaeologists have begun to adopt, and which contribute to a problematic and counter-intuitive conception of archaeological data and of archaeological knowledge production in general.</p>
<p>Data are the material records that archaeologists produce to store, transmit and communicate meaning. They are functional in nature, meaning they are tools that archaeologists rely on to extend their understanding of phenomena of interest. In other words, data work alongside analytical methods so that archaeologists can get from one point of understanding to another. Data are produced and used through pragmatic action and necessity, and exhibit characteristics that derive from the circumstances of their creation and their intended purposes. This notion that data are the products of social and material decisions is widely accepted by scholars of scientific practice, but many academics who generate and make use of data in their day-to-day work, including archaeologists, still commonly consider data as disembodied statements about the world as it truly is and which inform more elaborate and complex ways of understanding particular phenomena <span class="citation" data-cites="kintigh2015">(Kintigh et al. 2015: 2)</span>. This aligns with a popularly held view of science as the collective pursuit of a unified and unambiguous understanding of nature. This general vision of science considers knowledge to accumulate on a grand, global scale, and is thought to inform technological change in a way that reflects arbitrarily defined technological scales of progress. For instance, early hominins’ ability to craft stone tools is thought to have been necessary for the development of iron production, which necessarily informed the invention of steel, and eventually led humans to create electronic computers, and so on into an imagined future that conspicuously resembles the future worlds depicted by a particular cohort of post-war science fiction writers (i.e.&nbsp;Arthur C. Clarke’s 2001: A Space Odyssey, H.G. Wells’ World Brain, Isaac Asimov’s Foundation trilogy, Gene Roddenberry’s Star Trek, etc).</p>
<p>Interestingly enough, many non-archaeologist adherents to this ideology rely on outmoded archaeological frameworks that have been discredited for decades, namely grand and deterministic histories of scientific and technological progression. They use their limited understanding of archaeology to identify “technologies out of place”, or artefacts that do not fit in with the just-so technological narratives summarized above. Things like the Antikythera mechanism, the Babylon battery, and the archaeological site of Gobleki Tepe are sources of fervent discussion that often feeds into colonial and racist pseudoarchaeological tropes regarding “lost civilizations” that seeded the remnants of their knowledge to the ancestors of indigenous peoples.</p>
<p>The accumulation and assembly of data is thought to contribute to a species-level understanding of the world, which is not held by any one individual but is stored in media such as books, scientific reports, and internet-connected archives <span class="citation" data-cites="bush1945">(cf. Bush 1945)</span>. These documents are perceived of as value-neutral and purely representational in nature, and are thought to be produced and maintained by scholars and librarians exhibiting the virtuous traits of scientific objectivity, openness to alternative perspectives, and the capability to critique ideas in a rational and structured manner <span class="citation" data-cites="ettarh2018">(cf. Ettarh 2018)</span>.</p>
<p>This idealized vision of science is commonly illustrated through the data/information/knowledge/wisdom (DIKW) pyramid, which relates basic and synthetic ways of understanding the world. More specifically, the DIKW model states that representations of a natural truth (data) undergird more complex statements (information) which inform explanations (knowledge) and eventually contribute to intrinsic understanding (wisdom). Despite the implication of movement from one stage to the next, this model fails to account for how these transitions actually occur in practice. Moreover, the flow is assumed to be unidirectional up the pyramid towards a pinnacle of pure human thought. This resembles another popular metaphor in knowledge management, that of the oil pipeline, whereby data are presented as scarce natural resources that are harvested and gradually refined to create more stable and marketable products. Mining for data is characterized as visceral and material work that occurs close to nature, while refining and synthesizing are imagined as more mental and formulaic processes. Like the DIKW scheme, the pipeline model assumes that the starting point is a natural and free-flowing repository of truth, and that researchers must contain and channel it to give it greater value, while giving it artificial shape in the process <span class="citation" data-cites="huggett2020 dallas2015">(Huggett 2020: S8-S9; Dallas 2015: 194)</span>. Any sources of friction that impede the flow of data are considered as obstructions that must be cleared or worked around to facilitate the development of more elaborate forms of understanding <span class="citation" data-cites="huggett2022a">(Huggett 2022: 284)</span>. At the same time, the systems engineered to channel information are meant to protect us, to ensure that we do not get swept away by the “data deluge”, unmoored and lost at sea <span class="citation" data-cites="bevan2015">(cf. Bevan 2015)</span>.</p>
<p>This drives an obsessive concern with workflows pertaining to legal and logistical issues surrounding scientific publishing that OA advocates deem problematic. Many OA advocates see publishing as the business of typesetting and copyright law, which they could render moot using automated publishing workflows and by encouraging use of open licensing agreements <span class="citation" data-cites="foster2017 harnad1998">(cf. Foster and Deardorff 2017; Harnad 1998)</span>. Viewed as merely technical systems, these could be resolved through technical means. However academic publishing and ownership involve social arrangements that serve to stabilize knowledge, grant authority to validated claims, and enable science to move forward <span class="citation" data-cites="kelty2008">(Kelty 2008: 274-275)</span>. In other words, technocentric visions of publication workflows tend to ignore the fact that publication is a cultural phenomenon, whereby projects are made complete and knowledge claims are articulated, credited, and rendered accountable to the people who proposed them.</p>
<p>The technocratic system imagined by OA advocates envisions a global web of information, whereby new forms of knowledge emerge through novel integrations <span class="citation" data-cites="tennant2020 harnad1998">(cf. Tennant et al. 2020; Harnad 1998)</span>. If bits of data click together and are consistent with a pre-existing understanding of the world (which already clicks together), the new knowledge is deemed legitimate. Science is therefore considered to be self-correcting, since the act of assembling data to produce new knowledge is itself the means through which claims are verified or refuted. This resembles the means of evaluating contributions in open source projects, which emphasizes code’s functionality as a primary factor. If code does not run, there must be a bug, or an inconsistency in the program, that renders it non-functional <span class="citation" data-cites="kelty2008">(Kelty 2008: 220)</span>. Contributions to a collective code base are therefore said to be based on merit, or the skillful implementation of code, rather than according to the qualities of the programmer who committed the code <span class="citation" data-cites="coleman2012">(Coleman 2012: 121)</span>. This parallels the idealized conception of scientific knowledge production described above, in that contributions to a collective enterprise are considered to be disembodied, unambiguous and lacking positionality ascribed by the people who contributed them.</p>
<p>These transformations are genuine attempts to reify the adage that “information wants to be free”, which implies that information is constrained by external domineering forces — namely, the arbitrary restrictions imposed by copyright law and the use of proprietary media formats and communications protocols — and that information can and should exist in a boundless state, which is assumed to be more natural <span class="citation" data-cites="harnad1998">(cf. Harnad 1998)</span>. However this bears an unsettling resemblance to libertarian ideology in that each involve a spurious assumption that their key agents of concern are naturally independent and asocial beings, and that these isolated and atomic units may vaguely combine to form products of greater value, i.e.&nbsp;communities or states and knowledge, respectively. Value is ascribed based on market-based solutions (“the marketplace of ideas”), which assume that all actors within the system behave rationally and in accordance with the system’s built-in assumptions <span class="citation" data-cites="wellen2004">(Wellen 2004: 110)</span>. In a world where digital communications platforms have come to resemble state institutions to a great extent <span class="citation" data-cites="gorwa2019 nieborg2018">(cf. Gorwa 2019; Nieborg and Poell 2018)</span>, OA promises to enact a populist and anti-establishment vision for the future of scholarly communications, as illustrated by <span class="citation" data-cites="suber2003">Suber (2003)</span> who remarked that the OA revolution is the start of a new era wherein “scientific communication can be in the hands of scientists, who answer to one another, rather than corporations, who answer to shareholders.”</p>
<p>All of this colloquial imagery frames science as largely responsive to nature, and ignores how science constructs data. OA visions of science rarely take into account the double hermeneutic that is characteristic of social science research methods, which considers researchers’ roles in ascribing meaning to the objects that they identify and collect to begin with. And yet, the myth of unidirectional data production and processing persists even in archaeology, despite the discipline’s strong tendency for critical reflection regarding its own practices <span class="citation" data-cites="sorensen2017">(Sørensen 2017: 107)</span>. This is perhaps due to the practical and bureaucratic contexts within which archaeological research operates, namely the need to produce particular kinds of research outputs that require specific kinds of stable inputs. For instance, since the functional value of archaeological data is associated with their usefulness for generating the kinds of reports that archaeologists deem valuable, which are typically one-way forms of communication that contribute to an additive process of knowledge production (i.e.&nbsp;journal articles, book manuscripts, conference presentations, etc), anything else, including “unofficial” or non-authoritative perspectives and discourse, is commonly deemed extraneous for the purposes of formal analysis, and tend to be dismissed as a lower or less professional form of archaeological engagement, despite wide recognition of their value in theoretical discourse <span class="citation" data-cites="hodder1989 joyce2002">(Hodder 1989: 273-274; Joyce 2002: 138-139)</span>. Thus, the information infrastructures and sociopolitical pressures that frame the value regimes of archaeological research together contribute to a particular vision of what data are and how they should be acted upon.</p>




<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-bevan2015" class="csl-entry">
Bevan, Andrew. 2015. <span>“The Data Deluge.”</span> <em>Antiquity</em> 89 (348): 1473–84. <a href="https://doi.org/10.15184/aqy.2015.102">https://doi.org/10.15184/aqy.2015.102</a>.
</div>
<div id="ref-bowker2010" class="csl-entry">
Bowker, Geoffrey C., Karen Baker, Florence Millerand, and David Ribes. 2010. <span>“Toward Information Infrastructure Studies: Ways of Knowing in a Networked Environment.”</span> In <em>International Handbook of Internet Research</em>, edited by Jeremy Hunsinger, Lisbeth Klastrup, and Matthew Allen, 97–117. Dordrecht: Springer Netherlands. <a href="https://doi.org/10.1007/978-1-4020-9789-8_5">https://doi.org/10.1007/978-1-4020-9789-8_5</a>.
</div>
<div id="ref-bush1945" class="csl-entry">
Bush, Vannevar. 1945. <span>“As We May Think.”</span> <em>The Atlantic Monthly</em> 176 (1): 101–8. <a href="http://www.ias.ac.in/describe/article/reso/005/11/0094-0103">http://www.ias.ac.in/describe/article/reso/005/11/0094-0103</a>.
</div>
<div id="ref-coleman2012" class="csl-entry">
Coleman, E. Gabriella. 2012. <em>Coding Freedom: The Ethics and Aesthetics of Hacking</em>. Princeton University Press. <a href="https://doi.org/10.1515/9781400845293">https://doi.org/10.1515/9781400845293</a>.
</div>
<div id="ref-costa2013" class="csl-entry">
Costa, Cristina. 2013. <span>“The Habitus of Digital Scholars.”</span> <em>Research in Learning Technology</em> 21 (1): 21274. <a href="https://doi.org/10.3402/rlt.v21.21274">https://doi.org/10.3402/rlt.v21.21274</a>.
</div>
<div id="ref-dallas2015" class="csl-entry">
Dallas, Costis. 2015. <span>“Curating Archaeological Knowledge in the Digital Continuum: From Practice to Infrastructure.”</span> <em>Open Archaeology</em> 1 (1): 176–207. <a href="https://doi.org/10.1515/opar-2015-0011">https://doi.org/10.1515/opar-2015-0011</a>.
</div>
<div id="ref-ettarh2018" class="csl-entry">
Ettarh, Fobazi. 2018. <span>“Vocational Awe and Librarianship: The Lies We Tell Ourselves – in the Library with the Lead Pipe.”</span> <em>In the Library with the Lead Pipe</em>. <a href="https://www.inthelibrarywiththeleadpipe.org/2018/vocational-awe/">https://www.inthelibrarywiththeleadpipe.org/2018/vocational-awe/</a>.
</div>
<div id="ref-foster2017" class="csl-entry">
Foster, Erin D., and Ariel Deardorff. 2017. <span>“Open Science Framework (OSF).”</span> <em>Journal of the Medical Library Association : JMLA</em> 105 (2): 203–6. <a href="https://doi.org/10.5195/jmla.2017.88">https://doi.org/10.5195/jmla.2017.88</a>.
</div>
<div id="ref-gorwa2019" class="csl-entry">
Gorwa, Robert. 2019. <span>“What Is Platform Governance?”</span> <em>Information, Communication &amp; Society</em> 22 (6): 854–71. <a href="https://doi.org/10.1080/1369118X.2019.1573914">https://doi.org/10.1080/1369118X.2019.1573914</a>.
</div>
<div id="ref-harnad1998" class="csl-entry">
Harnad, Stevan. 1998. <span>“Learned Inquiry and the Net: The Role of Peer Review, Peer Commentary and Copyright.”</span> <em>Learned Publishing</em> 11 (4): 283–92. <a href="https://doi.org/10.1087/09531519850146229">https://doi.org/10.1087/09531519850146229</a>.
</div>
<div id="ref-hodder1989" class="csl-entry">
Hodder, Ian. 1989. <span>“Writing Archaeology: Site Reports in Context.”</span> <em>Antiquity</em> 63 (239): 268–74. <a href="https://doi.org/10.1017/S0003598X00075980">https://doi.org/10.1017/S0003598X00075980</a>.
</div>
<div id="ref-huggett2020" class="csl-entry">
Huggett, Jeremy. 2020. <span>“Is Big Digital Data Different? Towards a New Archaeological Paradigm.”</span> <em>Journal of Field Archaeology</em> 45 (February):S8–17. <a href="https://doi.org/10.1080/00934690.2020.1713281">https://doi.org/10.1080/00934690.2020.1713281</a>.
</div>
<div id="ref-huggett2022a" class="csl-entry">
———. 2022. <span>“Data Legacies, Epistemic Anxieties, and Digital Imaginaries in Archaeology.”</span> <em>Digital</em> 2 (2): 267–95. <a href="https://doi.org/10.3390/digital2020016">https://doi.org/10.3390/digital2020016</a>.
</div>
<div id="ref-joyce2002" class="csl-entry">
Joyce, Rosemary. 2002. <em>The Languages of Archaeology: Dialogue, Narrative, and Writing</em>. Wiley. <a href="https://books.google.com?id=k51TlhQeeQsC">https://books.google.com?id=k51TlhQeeQsC</a>.
</div>
<div id="ref-kelty2008" class="csl-entry">
Kelty, Christopher M. 2008. <em>Two Bits: The Cultural Significance of Free Software</em>. Duke University Press.
</div>
<div id="ref-kintigh2015" class="csl-entry">
Kintigh, Keith W., Jeffrey H. Altschul, Ann P. Kinzig, W. Fredrick Limp, William K. Michener, Jeremy A. Sabloff, Edward J. Hackett, Timothy A. Kohler, Bertram Ludäscher, and Clifford A. Lynch. 2015. <span>“Cultural Dynamics, Deep Time, and Data: Planning Cyberinfrastructure Investments for Archaeology.”</span> <em>Advances in Archaeological Practice</em> 3 (1): 1–15. <a href="https://doi.org/10.7183/2326-3768.3.1.1">https://doi.org/10.7183/2326-3768.3.1.1</a>.
</div>
<div id="ref-nieborg2018" class="csl-entry">
Nieborg, David B, and Thomas Poell. 2018. <span>“The Platformization of Cultural Production: Theorizing the Contingent Cultural Commodity.”</span> <em>New Media &amp; Society</em> 20 (11): 4275–92. <a href="https://doi.org/10.1177/1461444818769694">https://doi.org/10.1177/1461444818769694</a>.
</div>
<div id="ref-sorensen2017" class="csl-entry">
Sørensen, Tim Flohr. 2017. <span>“The Two Cultures and a World Apart: Archaeology and Science at a New Crossroads.”</span> <em>Norwegian Archaeological Review</em> 50 (2): 101–15. <a href="https://doi.org/10.1080/00293652.2017.1367031">https://doi.org/10.1080/00293652.2017.1367031</a>.
</div>
<div id="ref-suber2003" class="csl-entry">
Suber, Peter. 2003. <span>“Open Access to Science and Scholarship.”</span> In. Geneva: World Summit on the Information Society (WSIS). <a href="https://dash.harvard.edu/bitstream/handle/1/4552056/suber_wsis.htm">https://dash.harvard.edu/bitstream/handle/1/4552056/suber_wsis.htm</a>.
</div>
<div id="ref-tennant2020" class="csl-entry">
Tennant, Jonathan, Ritwik Agarwal, Ksenija Baždarić, David Brassard, Tom Crick, Daniel J. Dunleavy, Thomas Rhys Evans, et al. 2020. <span>“A Tale of Two ’Opens’: Intersections Between Free and Open Source Software and Open Scholarship.”</span> <em>SocArXiv</em>, March. <a href="https://doi.org/10.31235/osf.io/2kxq8">https://doi.org/10.31235/osf.io/2kxq8</a>.
</div>
<div id="ref-wellen2004" class="csl-entry">
Wellen, Richard. 2004. <span>“Taking on Commercial Scholarly Journals: Reflections on the <span>‘Open Access’</span> Movement.”</span> <em>Journal of Academic Ethics</em> 2 (1): 101–18. <a href="https://doi.org/10.1023/B:JAET.0000039010.14325.3d">https://doi.org/10.1023/B:JAET.0000039010.14325.3d</a>.
</div>
</div></section></div> ]]></description>
  <category>notions of data</category>
  <category>open data</category>
  <category>open science</category>
  <guid>https://zackbatist.info/posts/2022-11-28-open-science-and-its-weird-conception-of-data.html</guid>
  <pubDate>Mon, 28 Nov 2022 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Abstract submitted for DAB23 (Bern, Switzerland)</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2022-11-27-abstract-submitted-for-dab23-bern-switzerland/</link>
  <description><![CDATA[ 





<p>Today I submitted an abstract to present at the DAB23 colloquium hosted by the Bern Computational and Digital Archaeology lab. The conference is about “advancing open research into the next decade” and my paper is titled <strong>Documenting the collaborative commitments that support data sharing within archaeological project collectives</strong>. Here is <a href="dab23-abstract.pdf">the abstract</a>:</p>
<blockquote class="blockquote">
<p>Archaeological research is inherently collaborative, in that it involves many people coming together to examine a material assemblage of mutual interest by implementing a variety of tools and methods in tandem. Independent projects establish organizational structures and information systems to help coordinate labour and pool information derived thereof into a communal data stream, which can then be applied towards the production and publication of analytical findings. Albeit not necessarily egalitarian, and with different expectations set for people assigned different roles, archaeological projects thus constitute a form of commons, whereby participants contribute to and obtain value from a collective endeavour. Adopting open research practices, including sharing data beyond a project’s original scope, involves altering the collaborative commitments that bind work together. This paper, drawn from my doctoral dissertation, examines how archaeologists are presently navigating this juncture between established professional norms and expectations on the one hand, and the potential benefits and limitations afforded by open research on the other.</p>
<p>I applied an abductive qualitative data analysis approach based on recorded observations, interviews, and documents collected from three cases, including two independent archaeological projects and one regional data sharing consortium with limited scope and targeted research objectives. My analysis documents a few underappreciated aspects of archaeological projects’ sociotechnical arrangements that open data infrastructures should account for more thoroughly:</p>
<ol type="1">
<li>boundaries, whether they restrict membership within a collective, delimit a project’s scope, or limit the time frame under which a project operates, have practical positive value, and are not just arbitrary impediments;</li>
<li>systems designed to direct the flow of information do so via the coordination of labour, and the strategic arrangement of human and object agency, as well as resistances against such managerial control, are rarely accounted for in data documentation; and</li>
<li>information systems and the institutional structures that support them tend to reinforce and reify existing power structures and divisions of labour, including implicit rules that govern ownership and control over research materials and that designate who may benefit from their use.</li>
</ol>
<p>By framing data sharing, whether it occurs between close colleagues or as mediated by open data platforms among strangers, as comprising a series of collaborative commitments, my work highlights the broader social contexts within which we develop open archaeological research infrastructures. As we move forward, we should be aware of and account for how the data governance models embedded within open research infrastructures either complement or challenge existing social dynamics.</p>
</blockquote>



 ]]></description>
  <category>abstract</category>
  <category>conference</category>
  <category>DAB23</category>
  <category>open archaeology</category>
  <category>ooen science</category>
  <guid>https://zackbatist.info/posts/2022-11-27-abstract-submitted-for-dab23-bern-switzerland/</guid>
  <pubDate>Sun, 27 Nov 2022 05:00:00 GMT</pubDate>
</item>
<item>
  <title>Mastodon and the potential for community growth</title>
  <dc:creator>Zack Batist</dc:creator>
  <link>https://zackbatist.info/posts/2022-11-23-mastodon-and-the-potential-for-community-growth/</link>
  <description><![CDATA[ 





<p>It’s a weird time to be starting a blog. Twitter has imploded and many of my colleagues have started using mastodon instead. There’s a lot of talk about the virtues of decentralization and of establishing and maintaining firm community values as reflected in content moderation policies and practices. Of course, all of this discourse is happening in microblog format, and is restricted by the usual inability to have any kind of nuanced conversation on the web. I feel that when posting on twitter and on mastodon alike, I’m bound to a formal position, and I find it hard to establish a tone that is my own. This makes it difficult for me to be casual, and to express my thoughts in a way that makes sense to me, especially when my ideas are really half-baked or vaguely critical. So I started this blog to help me retain a more tentative voice that I often express in casual conversations, and which I’m terrified of letting out in more formal or professional spaces.</p>
<p>The shift to mastodon has been interesting. It definitely has a very different vibe, but there’s a chance that this might just be due to the novelty of the experience. Sure, there are affordances built in to the platform that enable or encourage certain behaviours, such as content warnings, image descriptions and various means of controlling post visibility, but the value of these features will depend on whether people take action and actually use them.</p>
<p>I think that the biggest change, whose ramifications we’re just starting to see, has to do with community governance. On twitter, the usual and pretty much only way of responding to inadequate content moderation was to complain and put up with it. But on mastodon there are three main ways you can deal with it:</p>
<ol type="1">
<li>put up with it,</li>
<li>switch to another instance, or</li>
<li>get involved, give feedback, make change</li>
</ol>
<p>People are very used to the first option, and the latter two require more work. The second option involves a bit of work to find another instance that appeals to you, to create a new introduction post and build out your profile again, and to re-link all your other socials, etc. The third option seems like the most exciting one, since it actually feels like a potential venue for dynamic community building, for personal and collective growth. The distinction between the second and third options may also have lot to do with a weird tension between techno-libertarian and anarcho-syndicalist visions of (web-based) community building (but more on that in another post…).</p>
<p>This is the sort of thing that is on my mind as <a href="https://archaeo.social/about">archaeo.social</a> continues to develop. <a href="https://archaeo.social/@joeroe">Joe Roe</a> started archaeo.social as a mastodon instance for archaeologists, and I joined him soon after to help with content moderation and to plan some community guidelines (still in progress). I’m learning a lot through this whole experience. I’m learning to be more patient, more open to other perspectives, less controlling, and less apprehensive. it’s still early days, and Joe is encouraging me to sit tight, let the community do its thing, get them to shape the path ahead, which scares the hell out of me. Hours before archaeo.social launched, I even posted a very critical toot about how this would be bound to fail, but look at me now, riding shotgun!</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="Screen-Shot-2022-11-23-at-3.10.14-AM.png" class="lightbox" data-gallery="quarto-lightbox-gallery-1" title="Me being wrong on the internet, hopefully."><img src="https://zackbatist.info/posts/2022-11-23-mastodon-and-the-potential-for-community-growth/Screen-Shot-2022-11-23-at-3.10.14-AM.png" class="img-fluid figure-img" alt="Me being wrong on the internet, hopefully."></a></p>
<figcaption>Me being wrong on the internet, hopefully.</figcaption>
</figure>
</div>
<p>In retrospect, that kind of attitude I posted a couple weeks ago may be what’s holding us back. We need to try things out and play around to find out what else could come from all of this. I’m very eager to have been wrong.</p>



 ]]></description>
  <category>archaeo-social</category>
  <category>mastodon</category>
  <category>meta</category>
  <guid>https://zackbatist.info/posts/2022-11-23-mastodon-and-the-potential-for-community-growth/</guid>
  <pubDate>Wed, 23 Nov 2022 05:00:00 GMT</pubDate>
</item>
</channel>
</rss>
